New DNAscent executable that detects DNA breaks at replication forks

Admin
Sep 5
4 min read

We are excited to release DNAscent v4.1.1 which includes the new executable seeBreaks that will sit alongside existing executables like detect and forkSense. It adds a new functionality to DNAscent: the ability to determine whether there is elevated DNA breaking at replication forks. seeBreaks was named after the three talented and dedicated Cambridge undergraduates that came up with it: Katy Sherborne, Eva Zeng, and Emma Cohen.

DNAscent works by sequentially pulsing the thymidine analogues EdU and BrdU into replicating DNA so that they are incorporated into the nascent strand by replication forks. This creates tracks of continuous analogue incorporation that leave a footprint of where the fork was moving during the pulse. We then take these molecules, sequence them on the Oxford Nanopore Technologies platform, and use DNAscent detect to determine the positions of the base analogues on each read. These positions are then parsed into continuous analogue tracks that indicate fork direction, speed, and stalling by DNAscent forkSense. You can see one of our recent publications for more details on how this works.

Analogue tracks on Oxford Nanopore sequencing reads of human RPE1 cells.

Fork stalls, as detected by forkSense, are characterised by the sudden drop in BrdU incorporation that happens when a fork abruptly stops moving. However, in order to measure that, we have to be able to see the drop which means the the drop can't occur at the end of the read. Hence, the stalls detected by forkSense are limited to those that eventually restart (so more of a "pause") or are rescued by another fork moving in the opposing direction. It doesn't allow us to say anything about fork stalls that result in breaking.

Oxford Nanopore sequencing works by threading DNA through a nanopore and inferring its base sequence. If a fork stall results in a break and we sequence that DNA molecule on the Oxford Nanopore platform, then the site of the break will be at the end of the read. Because replication forks were incorporating BrdU or EdU into the nascent strand, we would then expect to see the analogue track reach the end of the sequenced molecule.

The tricky part is that an analogue track reaching the end of the read can happen for two reasons: It can happen due to a fork stall and break, or it can happen by chance due to random fragmentation of the sequencing library. How often it happens due to chance will also differ widely between sequencing runs and experimental conditions; it will depend the distribution of read lengths, the analogue pulse durations, and the fork speed. For instance, suppose we want to test whether an inhibitor causes breaks at replication forks, so we do two DNAscent sequencing runs, one with the inhibitor and one without. We need to be able to compare between those two runs, but they will have different (sometimes very different) read length distributions even if the pulse times were the same. The one with the shorter read length will show more analogue tracks reaching the end of the read by chance, so we need to be able to normalise for that.

That's exactly what our new DNAscent executable seeBreaks does: It works by determining whether there are more fork tracks at the ends of reads than would be expected by chance given a read length and fork track length distribution. The software will automatically tune itself to whatever read length and fork track length distributions you have. All you have to do is feed it the output from detect and forkSense. No further inputs are required. There's also no changes to the wet lab protocol, which means you can re-analyse your old DNAscent results (even those from R9.4.1 flow cells) using our new tool.

The seeBreaks output for each of two replicates from untreated RPE1 cells. Given the lack of replication stress in RPE1 cells (confirmed here by the lack of γH2AX markers in S-phase) the observed fraction of tracks at read ends closely matches what we would expect by chance for the read length distribution and analogue track length. — The **seeBreaks** output for each of two replicates from untreated RPE1 cells. Given the lack of replication stress in RPE1 cells (confirmed here by the lack of γH2AX markers in S-phase) the observed fraction of tracks at read ends closely matches what we would expect by chance for the read length distribution and analogue track length.

The output from seeBreaks is an estimate and a standard error for the expected number of analogue tracks reaching the read end for the read length distribution and analogue track lengths that you have. It also computes the same thing for what it actually observes in your sequencing run, and that allows you to determine whether the observed breaks are significantly higher than what we would expect. seeBreaks will then compute the difference between observed and expected in order to give you a 95% confidence interval of that difference. If 0 lies outside that interval, we can infer that there is an elevated level of breaking at replication forks.

When treated with 1000 nM of ATRi for 8 hours, the seeBreaks output shows elevated breaking that matches the γH2AX profile.

Katy, Eva, and Emma did a fantastic job leading the computational side of this project, and we're grateful to BDIAP for funding summer studentships for Katy and Emma to join the lab. This was also done in partnership with our longstanding collaborator Mathew Jones at the University of Queensland and, in particular, the data was generated by Subash Rai and David Cullen. We were all excited to release the software ahead of publication so that we can get it into your hands as soon as possible, so a huge thanks to the whole team. We're really looking forward to seeing the cool science you do with it!

Team

Boemo Group

New DNAscent executable that detects DNA breaks at replication forks

Recent Posts