Transcription activity contributes to the firing of non-constitutive origins in African trypanosomes helping to maintain robustness in S-phase duration

The co-synthesis of DNA and RNA potentially generates conflicts between replication and transcription, which can lead to genomic instability. In trypanosomatids, eukaryotic parasites that perform polycistronic transcription, this phenomenon and its consequences are still little studied. Here, we showed that the number of constitutive origins mapped in the Trypanosoma brucei genome is less than the minimum required to complete replication within S-phase duration. By the development of a mechanistic model of DNA replication considering replication-transcription conflicts and using immunofluorescence assays and DNA combing approaches, we demonstrated that the activation of non-constitutive (backup) origins are indispensable for replication to be completed within S-phase period. Together, our findings suggest that transcription activity during S phase generates R-loops, which contributes to the emergence of DNA lesions, leading to the firing of backup origins that help maintain robustness in S-phase duration. The usage of this increased pool of origins, contributing to the maintenance of DNA replication, seems to be of paramount importance for the survival of this parasite that affects million people around the world.


Correctness of Equations 3 and 4
In this section, we present proofs of correctness for Equations 3 and 4 of the main paper. For both of these equations, we assume that the replication fork speed v has a negligible variance along the S-phase duration.

Minimum number of origins required to complete S phase
The lower-bound number of origins to replicate an entire chromosome, given as a function of chromosome size, S-phase duration and replication fork speed was presented both in Table 1 and Figure 2 of the main paper. The following theorem guarantees the correctness of that equation.
Theorem S1. Let v be the replication fork average speed, S be the measured S-phase duration, and N be the chromosome size. The lower bound for the number of origins required to replicate the entire chromosome is given by: Proof. Consider two cases for Equation 3: • N ≤ 2vS: in this case, we set a single origin in the middle of the chromosome. Since each fork must replicate N 2 bases with velocity v and in at most S time, we have: • N > 2vS: In this case, we assume for this chromosome a minimal set Θ of replication origins such that |Θ| ≥ 2. We also assume that the chromosome is divided into |Θ| + 1 pieces, and that each piece demands exactly S time to replication. These assumptions imply that the size of the first and the last pieces is N 2|Θ| each, while the size of the remaining |Θ| − 1 pieces is N |Θ| each (if we slide any origin either to right or to the left, then the piece of the opposite side will demand a time greater than S, a contradiction on the given S-phase duration). Now, let us apply Equation 4, which returns a lower bound time as a function of a set of origins, chromosome size and fork velocity: Observe that the last right-hand side of the equations above is equal to S, since each piece requires the same amount of time to replicate. Thus, we have: Finally, once |Θ| is an integer greater or equal to 2, it holds that:

Lower-bound time for DNA replication with constitutive origins only
We start with formal definitions of chromosome and also of constitutive origins. A chromosome 1, N is an ordered set of the first N positive integers. A constitutive origin θ is a positive integer in 1, N . The correctness of Equation 4 is assured by the following theorem.
Proof. This proof is an induction on |Θ|. If |Θ| = 1, then the firing of the single origin yields two forks. One of them slides, at velocity v, from location θ 1 to 1; the other one slides, also at velocity v, from θ 1 to N . Thus, we have: This is equivalent to split the original instance into two pieces: one contains the last origin (θ |Θ| ) and the final stretch of the chromosome, which goes from halfway between θ |Θ| −1 and θ |Θ| (that is, where the encounter of replisomes from these two origins takes place) and N ; ant the second one with both the remaining origins (θ 1 , . . . , θ |Θ|−1 ) and the first stretch of the chromosome. Thus, we have:  Figure S2. Chromosome representation in our dynamic model. A. Example of chromosome description, which includes its size N , locations of constitutive origins (in this example we have two of them, θ 1 and θ 2 ) and location of polycistronic regions in both strands (yellow arrows). B. Example of chromosome description mapping into a binary vector, where "0" and "1" means that a given nucleotide is non-replicated and replicated, respectively. In this example, we assume N = 30 nucleotides. Additional data are stored to give coordinates of constitutive origins and polycistronic regions; in the case of this latter, negative values mean that it is located in the complementary strand in respect to the initial nucleotide.

5'
3' Figure S3. Example of DNA replication dynamics. A. Two origins (θ 1 and θ 2 ) are fired, with two replisomes (blue ellipses) binding on each origin. B-C. For each fired origin, its respective pair of replisomes slide on the opposite direction to each other, leaving behind replicated chromosome (in green). D-E. If there is a head-to-head collision between two replisomes, they unbind the chromosome. F-G. The replisome also unbinds the chromosome if it reaches an extremity of the chromosome.

5'
3' 3' Figure S4. Example of DNA transcription dynamics. A-B. At a given constant frequency, a RNA polymerase (RNAP) binds to the beginning of each polycistronic region. At each iteration, all RNAPs slides on the chromosome at a constant velocity (B-F). When a RNAP reaches the end of polycistronic region, it releases the chromosome (e.g., compare E with F, looking at the second rightmost yellow arrow). Figure S5. The immediate phosphorylation of H2A does not depend directly on nascent RNAs inhibited by α-amanitin. Distribution profile of γH2A fluoresence throughout the cell cycle of non-treated (control), ionizing radiation (50 Gy) treated, and treated with α-amanitin + 50 Gy parasites. The scale bar on the fluorescence images corresponds to 2 µm. K and N mean kinetoplast and nucleus, respectively. The graph shows the γH2A fluorescence intensity (red) per cell in the samples previously described. Errors bars indicate SD. * and NS mean, respectively, p < 0.001 and non-significant using Student's t-test (n = 100 cells per group). Figure S6. Fluxogram of the DNA replication simulator. In this chart, it is depicted the major steps during a simulation of DNA replication dynamics in T. brucei TREU927. The simulator implementation, called ReDyMo, was coded in Python programming language and is available at GitHub: github.com/msreis/ReDyMo. A C++ port of this program is also available at that repository: github.com/msreis/ReDyMo-CPP.  Table S2. Results with the DNA replication model. For each number of available replisomes during simulation (F ) and for each transcription period, it was carried out 30 simulations, whose results were averaged. To evaluate S-phase duration robustness in increasing levels of transcription, for each F value, the assay with no transcription was assumed as taking exactly the measured S-phase duration (2.31 hours); thus, the S-phase duration of the remaining assays of this F value were obtained through a normalization using the mean number of iterations. A similar procedure was used to yield the transcription frequency of each assay. AU = arbitrary unit; IOD = inter-origin distance. We highlight in yellow the S-phase duration that is within the interval [2.07, 2.55] (10% above or below the measured S-phase duration). We also highlight in green a replisome velocity (v) that is within either the interval [55.49, 67.82]