Abstract
In Escherichia coli DNA replication yields interlinked chromosomes. Controlling topological changes associated with replication and returning the newly replicated chromosomes to an unlinked monomeric state is essential to cell survival. In the absence of the topoisomerase topoIV, the sitespecific recombination complex XerCD difFtsK can remove replication links by local reconnection. We previously showed mathematically that there is a unique minimal pathway of unlinking replication links by reconnection while stepwise reducing the topological complexity. However, the possibility that reconnection preserves or increases topological complexity is biologically plausible. In this case, are there other unlinking pathways? Which is the most probable? We consider these questions in an analytical and numerical study of minimal unlinking pathways. We use a Markov Chain Monte Carlo algorithm with Multiple Markov Chain sampling to model local reconnection on 491 different substrate topologies, 166 knots and 325 links, and distinguish between pathways connecting a total of 881 different topologies. We conclude that the minimal pathway of unlinking replication links that was found under more stringent assumptions is the most probable. We also present exact results on unlinking a 6crossing replication link. These results point to a general process of topology simplification by local reconnection, with applications going beyond DNA.
Introduction
Flexible circular chains appear often in nature, from microscopic DNA plasmids to macroscopic loops in solar corona. Such chains entrap rich geometrical and topological complexity which can give insight into the processes underlying their formation or modification. Knotted and interlinked states often coincide with higher energy states in physical systems and are usually undesired. Topologysimplifying reconnection processes involving one or two cleavages are observed. Examples in biology include the action of type II topoisomerases and of sitespecific recombinases. Type II topoisomerases bind to two segments of doublestranded DNA, cleave one of the segments, transport the other through the break (strandpassage) and reseal the break. Sitespecific recombinases bind to two specific sites (short segments of doublestranded DNA), introduce a doublestranded break on each site, recombine the ends and reseal the breaks. The action of recombination enzymes is a local reconnection event. We here investigate pathways of unlinking of newly replicated DNA links by local reconnection. The results presented, and the numerical methods proposed are not restricted to the biological example and are applicable to any local reconnection process.
In genetics, the observation of topological links dates back to studies in plants in the 1930s. In a study of chromosomal variation in Crepis tectorum, M. Navashin observed ring chromosomes, noting “in one case, the two daughter strands composing a normal chromosome failed to separate”. Navashin reported on a metaphase involving four rings, two of which were “united in the fashion of chain links”, thus documenting the appearance of two newly replicated circular chromosomes forming a singlylinked catenane, or 2crossing link^{1}. In her study of ring chromosomes in maize, Barbara McClintock observed the accumulation of several rings in the same cell and hypothesized that “lack of uniformity in the splitting plane could give rise to a double sized ring with two insertion regions or cause split halves of the ring to become interlocked”, thus introducing the ideas of chromosome dimers and links (also called catenanes)^{2}. Three decades later, DNA links were studied in vitro via random cyclization of circular DNA in the presence of an excess of DNA circles^{3} and, in 1980 interlinked dimers formed by nicked newly replicated 5.2 kb circular dsDNA mini chromosomes from SV40 were observed by electron microscopy^{4}. The mechanisms of replication and segregation of circular DNA predict products that can be topologically characterized as righthand (RH) 2mcrossing torus links with parallel sites, which we here refer to as parallel 2mcats (denoted mathematically as parallel \({\mathrm{(2}m)}_{1}^{2}\) or \(T{\mathrm{(2,}2m)}_{p}\))^{5}. These topological forms were confirmed by characterizing the linked replication intermediates that accumulate in topoIV mutants^{6} (Fig. 1(A)).
Sogo et al.^{7} hypothesized that catenanes appeared as replication intermediates of bacteriophage λ DNA and observed that, in order to secure proper segregation of circular chromosomes at cell division, the linking number of the two newly replicated molecules must be reduced to zero. However, the topology of a circular doublestranded (ds)DNA molecule is insensitive to any manipulation that does not allow a doublestranded break^{5}. Nicking of a single DNA strand, however extensive, is insufficient to unlink two newly replicated DNA circles unless preexisting nicks are present along the second strand. The type II topoisomerase topoIV is a major decatenase in E. coli ^{6,8}. Grainge et al. showed that in the absence of topoIV, the XerCD difFtsK molecular machine can act in vivo to separate two interlinked, newly replicated chromosomes^{9}. The XerCD complex consists of the sitespecific tyrosine recombinases XerC and XerD. The dif site is a 28 bp long recombination site located within the terminus region of the E. coli chromosome. FtsK is a powerful translocase that assembles at the division septum, where it activates XerCD dif recombination. Their experimental data suggested a gradual reduction in topological complexity of the substrates, which were RH 2mcats with parallel dif sites^{9}. The proposed unlinking pathway, through which the enzymes unlink the replication links in a stepwise fashion is illustrated in Fig. 1A. In the figure, each closed curve represents a circular dsDNA molecule. The components of a twocomponent link represent two newly replicated DNA chains.
A rigorous mathematical analysis of the recombination experiments of Grainge et al.^{9} showed that at least 2m steps are needed in order to unlink any RH 2mcat with parallel sites^{10}. This result relied simply on the assumption that the XerCD tetramer binds the two dif sites and that a simple cutreconnectpaste reaction ensues (Fig. 1C). If the shortest pathway of unlinking a 2mcrossing replication link has exactly 2m steps, it is natural to ask how many such pathways exist and whether some are more likely than others. Under the assumption that each step strictly reduces the topological complexity of its substrate (as measured by minimal crossing number), Shimokawa et al.^{10} showed that the only possible pathway of unlinking a 2mcrossing replication link is that in Fig. 1A. Using tangle calculus, they proposed a 3dimensional topological mechanism to take the parallel 2mcat to the unlink. This mechanism incorporates three solutions obtained by tangle calculus at each step of the process, and the last three steps are fully characterized. The results in Shimokawa et al.^{10} provide unprecedented detail in the study of the topological mechanism of DNA unlinking by sitespecific recombination. Going beyond the original problem of unlinking newly replicated circular chromosomes, these results apply to any reconnection event that can be modeled using tangles as in Fig. 1. For example, the same unlinking pathway proposed for DNA links under sitespecific recombination has been observed during reconnection events in physical fields such as vortices in fluid flow^{11,12,13}. Further mathematical research on this subject can be found in the literature^{14,15,16,17,18}.
Successful unlinking by XerCDFtsK of newly replicated plasmids containing dif sites was shown in ref.^{9}. Quantification of these data gave weak justification to the assumption of stepwise reduction in complexity during the unlinking reaction^{10}. As can be seen in Fig. 2, the gel quantification clearly illustrates the reduction of replication links by XerCDFtsK sitespecific recombination at dif sites. However, because of the complexity of the data, in order to confirm stepwise reduction one would need to repeat the time course experiments^{9} for each individual topology. This motivates the current work where we remove the assumption of stepwise decrease in complexity, and design mathematical and numerical methods to assess the different unlinking pathways and the identification of the most probable ones. We ask whether there are other minimal unlinking pathways and hypothesize that the minimal pathway previously proposed^{9,10,19} and illustrated in Fig. 1A is the most likely among all the possible minimal pathways that arise. First, we allow the complexity of the products to decrease or remain the same at each step of the reaction. We provide analytical proof that there are exactly nine minimal pathways of unlinking a parallel 6cat; many of the resulting transitions are fully characterized. Characterizing minimal pathways of unlinking by local reconnection and resolving the topological mechanisms involved are problems of high theoretical complexity since the number of possibilities quickly increases with the number of crossings of the substrate. Likewise, characterizing the topological mechanism(s) taking a link L _{ i } to a knot K _{ j } is equivalent to characterizing all band surgeries between L _{ i } and K _{ j } (see Fig. 1C).
In order to discriminate between different minimal unlinking pathways for a given substrate and to extend the study to higher crossing numbers, we eliminate the complexity assumption and develop a Monte Carlo method to simulate local reconnection events. The method can be applied to a substrate with any topology, allows products of varying topological complexity, and facilitates the rigorous quantification of the transition probabilities along each obtained pathway. Using this method we embark on a numerical study relevant to unlinking of DNA replication links by sitespecific recombination a dif sites. More specifically, we restrict the numerical study to knotted chains of fixed length with two reconnection sites (representing the dif sites) that are evenly spaced along the chain, and linked chains consisting of the union of two circles of same length with one reconnection site in each component. Details on the numerical experiments can be found in the Numerical Methods section and in the Supplementary Methods.
The computational approach provides a rigorous means to discriminate between mathematically equivalent unlinking pathways. The combination of the mathematical and computational studies provides strong quantitative support for the hypothesis that the unlinking pathway from Fig. 1A is the most likely, even under the weakened assumptions.
Nomenclature for knots and links
It is important at the outset to say a word about the naming convention used for the knots and links which arise in this study (490 knots and 391 twocomponent links). A local reconnection event on a two component link with one cleavage site in each component yields a knotted chain with two sites in direct repeats (cf. Fig. 1A). Rolfsen’s Knot Table^{20} summarizes the knot nomenclature used in the mathematics community, which was not intended to distinguish between mirror images nor between oriented links, an important consideration when dealing with circular DNA and other biopolymers. Chirality is relevant, and indeed crucial, to characterize biological and chemical compounds. In this paper, we use the writhebased knot nomenclature proposed in Brasher et al.^{21}. The writhe is a geometrical invariant that provides a measure of a chain’s entanglement complexity and chirality. It is computed analytically using a Gauss double integral and can be estimated numerically by taking the average of the writhe of a planar diagram taken over all projection directions (the projected writhe). The mean writhe of a knot K refers to the average of the writhes of all knotted chains of type K. Numerically this is estimated by averaging over a sufficiently large, randomly generated ensemble of conformations of type K. A representative of a chiral pair is chosen based on its mean writhe^{21}. We extend this nomenclature to the 2component links depicted in Fig. 3. For prime 2component links with 9 or more crossings we use the default notation from Knotplot^{22}. For more details and a comparison with other published nomenclature for links refer to the Supplementary Methods and to Supplementary Fig. S5.
Results
There are exactly 9 shortest pathways to unlink the 6cat that do not increase substrate complexity
We consider an event where two oriented sites come together and undergo cleavage followed by reconnection. If the substrate is a single circle, then the oriented sites are in direct repeat, i.e. they induce the same orientation into the circle. If the substrate consists of two circular chains, then there is one site in each chain. Note that such an event always changes the topology of the substrate: reconnection between two sites in separate components of a link yields a knot with two sites in direct repeats, and reconnection on a knot with two directly repeated sites yields a 2component link with one site in each component. The reconnection event is modeled as a system of tangle equations as described in Fig. 1(B). In the context of DNA unlinking, as in Shimokawa et al.^{10}, we model dsDNA as a curve defined by the axis of the DNA double helix, and the synapse formed by the enzymes bound to the core regions of the dif recombination sites as the 2string tangle P. Reconnection changes P into R. If we assume that each reconnection is modeled as a coherent band surgery, i.e. P = (0) and R = (w, 0) for some integer w, then any minimal pathway to unlink an ncrossing torus link with parallel sites (e.g. \({4}_{1}^{2}\) or \({6}_{1}^{2}\)) has exactly n steps. Furthermore, if each reconnection step is assumed to strictly reduce the complexity of its substrate, then the minimal pathway is unique: i.e. RH 2mcat, RH \(\mathrm{(2,}\,2m\mathrm{1)}\)torus knot, RH \(\mathrm{(2}m\mathrm{2)}\)cat, \(\cdots \), RH trefoil, Hopf link, trivial knot, trivial link. Figure 1A illustrates the 6cat case. Since the experimental data^{9} only gives weak support to the assumption that the complexity goes strictly down at each step of the reaction (Fig. 2), we here examine the case where no reconnection step increases the number of crossings and provide analytical characterization of all shortest pathways from the 6cat to the unlink.
Assumption 1
. Consider a reconnection pathway from a parallel RH 2mcat to the unlink. Assume that each product along the pathway is a knot or a 2component link, that the pathway is shortest, and that no reconnection event increases the number of crossings of its substrate.
Recall that any shortest reconnection pathway from \({\mathrm{(2}m)}_{1}^{2}\) to the unlink has exactly 2m steps^{10}. In Theorem 2 we show that there are exactly nine unlinking pathways satisfying Assumption 1.
Theorem 2
. A pathway from the parallel RH 6cat that satisfies Assumption 1 is one of the 9 shown in Fig. 4.
The 9 pathways found in Theorem 2 involve 16 possible transitions taking a knot to a link or vice versa; 6 of the transitions have fully characterized mechanisms. The proof of the theorem and the characterization of the mechanisms are presented in the Supplementary Methods. Figure 4 summarizes the results as an oriented graph where each node is a knot/link type and each edge represents the transition between two topologies by one reconnection step. All minimal pathways taking the parallel \({6}_{1}^{2}\) to the unlink \({0}_{1}^{2}\), and satisfying Assumption 1 are shown. In the next section we undertake a thorough computational study with the objective of discriminating between minimal pathways while minimizing the number of assumptions. In particular, we use the numerical work to assign frequencies to each transition in the pathway graph (represented in Fig. 4 as weights on the edges).
We here give a draft of the proof of Theorem 2. More details, including Lemmas S1S8, Propositions S9S17, and Figs S1 and S2 exhibiting the steps of the proof and relevant band surgeries for each of the transitions in Fig. 4, are included in the Supplementary Methods. In order to characterize the minimal pathways starting from the parallel \({6}_{1}^{2}\) link, we first investigate the effect of band surgeries on certain topological invariants such as the signature, the Jones polynomial, the Q polynomial and the Arf invariant of the knots and links involved in those pathways. By Lemma S6, the sequence of the signatures of knots and links is −5, −4, −3, −2, −1, 0, 0. Lemma S7 shows that split links can not appear in a shortest pathways. Lemma S8 identifies the candidate topologies for the minimal pathways from \({6}_{1}^{2}\).
Outline of the proof
(First step) From Proposition S9, the product knot obtained from \({6}_{1}^{2}\) is either 5_{1} or \({3}_{1}{\mathrm{\#3}}_{1}\).
(Second step) From Proposition S10, the product link obtained from 5_{1} is either \({4}_{1}^{2}\) or \({3}_{1}{\mathrm{\#2}}_{1}^{2}\). From Proposition S11, the product link obtained from \({3}_{1}{\mathrm{\#3}}_{1}\) is either \({6}_{3}^{2}\) or \({3}_{1}{\mathrm{\#2}}_{1}^{2}\).
(Third step) From Proposition S12, the product knot obtained from \({6}_{3}^{2}\) is 5_{2}. From Proposition S13, the product knot obtained from \({3}_{1}{\mathrm{\#2}}_{1}^{2}\) is either 5_{2} or 3_{1}. From Proposition S14, the product knot obtained from \({4}_{1}^{2}\) is 3_{1}.
(Fourth step) From Proposition S15, the product link obtained from 5_{2} is either \({2}_{1}^{2}\) or \({4}_{1}^{\mathrm{2\ast }}\text{'}\). From Proposition S16, the product link obtained from 3_{1} is \({2}_{1}^{2}\).
(Fifth step) From Proposition S17, the product knot obtained from \({4}_{1}^{\mathrm{2\ast }}\text{'}\) is 0_{1}. The product obtained from \({2}_{1}^{2}\) is 0_{1}. In the last step, the recombination event changes 0_{1} into \({0}_{1}^{2}\). These steps cover all transitions satisfying the Assumption 1.
Topological mechanisms of reconnection
The topological mechanisms of events between the following (substrate, product) pairs have been fully characterized^{10}: \({\mathrm{(3}}_{1},\,{2}_{1}^{2}{\mathrm{),(2}}_{1}^{2},\,{0}_{1}{\mathrm{),(0}}_{1},\,{0}_{1}^{2})\). The topological mechanisms between pairs \({\mathrm{(5}}_{2},\,{2}_{1}^{2}{\mathrm{),(5}}_{2},\,{4}_{1}^{\mathrm{2\ast }}\text{'})\), \({\mathrm{(4}}_{1}^{\mathrm{2\ast }}\text{'},\,{0}_{1})\) are characterized in the proposition below. For all transitions along the 9 minimal pathways, Fig. 4 illustrates one possible band surgery relating the knot to the link. The proof of Proposition 3 is given in the Supplementary Methods, Characterization of Mechanisms section (Supplementary Fig. S3, Proposition S18, Theorem S19, Lemma S20).
Proposition 3
A ^{23}. Suppose \(N(O+P)={5}_{2}\), \(N(O+R)={2}_{1}^{2}\), P = (0) and R = (w, 0). Then \(O=(\frac{7}{7w2})\).
B ^{23}. Suppose \(N(O+P)={5}_{2}\), \(N(O+R)={4}_{1}^{\mathrm{2\ast }}\text{'}\), P = (0) and R = (w, 0). Then \(O=(\frac{7}{7w4})\).
C ^{24}. Suppose \(N(O+P)={4}_{1}^{\mathrm{2\ast }}\text{'}\), \(N(O+R)={0}_{1}\), P = (0) and R = (w, 0). Then \(O=(\frac{4}{4w1})\).
Because XerC and XerD are tyrosine recombinases and act through a Holliday Junction Intermediate, the tangle pairs (P, R) that are relevant to unlinking of DNA replication links by Xer recombination are \((P,R)={\mathrm{((0)}}_{p},(\mathrm{1))}\), \((P,R)={\mathrm{((0)}}_{a}\mathrm{,(0,}\,\mathrm{0))}\) \((P,R)={\mathrm{((0)}}_{p}\mathrm{,(1))}\) as illustrated in Fig. 1C. The above proposition allows to determine all the topological mechanisms for each of the three combinations of substrate and product in the statement. We illustrate the solutions in Proposition S18 and in Supplementary Fig. S3 in the Supplementary Methods. Just as in Shimokawa et al.^{10}, here each system of tangle equations yields three solutions, and the three solutions can be interpreted as representing a unique 3dimensional topological mechanism.
Which unlinking pathways are most probable?
In the previous section, we proved analytically that under Assumption 1 there are 9 minimal pathways of unlinking the parallel 6cat, \({6}_{1}^{2}\). The mathematical analysis that includes enumeration of pathways and characterization of topological mechanisms becomes difficult for substrates with high crossing numbers. Furthermore, if the assumption of reduction in complexity–which is equivalent to imposing a topological filter in the physical system–is lifted, then the number of possible pathways increases rapidly and the detailed mathematical analysis quickly becomes intractable. We here remove Assumption 1 and set out on a numerical exploration of reconnection pathways starting from a broader set of substrate topologies. We develop software which finds reconnection sites along polygonal chains in the simple cubic lattice and simulates the reconnection event. Figure 5C illustrates the basic reconnection move on a simplified polygon. Figure 5A shows a lattice trefoil with one single reconnection site, before and after local reconnection. We simulate reconnection to explore different topological transitions, to quantify transition probabilities and to discriminate between unlinking pathways that are mathematically indistinguishable when only substrate, product and length are specified.
We provide numerical evidence that, of all minimal pathways starting with the RH parallel 6cat, the one in Fig. 1A is the most likely. The weights in Fig. 4 correspond to the transition probabilities obtained in the numerical simulations. More generally, our numerical data suggest that this trend holds for any substrate that is a RH 2mcat with parallel sites, or a RH \(\mathrm{(2}m\mathrm{1)}\)torus knot with two sites in direct repeats. It is important to emphasize that the simulations do not use Assumption 1. Figure 5B is a circos figure that shows all observed reconnection transitions that maintain or decrease minimal crossing number and that belong to an observed minimal pathway from the 9_{1} knot. The thickness of the arcs corresponds to the directed transition probability between two topologies. Transitions in the most probable minimal pathway from 9_{1} are colored red. The predominance of these most probable unlinking pathways is consistent with the experimental observations for XerCDFtsK dif sitespecific recombination on DNA replication links^{9}, and for reconnection in fluid vortices^{12}, and is also consistent with the predictions in the literature^{10,11}.
The minimum distance between the link type L _{ i } and the knot type K _{ j } in terms of band surgeries is called nullification distance ^{25,26}. In the numerical experiment we started by choosing knots and 2component links that are at nullification distance 1–3 from one of the 11 knots or links along one of the 9 minimal pathways of Theorem 2 and Fig. 4, or are obtained from these topologies by taking mirror images or reversing the orientation of one of the components. For completeness, we expanded the initial set to include 491 substrate topologies representing almost all knots and links with 9 or fewer crossings. Reasons for omitting a handful of 9crossing split links from the substrate set are described in detail below. We use the BFACF algorithm to generate large independent ensembles of conformations for each substrate topology. BFACF is a dynamic Monte Carlo method which samples uniformly the set of all lattice polygons of fixed topology for a given mean length^{27}. The BFACF moves used to perturb each chain are illustrated in Fig. S4 in the Supplementary Methods. Split links such as the unlink \({0}_{1}^{2}\) or \({0}_{1}\cup {3}_{1}\) (see Fig. 3), even though they appear as reconnection products, are not used as substrates due to the difficulty of keeping the components together without altering the Monte Carlo procedure. In order to improve the efficiency of sampling statistically independent conformations we implemented BFACF as a Composite Markov Chain (CMC). Details of the simulations, including a description of the algorithms and different parameters, are included in the numerical methods section and in the Supplementary Methods. Fig. S6 in the Supplementary Methods illustrates all the transitions observed between 881 topologies in the numerical experiment, including those that do not appear in minimal pathways from 9_{1}. The resulting transition probabilities are available in matrix form in the data spreadsheet provided as Supplementary Information (Supplementary Data).
Figure 5D contains exact counts for the number of minimal unlinking pathways for torus knots and links with up to 6 crossings, and the corresponding numerical estimates for 7 and 8 crossings. Under Assumption 1 there are 9 minimal pathways of unlinking the \({6}_{1}^{2}\) link. In the numerical study, we find 36 minimal unlinking pathways for the 7_{1} knot and 208 minimal unlinking pathways for the \({8}_{1}^{2}\) link, under Assumption 1 (\({P}_{min}(L)\)). Once the Assumption is removed, we observe \(P{\mathrm{(7}}_{1})=2760\) minimal pathways for the knot 7_{1} and \(P{\mathrm{(8}}_{1}^{2})=6434\) minimal pathways for the link \({8}_{1}^{2}\) (in this case the crossing number can increase at any given step). However it has been shown analytically that there are infinitely many possible minimal pathways between any 2n torus link with parallel sites and the unlink^{17}. The numerical data can provide biologicallyrelevant information by establishing a ranking of the most likely pathways. The third row in Fig. 5D indicates the number of distinct product topologies (as detected by the HOMFLYPT polynomial) observed for torus knots and links of the type \(T\mathrm{(2,}\,n)\) with 8 or fewer crossings after a single reconnection step.
Discussion
In Theorem 2 we prove that there are exactly 9 shortest unlinking pathways for the \({6}_{1}^{2}\), assuming that at every step the complexity of the substrate goes down or remains the same. The 9 pathways are illustrated in Fig. 4. We solve the topological mechanisms involved for 6 of the 16 steps along these pathways. We develop a new Monte Carlo based numerical method which allows us to model local reconnection on chains of fixed length and topology. We run the numerical simulation on each topology found to be within 3 nullification steps from any topology in Fig. 4. Notice that in these experiments there is nothing preventing the complexity of a substrate from going up at any given step. We can determine the set of all minimal pathways from any of the substrate topologies, and single out the most probable pathway. In Fig. 5 we provide numerical estimates for the number of minimal pathways for torus knots and links with 7 and 8 crossings. In our numerical data the most probable minimal pathway from a torus link (or knot) to the unlink is the one where every intermediate is also in the torus family as in Fig. 1A. The data from the numerical experiments can be found in the Supplementary Data.
Mathematically, extending Theorem 2 to determine all minimal pathways for T(2, N) torus knots and links is difficult. In general, if the substrate is a torus knot or link T(2, N) one can find multiple pathways that preserve the minimal crossing number at many steps. The complexity of the problem grows with the minimal crossing number of the substrate. For example, using numerical simulation we estimate the number of minimal pathways from the 7_{1} (resp. \({8}_{1}^{2}\)) to the unlink to be at least 36 (resp. 208) under Assumption 1. These are not tight bounds due to the limitations with using links of the form \(K{\mathrm{\#2}}_{1}^{2}\) as substrates in the numerical experiments. It is known that when the assumption is removed, there are infinitely many shortest pathways between the \(T{\mathrm{(2,}2N)}_{p}\) torus link and the unlink^{17}. In our numerical work, once Assumption 1 is removed we count at least 744, 2760 and 6434 shortest unlinking pathways for \({6}_{1}^{2}\), 7_{1} and \({8}_{1}^{2}\), respectively.
The problem of computing the nullification distance between a knot and a link is of interest to the mathematical community^{17,25,26,28,29}. In cases where the analytical tools fail to provide an exact nullification distance, one can estimate the distance between two topologies using the numerical method and possibly remove ambiguities by exhibiting the relevant band surgeries.
The numerical simulations in this study posed a number of challenges. For example, in order to generate an ensemble of essentially independent unknots 0_{1} of length 120 we had to go through at least twice as many iterations of the BFACF algorithm than for any other substrate topology. Further, these unknots contained synapses meeting the reconnection criteria approximately once every 7.5 × 10^{9} iterations. In order to improve the efficiency of such runs, we implemented the BFACF algorithm as a Composite Markov Chain process^{30,31,32,33}. Similar challenges extend to any topology consisting of a connected sum of a knot and a Hopf link \(K{\mathrm{\#2}}_{1}^{2}\), or the disjoint union of a knot and an unknot \(K\cup {0}_{1}\) (see examples in Fig. 3). In the first case, the unknotted component tends to shrink, making it difficult to satisfy the equallength criteria for recombination. In the second case, even though these topologies appear as reconnection products, they cannot be used as substrates due to the difficulty of keeping the components together (without biasing the simulations for those specific substrates). Now consider an example where a bacterial chromosome dimer forms a 3_{1} knot with two equidistant directly repeated dif sites. In our simulations we see that 0.025% of trefoils transition to \({0}_{1}\cup {3}_{1}\), the disjoint union of an unknot and a trefoil, and 95.2% of trefoils transition to \({2}_{1}^{2}\). In the first case the knotted dimer is effectively unlinked in one step, but one of the components will remain knotted, which can pose problems during chromosome segregation. In the second case unlinking of the trefoil can be achieved in 3 steps, with a combined probability of 0.925; the final product is \({0}_{1}^{2}\), a union of two circles which can then segregate at cell division.
In the case of unlinking of DNA replication links, each component of the link corresponds to a newly replicated chromosome from E.coli with one dif site in each component. This example motivated our choice to let two reconnection sites within a single circle be equidistant, and the two components of a linked product or substrate have the same length. In different contexts, such as that of sitespecific recombination between nonequidistant sites, more general homologous recombination, and possibly other reconnections in physics, the distance between sites will be an important parameter, requiring further exploration of the length and topology dependence of the transition probabilities obtained by the numerical method.
Furthermore, in nature, DNA molecules are often found tightly packaged in crowded environments. A study of reconnection on confined chains would shed light on whether confinement plays a role in driving topological simplification by any process involving local reconnection. Existing studies of the confinement of polygonal chains inside and outside the lattice suggest methods for generating ensembles of conformations^{34,35}.
Materials and Methods
Mathematical Methods
The tangle method is briefly summarized in Fig. 1. The naming convention used for knots and links is reviewed in the Introduction and in Fig. 3. More detailed mathematical methods and results used in the proof of Theorem 2 are provided in Fig. 4 and in the Supplementary Methods. A sitespecific recombination event is modeled as a local reconnection and is represented mathematically as a system of tangle equations as described in Fig. 1B. The circular chain represents the starting knot or link, and P is a 2string tangle that encloses the reconnection sites. Reconnection changes P into R. We assume that each reconnection is modeled as a coherent band surgery, i.e. P = (0) and R = (w; 0) for some integer w (Fig. 1C).
Numerical Methods: modeling reconnection
Computer simulations of local reconnection
We use an integrated set of computational tools to generate and filter ensembles of conformations, perform reconnection, identify product topologies, generate transition probabilities and facilitate statistical analysis of the results. Given an ensemble of lattice conformations with fixed length and constant topology, our algorithm searches for possible synapses along each conformation, selects one uniformly at random, and performs reconnection as illustrated in Fig. 5A. Our original motivation came from XerC/D sitespecific recombination at dif sites in newly replicated chromosomes with one site in each component or in chromosome dimers with two equidistant directlyrepeated sites. In this case reconnection events are constrained by the position and orientation of the dif sites. We therefore impose a set of constraints on where to perform reconnection. These can be seen as topological filters that can be adjusted to best fit the scenario to be modeled. Here, a reconnection synapse is defined as a pair of coplanar edges of distance one apart with antiparallel orientation; each of the two oriented edges is a reconnection site. Reconnection exchanges each edge of the synapse for one perpendicular to it as shown in Fig. 5C. The set of possible edge pairs on which to form a synapse is further constrained by step distance along the conformation. Here we adjust this parameter to constrain the location of the synapse so that the arc lengths on each side are equal within a ±6 range, while enforcing the total length of the knotted polygon, or the sum of the lengths of the components of interlinked polygons, to be fixed. For knots this models two equidistant sites in the synapse. For two component links, it models two components of equal length with a single site in each of the two components. We exclusively sampled conformations of total length 120 which contain at least one reconnection synapse.
Generation of reconnection substrates
Selfavoidance is an important property when modeling biopolymers such as circular DNA. Here, conformations in the simple cubic lattice, \({{\rm{Z}}}^{3}\), are selfavoiding polygons whose vertices have integer coordinates and whose edges are parallel to one of the three coordinate axes. The BFACF algorithm is a dynamic Monte Carlo method which samples from the space of lattice conformations of a fixed topology^{27}. The states of the resulting Markov Chain are conformations obtained by first randomly selecting an edge, then attempting one of the three moves shown in Fig. S4 in the Supplementary Methods ((−2)move, (+2)move or (0)move). None of these moves can ever change the link type of the conformation^{27,36}.
Generating large ensembles of conformations for each topology with at least one valid synapse posed significant technical challenges. The 0_{1} knot and links of the type \(K{\mathrm{\#2}}_{1}^{2}\) where K is a knot with high crossing number were particularly problematic. This is because the component with trivial topology tends to have a short average length, making sampled conformations that form a reconnection synapse very rare. For example, the 0_{1} forms such a synapse in fewer than 1 in 1.3 × 10^{6} sampled conformations. To address these challenges and gain the computational performance needed for this study, we here extend the efficient, constant time (in knot length) implementation of the BFACF algorithm used in previous work^{34,35,37,38} by employing it as a Composite Markov Chain (CMC) Monte Carlo process^{30,31,32,33,39}. CMC BFACF iterates simultaneously on multiple Markov chains with different fugacity parameters, swapping conformations between chains when certain weighted random criteria are met; more details of the implementation are included in the Supplementary Methods. CMC Monte Carlo improves efficiency by exchanging conformational states between chains, thus improving the speed at which the conformations are randomized. We sample conformations at a frequent fixed rate and correct for dependent samples using block mean analysis^{40}, therefore standardizing the sampling methodology across all of the topologies in the study and avoiding reliance on direct estimations of integrated autocorrelation time. With this methodology, we generated in the range of 10^{7} conformations for every substrate topology. Of the topologies for which a reconnection event was observed, the number of conformations containing at least one reconnection synapse ranged from approximately 1.5 × 10^{6} for the \({9}_{13}^{\ast }\) knot, to as little as 86 for the \({6}_{2}{\mathrm{\#2}}_{1}^{2}\) link. Two component topologies in which the two components are of different topology are difficult to sample efficiently because of the rarity of conformations that meet our stringent arclength criteria. Split links, i.e. those topologies in which the two components are not interlinked, are even more problematic because both components tend to travel away from each other, thus dramatically reducing the probability of sampling conformations that contain a valid synapse. We identified those topologies as products of reconnection, but did not include them in the set of substrate topologies described in the next paragraph.
Recall that 9 minimal unlinking pathways from the 6cat were obtained analytically in Theorem 2 under the assumption that each reconnection step either preserves or reduces the complexity of the substrate. Our simulations eliminate that assumption, enabling wider exploration of possible topological reconnection pathways. We start with 491 substrate topologies, including those along the 9 unique pathways from Fig. 4 (excluding the unlink \({0}_{1}^{2}\)). With CMC BFACF we generate ensembles of conformations with fixed topology to be used as reconnection substrates. The number of substrate conformations generated ranges from 1.2×10^{7} for the \({7}_{6}^{2}\) link, to more than 6.9 × 10^{8} for the 0_{1}. We perform one reconnection per conformation and identify the resulting topology. Including all substrate topologies and the identifiable products after reconnection, there are 881 topologies being analyzed in the study (490 knots and 391 two component links).
Knot identification
Our simulations require a rigorous, unambiguous way of identifying the knot or link conformation types in \({{\rm{Z}}}^{3}\). With the exception of chiral knots 8_{17} and 9_{42} which have the same HOMFLYPT as their mirror images, and 9_{12} which has the same HOMFLYPT as 4_{1}#5_{2}, all prime knots with nine or fewer crossings can be unambiguously identified using the HOMFLYPT polynomial^{41,42}. Our knot identification software is based on the other published algorithms^{43,44}. In order to identify product topologies, we first perform 20,000 BFACF iterations with randomly chosen (0) and (−2) moves. At each step, the conformation either remains the same length or becomes shorter, in many cases approaching the minimal length for that topology^{38}. The final conformation goes through an energy minimization algorithm^{22}, we compute an extended Gauss code and identify the topology using the HOMFLYPT polynomial. Information on those oriented knots or links with 10 or fewer crossings that HOMFLYPT fails to identify uniquely is included in the Supplementary Methods.
Recombination between two directly repeated sites along a single circular chain yields a 2component link. The number of product topologies increases dramatically with the complexity of the substrate. Figure 3 shows a selection of some of the expected products, including composite links that are not normally shown in knot tables. Composites are of two types: connected sums of prime knots or links; and disjoint unions. In this study, we perform recombination on two types of substrates: (i) knots with two (approximately) equidistant directly repeated sites; and (ii) links with 2 components of identical total length and with one site in each component. More specifically, each substrate knot is a selfavoiding lattice polygon of length 120 and recombination occurs on two directly repeated sites that are between 54 and 66 units apart (Fig. 5A). Each linked substrate consists of two selfavoiding polygons between 54 and 66 units long, such that the sum of their lengths is exactly 120. Recombination is restricted to synapses where two sites, one in each component, are found at unit distance apart and in antiparallel alignment as illustrated in Fig. 5(A and C). A small representative subset of the knot and link types used in the simulations is shown in Fig. 3, and the naming convention is described in the nomenclature section, in the Supplementary Methods and in Supplementary Fig. S5.
References
 1.
Navashin, M. S. Unbalanced somatic chromosomal variation in Crepis. Univ. Calif. Pub. Agr. Sci. 6, 95–106 (1930).
 2.
McClintock, B. A correlation of ringshaped chromosomes with variation in Zea Mays. Proc. Natl. Acad. Sci. USA 18, 677–681 (1932).
 3.
Wang, J. C. & Schwartz, H. Noncomplementarity in base sequences between the cohesive ends of coliphages 186 and lambda and the formation of interlocked rings between the two DNA’s. Biopolymers 5, 953–966 (1967).
 4.
Sundin, O. & Varshavsky, A. Terminal stages of SV40 DNA replication proceed via multiply intertwined catenated dimers. Cell 21, 103–114 (1980).
 5.
Wasserman, S. & Cozzarelli, N. Biochemical topology: applications to DNA recombination and replication. Science 232, 951–960 (1986).
 6.
Adams, D. E., Shekhtman, E. M., Zechiedrich, E. L., Schmid, M. B. & Cozzarelli, N. R. The role of topoisomerase IV in partitioning bacterial replicons and the structure of catenated intermediates in DNA replication. Cell 71, 277–288 (1992).
 7.
Sogo, J., Greenstein, M. & Skalka, A. The circle mode of replication of bacteriophage lambda: the role of covalently closed templates and the formation of mixed catenated dimers. J. Mol. Biol. 103, 537–562 (1976).
 8.
Zechiedrich, E. L., Khodursky, A. B. & Cozzarelli, N. R. Topoisomerase IV, not gyrase, decatenates products of sitespecific recombination in Escherichia coli. Genes Dev. 11, 2580–2592 (1997).
 9.
Grainge, I. et al. Unlinking chromosomes catenated in vivo by sitespecific recombination. EMBO J. 26, 4228–4238 (2007).
 10.
Shimokawa, K., Ishihara, K., Grainge, I., Sherratt, D. J. & Vazquez, M. FtsKdependent XerCDdif recombination unlinks replication catenanes in a stepwise manner. Proc. Natl. Acad. Sci. USA 110, 20906–20911. arXiv:http://www.pnas.org/content/110/52/20906.full.pdf+html (2013).
 11.
Kleckner, D., Kauffman, L. H. & Irvine, W. T. M. How superfluid vortex knots untie. Nat. Phys. 12, 650–655 (2016).
 12.
Kleckner, D. & Irvine, W. T. M. Creation and dynamics of knotted vortices. Nat. Phys. 9, 253–258 (2013).
 13.
Laing, C. E., Ricca, R. L. & Sumners, D. W. L. Conservation of writhe helicity under antiparallel reconnection. Scientific Reports 5, 9224; doi:10.1038/srep09224 (2015).
 14.
Ishihara, K. & Shimokawa, K. Band surgeries between knots and links with small crossing numbers. Prog. Theor. Phys. Supplement 191, 245–255, arXiv:http://ptps.oxfordjournals.org/content/191/245.full.pdf+html (2011).
 15.
Ishihara, K., Shimokawa, K. & Vazquez, M. Sitespecific recombination modeled as a band surgery: applications to Xer recombination, 387–401. In: Jonoska N., Saito M. (eds) Discrete and Topological Models in Molecular Biology. Natural Computing Series. Springer, Berlin, Heidelberg. https://doi.org/10.1007/9783642401930_18.
 16.
Yoshida, M. Applications of band surgery and signed crossing changes of knots and links to molecular biology. Master’s thesis, Department of Mathematics, Saitama University (2013).
 17.
Buck, D. & Ishihara, K. Coherent band pathways between knots and links. J. Knot Theory Ramifications 24, 1550006–27 (2015).
 18.
Buck, D., Ishihara, K., Rathbun, M. & Shimokawa, K. Band surgeries and crossing changes between fibered links. J. London Math. Soc. 94, 557–582 (2016).
 19.
Ip, S. C. Y., Bregu, M., Barre, F.X. & Sherratt, D. J. Decatenation of DNA circles by FtsKdependent Xer sitespecific recombination. EMBO J. 22, 6399–6407 (2003).
 20.
Rolfsen, D. Knots and Links. AMS Chelsea, vol. 346H, Providence, RI (2003).
 21.
Brasher, R., Scharein, R. G. & Vazquez, M. New biologically motivated knot table. Biochem Soc. Trans. 41, 606–611 (2013).
 22.
Scharein, R. G. Interactive topological drawing. Ph.D. thesis, Department of Computer Science, The University of British Columbia. https://open.library.ubc.ca/cIRcle/collections/831/items/1.0051670 (1998).
 23.
Darcy, I. K., Ishihara, K., Medikonduri, R. K. & Shimokawa, K. Rational tangle surgery and Xer recombination on catenanes. Algebr. Geom. Topol. 12, 1183–1210. Preprint: https://arxivorg/abs/1108.0724 (2012).
 24.
Vazquez, M., Colloms, S. & Sumners, D. Tangle analysis of Xer recombination reveals only three solutions, all consistent with a single threedimensional topological pathway. J. Mol. Biol. 346, 493–504 (2005).
 25.
Diao, Y., Ernst, C. & Montemayor, A. Nullification of knots and links. J. Knot Theory Ramifications 21, 1250046–70 (2012).
 26.
Ernst, C. & Montemayor, A. Nullification of torus knots and links. J. Knot Theory Ramifications 23, 1450058–77 (2014).
 27.
Madras, N. & Slade, G. The SelfAvoiding Walk (Modern Birkhäuser Classics, Cambridge, MA, 1996).
 28.
Kanenobu, T. Band surgery on knots and links. J. Knot Theory Ramifications 19, 1535–1547, https://doi.org/10.1142/S0218216510008522 (2010).
 29.
Kanenobu, T. Band surgery on knots and links, II. J. Knot Theory Ramifications 21, 1250086–108, https://doi.org/10.1142/S0218216512500861 (2012).
 30.
Geyer, C. J. Practical Markov chain Monte Carlo. Statistical Science 7, 473–483 (1992).
 31.
Orlandini, E. Monte Carlo Study of Polymer Systems by Multiple Markov Chain Method, in Numerical Methods for Polymeric Systems, 33–57. https://doi.org/10.1007/9781461217046_3 (Springer New York, New York, NY, 1998).
 32.
Szafron, M. Monte Carlo Simulations of Strand Passage in Unknotted SelfAvoiding Polygons. Master’s thesis, Department of Mathematics and Statistics, University of Saskatchewan (2000).
 33.
Szafron, M. Knotting statistics after a local strand passage in unknotted selfavoiding polygons in Z ^{3}. Ph.D. thesis, Department of Mathematics and Statistics, University of Saskatchewan (2009).
 34.
Ishihara, K. et al. Bounds for the minimum step number of knots confined to slabs in the simple cubic lattice. J. Phys. A: Math. Theor. 45, 065003–27 (2012).
 35.
Arsuaga, J. et al. Current theoretical models fail to predict the topological complexity of the human genome. Front. Mol. Biosci. 2, 48 (2015).
 36.
Janse van Rensburg, E. J., Orlandini, E., Sumners, D.W., Tesi, M.C. & Whittington, S.G. The writhe of knots in the cubic lattice. J. Knot Theory Ramifications 6, 31–44 (1997).
 37.
Hua, X., Nguyen, D., Raghavan, B., Arsuaga, J. & Vazquez, M. Random state transitions of knots: a first step towards modeling unknotting by type II topoisomerases. Topol. Appl. 154, 1381–1397 (2007).
 38.
Scharein, R. et al. Bounds for the minimum step number of knots in the simple cubic lattice. J. Phys. A: Math. Theor. 42, 475006 (2009).
 39.
Orlandini, E., Janse van Rensburg, E. J., Tesi, M. C. & Whittington, S. G. Entropic Exponents of Knotted Lattice Polygons, in Topology and Geometry in Polymer Science, vol. 103 (Springer, Berlin, 1998).
 40.
Fishman, G. Discreteevent simulation: modeling, programming, and analysis (SpringerVerlag, London, 2001).
 41.
Freyd, P. et al. A new polynomial invariant of knots and links. Bull. Amer. Math. Soc. 12, 239–246 (1985).
 42.
Przytycki, J. H. & Traczyk, P. Conway algebras and skein equivalence of links. Proc. Amer. Math. Soc. 100, 744–748 (1987).
 43.
Gouesbet, G., MeunierGuttinCluzel, S. & Letellier, C. Computer evaluation of homfly polynomials by using gauss codes, with a skeintemplate algorithm. Appl. Math. Comput. 105, 271–289 (1999).
 44.
Jenkins, R. J. Knot Theory, Simple Weaves, and an Algorithm for Computing the HOMFLY Polynomial. Master’s thesis, Carnegie Mellon University (1989).
Acknowledgements
This research was supported by the following: Japan Society for the Promotion of Science KAKENHI grant numbers 25400080, 26310206, 16H03928, 16K13751, 17H06463(to K.S.), 26800081 (to K.I.); National Science Foundation DMS1716987 (MF, MV) and CAREER Grant DMS1057284 (MV, RS, MF, RB) and NIHR01GM109457 (MV); Welcome Trust SIA 099204/Z/12Z and 200782/Z/16/Z (DJS). The authors are grateful to R. Scharein for providing assistance with Knotplot and for his work on the first version of the reconnection software; C. Soteros, M. Szafron and M. Schmirler for contributing their statistical expertise; J. Arsuaga, D.W. Sumners and S. Witte for helpful discussions; and Barbara Ustanko, ELS, for editorial assistance with this manuscript.
Author information
Affiliations
Contributions
M.V. conceived the overall research project. M.V., K.S. and D.S. conceived the detailed research plan. M.V. and K.S. directed the mathematical component of the paper. M.V. and R.B. directed the computational component of the paper. M.Y. and K.I. performed the details of the mathematical research. R.S., M.F. and R.B. performed the details of the computational component. M.V., K.S. wrote the main manuscript text; M.V., R.B. and R.S. wrote the numerical methods; M.V., K.S., K.I. wrote the mathematical methods and proofs. R.S., K.I., K.S., M.F., M.V. and M.Y. prepared figures for publication. All authors reviewed the manuscript.
Corresponding author
Ethics declarations
Competing Interests
The authors declare that they have no competing interests.
Additional information
Publisher's note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Electronic supplementary material
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Stolz, R., Yoshida, M., Brasher, R. et al. Pathways of DNA unlinking: A story of stepwise simplification. Sci Rep 7, 12420 (2017). https://doi.org/10.1038/s41598017121722
Received:
Accepted:
Published:
Further reading

Magnetic knot cascade via the stepwise reconnection of helical flux tubes
Journal of Fluid Mechanics (2021)

A topological analysis of difference topology experiments of condensin with topoisomerase II
Biology Open (2020)

Review: knots and other new topological effects in liquid crystals and colloids
Reports on Progress in Physics (2020)

A note on band surgery and the signature of a knot
Bulletin of the London Mathematical Society (2020)

Topological transition from superfluid vortex rings to isolated knots and links
Physical Review A (2020)
Comments
By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.