RecA finds homologous DNA by reduced dimensionality search

Homologous recombination is essential for the accurate repair of double-stranded DNA breaks (DSBs)1. Initially, the RecBCD complex2 resects the ends of the DSB into 3′ single-stranded DNA on which a RecA filament assembles3. Next, the filament locates the homologous repair template on the sister chromosome4. Here we directly visualize the repair of DSBs in single cells, using high-throughput microfluidics and fluorescence microscopy. We find that, in Escherichia coli, repair of DSBs between segregated sister loci is completed in 15 ± 5 min (mean ± s.d.) with minimal fitness loss. We further show that the search takes less than 9 ± 3 min (mean ± s.d) and is mediated by a thin, highly dynamic RecA filament that stretches throughout the cell. We propose that the architecture of the RecA filament effectively reduces search dimensionality. This model predicts a search time that is consistent with our measurement and is corroborated by the observation that the search time does not depend on the length of the cell or the amount of DNA. Given the abundance of RecA homologues5, we believe this model to be widely conserved across living organisms.

Homologous recombination is essential for the accurate repair of double-stranded DNA breaks (DSBs) 1 . Initially, the RecBCD complex 2 resects the ends of the DSB into 3′ single-stranded DNA on which a RecA filament assembles 3 . Next, the filament locates the homologous repair template on the sister chromosome 4 . Here we directly visualize the repair of DSBs in single cells, using high-throughput microfluidics and fluorescence microscopy. We find that, in Escherichia coli, repair of DSBs between segregated sister loci is completed in 15 ± 5 min (mean ± s.d.) with minimal fitness loss. We further show that the search takes less than 9 ± 3 min (mean ± s.d) and is mediated by a thin, highly dynamic RecA filament that stretches throughout the cell. We propose that the architecture of the RecA filament effectively reduces search dimensionality. This model predicts a search time that is consistent with our measurement and is corroborated by the observation that the search time does not depend on the length of the cell or the amount of DNA. Given the abundance of RecA homologues 5 , we believe this model to be widely conserved across living organisms.
To study the mechanism of homologous recombination in living bacteria, we created an inducible DSB system consisting of (1) an inducible Cas9 nuclease to create DSBs at a specific chromosomal location (the 'cut site'), (2) a fluorescent parS/mCherry-ParB system (hereafter referred to as ParB) to visualize the chromosomal location of the break, and (3) an SOS-response 6 reporter to select the cells undergoing DSB repair (Fig. 1a, Extended Data Fig 1a). We used a variant of the microfluidic mother machine 7 that allows for brief induction of the Cas9 nuclease (Fig. 1a, Extended Data Fig. 1b).

DSB repair is fast and accurate
Formation of a DSB is followed by end processing by RecBCD, which removes ParB markers close to the break 8 , and by activation of the SOS-response reporter (Fig. 1b). A pulse of Cas9 induces specific DSBs in cells with the chromosomal cut site, which is accompanied by an increase in the fraction of SOS-activated cells shortly after the induction (Fig. 1c, d, Extended Data Fig. 2a). Expression of Cas9 in cells without the cut site or expression of catalytically dead Cas9 (dCas9) did not induce an SOS response (Fig. 1d, Extended Data Fig. 2b). Activation of the SOS response depended on homologous recombination and was absent in cells lacking recA or recB genes (Extended Data Fig. 2b). The ParB foci lost owing to DSBs were recovered and segregated in wild-type cells, but not in cells lacking critical components of recombination (Extended Data Fig. 2c, f). These results show that the induced DSBs were repaired by homologous recombination. The repair is notably robust: 95.5% (447 out of 468) of cells that retained an uncleaved template repaired the DSB and subsequently divided. Repair was impaired in cells without a repair template, as none of the cells that cleaved all copies of the cut site divided during 4-h experiments (n = 27 cells, 3 experiments).
Next, we focused on the dynamics of the cut site loci during repair. Typically, after a DSB, the uncut locus first translocated to the middle of the cell and then split into two foci that subsequently segregated (Fig. 1c). This pattern was visible when we plotted the positions of the ParB foci along the cell length against the time relative to the DSB event (Fig. 1e). We then measured repair times in individual cells, that is, the time between the loss and reappearance of the ParB focus (Fig. 1c). We limited the analysis to cells that had two separated ParB foci before the DSB. We found that a DSB is repaired in 15.2 ± 5.0 min (Fig. 1g). These results were consistent between replicates (Extended Data Fig. 2d), and notably, when I-SceI was used instead of Cas9 to induce breaks (Extended Data Fig. 2e), or when the ParB marker was replaced by a malO array bound by MalI-Venus proteins (referred to as 'MalI') (Extended Data Fig. 3a). When the DSB was flanked by ParB on one side and MalI on the other, both ends were processed simultaneously and followed the same dynamics (Extended Data Fig. 3b). Given that the repair time is just a fraction of the generation time (here 35 ± 10 min), we tested whether it has negative effects on fitness. A single DSB delayed the division to 55 ± 10 min; however, this delay was compensated by faster divisions in the following generations (Fig. 1f). The growth rate was temporarily reduced by 4% in cells undergoing DSB repair (Fig. 1f).
loci moved to the middle of the cell, where they colocalized (Fig. 2a, c, Extended Data Fig. 4a, c). The movement of the sister MalI markers was symmetric, unlike previously reported dynamics in which the cleaved sister locus moved much further after a DSB 10,11 . To test whether the colocalization of sister loci is specific or caused by a global alignment of the chromosomes, we used a distant MalI marker integrated on the opposite arm of the chromosome (ygaY). The distant ygaY marker maintained its typical position during repair (Fig. 2d, Extended Data Fig. 4b), excluding a model in which homologous recombination repair induces pairing of entire chromosomes. Instead, it appears colocalization is specific for the cleaved cut site and its homology. As the sister locus is no different from any other chromosomal locus until it has been located through search, we concluded that the colocalization of sister loci implies completed homology search (Extended Data Fig. 4e). The −25-kb markers colocalized 9.1 ± 3.3 min (Fig. 2e) after the DSB, and thus the homology search is faster than this.

RecA filaments are thin and dynamic
The homology search is mediated by a RecA filament assembled on single-stranded DNA (ssDNA) 3 and structures made by fluorescent RecA fusions have previously been imaged [10][11][12][13][14] . We visualized RecA activity during DSB repair using a RecA-yellow fluorescent protein (YFP) fusion integrated in tandem with wild-type recA, a construct found to be fully active (Fig. 3c, Extended Data Fig. 5b, g). A similar construct has previously been shown to be fully functional 14 but has not been used to characterize distant repair events. Induction of DSBs led to the recB-dependent formation of RecA structures at the DSB sites (Fig. 3a, Extended Data Fig. 5a, f). These structures appeared 35 ± 98 s (Extended Data Fig. 5h) before the loss of the ParB focus, and disassembled 6.6 ± 5.2 min (Extended Data Fig. 5i) before repair was complete, as defined by segregation of the ParB foci. The SOS response is activated by RecA filament assembled on ssDNA 3 . We predicted that if the structures are RecA-ssDNA filaments, their lifetime would correlate with the strength of the SOS response. As this was indeed the case (Pearson's r = 0.36, P = 3 × 10 −12 ) (Extended Data Fig. 5d, e), we are confident that the structures are RecA-ssDNA filaments.
The RecA structures were highly dynamic on a timescale of tens of seconds (Fig. 3b). Their average lifetime was 8.8 ± 3.0 min (Fig. 3d), and during that time they displayed two forms: the initial spot at the DSB (existing for 2.6 ± 1.4 min) that increased in intensity (Extended Data Fig. 5a), suggesting RecA loading on ssDNA; and a filament extruding from the initial spot and extending throughout the cell (existing for 6.2 ± 2.8 min).
It has been reported that RecA forms bundle-like structures during bacterial DSB repair 10 . To test whether we could observe such structures, we used stimulated emission depletion super-resolution microscopy (STED) to image RecA tagged with the ALFA epitope 15 . RecA-ALFA alone was fully functional in DSB repair (Extended Data Fig. 5b, g). Immunostaining of cells undergoing DSB repair revealed that RecA forms thin, filamentous structures (Fig. 3f, h, Extended Data Fig. 6a) that were not associated with the membrane-rather, they appeared in the central region of the cell (Fig. 3i, j, Extended Data Fig. 6b-d, Supplementary Video 1). Deconvolution of the observed 60 ± 13 nm full width at half maximum (FWHM) with the point spread function (PFS) of the imaging system showed that the filament width is 37.5 ± 23.5 nm (Fig. 3g, Extended Data Fig. 6e, f). As expected, given the high mobility of the RecA-YFP structures in living cells, fixed cells exhibited a large variety of conformations, including complex, tangled threads or single, winding filaments spanning the length of the cell (Fig. 3h). Notably, we observed the same type of structures when immunostaining RecA-YFP in the tandem construct (Extended Data Fig. 6g), or the RecA-ALFA, RecA-YFP tandem construct (Extended Data Fig. 6h-k). These results show that the homology search is performed by thin and flexible RecA filaments.

RecA reduces dimensionality of search
The fast target search by RecA compared with other systems that also rely on homology-directed search 16 requires a different quantitative model. Owing to the slow diffusion rates of both the large RecA-ssDNA complex and the chromosomal loci 17 , the search cannot be explained by bi-molecular reaction-diffusion in three dimensions 18 . We propose that the RecA homology search is facilitated by a 'reduced dimensionality' mechanism that accelerates the process in two ways. First, ATP hydrolysis enables mechanical extension of the RecA-ssDNA filament across the cell in less than a minute, rapidly covering most of the distance between the broken ends and the search target as shown in Fig. 3a. Second, the extended filament interacts with many different sequences in parallel. Such simultaneous probing has previously been suggested on the basis of single-molecule experiments 19 and cryo-electron microscopy structures 20 . Our addition to the model is the realization that at any z coordinate (that is, along the long axis of the cell) there is always at least one segment of the stretched RecA-ssDNA filament that is homologous to a double-stranded DNA (dsDNA) segment at the same z position (Fig. 4a). This makes the search problem independent of the z coordinate and reduces the complexity from three to two dimensions. That is, the time of homology pairing is equal to the time it takes for a segment of chromosomal dsDNA to diffuse radially to the RecA-ssDNA filament, and not the time it takes for two segments to find each other by 3D diffusion in the whole cell. In this case, 2D search is approximately 100 times faster than 3D search 21 (for a detailed description, see Methods).
In the Supplementary Information, we derive an expression for the expected time for the homologous sequences to encounter each other. Our model predicts that the search will be completed within 5 min on the basis of the rate of diffusion of the target DNA at the length scale of the radius of the nucleoid 17 . Of note, the model also predicts that the search time should be invariant with respect to the cell length, as only the radial distance is relevant. This distinguishes our model from previous models, such as the one presented in ref. 22 . Experimental data confirm that the length of a cell has a minuscule effect on the search time estimated by either ParB foci splitting, RecA structure lifetimes or MalI foci colocalization in the mid-cell (Fig. 4b), despite larger cells having more DNA to explore (Extended Data Fig. 8).

Discussion
We propose that the stretched RecA-ssDNA filament-in a simple and elegant way-positions at least one ssDNA segment in the proximity of its homologous sister, such that the homologous dsDNA segment can find the ssDNA segment using a fast, short-range search. RecA is a prototypic member of the strand-exchange protein family, which is found in all forms of life and shares a common mechanism 3,5 . A reduced-dimensionality mechanism may be a conserved property of these proteins. The advantage of the stretched filament is obvious in elongated cells, and long RecA structures have indeed been observed in Caulobacter crescentus 11 and Bacillus subtilis 13 . When DSB repair occurs at the replication fork, where the sister template is nearby, fluorescent RecA forms only a transient focus at the site of the break 14 , suggesting that the search may finish before the RecA filament is fully extended. Previous work has suggested that the repair of DSBs induced by I-SceI is carried out by 'bundles', which are complex structures formed by RecA 10 . These bundles are characterized by a thick central body, low mobility, tens-of-minutes lifetime, and are positioned between the nucleoid and the inner membrane. We have found that the RecA filaments involved in DSB repair are markedly different from these bundles: they are thin, dynamic, last only for minutes, and are present within the nucleoid (Fig. 3h, i). Notably, according to the reduced-dimensionality model, the search time is not affected by an increase in the amount of DNA, as long as the length of the filament scales with the amount of DNA. The mechanism therefore enables search in organisms with genomes larger than those of bacteria.

Online content
Any methods, additional references, Nature Research reporting summaries, source data, extended data, supplementary information, acknowledgements, peer review information; details of author contributions and competing interests; and statements of data and code availability are available at https://doi.org/10.1038/s41586-021-03877-6.
Publisher's note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Strain construction
Strains used in this work are derivatives of E. coli TB28 23 in which we restored the rph-1 mutation to wild type and deleted the malI gene and malO operator. Genetic integrations were done using lambda red integration 24 , resistance cassettes were removed using the pCP20 plasmid 24 , and markers were combined using P1 phage transduction. The labels mCherry-ParBMt1 25 and MalI-Venus were expressed by constitutive promoters integrated into the gtrA locus. The malO/MalI marker consists of an array of 12 maltose operators, which are the binding sites for MalI-Venus.
The DSB cassette consists of the I-SceI recognition site flanked by two lac operators, a parSMt1 site and three chi sites positioned outside the parSMt1 and I-SceI recognition sequence (Extended Data Fig. 1). The Cas9 target site was chosen about 100 bp away from the parSMt1 site. Notably, the construct is designed in a way that there are no chi sites between the DSB site and parSMt1 site. The DSB cassette was integrated into codA locus.
The RecA-YFP fusion was expressed directly downstream from the endogenous recA gene, and made by replacing mCherry with SYFP2 in the construct by ref. 14 . The RecA-ALFA fusions were made by introducing ALFA C-terminal to recA in either the wild-type recA locus or into the tandem construct mentioned above.
List of the strains used in this study can be found in Extended Data Table 1.
psgRNA-CS1 was cloned in E. coli Top10 by blunt-end ligation of a fragment generated with a PCR with Jwu267 and Jwu184 and psgRNA 26 plasmid as a template (psgRNA was a gift from D. Bikard; Addgene plasmid no. 114005), the guide RNA (gRNA) sequence is ACTGGCTAAT GCACCCAGTA.
A list of primers used in this study can be found in Extended Data Table 2.
High-throughput DSB imaging Growth conditions. For the microfluidic experiments cells were grown at 37 °C in M9 medium supplemented with 0.4% glucose, 0.08% RPMI 1640 amino acids (Sigma-Aldrich R7131), surfactant Pluronic F-109 (Sigma-Aldrich 542342, 21 µg ml −1 ), and when relevant carbenicillin (40 µg ml −1 ) or kanamycin (20 µg ml −1 ). Cells from −80 °C freezer stock were used to inoculate LB medium supplemented with adequate antibiotics and grown overnight at 37 °C. On the next day, the cells were diluted 1/250 in M9 0.4% glucose 0.08% RPMI medium and grown for 2 h, then cells were loaded onto the microfluidic chip. Cells were grown in the microfluidic chip for at least 2 h before the start of the experiments. Induction of Cas9 and dCas9 was done with a 5-min (or 6-min for spot counting experiments in Extended Data Fig. 2c) pulse of aTc (20 pg ml −1 ).

Microfluidic experiments.
Microfluidics experiments were performed using PDMS mother machine microfluidic chips developed previously 7 . This chip design allows for loading of two different strains and for automated switching of the medium. Medium pressure was controlled using an OB1 MK3+ microfluidic flow controller (Elveflow). The aTc was pulsed at the beginning of the for 3-h experiments, or 20 min after the start of image acquisition for 4-h experiments.
Chromosome staining with DAPI. Strains EL2504 and EL1743 (with dnaC2 28 ) were grown overnight in LB medium. The next day, cell cultures were diluted in 5 ml of M9 with 0.4% glucose and 0.08% RPMI 1640 amino acids (Sigma-Aldrich R7131), and incubated at 37 °C (strain EL2504). The EL1734 strain was grown at 30 °C for 2 h before imaging one aliquot of the culture was grown at 42 °C to induce replication arrest. DAPI was added to a final concentration of 1 µg ml −1 , and cells with DAPI were incubated for 30 min at the growth temperatures. Then 1 ml of cells was centrifuged at 4 °C, 7,000 rpm (5424 R, Eppendorf) for 3 min and resuspended in 50 µl of cold ITDE (Integrated DNA Technologies) buffer supplemented with 10 mM MgCl 2 . Two µl of concentrated cells were mounted on an agarose pad for imaging. Imaging was done with a 445-nm laser at the power 12 mW cm −2 and exposure time of 220 ms. Phase-contrast images were segmented using Nested-Unet neural network 29 , trained in-house. DAPI images were corrected for background by subtracting the mean pixel intensity of the area outside of the cell. Image analysis. Data analysis was done in MATLAB (Mathworks), with the exception of the cell segmentation, which was done in Python. Microscopy data were processed using an automated analysis pipeline developed previously 30 , with several modifications. First, segmentation of phase-contrast images was done using Nested-Unet neural network 29 trained in-house, specifically for our microscopy setup. Pytorch 1.7.1 was used for the neural networks. We trained two networks, one to segment cells, and another to detect microfluidic growth channels on the phase-contrast images (Extended Data Fig. 1c). Transformation matrices between images acquired with different filter cubes were measured and fluorescent images were transformed to correct for the pixel shifts between fluorescence and phase-contrast images. Gramm 31 toolbox was used to generate some of the plots in Matlab.

Selection and analysis of cells undergoing DSB.
Cells undergoing a DSB repair were selected on the basis of an increase of at least fourfold in CFP signal from plasmid-borne SOS reporter. First, the CFP image was background corrected by subtracting an image filtered with a Gaussian filter with the kernel size of 20 pixels (using the Matlab function imgaussfilt) from the original CFP image. We limited the analysis to the cells that had no major errors in segmentation, lived for at least 9 min, and divided during the experiment. Manual repair dynamics annotation was done only on cells that contained two ParB foci prior to the DSB, and divided after the repair. Cells that had >2 ParB foci, or that induced a DSB more than once, were excluded from the analysis. In the experiment in Extended Data Fig. 2c ParB foci were detected automatically using a radial symmetry-based method. Spot positions were mapped on the cell length using a custom written Matlab code. All images showing example cells come from the experiments that were repeated at least twice, with exception of costaining of RecA-YFP and RecA-ALFA, which was done once (Extended Data Fig. 6 h-k).
2D and 3D STED microscopy. Super-resolution imaging was performed with a custom-built STED setup 32 . Excitation of the dyes was done with pulsed diode lasers; at 561 nm (PDL561, Abberior Instruments), 640 nm (LDH-D-C-640, PicoQuant) and 510 nm (LDF-D-C-510, PicoQuant). A laser at 775 nm (KATANA 08 HP, OneFive) was used as the depletion beam, which was split into two orthogonally polarized beams that were separately shaped to a donut and a top-hat respectively in the focal plane using a spatial light modulator (LCOS-SLM X10468-02, Hamamatsu Photonics), enabling both 2D-and 3D-STED imaging. The laser beams were focused onto the sample using a HC PL APO 100×/1.40 Oil STED White objective (15506378, Leica Microsystems), through which the fluorescence signal was also collected. The imaging was done with a 561-nm excitation laser power of 8-20 µW, a 640 nm excitation laser power of 4-10 µW and a 775 nm depletion laser power of 128 mW, measured at the first conjugate back focal plane of the objective.
Two-color STED imaging of RecA-YFP together with RecA-ALFA was done in a line-by-line scanning modality, averaging over 4 or 8 lines; while ParB and RecA-ALFA was recorded frame by frame, with the first channel in confocal and the second channel in STED. The pixel size for all 2D images was set to 20 nm with a pixel dwell time of 50 µs.
Volumetric two-color 3D-STED imaging of RecA-ALFA together with Nile red was recorded in a line-by-line scanning modality, while a single confocal frame of Pico488 was recorded at the middle of the bacterial cell afterwards. The voxel size for xyz volumes was set to 25 × 25 × 80 nm 3 . The pixel dwell time was set at either 30 or 50 µs.
Raw images were processed and visualized using the ImSpector software (Max-Planck Innovation) and ImageJ 33,34 . Brightness and contrast were linearly adjusted for the entire images. The size of the RecA filaments and RecA bundles were calculated tracing line profiles perpendicular to the structure orientation and averaged on two pixels on the raw images. The data were then fitted with a Gaussian function with the software OriginPro2020, from which the full width half maximum was extracted. For xz representation, images were deconvolved using the Richardson Lucy deconvolution algorithm implemented in ImSpector. 3D volumetric rendering was done with Huygens deconvolution in Imaris 9.1 (Bitplane).
The resolution of the microscope was measured on a calibration sample, made of sparse antibodies attached to the glass, coupled with the Star635P dye. The line profiles were extracted and fitted with a Lorentzian function 35 , from which the width (W) was extracted as the dot size.
We estimated the diameter of the filaments to be 37.5 ± 23.5 nm as a deconvolution of observed 60 ± 13 nm FWHM width (Fig. 3g) with the 35 ± 11 nm (FWHM) Lorentzian PFS of the imaging system (Extended Data Fig. 6f) assuming that filament is a cylinder with a 3 nm layer of fluorophores at the surface.

I-SceI experiments
For experiments with the I-SceI meganuclease, cells with the p15a-I-SceI plasmid were cultured in M9 minimal medium, loaded in a microfluidic chip and then incubated at 37 °C in M9 medium with glucose (0.4%), RPMI 1640 amino acid supplement (Sigma-Aldrich R7131, 0.05%), carbenicillin (20 µg ml −1 ) and Pluronic F-108 (21 µg ml −1 ). DSBs were induced by switching for 3 min to medium also containing aTc (20 ng ml −1 ) and IPTG (1 mM), and then for three min to medium with only IPTG. The cells were then imaged while repairing and recovering in the initial medium.

Serial-dilution plating assay
Bacteria cultures were streaked on LA plates from freezer stock and grown overnight at 37 °C. The following day, 5 ml of LB medium was inoculated with a single colony from the overnight plate and cultured for 6 h at 37 °C. Next, tenfold serial dilutions were made in LB and 4 µl of each dilution was plated on a LA plate, or LA plate containing 1 µg ml −1 of nalidixic acid. Plates were incubated overnight at 37 °C.

Search model by extended RecA filament
Homology search will be treated as a diffusion-limited reaction with a transport time for the homologous sequences to reach a reaction radius and a probing time once the sequences have met. The probing time for the correct sequence will be negligible compared to the time for getting the homologous sequences in contact, but the overall reaction will be slowed down by the overwhelming number of incorrect sentences that will need to be probed.
To quantify the situations we start with few approximations. Assume that the RecA-ssDNA filament is a thin rod in the centre of the cylindrical nucleoid reaching from pole to pole in the z direction, whereas the homologous dsDNA sequence is coiled up at a random position in the nucleoid. The relevant time for the homologous sequences to find each other is the time for a segment in the coiled up dsDNA to diffuse radially into the rod in the centre of the cell. The central realization in our model is that it does not matter at which z-coordinate it reaches the rod. This transforms the search problem from 3D to 2D, since we can describe the search process from the perspective of the dsDNA fragment that is homologous to the ssDNA sequence that happens to be at the z-position at which the rod is reached first.
An equivalent way to think about the situation is to consider that the first binding event for many independent searchers (that is, the chromosomal dsDNA segments), that each can bind to one out of many targets (that is, the RecA-bound ssDNA segments), has the same rate as one searcher that can bind all targets. The total rate of binding is r r = ∑ i i , in which r i is the rate for template segment i to find its homologous ssDNA segment. If we write out the dependance of at which z-coordinate, z j , the filament is reached, the total rate can be expressed as r r z p z = ∑ ∑ ( ) ( ) j i i j j , in which p z ( ) j is the probability to reach the filament at position z j and r z ( ) i j is the conditional rate for segment i of binding given that the filament is reached at z j . Here, the rate of binding is zero unless the template segment matches the ssDNA that is at the specific z-position, that is, r z ( )= 0 , that is, the total binding rate is the same as the binding rate for a single dsDNA segment that can bind at any position at the filament, irrespective of z position, and for which each position is homologous.

Search time prediction for E. coli
In a first-order approximation of how long it takes for a chromosomal dsDNA segment to diffuse from a random radial position in the nucleoid to the filament in its centre, we can use the diffusion limited rate for reaching a rod in the centre of a cylinder 21 of length 2L, that is, k = 2π(2L)D/ln(R/r), in which r is the reaction radius of the rod, which is assumed to be in the order of a nucleotide (1 nm), and R is the nucleoid radius. The concentration of the searching dsDNA fragment is c = 1/V = 1/(2LπR 2 ), in which V is the nucleoid volume. The average time to reach the rod is therefore T = V/k = R 2 ln(R/r)/2D. The nucleoid radius R ≈ 200 nm and the reaction radius is in the order of r ≈ 1 nm, although the exact value is inconsequential as only its logarithm enters into the time. The complicated parameter is the diffusion rate constant D, as DNA loci movement is subdiffusive and D is therefore lower at a long length scale than a short. The process will however be limited by the long distance movement corresponding to R. At the length scale 17 of R = 200 nm, D R ≈ 0.0007 µm 2 s −1 . The association step of the search process is thus T ≈ (0.2 µm) 2 × ln(200 nm/ 1 nm)/0.0007 µm 2 s −1 = 300 s = 5 min. If we consider that also the RecA filament is moving on the minute timescale this will only speed up search further.

Time for probing
The RecA filament will, however, not be accessible for binding all the time since it also needs to probe all other dsDNA segments. If only half of the measured search time (about 10 min) is needed for homologous sequences to meet, 5 min is still available for probing other sequences. Is this sufficient to probe all sequences? If the dsDNA is probed in n-bp-long segments, each of the equally long segments of the 2-kb-long ssDNA in the RecA filament will have to interrogate, on average, every 2,000/n segments of the chromosome. There are 9.6 Mb (4.8 Mb per genome × approximately 2 genomes per cell) of dsDNA in the cell, which corresponds to 9.6 × 10 6 /n probing segments. This, in turn, means that each ssDNA segment in the RecA filament needs to test (9.6 × 10 6 /n)/(2,000/n) = 4,800 dsDNA segments. The average time for each test cannot be longer than 300 s/4,800 ≈ 63 ms, which should be plenty of time, considering that Cas9-sgRNA takes on average 30 ms to perform a similar task 16 .

ATP binding and hydrolysis
ATP-RecA has high affinity towards ssDNA 15 and upon binding results in a stretched and rigid filament with persistence length 16 of about 900 nm. ATP hydrolysis lowers the affinity of RecA to ssDNA and is needed for rapidly discarding mismatched sequences 17,36,37 . ATP turnover is therefore needed to stretch the filament in the intracellular environment where it otherwise would get stuck at partial homologies.

Alternative models
Alternative models for how to get sequences sufficiently close to probe for homology can come in many other flavours.
The most naive comparison is considering the diffusion limited bi-molecular reaction of a particle with diffusion rate corresponding to the dsDNA segment and a non-diffusive segment of the filament with a reaction radius corresponding to r (about 1 nm) anywhere in nucleiod. Here, the rate of the diffusion limited reaction is k = 4πrD L , in which D L is the diffusion 38,39 rate at the length scale of the cell, whereas the concentration of the segment is the same as above (c = 1/V = 1/(2LπR 2 )). This results in T = V/k = LR 2 /2rD L . This should be compared to R 2 ln(R/r)/2D R in the 2D situation. Importantly, the diffusion rate at the length scale of the nucleoid radius, D R , is one order of magnitude 17 faster than D L . The ratio is (L/rD L )/(ln(R/r)/D R ) ≈ 1,750, considering L = 1 µm, R = 300 nm, r = 1 nm and D R /D L ≈ 10. The actual value for r is more important in this case, because the number of rebinding events is more important than in the 2D case.
An intermediate step towards the 2D model is to parallelize the naive model and think of the homologous dsDNA as divided in segments that can search in parallel and independently for their respective homology in the RecA filament. For example, if we divide a 1,750-bases-long filament into segments of 10, the speed would increase by 175 times and the remaining difference compared to the 2D model would be (D R /D L ) − fold difference of short-and long-range diffusion.
A final model is to consider the homologous DNA as static and that only the RecA filament moves similarly to a knitting needle in a ball of yarn. This situation is not as straightforward to quantify, since we do not know how rigid the RecA filament is on different length scales in the cell. It appears to be flexible on the 100-nm length scale, but the probing interactions will have to occur on the 1-nm scale and we therefore have no knowledge of how fast the filament would explore the genome. It can clearly probe many DNA segments simultaneously 19 but it causes complex constraints to probe sequences with one part of the filament at the same time as the filament should move to explore the rest of the genome. Detailed simulations may be needed to predict the expected search times for this type of model.

Reporting summary
Further information on research design is available in the Nature Research Reporting Summary linked to this paper.

Data availability
Raw microscopy data are available at https://doi.org/10.17044/scilifelab.14815802. Source data are provided with this paper.

Code availability
The computational code used for analysis and plotting is available at https://doi.org/10.17044/scilifelab.14815802.  Fig. 3f. b, Volumetric three dimensional view of cells (also presented in Fig. 3i) with extended RecA-ALFA structures, imaged by 3D STED. c, Top: 3D STED crosssections of cells with RecA-ALFA and Nile red membrane stain as indicated in Fig. 3j. Bottom: Line profiles of RecA and membrane intensities as indicated above. Peak-to-peak distances between RecA and membrane are indicated.

Statistics
For all statistical analyses, confirm that the following items are present in the figure legend, table legend, main text, or Methods section.

n/a Confirmed
The exact sample size (n) for each experimental group/condition, given as a discrete number and unit of measurement A statement on whether measurements were taken from distinct samples or whether the same sample was measured repeatedly The statistical test(s) used AND whether they are one-or two-sided Only common tests should be described solely by name; describe more complex techniques in the Methods section.
A description of all covariates tested A description of any assumptions or corrections, such as tests of normality and adjustment for multiple comparisons A full description of the statistical parameters including central tendency (e.g. means) or other basic estimates (e.g. regression coefficient) AND variation (e.g. standard deviation) or associated estimates of uncertainty (e.g. confidence intervals) For null hypothesis testing, the test statistic (e.g. F, t, r) with confidence intervals, effect sizes, degrees of freedom and P value noted

Software and code
Policy information about availability of computer code Data collection High-throughput microfludic experiments were acquired using Micro-manager 2.0.0. STED microscopy experiments were acquired using ImSpector 0.10 revision 8576

Data analysis
High-throughput microfluidic experiments: Phase image segmentation was done using custom Python (3.7.3) code using Pytorch 1.7.1 library. Cell tracking was done using Baxter algorithm (free software under MIT license). Gramm library (free software under MIT license) was used to create plots. STED microscopy: Images were analyzed using ImageJ 1.52p and plots were created with OriginPro2020 For manuscripts utilizing custom algorithms or software that are central to the research but not yet described in published literature, software must be made available to editors and reviewers. We strongly encourage code deposition in a community repository (e.g. GitHub). See the Nature Research guidelines for submitting code & software for further information.

Data
Policy information about availability of data All manuscripts must include a data availability statement. This statement should provide the following information, where applicable: -Accession codes, unique identifiers, or web links for publicly available datasets -A list of figures that have associated raw data -A description of any restrictions on data availability Raw microscopy data will be publicly deposited and accession codes will be provided at publication.

nature research | reporting summary
April 2020 Field-specific reporting Please select the one below that is the best fit for your research. If you are not sure, read the appropriate sections before making your selection.

Life sciences Behavioural & social sciences Ecological, evolutionary & environmental sciences
For a reference copy of the document with all sections, see nature.com/documents/nr-reporting-summary-flat.pdf

Life sciences study design
All studies must disclose on these points even when the disclosure is negative.

Sample size
No statistical methods were done to predefine sample size. Sample sizes were chosen based on best practices in the field for experimental methods used.
Data exclusions To measure the dynamics of DSB repair (ParB, MalI, RecA) we analyzed the cells that could be segmented without errors and were tracked for at least 9 minutes. The errors include events when two segmented cells that merge into one, large jumps in size of segmented cells, large movements of the centre of the mass of segmented cell, lost of segmentation between consecutive cells. Next, only the cells that activated the SOS response (by increasing the CFP signal by at least 4-times) were selected for analysis. Finally, only the cells that showed clear DSB repair, that is had 2 separated ParB foci before the DSB that segregated into 2 foci after the repair, were analyzed.