Design and comparative characterization of RecA variants

RecA plays a central role in DNA repair and is a main actor involved in recombination and activation of the SOS response. It is also used in the context of biotechnological applications in recombinase polymerase isothermal amplification (RPA). In this work, we studied the biological properties of seven RecA variants, in particular their recombinogenic activity and their ability to induce the SOS response, to better understand the structure–function relationship of RecA and the effect of combined mutations. We also investigated the biochemical properties of RecA variants that may be useful for the development of biotechnological applications. We showed that Dickeya dadantii RecA (DdRecA) had an optimum strand exchange activity at 30 °C and in the presence of a dNTP mixture that inhibited Escherichia coli RecA (EcRecA). The differences between the CTD and C-tail of the EcRecA and DdRecA domains could explain the altered behaviour of DdRecA. D. radiodurans RecA (DrRecA) was unable to perform recombination and activation of the SOS response in an E. coli context, probably due to its inability to interact with E. coli recombination accessory proteins and SOS LexA repressor. DrRecA strand exchange activity was totally inhibited in the presence of chloride ions but worked well in acetate buffer. The overproduction of Pseudomonas aeruginosa RecA (PaRecA) in an E. coli context was responsible for a higher SOS response and defects in cellular growth. PaRecA was less inhibited by the dNTP mixture than EcRecA. Finally, the study of three variants, namely, EcPa, EcRecAV1 and EcRecAV2, that contained a combination of mutations that, taken independently, are described as improving recombination, led us to raise new hypotheses on the structure–function relationship and on the monomer–monomer interactions that perturb the activity of the protein as a whole.

First, the nucleoprotein filament is formed by binding of singlestranded DNA (ssDNA) to site I (step1). Then, the filament is activated by binding of the cofactor ATP to the ATP binding site (step 2). Next, RecA performs a homology search to find a homologous double-stranded DNA (dsDNA). In this process, dsDNA interacts first with the NTD (step 3) and then with the CTD (step 4), by which it can move to site II, where it binds (step 5). If the bound dsDNA is homologous to the ssDNA, strand exchange is performed; if it is not, the dsDNA is released (step 6). During these steps, the nucleoprotein filament presents hydrolytic activity, carried out by the [KR] × [KR] hydrolysis motif containing Lys248 and Lys250, which cooperate with Glu96 on the other monomer. The figure was generated using the PyMol program 18  www.nature.com/scientificreports/ In fact, many amino-acid point mutations have been identified to improve the different properties of RecA. Many of them modify the monomer-monomer interface because of the importance of subunit interactions for all RecA activities.
The L29M mutation alters the monomer-monomer interface and the flexibility of the whole protein. It is located in the N-terminal region, which is involved in the interaction with dsDNA. The study of chimaeras between RecA proteins of E. coli and Pseudomonas aeruginosa (Ec-Pa proteins) suggested that this mutation caused a threefold increase in the frequency of recombination exchanges (FRE) in vivo. In vitro, it caused an increase in dsDNA affinity, an increase in SSB displacement and a decrease in filament dissociation 8,9 . In the same region and interface of the protein, R28D increased the FRE by 38-fold in vivo but was not studied in vitro. However, since the R28A mutation that also increased the FRE (but to an extent that was less than R28D, 27-fold) increased ssDNA binding affinity, we believe that the R28D mutation may also have an increased ssDNA binding affinity in vitro 8,10,11 .
Comparison of other Ec-Pa RecA chimaeras highlights other mutations located at the monomer-monomer interface, such as I102D, which increases the FRE by twofold, but it has not been studied in vitro. Additionally, the combination of I159M + S162A + M164V mutations has the same effects as the mutation I102D, increasing the FRE by 1.5-fold. This combination of mutations also modifies site I of the protein 8 .
Mutations V79L and I102L have been shown to increase the conjugational capacity in vivo, but they displayed a greater persistence on DNA and caused barriers to replication and transcription processes. In vitro, they showed a more rapid displacement of SSB. I102L modifies the monomer-monomer interface 12 .
Furthermore, at least one of the L178I, A179T or L182I mutations that modify site I of the protein, where ssDNA binds, was supposed to be responsible for the increase in ssDNA affinity of P. aeruginosa RecA (PaRecA) 13 . However, this hypothesis has never been investigated in vivo or in vitro.
A289S was found during directed evolution of enhanced ionizing radiation resistance. It has not been studied in vitro, but we hypothesize that it could, similar to D276A or D276N, have an impact on RecA strand exchange activity because these three mutations were found by the same process and have the same in vivo phenotype. It is also thought that this mutation modifies the interaction with the LexA protein. It is located in the CTD of the protein, that is the dsDNA gateway 14,15 .
Finally, suppression of a part of the C-terminus of the EcRecA protein (ΔC17 or ΔC25) has been shown to increase the FRE by fourfold in vivo and to increase dsDNA binding and pairing and to display a more efficient SSB displacement and a more efficient binding to secondary structures in vitro 11,16,17 .
The main objective of this study was to characterize the ability of RecA variants to perform DNA strand exchange reaction required for RPA amplification. Thus, firstly, we highlighted the biological properties of RecA variants, in particular their recombinogenic activity and their ability to induce the SOS response, to better understand the structure-function relationship of RecA and the effect of combined mutations. Then, we highlighted the biochemical properties of RecA variants that may be useful for the development of biotechnological applications and in particular RPA amplification. Indeed, it is fundamentally important to understand the enzymatic mechanisms and to have a panel of different enzymatic properties to design devices that can be used in a wide variety of contexts and applications.

Results
RecA variants: selection and design. In this work, we selected four RecA proteins originating from different bacteria. EcRecA was chosen as a reference because it is the most studied RecA protein. RecA from Dickeya dadantii (DdRecA) was retained because it is a close relative of EcRecA (88% identity, Fig. 2). The major differences are located in the C-terminal domain of these two proteins. However, the two bacteria have different ecological niches and different optimal growth temperatures. While E. coli colonizes the gut and grows at 37 °C, D. dadantii colonizes plants and grows at 30 °C 20 . Therefore, it is hypothesized that DdRecA may have specific characteristics with regard to EcRecA. PaRecA was included in the study because of its hyperrecombinogenic activity. Indeed, previous works have revealed that in vivo PaRecA has an FRE that is 6.5 times higher than EcRecA 21 . In vitro, PaRecA displays a higher affinity for ssDNA, a more efficient SSB displacement, a higher salt and temperature stability and disassembles at only half the rate of EcRecA 8,9,13 . Finally, RecA from Deinococcus radiodurans (DrRecA) was selected because of its distinct biochemical characteristics. D. radiodurans is well known for its extreme resistance to ionizing radiation and other DNA damage-and oxidative stress-generating agents. More than 100 genes have been reported to contribute to this resistance in Deinococcus. These genes mainly encode proteins involved in DNA repair and oxidative stress defence 22 . DrRecA plays a central role in DNA repair 23 . DrRecA displays 55% sequence identity to EcRecA, with an extension in the NTD and major differences in the CTD, and uses an inverse pathway in which the filament is formed on dsDNA instead of ssDNA. It also tends to create a large number of shorter filaments than EcRecA, while it also nucleates more rapidly but extends the filaments more slowly than EcRecA. Furthermore, it has high DNA binding activity [24][25][26] .
Additionally, three E. coli RecA variants (EcPa, EcRecAV1, and EcRecAV2) were designed. These variants contain a combination of mutations that, taken independently, are described as improving recombination.
Some of these mutated amino acids are close to each other in 3D structure or in direct contact, either within a monomer or at the interface between two monomers, and are susceptible to interaction (Fig. 3 www.nature.com/scientificreports/ the L29M mutation from monomer 1 is in the vicinity of the I102D or I102L mutation from monomer 2 (mutations present in EcPa and EcRecAV2). Mutations L178I, A179T and L182I from monomer 1 are close to the I159M mutation from monomer 2 (mutations present in EcPa). The L29M mutation from monomer 1 is close to the V79L mutation from monomer 2 (mutations present in EcRecAV1 and EcRecAV2), and finally, mutations R28D and L29M present in EcRecAV2 are consecutive within the same monomer. From the 3D structure, it is difficult to predict whether combinations of these mutations will have additive, synergistic or antagonistic effects on recombination. The biological and biochemical properties of the seven RecA proteins were studied. In particular, we analysed the effects of RecA production on growth and transcription, their recombinogenic activity and their ability to induce the SOS response.
Among the biochemical properties relevant for biotechnological applications, we determined the optimal temperature to be between 30 and 37 °C, the optimal buffer to be between Tris-chloride or Tris-acetate buffer, the impact of the addition of a mixture of the four dNTPs needed in the context of isothermal amplification, as well as ssDNA binding. It was decided to not study the ATPase activity of the different proteins because there is no consistent correlation or coupling between the ATPase activity and the other RecA activities 26 .
Expression of some RecA variants produces a growth defect. Since the RecA variants engineered in this study have the potential to interact tightly with DNA, we first explored their impact on cell growth. MG1655 nalR lacZΔP recA::Cm cells producing any of the seven variant forms of RecA protein expressed under an IPTGinducible promoter were cultured in LB broth at 37 °C with 0, 0.05 or 1 mM IPTG and 100 µg/ml Ampicillin in microplate. The MG1655 nalR lacZΔP RecA + strain transformed with pQE-80L was used as control.
With 1 mM ITPG, relative overproduction of the different RecA proteins ranged from 0.8 to 1.2 compared to EcRecA ( Supplementary Fig. 1). Cells producing EcRecA, DdRecA and EcPaRecA showed no growth deficiency, as the growth profile was identical for cells with and without a plasmid expressing RecA (Fig. 4). However, growth was altered in the case of cells expressing PaRecA: the density of cells was lower in the steady state. In the case of cells producing DrRecA, EcRecAV1 and EcRecAV2, growth was even more altered, with a significant lag and AAF11887: Deinococcus radiodurans R1 RecA. The three E. coli variants EcRecAV1, EcRecAV2 and EcPa were also aligned. The filled red boxes indicate strict identity between residues, and the filled yellow boxes indicate strong homology between residues. No difference is made between black and red characters in this case because this feature is applied for other usages. At the top of the first sequence, the 2D structure of E. coli RecA is shown: the squiggles represent α-helices; the arrows, β-strands; the TT letters, strict β-turns; and the star, a residue with multiple conformations revealed by crystallography. The 2D structure is from the PDB entry 4TWZ. Additionally, the different domains of RecA are indicated in boxes: in red, the N-terminal domain (NTD); in pink, the ATP binding site; in blue, the hydrolytic residue and the hydrolytic motif; in orange, site I; in cyan, site II; in yellow, the dsDNA gateway in the CTD; and in green, the C-tail and the Mg 2+ binding sites. The first methionine is present in the alignments to allow the program to perform well. Thus, the numbering of each amino acid is shifted by one in the figure compared to the text. For example, in EcRecA, L29 is in the 30 th position in the figure. The figure was generated using the ENDscript server 27  www.nature.com/scientificreports/ time before entering the exponential phase, in addition to a lower density of cells at steady state. With 0.05 mM IPTG, toxicity is avoided, except for DrRecA and EcV1RecA for which the toxicity is still present albeit limited ( Supplementary Fig. 2). Cellular toxicity could arise due to extreme and potentially deleterious levels of genetic exchange, strong activation of the SOS response that inhibits cell division and indirectly induces toxin-antitoxin production or the creation of barriers to replication and/or transcription by persistently bound RecA filaments 12 .
Expression of some RecA variants impacts transcription. The capacity of RecA variants to create barriers impacting transcription was evaluated in MG1655 recA::Cm (recA − lacZ + ) cells by measuring lacZ transcription. This strain was transformed with pQE-80L plasmid derivatives containing each of the seven recA variants, induced with 0.05 mM IPTG, a less toxic concentration for growth, and lacZ transcription was estimated through a β-galactosidase activity assay. The effect of RecA variants was compared to that of wild-type EcRecA, taken as a reference, by calculating the ratio of β-galactosidase activity in the presence of the variant to β-galactosidase activity in the presence of EcRecA (Fig. 5). No significant differences were found between EcRecA and the DdRecA, PaRecA or EcPa variant. However, the production of the DrRecA, EcRecAV1 and EcRecAV2 variants induced a significant decrease in gene transcription. This finding suggested that overproduction of these three proteins created barriers that hindered transcription and might contribute to their cellular toxicity.
RecA variants exhibit different capacities to activate the SOS response. The ability of RecA variants to induce the SOS response was examined by monitoring the expression of the sulA::gfp reporter gene. The SOS response was evaluated in two settings: without any induction of the SOS response and after the addition of nalidixic acid, an antibiotic that induces the SOS response.
First, none of the proteins produces a constitutive SOS response: in fact, without nalidixic acid, the fluorescence measures were comparable to those without any recA expression, with the exception of PaRecA; nevertheless, the fluorescence measured was far lower than in the presence of the inducer antibiotic (Fig. 6).
DdRecA and EcPa variants presented a fluorescence value comparable to that of EcRecA, with and without induction by nalidixic acid. Thus, the SOS response is assumed to be similar for these three proteins and indicates that they are able to interact with LexA. PaRecA resulted in a higher fluorescence value than EcRecA, with and without induction, indicating that this protein induced the SOS more efficiently. The high activation of the SOS www.nature.com/scientificreports/ response by PaRecA contributed to the inhibition of cellular division and growth deficiency. In the case of the EcRecAV2 variant, fluorescence levels were higher only under nalidixic acid induction. This finding suggested that SOS induction promoted the formation of EcRecAV2 filaments tightly interacting with DNA that were probably unable to disassemble and were consequently responsible for SOS superinduction. No significant SOS response induction was detected with the DrRecA and EcRecAV1 variants, suggesting that these two proteins were probably unable to interact with E. coli LexA.
Recombinogenic activity of RecA variants. The recombinogenic activity of RecA variants was estimated by restoration of an intact lacZ gene after conjugation between a donor strain Hfr3000 lacZΔT::CmR (lacZ -) containing a lacZ C-terminal deletion and sharing 2.7 kb homology within lacZ with the recipient strain MG1655 nalR lacZΔP recA::CmR (recA − lacZ − ), which contained a lacZ N-terminal deletion. Plasmid derivatives containing each of the seven recA variants were introduced in the recipient strain, and conjugation was then performed with the donor strain. Reconstitution of an intact lacZ gene was estimated through a β-galactosidase activity assay 28,29 . Recombinogenic activity of the RecA variants was compared to the wild-type EcRecA as a reference by calculating the ratio of the β-galactosidase activity in the presence of the variant to the β-galactosidase activity in the presence of EcRecA (Fig. 7). A conjugation between Hfr3000 lacZΔT::CmR and MG1655 nalR lacZΔP RecA + was introduced as control.
It is important to note that a lower recombinogenic activity can result from an alteration of lacZ transcription caused by some RecA variants, as seen above. However, it was decided to not apply a correction factor to avoid introducing a bias. To exclude the effect on cell growth, traditional recombination assays requiring cell survival were not considered, but transient recombination assays were instead used 30 .
As expected, without any RecA, very low recombinogenic activity was detected. The RecA + control gave similar results to plasmidic EcRecA expressed protein.
The DrRecA and EcRecAV1 and EcRecAV2 variants showed very low recombinogenic activities, which were not significantly different from the negative control (p-values = 0.3192; 0.9656 and 0.4157, respectively, not shown in the figure).
The recombinogenic activity of DdRecA was 2 times lower than that of EcRecA, but the measured activity was significantly different from the negative control (p-value = 0.0015, not shown in the figure).
The recombinogenic activity of PaRecA displayed no significant differences from EcRecA in our transient recombination experimental setup. In the literature, the frequency of recombination exchanges was assayed to be 6.5 times higher for PaRecA than for EcRecA, but the experimental conditions were different. While the monitoring of recombination here was based on Hfr transient conjugation and reconstitution of the lacZ gene www.nature.com/scientificreports/ between donors and recipients, the calculations in the literature were based on the transfer of donor markers after conjugation and selection of transconjugants. The recombinogenic activity of the EcPa variant tended to be 2 times higher than that of EcRecA, with a p-value of 0.11. These results will be discussed further below.

Study of different reaction conditions for the strand exchange activity of RecA variants.
The strand exchange activity of each protein was tested in the standard 3-strand exchange assay using M13mp18 substrates in four settings, namely, two buffers, Tris-acetate pH 7.5 and Tris-HCl pH 7.6, classically used in strand exchange reactions (the first is used commonly 11,12,15 and the second is the commercial buffer developed by New England BioLabs 31 ), and two temperatures, 30 and 37 °C, corresponding to the two optimal growth temperatures of the organisms studied. The efficiency of strand exchange was evaluated semi-quantitatively by calculating the fraction of fluorescence corresponding to the nicked circular product (NCP) compared to the total fluorescence excluding ssDNA and joint molecules (the total fluorescence considered is the sum of the fluorescence of the NCP and the fluorescence of the linear dsDNA) (Fig. 8, Supplementary Fig. 3). The strains analysed are RecA + strain (green dot filled bar), ΔrecA strain (unfilled bar), ∆recA strain producing EcRecA (green filled bar, it is the reference but is presented for visual comprehension) and ∆recA strains producing the variants PaRecA, DrRecA, EcPa, EcRecAV1 and EcRecAV2 (green dot filled bars). NS indicated a non-significant p-value (from the one sample t test compared to 1, the EcRecA value), * indicates a significant p-value < 0.05 and ** a significant p-value < 0.005. Error-bars indicate SD. www.nature.com/scientificreports/ This method allowed us to determine which protein was most efficient in performing strand exchange for each condition, as well as the best buffer and temperature for each protein.
EcRecA was the most efficient variant at 37 °C in both buffers, whereas DdRecA was the most efficient variant at 30 °C.
EcRecA and PaRecA performed best in Tris-HCl buffer, whereas the DdRecA, DrRecA and EcPa performed best in Tris-Ac buffer. EcRecA, PaRecA, DrRecA variants functioned better at 37 °C, whereas DdRecA and the EcPa variants functioned better at 30 °C.
In addition, EcRecAV1 and EcRecAV2 variants were unable to catalyze significant strand exchange independently of the conditions used (< 10%).
The impact of the addition of a mixture of the four dNTPs needed in the context of DNA isothermal amplification for infectious disease diagnostic was further analysed using the same experimental setup and the best conditions established above for the different proteins.
The addition of the dNTPs had a negative effect on the activities of EcRecA, DrRecA and EcPa variant, while it had no effect on PaRecA and a positive effect of DdRecA, for which the activity was increased by twofold (Fig. 9). The effects on the EcRecAV1 and EcRecAV2 variants were not evaluated due to the absence of significant strand exchange activity. These results will be discussed further below.
Study of ssDNA binding to RecA variants. ssDNA binding is the first step performed by RecA and leads to strand exchange and to activation of the SOS response. Altered ssDNA binding can impact all cellular processes involved in DNA repair. Binding of ssDNA should also be evaluated to design effective assays for biotechnological applications.
Therefore, ssDNA binding was assessed by an electrophoretic mobility shift assay. A short 35-nt FAM-labelled ssDNA was incubated with each RecA variant for 10 min under the conditions determined in the previous section. The fraction of bound primer was then determined by fluorescence analysis using electrophoresis on native polyacrylamide gels.
Binding was performed in the presence of ATP or in the presence of non-hydrolysable ATPγS. In the presence of ATP, ATP is hydrolysed, and the filament dissociates, so a low fraction of bound primer was expected. In the presence of ATPγS, the filament is in an active P-state configuration that does not dissociate due to the absence of hydrolysis 32 . Indeed, with ATPγS, RecA filaments have high stability 33 .
Thus, without hydrolysis, binding was optimal for EcRecA, PaRecA and DrRecA (> 90%). Binding was lower but substantial for DdRecA and EcPa variant (> 20%), and it was low for the EcRecAV1 and EcRecAV2 variants (< 10%) (Fig. 10, Supplementary Fig. 4). Note that in the case of DdRecA, binding was assessed after 20 min, rather than 10 min, due to the observed slow binding kinetics (results not shown).
In the presence of ATP, binding was at least fivefold lower than binding in the non-dissociating conformation for EcRecA, DdRecA, PaRecA and DrRecA and twofold lower for the EcPa variant. However, for the EcRecAV1 www.nature.com/scientificreports/ and EcRecAV2 variants, contrary to expectations, the binding of ssDNA was higher in the presence of ATP than in the presence of ATPγS. These data highlighted an altered cofactor interaction for these two RecA variants.

Discussion
The RecA protein structure, shared by RecA homologues in virtually all organisms, is highly malleable and tailored by evolution to the requirements for genome maintenance in each particular organism. Indeed, comparisons of RecA proteins from different bacteria highlight subtle variations.
In this work, we compared the biological and biochemical properties of four RecA proteins originating from E. coli, D. dadantii, P. aeruginosa and D. radiodurans and three EcRecA variants containing a combination of mutations that, taken independently, are described to improve recombination.
Both in vivo and in vitro results revealed variations in the biological and biochemical behaviours of the RecA variants compared to wild-type EcRecA, which was chosen as a reference.
DdRecA and EcRecA behaved similarly. Their overproduction had no impact on growth and gene transcription. In addition, both activated SOS response. Homologous recombination activity was twofold lower for DdRecA than for EcRecA. As expected from the high sequence similarity, the in vivo activity was similar between DdRecA and EcRecA. However, in vitro, the differences were more pronounced: the activity of DdRecA was optimum at 30 °C, at which temperature the activity was higher than that of EcRecA, which is optimum at 37 °C. This probably reflects an adaptation of RecA proteins from these two enterobacteria to their respective ecological niches during evolution. The ratios of β-galactosidase activity in the presence of the variant versus β-galactosidase activity in the presence of EcRecA, used as a reference, were calculated for 3 different biological experiments. These ratios are plotted as a histogram. The strains analysed are RecA + strain (blue dot filled bar), ΔrecA strain (unfilled bar), the ∆recA strain producing EcRecA (blue filled bar, it is the reference but is presented for visual comprehension) and ∆recA strains producing PaRecA, DrRecA, EcPa, EcRecAV1 and EcRecAV2 variants (blue dot filled bars). NS indicates a non-significant p-value (from the one sample t test compared to 1, the EcRecA value), * indicates a significant p-value < 0.05 and ** a significant p-value < 0.005. Error-bars indicates SD. www.nature.com/scientificreports/ Furthermore, the addition of dNTPs at a concentration of 250 µM used for the biotechnological application RPA, which inhibits EcRecA activity, increased DdRecA activity by twofold. Inhibition of wild-type EcRecA by GTP, CTP and TTP is well known and is explained by the competition to occupy the ATP binding site and a decrease in the binding affinity to DNA, affecting filament structure 4,34-38 . As dATP acts as a cofactor for RecA in the same way as ATP, it is hypothesized that the dNTP would interact in a similar way to the corresponding NTP and that adding competitive cofactors to the reaction would explain the inhibition observed for EcRecA strand exchange activity 39 . The positive effect of dNTPs on DdRecA is particularly interesting but could not be correlated with any sequence differences between EcRecA and DdRecA ATP binding sites or hydrolytic residues,. As the CTD and the C-tail of DdRecA are slightly preserved compared to EcRecA, we hypothesized that this domain plays a role in the conformation changes induced by cofactor binding and could explain the modifications of the dNTP effect (Fig. 2). These two findings can be used in the development of isothermal RPA. Indeed, the temperature of 30 °C is closer to room temperature than 37 °C in a large number of countries. On the other hand, a slower interaction with short ssDNA has also been observed, a property that should be considered during the conception of biotechnological applications. As this last property can slow down the homology search between the oligonucleotide and the DNA of the sample, it can also modify the balance with the activity of the polymerase, which could be less obstructed by RecA with this type of kinetics. www.nature.com/scientificreports/ Overexpression of PaRecA altered cell growth and induced a higher SOS response than the EcRecA-activated response. PaRecA has been described earlier as lacking constitutive SOS function and having a normal ability to cleave the E. coli LexA repressor. Nevertheless, it has been shown that in an E. coli genetic context, a higher SOS response is observed: when an operator promoter from the E. coli lac gene is used instead of the operator promoter from the P. aeruginosa recA gene, the basal SOS response without inducer is 1.8-fold higher than that of E. coli, which is explained by the higher protein expression relative to the very weak expression in the P. aeruginosa background 8,21 . Thus, our results could be explained by the overproduction of PaRecA from pQE-80L plasmid. In addition, the higher SOS response in the presence of the inducer nalidixic acid confirmed that the SOS response induced by PaRecA is not constitutive. On the other hand, no impact on transcription was found, indicating that PaRecA filaments did not persistently bind to DNA and did not form significant barriers hindering transcription. Homologous recombination activity was close to that of EcRecA. However, the activity was  www.nature.com/scientificreports/ expected to be higher since a ΔFRE of 6.5 has been reported in the literature. This discrepancy can be explained by the difference between the two experimental procedures, highlighting the variability of RecA activity depending on the experimental and cellular conditions. In vitro, the best working conditions of this protein were close to those of EcRecA, except that the activity of PaRecA did not appear to be inhibited by dNTPs. PaRecA, similar to DdRecA, also presents differences in the CTD suggesting that these differences are responsible for the insensitivity to dNTP inhibition. In addition, it is interesting to note that although the optimum growth temperature of P. aeruginosa is 30 °C, the RecA activity was higher at 37 °C compared to 30 °C. This may be related to the fact that P. aeruginosa is an opportunistic pathogen capable of colonizing the lungs and causes severe infection in individuals with cystic fibrosis. In the lungs, P. aeruginosa is confronted with 37 °C temperature and with the host immune response and must deal with oxidative stress that causes DNA damage by various mechanisms. For example, among the reactive oxygen species, the highly reactive hydroxyl radical (•OH) reacts with DNA by addition to double bonds of DNA bases and by abstraction of an H atom from the methyl group of thymine and each of the C-H bonds of 2′-deoxyribose 40 . Optimal activity of RecA is therefore necessary under these conditions to repair DNA. Recently, it has been shown that EcRecA is inactivated under oxidative stress following oxidation of two important methionine residues (Met 35 and Met 164) to methionine sulfoxide. These two residues are not conserved in PaRecA and DrRecA originating from bacteria that are highly resistant to oxidative stress (Fig. 2) 41 .
Overexpression of DrRecA altered cell growth and had a negative impact on gene transcription, indicating that it bound DNA permanently and created barriers to RNA polymerase and probably also to DNA polymerase in the E. coli cellular environment. In addition, we observed that DrRecA did not perform homologous recombination in E. coli, suggesting that it could not interact with E. coli recombination accessory proteins. These behaviours could also be due to the absence of specific proteins interacting with DrRecA in its native cellular environment. In particular, D. radiodurans, encodes an unusual SSB that is twice the size of EcSSB and plays a more central role in DNA pairing and strand exchange 42 . In addition, Deinococcus encodes several proteins that contribute to extreme radioresistance, including RqkA or PprA, which are absent in E. coli. RqkA is a serine threonine/tyrosine kinase that catalyses the phosphorylation of DrRecA at Tyr-77 and Thr-318 and modifies its activity in several ways, including increasing its affinity for dsDNA and its preference for dATP over ATP. It enables DrRecA to use abundant dsDNA substrates for efficient dsDNA break repair 43 . The PprA protein negatively regulates DrRecA functions by inhibiting DNA strand exchange and ATP hydrolysis to protect the D. radiodurans genome from hyperrecombination and its associated negative effects 44 .
In the same way, the absence of the SOS response could be due to the absence of an interaction between DrRecA and EcLexA. This can be partly explained by the substitution of arginine, which is involved in the EcReca-LexA interaction, by a lysine in DrRecA at position 243 21,45 . In addition, although Deinococcus radiodurans encodes two LexA homologues that possess proteolytic activity that can be stimulated by RecA, these proteins do not participate in the induction of recA, indicating the malfunction of the classical SOS response system [46][47][48] .
Concerning the in vitro activities of DrRecA, the main difference from EcRecA was the buffer specificity. DrRecA was totally inhibited by the Tris-HCl buffer. Its highest activity was found in Tris-acetate buffer, and the activity increased in the presence of glycerol and glutamate potassium (results not shown). In fact, buffer composition affects the biophysical and structural properties of the proteins 52 . The different molecules of the buffer can affect the surface protein and its electrostatic properties, thus having an effect on the protein itself and on the interaction with its different partners. However, because of the extremely diverse properties of proteins, it is difficult to predict a priori the effect of the buffer composition in a theoretical way 52 . Previous studies on the buffer impact on EcRecA have shown that anion present in the buffer influences the binding affinity, which is higher when acetate ions are present instead of chloride ions at the same concentration, suggesting the modification of ionic interactions 33 . Indeed, ion nature and its concentration influence RecA structure, aggregation, unfolding transitions and stability and thus protein-protein interactions 53,54 .
Additionally, similar to PaRecA, DrRecA activity performed better at 37 °C than 30 °C, despite the optimal cell growth temperature of 30 °C. Numerous Deinococcus species have been isolated from arid deserts where they are exposed to high temperature and UV radiation 55 , and D. radiodurans has been shown to be an extremely resistant bacterium that survives under various environmental stress conditions, including heat stress 56 . The activity of RecA has probably been adapted during evolution to function under these conditions. The EcPa variant behaved similarly to EcRecA, with unaltered cell growth, gene transcription and SOS response activation. Its homologous recombination activity was twice that of EcRecA. However, in vitro, the activity was not improved compared to EcRecA under the conditions tested. Interestingly, this is an E. coli RecA variant with P. aeruginosa mutations, both EcRecA and PaRecA proteins being optimal at 37 °C and in Tris-HCl buffer, while the EcPa variant performed better at 30 °C in Tris-Ac buffer. Once again, these results highlight that, the impact of the combination of mutations and the modulation of protein surface properties by different buffers are unpredictable 52 .
As expected, the mutated amino acids present in EcPa led to an increase in homologous recombination activity. This activity is higher than that of PaRecA in this configuration, leading us to believe that the mutated amino acids interact with amino acids only present in EcRecA.
However, it was thought that one of the mutations between L178I, A179T and L182I would modify site I of the ssDNA such that the affinity for ssDNA would be increased 13 . This was not observed; in contrast, the affinity for ssDNA decreased for this variant. Therefore, we hypothesize that these mutations are not responsible for the increased affinity for ssDNA of PaRecA or that they need to interact with other amino acids present in PaRecA but not present in the EcPa variant.
Both the EcRecAV1 and EcRecAV2 variants have a strong impact on cell growth, gene transcription and recombination inhibiting all activities. In vitro, strand exchange was also strongly inhibited. Evaluation of ssDNA binding showed that the in vivo and in vitro observations can be explained by an altered behaviour of these www.nature.com/scientificreports/ proteins with its cofactor, ATP, in the presence of which DNA binding was strong. This can prevent DNA interactions with the rest of the cellular machinery and be the origin of the inhibition of all cellular processes. Globally, both EcV1RecA and EcV2RecA variants are loss of function mutants. The two proteins have the L29M and V79L mutations that are spatially close in the 3D structure with the L29M mutation from monomer 1 contacting the V79L mutation from monomer 2. Thus, the combination of the two mutations is antagonistic, resulting in an inadequate interaction with the cofactor that causes the absence of homologous recombination activity in vivo and strand exchange activity in vitro.
The main difference between EcRecAV1 and EcRecAV2 variants concerned the ability to induce SOS response. EcRecAV1 did not induce the SOS response while EcRecAV2 activated strongly SOS response in the presence of inducer nalidixic acid. The absence of SOS activation for EcRecAV1 could be related to the A289S mutation that has been hypothesized to modify the interaction with LexA 14 . Interestingly, this mutation is also present in DrRecA that does not exhibit the SOS response in the cellular context tested.
On the contrary, the strong SOS response activation by EcRecAV2 variant could be related to the R28D mutation that has been shown to slightly increase the SOS response by 2.2-fold 11 .
In conclusion, we characterized seven RecA variants with different characteristics in vivo and in vitro. Our study has revealed unexpected effects of combinations of different mutations and that in vivo and in vitro RecA activities do not have direct correlations.
DdRecA was studied for the first time, and three important and distinct features were discovered in vitro: a high activity at 30 °C, the potentiation of strand exchange activity in the presence of a dNTP mixture and the slow binding of ssDNA. These features can be considered in the development of new biotechnological applications, using recombinase polymerase DNA isothermal amplification.

Materials and methods
Biochemicals, bacterial strains, enzymes and DNA. The recA genes were cloned into the plasmid pQE-80L, with a 6-His tag at the N-terminus, Amp resistance gene and lacIq repressor, by Agentide. An empty pQE-80L plasmid was used as a negative control.
The bacterial strains used were Hfr3000 lacZΔT CmR (with a lacZ gene truncated from its terminal part + 2798 to 3051) and MG1655 nalR lacZΔP recA::CmR (with a lacZ gene truncated from its initial part − 138 to + 10) 28 . In addition, MG1655 recA::CmR was prepared by P1 transduction. For the study of the SOS response, ME12 ∆recA::kan was used 58 .
M13mp18 ssDNA, M13mp18 RFI and dNTPs (10 mM each) were purchased from New England BioLabs (NEB). M13mp18 RFI was linearized with PstI from NEB. ATPϒS was purchased from Roche. ATP, singlestranded binding protein from E. coli (SSB), creatine kinase and phosphocreatine were purchased from Sigma. A 35-nt oligonucleotide labelled with 5'-FAM was synthesized from Eurofins. Double-stranded DNA is a 1-kb preparation from an E. coli gene prepared by PCR amplification and purification with the QIAquick PCR purification kit from Qiagen.
Effect on cell growth. MG1655 nalR lacZΔP recA::CmR transformed strains were used for this experiment. The MG1655 nalR lacZΔP RecA + strain transformed with the empty pQE-80L plasmid was used as control. Cells were grown in LB broth at 37 °C with 0, 0.05 or 1 mM IPTG and 100 µg/mL ampicillin in microplate. Cell growth was analysed by measuring the OD600 with a Tecan Spark for 24 h. Each experiment was performed in triplicate.
Effect on gene transcription. Overnight cultures of MG1655 recA::CmR transformed strains were diluted 50-fold and grown to OD600 = 0.4 in LB broth in the presence of ampicillin at 100 µg/mL. The MG1655 RecA + strain transformed with the empty pQE-80L plasmid was used as control. IPTG was added to a final concentration of 0.05 mM. 0.05 mM was used to avoid the toxic effect on growth. Cells were then grown to OD600 = 1-1.5 and washed in M63 medium. The β-galactosidase activity was measured by Miller's assay: 100 µL of cells was diluted in 1 mL of buffer Z (60 mM Na 2 HPO 4 , 40 mM NaH 2 PO 4 , 10 mM KCl, 1 mM MgSO4, 0.05 M β-mercaptoethanol, pH = 7) with ONPG to a final concentration of 4 mg/mL. After a 15-min incubation at 37 °C, the reaction was stopped with 500 µL of Na 2 CO 3 at 1 M and centrifuged for 10 min at 13.2 rpm, and the OD420 was measured. The β-galactosidase activity was calculated in Miller units: 1000 × OD420/(OD600 × volume (0.1 mL) × reaction time (15 min)). Activity was compared with EcRecA activity by calculating the ratio between the two activities. For each strain, three independent biological experiments were performed. For the different RecA variants, the p-value was calculated by performing a one-sample t test compared to 1 (corresponding to EcRecA). Protein purification. The recombinant pQE-80L-recA plasmids were transformed into the expression host E. coli BL21 (DE3) for overexpression of RecA proteins. The overnight culture was diluted 50-fold in 500 mL of fresh LB medium with ampicillin at 100 µg/mL. The culture was incubated at 37 °C until OD = 0.5-0.6. The culture was then incubated for an additional 2 h at 15 °C until OD = 0.6-0.7. IPTG was added to a final concentration of 1 mM, and the cultures were incubated for an additional 4 h at 15 °C. Cells were harvested by centrifugation, flash frozen and stored at −80 °C. Thawed cells were lysed after resuspending them in 20 mL of lysis buffer (50 mM NaH 2 P0 4 , 300 mM NaCl, 10 mM imidazole, pH 0.8, with protease inhibitor) supplemented with 10 mM MgCl 2 using a French press. The cell lysate was clarified by centrifugation at 10,000g for 30 min at 4 °C. DNAse treatment was performed on the supernatant using DNAse I at a final concentration of 0.5 µg/mL for 10 min at 37 °C. The experiment was continued at 4 °C. The supernatant was loaded onto a column containing 1 mL of equilibrated Ni-NTA resin, and contact was made for 1 h with shaking to allow binding of the proteins to the resin. The lysate was then washed extensively with washing buffer (same as the lysis buffer but with 20 mM imidazole). The bound protein was eluted with elution buffer (same as the lysis buffer but with 250 mM imidazole and 0.1% Triton X-100) in 0.5-mL fractions. After gel analysis, fractions containing a high concentration of RecA were pooled and dialyzed against dialysis buffer (10 mM Tris-HCl, pH 7.3, EDTA 0.1 mM, DTT 1 mM). The sample was then filtered with a 0.45-µm filter and loaded onto an equilibrated MonoQ column. The sample was washed extensively with dialysis buffer, and the bound protein was eluted using a linear gradient of 150-1000 mM NaCl in dialysis buffer. After gel analysis, fractions containing pure RecA were pooled and dialyzed against dialysis buffer and against storage buffer (10 mM Tris-HCl, pH 7.3, 0.1 mM EDTA, 1 mM DTT, 50% glycerol). The protein concentration was checked by the Bradford method using BSA as a standard, and the protein was stored at −80 °C. Protein quality was assessed by 12% SDS-PAGE followed by Coomassie blue staining.
Buffers. For the in vitro reactions, two different buffers were selected after a literature review and experimental testing (results not shown). The first one corresponded to the NEB RecA reaction buffer (70 mM Tris-HCl, 10 mM MgCl2, 5 mM DTT, pH = 7.6); it is referred to as Tris-HCl buffer. The second, Tris-Ac buffer, was composed of 25 mM Tris acetate, 10 mM Mg acetate, 1 mM DTT, 5% glycerol and 3 mM potassium glutamate, pH 7.5.
Strand exchange reaction. Twenty microlitres of the reaction mixture containing the selected buffer, 10 µM/nt M13mp18 ssDNA and 10.5 µM RecA (RecA was in excess) was incubated at 30 or 37 °C for 10 min. ATP at 3 mM, an ATP regeneration system (12 mM phosphocreatine and 10 U/mL creatine kinase) and SSB at 0.5 µM were added, and incubation was continued for another 10 min. Finally, linearized M13mp18 RFI l was added at 10 µM/bp. When studied, a dNTP mix was added at this step at 250 µM each. Incubation was continued for 1.5 h. The reaction was stopped by adding 4 µL of 10% SDS and heating at 70 °C for 15 min. Next, 4.8 µL of 6X loading solution was added, and the samples were loaded onto a 0.7% agarose gel and electrophoresed in Trisacetate buffer at 70 V for 2 h. The products of the reaction were visualized with a Bio-Rad ChemiDoc imager. The bands were quantified with Image Lab software from Bio-Rad.