Homologous recombination (HR) is essential for all cellular life because it serves to rescue replication fork catastrophes and repair double-strand breaks4. HR is also utilized by many bacteriophages and molecular biologists for competitive fitness, in the latter case for recombineering and precise genome engineering. A vital early step in HR involves the pairing of complementary DNA strands, which is promoted by recombinases that bind single-stranded DNA in order to enhance the discovery of complementary regions and subsequently stabilize the discovery. Due to the importance of HR, the ubiquitous cellular HR recombinase, bacterial RecA and its eukaryotic homologue, Rad51, have been extensively studied5. The apparent diversity amongst other, non-RecA/Rad51, HR recombinases, broadly termed single-strand annealing proteins (SSAPs), was partly rationalized when three groups were identified within the diversity. The three groups were classified according to their most prominent members; Rad52, RecT/Redβ and Erf6. Electron microscopy imaging of the λ phage SSAP, Redβ, revealed spectacular un-/dodecameric rings7, which resonated with the rings displayed by human RAD528,9,10 to suggest the intriguing possibility that SSAP action may have an underlying unity. Subsequently, advanced bioinformatic tools identified a very distant relationship amongst SSAPs based on three short motifs that are also the most conserved sequences in the Rad52 group, which led to the proposition that the SSAPs comprise a superfamily and are not merely functionally related11,12,13.

However, securing the Rad52 SSAP superfamily hypothesis required more substantial evidence than a very faint sequence relationship or a propensity to multimerize, and several labs have pursued structural and mechanistic studies to resolve the issue. Although the structure of the human RAD52 annealing domain has been known for 20 years9,10, a representative structure from the diversity of bacterial and phage SSAPs has been lacking. Somewhat like London buses, two such structures have just arrived after decades of various labs being frustrated by inherent SSAP multimerization at the high in vitro concentrations required for X-ray crystallography and NMR. However, cryoEM is not disadvantaged by this problem and now exquisite annealed DNA filament structures for the annealing domain of λ phage Redβ2 and a full-length RecT from a Listeria phage3 have been solved. Despite the near-complete absence of amino acid sequence alignment, these structures are clearly similar to each other as well as to the known RAD52 structure. Thereby the RAD52 SSAP superfamily proposition is now conclusively established and a new class of protein fold has been identified (Fig. 1a).

Fig. 1: The Rad52 SSAP superfamily protein fold and AlphaFold predictions.
figure 1

a Diagram of the Rad52 SSAP protein fold secondary structure, using RAD52 secondary structural designations9,10, with the most conserved elements in blue and the most variable in grey. bd Selected examples of the Rad52 SSAP fold from the three SSAP classes. b Erf: E.coli phage P2Erf, E.coli phage T1Erf. c Rad52: H.sapiens RAD52, S.cerevisiae Rad59; L.lactis phage ul36 SakRad52. d RecT/Redβ: λ phage Redβ, E.coli rac phage RecT; Li, Listeria innocua RecT. e AlphaFold projected full-length Redβ filament displaying electrostatic surfaces presenting the positively charged ssDNA binding groove (red) between negatively charged ridges (blue). The C-terminal three α-helical bundle27, which is not part of the annealing domain or the published cryo-EM structure2 but is required for HR23, is the perpendicular projection away from the helical axis of the filament.

The RAD52 SSAP superfamily protein fold involves five conserved elements: (i) an antiparallel three stranded β-sheet, which forms the inner surface of the helical filament; (ii) an α-helix (α3 in Fig. 1) that is packed across the three β-sheet strands. Together α3 and the β-sheet are the most conserved part of this protein fold (blue in Fig. 1) and set the inherent curvature of the filaments formed by multimerization; (iii) a second α-helix (α2, green) packs with α3; followed by (iv) a β-hairpin and (v) α-helix 1. Utilizing AlphaFold14,15 for SSAP modelling revealed that not only are other RecT/Redβ SSAPs based on the same design but also members of the Erf group (Fig. 1b), as previously anticipated12. All SSAPs contain the first three elements but variability in the other two elements (Fig. 1b, grey) is evident. Notably the most diverse Rad52 members, S.c.Rad5916 and the prokaryotic SakRad5217, appear to either lack the β-hairpin or present a greatly extended version of it. To further validate the Rad52 superfamily, we screened the AlphaFold library of one million structures using Foldseek18 with the five-element structure from HsRAD52. Of the 25 top hits, 18 were Rad52 variations (and another 5 were unknown proteins). Therefore, we conclude that the Rad52 SSAP fold is not common or previously identified and probably unique to SSAPs. This list of 25 plus a variety of AlphaFold SSAP structures for viewing by PyMOL can be found at this link https://sharing.biotec.tu-dresden.de/index.php/s/8cZI7i8EdZhENoN

The new bacteriophage filament structures not only secure the Rad52 SSAP superfamily hypothesis, but also deliver pioneering insight into SSAP annealing mechanisms, because no high-resolution structure of RAD52, or any other SSAP, bound to annealed DNA strands has been previously achieved. In concordance with a previous deduction11, the annealed DNA strands lie on the outside of the helical filament and the two strands do not cross each other with respect to the underlying protein helix. Therefore, despite Watson–Crick base pairing, the annealed strands must be more underwound than B-form dsDNA. This explains how SSAPs can be tightly bound to the annealed intermediate but have no, or little, affinity for B-form dsDNA.

Both bacteriophage structures show one DNA strand bound in a deep groove through electrostatic and hydrogen bonds to the phosphodiester backbone so the bases are presented outward. This groove is the same as the known RAD52 ssDNA binding groove19 and includes the only identifiable amino acid sequence signature in the Rad52 SSAP superfamily11. However, concordant with the substantial divergence of amino acid sequence, the binding of the ssDNA in the groove differs in detail. For example, LiRecT presents a repetitively kinked five nucleotide/monomer regularity whereas Redβ and HsRAD52 appear to be 4 nucleotides/monomer. Notably in both cases the bound ssDNA is stretched about 1.5 fold, which is the same as ssDNA stretching by RecA/Rad51. The second ssDNA strand is bound to the first by Watson-Crick base pairs with little evidence for extensive binding into a second groove or trans interactions with another filament. So it is unlikely that the second strand is stretched before annealing. Consequently, the homology search by SSAPs is likely to be similar to the stretched versus unstretched search mechanism utilized by RecA/Rad51, where an initial match can be found and then expanded as the second strand is zipped into position through Watson–Crick pairing20. The evidence for a cis-based zipping mechanism concords with observations from atomic force microscopy and optical tweezer single-molecule studies with Redβ, which also revealed a substantial increase in complex stability upon the annealing of ~10 bases (now revealed to be dimerization of Redβ) and a transition to a remarkably stable complex, termed a DNA clamp, resistant to 200 pN of pulling force11,21. The basis for a DNA clamp is evident in both of the new bacteriophage filaments, however apparently involving different secondary structural elements that move to secure the DNA after annealing. Once again, the principle appears to be the same however the details are different.

Outlook

The perception that all SSAPs are ancestrally anchored in the Rad52 superfamily promotes functional implications. Notably, helical filaments have not been reported for RAD52 rather only rings that may be heptamers8 or undecamers9,10. Despite this evidence that RAD52 multimerization is flexible, rather than a cis-zipping mechanism, ring-to-ring trans-annealing models have been favoured19,22. In light of the new SSAP structures, a reappraisal of the RAD52-annealing mechanism may be rewarding.

Recent progress with the simpler HR Redβ mechanism23,24 could also illuminate Rad52 action. Both Rad52 and Redβ annealing domains, which like all members of the Rad52 SSAP superfamily occupy ~180 amino acids at the N-terminus, are insufficient for HR and protein–protein interactions with their C-terminal regions are required23,25. One of these interactions involves the major cellular single-strand binding protein, termed replication protein A (RPA) in eukaryotes and single-strand binding (SSB) in prokaryotes26. For eukaryotic Rad52, the RPA interaction with a specific Rad52 C-terminal region was defined some time ago25. Interaction between the C-terminus of λ phage Redβ and E.coli SSB was recently identified by inspired deduction27. Concomitantly the first functional evidence for SSB contribution to phage SSAP-mediated HR was reported28. This emergent commonality involving eukaryotic Rad52/RPA and prokaryotic phage SSAPs/SSB is another indicator that Rad52 and phage SSAP HR mechanisms are related. Consequentially, now that the Rad52 SSAP superfamily is secured, a new light is cast on Rad52 action and the vast diversity of prokaryotic SSAPs can be confidently evaluated for structural and mechanistic variations around a central theme.

Methods

For HsRad52 (P43351) we used the PDB structure with ID 1kn0 PMID12191481. For the remaining proteins with IDs Q12223, Q9MC33, P04892, K7P860, Q6XQB4, P03698, UPI00004B3CF7, UPI00006DD4A7, Q92FL9, P33228, Q9T172 we predicted their structure using AlphaFold (version v2.0.1 with the full_dbs option for all but Q92FL9, for which we used version 2.2). Structures were aligned against HsRAD52 79–156. To search for proteins with a similar fold to Rad52, we submitted the HsRAD52 79–156 monomers to Foldseek.