CRISPR (clustered regularly interspaced short palindromic repeats) and the nearby Cas (CRISPR-associated) operon establish an RNA-based adaptive immunity system in prokaryotes1,2,3,4,5. Molecular memory is created when a short foreign DNA-derived prespacer is integrated into the CRISPR array as a new spacer6,7,8,9. Whereas the RNA-guided CRISPR interference mechanism varies widely among CRISPR–Cas systems, the spacer integration mechanism is essentially identical7,8,9. The conserved Cas1 and Cas2 proteins form an integrase complex consisting of two distal Cas1 dimers bridged by a Cas2 dimer6,10. The prespacer is bound by Cas1–Cas2 as a dual-forked DNA, and the terminal 3′-OH of each 3′ overhang serves as an attacking nucleophile during integration11,12,13,14. The prespacer is preferentially integrated into the leader-proximal region of the CRISPR array1,7,10,15, guided by the leader sequence and a pair of inverted repeats inside the CRISPR repeat7,15,16,17,18,19,20. Spacer integration in the well-studied Escherichia coli type I–E CRISPR system also relies on the bacterial integration host factor21,22. In type II–A CRISPR, however, Cas1–Cas2 alone integrates spacers efficiently in vitro18; other Cas proteins (such as Cas9 and Csn2) have accessory roles in the biogenesis phase of prespacers17,23. Here we present four structural snapshots from the type II–A system24 of Enterococcus faecalis Cas1 and Cas2 during spacer integration. Enterococcus faecalis Cas1–Cas2 selectively binds to a splayed 30-base-pair prespacer bearing 4-nucleotide 3′ overhangs. Three molecular events take place upon encountering a target: first, the Cas1–Cas2–prespacer complex searches for half-sites stochastically, then it preferentially interacts with the leader-side CRISPR repeat, and finally, it catalyses a nucleophilic attack that connects one strand of the leader-proximal repeat to the prespacer 3′ overhang. Recognition of the spacer half-site requires DNA bending and leads to full integration. We derive a mechanistic framework to explain the stepwise spacer integration process and the leader-proximal preference.
Your institute does not have access to this article
Open Access articles citing this article.
Nature Communications Open Access 17 June 2021
Nature Communications Open Access 06 May 2021
Scientific Reports Open Access 25 February 2021
Subscribe to Nature+
Get immediate online access to the entire Nature family of 50+ journals
Subscribe to Journal
Get full journal access for 1 year
only $3.90 per issue
All prices are NET prices.
VAT will be added later in the checkout.
Tax calculation will be finalised during checkout.
Get time limited or full article access on ReadCube.
All prices are NET prices.
Barrangou, R. et al. CRISPR provides acquired resistance against viruses in prokaryotes. Science 315, 1709–1712 (2007)
Bolotin, A., Quinquis, B., Sorokin, A. & Ehrlich, S. D. Clustered regularly interspaced short palindrome repeats (CRISPRs) have spacers of extrachromosomal origin. Microbiology 151, 2551–2561 (2005)
Mojica, F. J., Díez-Villaseñor, C., García-Martínez, J. & Soria, E. Intervening sequences of regularly spaced prokaryotic repeats derive from foreign genetic elements. J. Mol. Evol. 60, 174–182 (2005)
Pourcel, C., Salvignol, G. & Vergnaud, G. CRISPR elements in Yersinia pestis acquire new repeats by preferential uptake of bacteriophage DNA, and provide additional tools for evolutionary studies. Microbiology 151, 653–663 (2005)
Marraffini, L. A. & Sontheimer, E. J. CRISPR interference limits horizontal gene transfer in staphylococci by targeting DNA. Science 322, 1843–1845 (2008)
Jackson, S. A. et al. CRISPR–Cas: adapting to change. Science 356, eaal5056 (2017)
Yosef, I., Goren, M. G. & Qimron, U. Proteins and DNA elements essential for the CRISPR adaptation process in Escherichia coli. Nucleic Acids Res. 40, 5569–5576 (2012)
Shmakov, S. et al. Discovery and functional characterization of diverse class 2 CRISPR–Cas systems. Mol. Cell 60, 385–397 (2015)
Makarova, K. S. et al. An updated evolutionary classification of CRISPR–Cas systems. Nat. Rev. Microbiol. 13, 722–736 (2015)
Nuñez, J. K. et al. Cas1–Cas2 complex formation mediates spacer acquisition during CRISPR–Cas adaptive immunity. Nat. Struct. Mol. Biol. 21, 528–534 (2014)
Wang, J. et al. Structural and mechanistic basis of PAM-dependent spacer acquisition in CRISPR–Cas systems. Cell 163, 840–853 (2015)
Nuñez, J. K., Harrington, L. B., Kranzusch, P. J., Engelman, A. N. & Doudna, J. A. Foreign DNA capture during CRISPR–Cas adaptive immunity. Nature 527, 535–538 (2015)
Nuñez, J. K., Lee, A. S., Engelman, A. & Doudna, J. A. Integrase-mediated spacer acquisition during CRISPR–Cas adaptive immunity. Nature 519, 193–198 (2015)
Rollie, C., Schneider, S., Brinkmann, A. S., Bolt, E. L. & White, M. F. Intrinsic sequence specificity of the Cas1 integrase directs new spacer acquisition. eLife 4, e08716 (2015)
Díez-Villaseñor, C., Guzmán, N. M., Almendros, C., García-Martínez, J. & Mojica, F. J. CRISPR–spacer integration reporter plasmids reveal distinct genuine acquisition specificities among CRISPR–Cas I-E variants of Escherichia coli. RNA Biol. 10, 792–802 (2013)
McGinn, J. & Marraffini, L. A. CRISPR–Cas systems optimize their immune response by specifying the site of spacer integration. Mol. Cell 64, 616–623 (2016)
Wei, Y., Chesne, M. T., Terns, R. M. & Terns, M. P. Sequences spanning the leader-repeat junction mediate CRISPR adaptation to phage in Streptococcus thermophilus. Nucleic Acids Res. 43, 1749–1758 (2015)
Wright, A. V. & Doudna, J. A. Protecting genome integrity during CRISPR immune adaptation. Nat. Struct. Mol. Biol. 23, 876–883 (2016)
Goren, M. G. et al. Repeat size determination by two molecular rulers in the type IE CRISPR array. Cell Rep. 16, 2811–2818 (2016)
Wang, R., Li, M., Gong, L., Hu, S. & Xiang, H. DNA motifs determining the accuracy of repeat duplication during CRISPR adaptation in Haloarcula hispanica. Nucleic Acids Res. 44, 4266–4277 (2016)
Nuñez, J. K., Bai, L., Harrington, L. B., Hinder, T. L. & Doudna, J. A. CRISPR immunological memory requires a host factor for specificity. Mol. Cell 62, 824–833 (2016)
Yoganand, K. N., Sivathanu, R., Nimkar, S. & Anand, B. Asymmetric positioning of Cas1–2 complex and Integration Host Factor induced DNA bending guide the unidirectional homing of protospacer in CRISPR-Cas type I-E system. Nucleic Acids Res. 45, 367–381 (2017)
Heler, R. et al. Cas9 specifies functional viral targets during CRISPR–Cas adaptation. Nature 519, 199–202 (2015)
Nam, K. H., Kurinov, I. & Ke, A. Crystal structure of clustered regularly interspaced short palindromic repeats (CRISPR)-associated Csn2 protein revealed Ca2+-dependent double-stranded DNA binding activity. J. Biol. Chem. 286, 30759–30768 (2011)
Schumacher, M. A., Choi, K. Y., Zalkin, H. & Brennan, R. G. Crystal structure of LacI member, PurR, bound to DNA: minor groove binding by alpha helices. Science 266, 763–770 (1994)
Garvie, C. W. & Wolberger, C. Recognition of specific DNA sequences. Mol. Cell 8, 937–946 (2001)
Yang, W. Nucleases: diversity of structure, function and mechanism. Q. Rev. Biophys. 44, 1–93 (2011)
Shipman, S. L., Nivala, J., Macklis, J. D. & Church, G. M. CRISPR–Cas encoding of a digital movie into the genomes of a population of living bacteria. Nature 547, 345–349 (2017)
Mojica, F. J., Díez-Villaseñor, C., García-Martínez, J. & Almendros, C. Short motif sequences determine the targets of the prokaryotic CRISPR defence system. Microbiology 155, 733–740 (2009)
Marraffini, L. A. & Sontheimer, E. J. Self versus non-self discrimination during CRISPR RNA-directed immunity. Nature 463, 568–571 (2010)
Otwinowski, Z. & Minor, W. Processing of X-ray diffraction data collected in oscillation mode. Methods Enzymol. 276, 307–326 (1997)
Sheldrick, G. M. Experimental phasing with SHELXC/D/E: combining chain tracing with density modification. Acta Crystallogr. D 66, 479–485 (2010)
Emsley, P. & Cowtan, K. Coot: model-building tools for molecular graphics. Acta Crystallogr. D 60, 2126–2132 (2004)
Adams, P. D . et al. PHENIX: a comprehensive Python-based system for macromolecular structure solution. Acta Crystallogr. D 66, 213–221 (2010)
Murshudov, G. N., Vagin, A. A. & Dodson, E. J. Refinement of macromolecular structures by the maximum-likelihood method. Acta Crystallogr. D 53, 240–255 (1997)
McCoy, A. J. et al. Phaser crystallographic software. J. Appl. Crystallogr. 40, 658–674 (2007)
Sievers, F. et al. Fast, scalable generation of high-quality protein multiple sequence alignments using Clustal Omega. Mol. Syst. Biol. 7, 539 (2011)
Robert, X. & Gouet, P. Deciphering key features in protein structures with the new ENDscript server. Nucleic Acids Res. 42, W320–W324 (2014)
Winn, M. D. et al. Overview of the CCP4 suite and current developments. Acta Crystallogr. D 67, 235–242 (2011)
This work is supported by NIH/NIGMS awards GM118174 and GM102543 to A.K. We thank D. Neau for assistance with data collection, and K. Rajashankar and I. Kriksunov for beamtime allocation. We thank P. Nguyen, R. Battaglia and A. Dolan for technical assistance and discussions. This work is based upon research conducted at NECAT, supported by NIH/NIGMS awards P41-GM103403 and S10-RR029205; and at CHESS and MACCHESS, supported by NSF award DMR-1332208 and NIH/NIGMS award GM-103485. This research used resources of the Advanced Photon Source, a US Department of Energy facility under contract no. DE-AC02-06CH11357.
Cornell University is in the process of applying for a patent application covering the utilization of the Cas1–Cas2-mediated DNA integration mechanism, as revealed in this study, for potential genome manipulation applications that lists A.K. and Y.X. as inventors.
Reviewer Information Nature thanks F. Dyda, M. White and the other anonymous reviewer(s) for their contribution to the peer review of this work.
Publisher's note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Extended data figures and tables
Extended Data Figure 1 Comparison of E. faecalis Cas1–Cas2 structures in prespacer-bound, target-sampling, half-integration, and full-integration states.
Extended Data Figure 2 Comparison of the Cas1 and Cas2 dimer structures in E. faecalis and E. coli Cas1–Cas2–DNA complexes.
a, Structural analysis of individual Cas1 subunit in the E. faecalis Cas1–Cas2–DNA complex. Cas1 consists of a N-terminal β-sandwich (yellow circle) and a C-terminal helical domain (blue circle). These two domains are connected by a flexible hinge loop (red circle). b, Superposition of the catalytic (Cas1B, in cyan) and non-catalytic (Cas1A, in violet) Cas1 subunit in the complex. Note the ~33° hinge motion between the NTD and CTD, taking place at the circled region. c, Structural analysis of individual Cas2 subunit in the E. faecalis Cas1–Cas2–DNA complex. The monomeric E. faecalis Cas2 structure contains a ferredoxin domain. d, Comparison of E. faecalis and E. coli Cas1 in the corresponding Cas1–Cas2–DNA complexes along the two-fold symmetry axis of the Cas1-NTD dimer. The Cas1-CTD dimer tilts at different angles in these two complex structures. e, E. faecalis Cas2 dimerizes at a tilted angle whereas E. coli Cas2 dimerizes in a juxtaposed fashion (follow the angle between major helices in the dimer). E. faecalis Cas2 features long positively charged spikes at its dorsal region, which are inserted into the major grooves of dsDNA for prespacer binding. Overall, the structures of individual Cas1 and Cas2 domains are fairly conserved. The altered overall dimension of the Cas1–Cas2 complex was due to the altered domain orientation at each subunit interface.
a, Overall structure of the E. faecalis Cas1–Cas2–prespacer complex. Cas1B (pink) molecule is positioned between Cas1A (cyan) and Cas2A (orange). There is no direct contact between Cas1A and Cas2A. b, Location of the Cas1A–Cas1B and Cas1B–Cas2A interface. The β-sandwich domain in Cas1B-NTD bridges between Cas1A and Cas2A. c–e, Close-up views of the β-sheet interface for Cas1A–Cas1B (c, d) and Cas1B–Cas2A (e) are shown.
The prespacer is illustrated as a simple splayed duplex, with the interactions to the attacking 3′ overhang at the half-integration site highlighted. The catalytic centre is denoted by a beige circle. Target DNA-contacting residues are organized into groups, and coloured according to the subunit they reside in. The lines distinguish base-specific versus sugar-phosphate contacts. The colouring scheme follows that of Fig. 4.
a, Size-exclusion chromatography (HiLoad 16/60 Superdex 200) of two E. faecalis Cas1–Cas2–prespacer–target ternary complexes. The solid line corresponds to the ternary complex with the 5-bp-leader DNA that yielded the half-integration structure; the dotted line is for the 9-bp-leader DNA-containing complex that yielded the full-integration structure. Red and blue traces correspond to 260 and 280 nm UV absorptions, respectively. The two complexes eluted at the same retention volume of 69 ml, which corresponds to an estimated molecular mass of 200 kDa. This suggests that the two complexes had the same stoichiometry before crystallization (Cas14:Cas22:prespacer:target). b, SDS–PAGE analysis of the dissolved crystals, side-by-side with the before-crystallization sample. Note the relative intensity of the target DNA band and the integration product band. The 5-leader crystal contained less integration product than the starting sample, which is consistent with the resulting structure containing two Cas1–Cas2–prespacer complexes bound to one DNA target. By contrast, the 9-leader crystal contained the extent of integration product as before-crystallization, which is consistent with it yielding a fully integrated crystal structure. The cleaved leader DNA ran out of the gel owing to its much smaller size.
Extended Data Figure 6 Unbiased Se-Met experimental phases superimposed with the half-integration structure.
a, Overall view. b, Zoomed-in view into the half-integration site. c, Further magnified view at the integration site. The reactants including the 3′-OH, scissile phosphate, and the leaving 3′-O are labelled. All maps are contoured at 1.5σ. The structure is modelled in the post half-integration state; however, the density in c is consistent with either the pre- or the post-integration scenario.
a, Cas1–Cas2 loads a 30-bp prespacer. b, Cas2 serves as a fulcrum, nonspecific contacts tilt the target DNA stochastically for half-site searching by Cas1. c, Cas1 preferentially binds to the leader–inverted-repeat-containing half-site, and catalyses half-site integration. d, The spacer-side inverted repeat is captured through DNA bending, and full integration takes place. e, While still under investigation, it is speculated that the post-integration complex is resolved by DNA replication, and the CRISPR repeat is duplicated on one side. f, Opposite-side DNA replication duplicates the repeat on the opposite side; ligation finalizes spacer incorporation. The spacer flips its orientation during the process.
a, Cas1 sequence alignments; b, Cas2 sequence alignments. Homologues from Enterococcus faecalis (Efa) TX0027 (accession code E6GPD7), Streptococcus thermophilus (Sth; G3ECR2), Streptococcus pyogenes serotype M1 (Spy; Q99ZW1), Agathobacter rectalis (Are; C4ZA17), and Treponema denticola (Tde; Q73QW5) are used in the alignment. The absolutely conserved residues are boxed in red, highly conserved residues are in unfilled boxes and red letters.
About this article
Cite this article
Xiao, Y., Ng, S., Nam, K. et al. How type II CRISPR–Cas establish immunity through Cas1–Cas2-mediated spacer integration. Nature 550, 137–141 (2017). https://doi.org/10.1038/nature24020
Nature Microbiology (2022)
Nature Reviews Microbiology (2022)
Nature Communications (2021)
Scientific Reports (2021)
Prophage integration into CRISPR loci enables evasion of antiviral immunity in Streptococcus pyogenes
Nature Microbiology (2021)