CRISPR–Cas immunity protects prokaryotes against invading genetic elements1. It uses the highly conserved Cas1–Cas2 complex to establish inheritable memory (spacers)2,3,4,5. How Cas1–Cas2 acquires spacers from foreign DNA fragments (prespacers) and integrates them into the CRISPR locus in the correct orientation is unclear6,7. Here, using the high spatiotemporal resolution of single-molecule fluorescence, we show that Cas1–Cas2 selects precursors of prespacers from DNA in various forms—including single-stranded DNA and partial duplexes—in a manner that depends on both the length of the DNA strand and the presence of a protospacer adjacent motif (PAM) sequence. We also identify DnaQ exonucleases as enzymes that process the Cas1–Cas2-loaded prespacer precursors into mature prespacers of a suitable size for integration. Cas1–Cas2 protects the PAM sequence from maturation, which results in the production of asymmetrically trimmed prespacers and the subsequent integration of spacers in the correct orientation. Our results demonstrate the kinetic coordination of prespacer precursor selection and PAM trimming, providing insight into the mechanisms that underlie the integration of functional spacers in the CRISPR loci.
This is a preview of subscription content, access via your institution
Open Access articles citing this article.
Nature Communications Open Access 19 May 2022
International Microbiology Open Access 06 September 2021
Nature Communications Open Access 17 June 2021
Subscribe to Nature+
Get immediate online access to the entire Nature family of 50+ journals
Subscribe to Journal
Get full journal access for 1 year
only $3.90 per issue
All prices are NET prices.
VAT will be added later in the checkout.
Tax calculation will be finalised during checkout.
Get time limited or full article access on ReadCube.
All prices are NET prices.
Hille, F. et al. The biology of CRISPR–Cas: backward and forward. Cell 172, 1239–1259 (2018).
Nuñez, J. K., Harrington, L. B., Kranzusch, P. J., Engelman, A. N. & Doudna, J. A. Foreign DNA capture during CRISPR–Cas adaptive immunity. Nature 527, 535–538 (2015).
Nuñez, J. K. et al. Cas1–Cas2 complex formation mediates spacer acquisition during CRISPR–Cas adaptive immunity. Nat. Struct. Mol. Biol. 21, 528–534 (2014).
Nuñez, J. K., Lee, A. S., Engelman, A. & Doudna, J. A. Integrase-mediated spacer acquisition during CRISPR–Cas adaptive immunity. Nature 519, 193–198 (2015).
Wang, J. et al. Structural and mechanistic basis of PAM-dependent spacer acquisition in CRISPR–Cas Systems. Cell 163, 840–853 (2015).
Jackson, S. A. et al. CRISPR–Cas: adapting to change. Science 356, eaal5056 (2017).
McGinn, J. & Marraffini, L. A. Molecular mechanisms of CRISPR–Cas spacer acquisition. Nat. Rev. Microbiol. 17, 7–12 (2019).
Brouns, S. J. et al. Small CRISPR RNAs guide antiviral defense in prokaryotes. Science 321, 960–964 (2008).
Marraffini, L. A. & Sontheimer, E. J. CRISPR interference limits horizontal gene transfer in staphylococci by targeting DNA. Science 322, 1843–1845 (2008).
Deveau, H. et al. Phage response to CRISPR-encoded resistance in Streptococcus thermophilus. J. Bacteriol. 190, 1390–1400 (2008).
Mojica, F. J., Díez-Villaseñor, C., García-Martínez, J. & Almendros, C. Short motif sequences determine the targets of the prokaryotic CRISPR defence system. Microbiology 155, 733–740 (2009).
Levy, A. et al. CRISPR adaptation biases explain preference for acquisition of foreign DNA. Nature 520, 505–510 (2015).
Künne, T. et al. Cas3-derived target DNA degradation fragments fuel primed CRISPR adaptation. Mol. Cell 63, 852–864 (2016).
Musharova, O. et al. Spacer-length DNA intermediates are associated with Cas1 in cells undergoing primed CRISPR adaptation. Nucleic Acids Res. 45, 3297–3307 (2017).
Semenova, E. et al. Highly efficient primed spacer acquisition from targets destroyed by the Escherichia coli type I-E CRISPR–Cas interfering complex. Proc. Natl Acad. Sci. USA 113, 7626–7631 (2016).
Yeeles, J. T., Gwynn, E. J., Webb, M. R. & Dillingham, M. S. The AddAB helicase-nuclease catalyses rapid and processive DNA unwinding using a single superfamily 1A motor domain. Nucleic Acids Res. 39, 2271–2285 (2011).
Yeeles, J. T., van Aelst, K., Dillingham, M. S. & Moreno-Herrero, F. Recombination hotspots and single-stranded DNA binding proteins couple DNA translocation to DNA unwinding by the AddAB helicase-nuclease. Mol. Cell 42, 806–816 (2011).
Mulepati, S. & Bailey, S. In vitro reconstitution of an Escherichia coli RNA-guided immune system reveals unidirectional, ATP-dependent degradation of DNA target. J. Biol. Chem. 288, 22184–22192 (2013).
Nuñez, J. K., Bai, L., Harrington, L. B., Hinder, T. L. & Doudna, J. A. CRISPR immunological memory requires a host factor for specificity. Mol. Cell 62, 824–833 (2016).
Xiao, Y., Ng, S., Nam, K. H. & Ke, A. How type II CRISPR–Cas establish immunity through Cas1–Cas2-mediated spacer integration. Nature 550, 137–141 (2017).
Lopez-Sanchez, M. J. et al. The highly dynamic CRISPR1 system of Streptococcus agalactiae controls the diversity of its mobilome. Mol. Microbiol. 85, 1057–1071 (2012).
Shmakov, S. et al. Pervasive generation of oppositely oriented spacers during CRISPR adaptation. Nucleic Acids Res. 42, 5907–5916 (2014).
Shiimori, M., Garrett, S. C., Graveley, B. R. & Terns, M. P. Cas4 nucleases define the PAM, length, and orientation of DNA fragments integrated at CRISPR loci. Mol. Cell 70, 814–824 (2018).
Rollie, C., Graham, S., Rouillon, C. & White, M. F. Prespacer processing and specific integration in a type I-A CRISPR system. Nucleic Acids Res. 46, 1007–1020 (2018).
Lee, H., Zhou, Y., Taylor, D. W. & Sashital, D. G. Cas4-dependent prespacer processing ensures high-fidelity programming of CRISPR arrays. Mol. Cell 70, 48–59 (2018).
Kieper, S. N. et al. Cas4 facilitates PAM-compatible spacer selection during CRISPR adaptation. Cell Rep. 22, 3377–3384 (2018).
Hou, Z. & Zhang, Y. Insights into a mysterious CRISPR adaptation factor, Cas4. Mol. Cell 70, 757–758 (2018).
Moch, C., Fromant, M., Blanquet, S. & Plateau, P. DNA binding specificities of Escherichia coli Cas1–Cas2 integrase drive its recruitment at the CRISPR locus. Nucleic Acids Res. 45, 2714–2723 (2017).
Savitskaya, E., Semenova, E., Dedkov, V., Metlitskaya, A. & Severinov, K. High-throughput analysis of type I-E CRISPR/Cas spacer acquisition in E. coli. RNA Biol. 10, 716–725 (2013).
Shiriaeva, A. A. et al. Detection of spacer precursors formed in vivo during primed CRISPR adaptation. Nat. Commun. 10, 4603 (2019).
Drabavicius, G. et al. DnaQ exonuclease-like domain of Cas2 promotes spacer integration in a type I-E CRISPR–Cas system. EMBO Rep. 19, e45543 (2018).
Lovett, S. T. The DNA exonucleases of Escherichia coli. Ecosal Plus 4, https://doi.org/10.1128/ecosalplus.4.4.7 (2011).
Dillard, K. E. et al. Assembly and translocation of a CRISPR–Cas primed acquisition complex. Cell 175, 934–946 (2018).
Redding, S. et al. Surveillance and processing of foreign DNA by the Escherichia coli CRISPR–Cas system. Cell 163, 854–865 (2015).
Loeff, L., Brouns, S. J. J. & Joo, C. Repetitive DNA reeling by the Cascade–Cas3 complex in nucleotide unwinding steps. Mol. Cell 70, 385–394 (2018).
Wu, Y. H., Franden, M. A., Hawker, J. R. Jr & McHenry, C. S. Monoclonal antibodies specific for the alpha subunit of the Escherichia coli DNA polymerase III holoenzyme. J. Biol. Chem. 259, 12117–12122 (1984).
Sheth, R. U. & Wang, H. H. DNA-based memory devices for recording cellular events. Nat. Rev. Genet. 19, 718–732 (2018).
Kim, S. et al. Temporal landscape of microRNA-mediated host–virus crosstalk during productive human cytomegalovirus infection. Cell Host Microbe 17, 838–851 (2015).
Thieme, F., Engler, C., Kandzia, R. & Marillonnet, S. Quick and clean cloning: a ligation-independent cloning strategy for selective cloning of specific PCR products from non-specific mixes. PLoS One 6, e20556 (2011).
Aslanidis, C. & de Jong, P. J. Ligation-independent cloning of PCR products (LIC-PCR). Nucleic Acids Res. 18, 6069–6074 (1990).
Jergic, S. et al. A direct proofreader-clamp interaction stabilizes the Pol III replicase in the polymerization mode. EMBO J. 32, 1322–1333 (2013).
Hamdan, S. et al. Hydrolysis of the 5′-p-nitrophenyl ester of TMP by the proofreading exonuclease (ε) subunit of Escherichia coli DNA polymerase III. Biochemistry 41, 5266–5275 (2002).
Lewis, J. S. et al. Single-molecule visualization of fast polymerase turnover in the bacterial replisome. eLife 6, e23932 (2017).
Carrico, I. S., Carlson, B. L. & Bertozzi, C. R. Introducing genetically encoded aldehydes into proteins. Nat. Chem. Biol. 3, 321–322 (2007).
Chandradoss, S. D. et al. Surface passivation for single-molecule protein studies. J. Vis. Exp. 86, e50549 (2014).
Pan, H., Xia, Y., Qin, M., Cao, Y. & Wang, W. A simple procedure to improve the surface passivation for single molecule fluorescence studies. Phys. Biol. 12, 045006 (2015).
Dekking, M. A Modern Introduction to Probability and Statistics: Understanding Why and How (Springer, 2005).
We thank A. C. Haagsma and T. Künne for providing Cas1–Cas2 vectors and proteins; S. Leachman, N. Dekker and the members of the C.J. and S.J.J.B. laboratories for discussions; and T. J. Cui for discussions on kinetic models. S.J. thanks N. Dixon for guidance. S.K. was partly funded by a Marie Skłodowska-Curie grant (753528); C.J. and S.J.J.B. were funded by the Foundation for Fundamental Research on Matter (15PR3188); and S.J. was funded by a collaborative grant from King Abdullah University of Science and Technology, Saudi Arabia (OSR-2015-CRG4-2644).
The authors declare no competing interests.
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Extended data figures and tables
a, Schematic of the single-molecule TIRF set-up that was used for measuring Cas1–Cas2 binding to canonical or precursor prespacer DNA. b, FRET efficiency histograms of binding events observed for canonical and precursor prespacer DNAs with 3′-overhangs of various lengths. c, Dwell-time (Δτ) distributions and average binding time (τoff) determination for canonical and precursor prespacer DNAs with 3′-overhangs of various lengths. d, Left, representative time trace from a binding assay in the absence of Cas1–Cas2. DNA was added at t = 5 s. Right, image of a field of view with Cy3 signals on the left and Cy5 signals on the right. The image was recorded 1 min after the addition of DNA. e, Schematics of precursor prespacer DNAs with different labelling positions. f, FRET efficiency histograms of individual precursor prespacer DNAs with different labelling positions bound to Cas1–Cas2. g, FRET distribution and fractions of precursor prespacer DNAs from a single-molecule competition assay using precursor prespacer DNAs with different labelling positions. Histograms were obtained by incubating equal concentrations of precursor prespacer DNAs. To track the stably bound population, the flow chamber was washed and the fluorescence signals of the remaining population were measured. h, Cumulative distribution of the arrival times for binding events and kon for precursor prespacer DNAs with different labelling positions. i, Dwell-time distributions and koff of binding events for precursor prespacer DNAs with distinct labelling positions. j, Cumulative probability of the arrival times for precursor prespacer DNAs with 3′-overhangs of various lengths. k, Schematics of precursor prespacer DNAs with 3′-overhangs of various lengths. Each DNA construct consists of a 23-bp central duplex and 5-nt 3′-overhangs at both ends, which are further extended with N number of single-stranded deoxythymidine (dT) nucleotides. Both strands were labelled with a Cy3 fluorophore at the 5′-end of the top strand and a Cy5 fluorophore at the 16th nucleotide (T) from the 5′-end. l, Cumulative distribution of the arrival times for binding events and kon for precursor prespacer substrates with 3′-overhangs of various lengths. m, Dwell-time distributions of binding events and koff for precursor prespacer DNAs with 3′-overhangs of various lengths. n, Survival probability of stably bound substrates with 3′-overhangs of various lengths. The solid lines represent single-exponential fits using maximum-likelihood estimation. o, Schematics of canonical and precursor prespacer DNAs with different labelling positions and 3′-overhang lengths for a single-molecule competition experiment. p, q, FRET distributions (p) and fractions (q) of each FRET population from single-molecule competition experiments for canonical and precursor prespacer DNAs with 3′-overhangs of various lengths. ‘Before washing’ includes both transient and stably bound molecules; ‘after washing’ includes only the stably bound molecules. r, EMSAs on various canonical and precursor prespacer DNA substrates with increasing amounts of wild-type Cas1–Cas2. The top and bottom strands were labelled at the 5′-end with Cy3 and Cy5, respectively. Cas1–Cas2-bound and unbound precursor prespacer DNAs are indicated on the right. For b, f, g, p, solid lines represent Gaussian fits; the centre of each peak corresponds to the predetermined position of each individual construct in g, p. For c, h–j, l, m, solid lines represents single-exponential fits (maximum-likelihood estimation) that were used to determine the binding frequency (kon) (h, j, l) and dissociation rate (koff) (c, i, m). For the cumulative probability of the arrival times for kon bar plots and dwell times for koff bar plots, data are mean ± 95% CI, obtained by bootstrap analysis of a single replicate with n ≥ 300 (h, j, l) or n ≥ 500 (c, i, m) individual molecules. For the FRET fractions, data are mean ± s.e.m. from three independent measurements (n = 3) with n ≥ 5,000 molecules for each measurement (g, q). Data are representative of three replicates with similar results (b, c, f, h–j, l–n, q).
a, Schematics of precursor prespacer DNA with the PAM sequence at different positions in the 3′-overhang. b, Average number of molecules bound per field of view after a 30-min incubation with precursor prespacer DNAs. Data are mean ± s.e.m. (n = 3). Representative CCD images (acceptor channel) are included as insets. Scale bars, 5 µm. c, Structural comparison of Cas1–Cas2 precursor prespacer complexes. Cas1–Cas2 in complex with a non-PAM (5′-TTT-3′, orange)-containing substrate with 10-nt 3′-overhangs (PDB: 5DLJ); a PAM (5′-CTT-3′, red)-containing substrate with 8-nt 3′-overhangs (PDB: 5DQZ); and a non-PAM substrate with 5-nt 3′-overhangs that end with a T (PDB: 5DS5; left) or a C (PDB: 5DS5; right). The C-terminal proline-rich tail of Cas1b and the flexible internal lid-like loop region of Cas1a are highlighted in blue and magenta, respectively. The magnified image on the right represents the molecular architecture of the PAM-recognizing residues of Cas1–Cas2. The residues of the PAM sequence (C28, red; T29, blue; T30,orange) are coloured, together with the PAM-interacting residues of Cas1. d, Cumulative probability of the arrival times for precursor prespacer DNAs with different PAM sequences. A single-exponential fit (solid line) was used to determine the binding frequency (kon). Data are mean ± 95% CI, obtained by bootstrap analysis from a single replicate with n ≥ 300 individual molecules. e, Schematics of the design of precursor prespacer DNAs with different PAM sequences. f, g, FRET distributions (f) and fractions (g) of each FRET population from single-molecule competition experiments for different sequences at the PAM position. Each population was fitted with a Gaussian distribution (solid line); the centre of each peak corresponds to the predetermined position of each individual construct (f). For the fractions, data are mean ± s.e.m. from three independent measurements (n = 3) with n ≥ 5,000 molecules per each measurement (g). Data are representative of three replicates with similar results (d, f, g).
a, Representative time traces of a single Cas1–Cas2 complex binding to ssDNAs that contain different numbers of PAM sites. DNA was added at t = 5 s. The insets show a snapshot of the field of view taken after a 10-min incubation. b, Cumulative probability of the arrival times with a single-exponential fit (solid line) that was used to determine the binding frequency (kon). Data are mean ± 95% CI, obtained by bootstrap analysis from a single replicate with n ≥ 300 individual molecules. c, Dwell-time distributions of binding events for ssDNAs that contain different numbers of PAM sites. Average dwell times (τoff) are mean ± 95% CI, obtained by bootstrap analysis of a single replicate with n ≥ 500 individual molecules. d, kon of ssDNA substrates of various lengths, containing one PAM site. Data are mean ± 95% CI, obtained by bootstrap analysis from a single replicate with n ≥ 300 individual molecules. e, koff of ssDNA substrates of various lengths, containing one PAM site. Data are mean ± 95% CI, obtained by bootstrap analysis of a single replicate with n ≥ 500 individual molecules. f, Model of a facilitated diffusion mechanism for PAM-dependent ssDNA binding by Cas1–Cas2. Cas1–Cas2 binds a non-specific (non-PAM) region on ssDNA, which is followed by rapid facilitated diffusion and PAM recognition. Although the diffusive movement cannot be directly observed with the time resolution of our single-molecule assay (0.1 s), the effects can be seen when the ssDNA substrate is extended (see d). When the length is increased, the measured binding frequency increases (commonly referred to as the antenna effect), which suggests that Cas1–Cas2 uses facilitated diffusion to locate PAM sequences. g, EMSAs on various ssDNA and dsDNA substrates with increasing amounts of wild type Cas1–Cas2. Top, EMSAs with ssDNAs without (Cy3) or with (Cy5) a PAM sequence. Bottom, EMSAs with a precursor prespacer DNA substrate that is annealed, or with two ssDNAs added simultaneously. The bands that correspond to the bound and unbound fractions are indicated on the right. For gel source data, see Supplementary Fig. 1. Data are representative of three replicates with similar results (a–e, g).
Extended Data Fig. 4 In vitro integration assay with precursor prespacer DNAs with 3′-overhangs of various lengths.
In vitro integration assay using a linear CRISPR DNA and canonical or precursor prespacer DNAs with 3′-overhangs of different lengths. Full-site integration of a mature prespacer DNA (28 nt) results in a 78-nt leader-side integration (L-I) product and a 113-nt spacer-side integration (S-I) product. The top and bottom strands of the canonical or precursor prespacer DNA substrates were labelled with Cy5 and Cy3, respectively. Samples were run on a 7 M urea denaturing 20% TBE–PAGE, after which images were collected with a Typhoon scanner. Only those precursor prespacers with the canonical size of 5 nt (5′-TTTTC-3′) in the 3′-overhang(s) were efficiently incorporated into the CRISPR locus to yield leader-side (spacer-side) integration products. For gel source data, see Supplementary Fig. 1. Data are representative of three replicates with similar results.
a, In vitro trimming assay with a precursor prespacer DNA in the presence of various 3′−5′ exonucleases. The precursor prespacer DNAs used were the same as in Fig. 3a. b, In vitro assay for PAM-dependent trimming. c, d, In vitro trimming assay with wild-type Cas1–Cas2 or mutant Cas1(Q287A/I291G)–Cas2 in the presence of DNA PolIII core (c) or ExoT (d). The two strands of the precursor prespacer DNA were internally labelled with Cy3 and Cy5. Samples were collected after the indicated times of incubation with exonucleases. Samples were run on a 7 M urea denaturing 20% TBE–PAGE, after which images were collected with a Typhoon scanner. The canonical size (28 nt) of trimmed strands is indicated with red arrowheads. For gel source data, see Supplementary Fig. 1. Data are representative of three replicates with similar results.
Extended Data Fig. 6 Single-molecule leader-side integration assay for the PAM-deficient end of asymmetrically trimmed precursor prespacer DNAs.
a, Schematic of the single-molecule FRET assay that was used to observe the orientation of integrated spacers. Biotinylated CRISPR DNA was labelled with Cy5 in the repeat region (5 nt away from the leader–repeat junction). Precursor prespacer DNA was labelled with Cy3 at the 5′-end of the top strand. b, Expected FRET from the single-molecule assay based on structural modelling (PDB: 5WFE). Representative CCD images in donor (green box) and acceptor (red box) channels are included as insets and indicated with representative high and low FRET states. c, smFRET design for assessing the orientation of integrated products. The 5′-end of the top strand was labelled with Cy3. Integration of the 3′-end of the bottom strand at the leader side exhibits high FRET, and integration of the 3′-end of the top strand shows low FRET. d, Fractions of high- and low-FRET events after the integration reaction. Data are mean ± s.e.m. from three independent measurements (n = 3) with n ≥ 3,000 molecules for each measurement. e, FRET efficiency histograms of precursor prespacer DNAs with a 3′-overhang length that is optimal (28-nt) for the PAM-deficient strand and non-optimal for the PAM-containing strand. Solid lines represent Gaussian fits to obtain the high and low FRET populations. Data are representative of three replicates with similar results.
Extended Data Fig. 7 Maturation and integration of the PAM-containing end in half-site intermediates.
a, Design for in vitro and single-molecule trimming-driven integration assays. The last three backbone phosphodiester bonds from the 3′-end of CRISPR DNA were modified with PTO (purple) to prevent degradation by 3′−5′ exonucleases. b, Schematic of the substrates used for the in vitro trimming-driven full-site integration assay. c, d, Gel images from the in vitro trimming-driven full-site integration assay with DNA PolIII core (c) or ExoT (d). Unreacted half-site (H-S) intermediates and disintegrated products are shown in the Cy3 image. Spacer-side integration products and processed top prespacer strands are shown in the Cy5 images. For clarity, the bottom part (below 50 nt) of the Cy5 image was separated from the top part (around 70–130 nt) and adjusted with a different contrast. For gel source data, see Supplementary Fig. 1.
a, Design of in vitro and single-molecule trimming-driven integration assays. The last three backbone phosphodiester bonds from the 3′-end of CRISPR DNA were modified with PTO (purple) to prevent degradation by 3′−5′ exonucleases. b, Representative gel images from the in vitro trimming-driven integration assay with DNA PolIII core. The contrast of areas of spacer-side and leader-side integration products was adjusted for optimal visibility. For gel source data, see Supplementary Fig. 1. c, d, Single-molecule assay for biased integration of PAM-containing precursor prespacer DNA. c, Schematic of the experimental procedure used for trimming-driven integration assays at the single-molecule level. d, FRET efficiency histograms from the trimming-driven integration assay. The Cy3-labelled top strand of precursor prespacer DNAs had either CGT (non-PAM) or CTT (PAM). Solid lines represent Gaussian fits to obtain the high and low FRET populations. The bar plot displays fractions of high and low FRET populations after the integration reaction. Data are presented as mean ± s.e.m. (n = 6). Data are representative of three replicates with similar results.
In this model, the 3′-overhangs of PAM-containing Cas1–Cas2-selected prespacer precursors are trimmed symmetrically to the canonical length of 5 nt. The PAM-derived 3′-end of trimmed prespacer can be integrated either in the leader side or the spacer side of the CRISPR DNA. Therefore, the probability of a correctly oriented spacer in the CRISPR DNA is 50% (which does not agree with spacer acquisition in vivo).
Our model suggests that ssDNA loops generated from RecBCD- and Cascade–Cas3-mediated DNA degradation provide the initiation sites for Cas1–Cas2 docking and prespacer selection.
Raw images of gels scanned by Typhoon imager. Fluorescence scans of Cy3 (green) and Cy5 (red) are indicated. Related to Figure 3a, 3b, 3c, and Extended Data Fig. 1r, 3g, 4, 5a–d, 7c–d, 8b.
List of synthetic oligonucleotides used in this study. Related to Figure 1–3, Extended Data Fig. 1–8, and Methods.
About this article
Cite this article
Kim, S., Loeff, L., Colombo, S. et al. Selective loading and processing of prespacers for precise CRISPR adaptation. Nature 579, 141–145 (2020). https://doi.org/10.1038/s41586-020-2018-1
Nature Reviews Microbiology (2022)
Nature Communications (2022)
Nature Communications (2021)