Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

Spacer prioritization in CRISPR–Cas9 immunity is enabled by the leader RNA

Abstract

CRISPR–Cas systems store fragments of foreign DNA, called spacers, as immunological recordings used to combat future infections. Of the many spacers stored in a CRISPR array, the most recent are known to be prioritized for immune defence. However, the underlying mechanism remains unclear. Here we show that the leader region upstream of CRISPR arrays in CRISPR–Cas9 systems enhances CRISPR RNA (crRNA) processing from the newest spacer, prioritizing defence against the matching invader. Using the CRISPR–Cas9 system from Streptococcus pyogenes as a model, we found that the transcribed leader interacts with the conserved repeats bordering the newest spacer. The resulting interaction promotes transactivating crRNA (tracrRNA) hybridization with the second of the two repeats, accelerating crRNA processing. Accordingly, disruption of this structure reduces the abundance of the associated crRNA and immune defence against targeted plasmids and bacteriophages. Beyond the S. pyogenes system, bioinformatics analyses revealed that leader-repeat structures appear across CRISPR–Cas9 systems. CRISPR–Cas systems thus possess an RNA-based mechanism to prioritize defence against the most recently encountered invaders.

Your institute does not have access to this article

Access options

Buy article

Get time limited or full article access on ReadCube.

$32.00

All prices are NET prices.

Fig. 1: pre-crRNA from the CRISPR–Cas system in S. pyogenes forms a stem-loop between the leader RNA and R1 that interferes with extraneous crRNA function.
Fig. 2: Disruption of the leader-repeat stem-loop impairs immune defence through the most recent CRISPR spacers.
Fig. 3: The leader-repeat stem-loop is important for the increased abundance and enhanced processing of the crRNA derived from the most recent spacer.
Fig. 4: Interaction between the leader-repeat stem-loop and R2 promotes tracrRNA hybridization to R2.
Fig. 5: A stem-loop formed between the leader RNA and R1 is found across CRISPR–Cas9 systems.
Fig. 6: Proposed model for the role of the leader region in prioritization of crRNA biogenesis associated with the most recent spacer for CRISPR–Cas9 systems.

Data availability

Next-generation sequencing data for RNA immunoprecipitation sequencing are accessible through NCBI Gene Expression Omnibus accession no. GSE158637 using the link https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE158637 (Supplementary Table 4). Source data for Figs. 1b,d,e, 2a,b, 3b–d and 4b–d and Extended Data Figs. 1b,d, 3a,c,d, 4a–d, 5a,b,d, 6a, 7a,c, 8b,c,e,f and 9a,d are included in the Source Data files. Source data are provided with this paper.

Code availability

Custom scripts analysing folding of the leader-repeat region of different CRISPR–Cas systems are available on GitHub at https://github.com/zashaweinberglab/type-II-A-leader-repeat.

References

  1. Barrangou, R. et al. CRISPR provides acquired resistance against viruses in prokaryotes. Science 315, 1709–1712 (2007).

    CAS  PubMed  Google Scholar 

  2. van der Oost, J., Westra, E. R., Jackson, R. N. & Wiedenheft, B. Unravelling the structural and mechanistic basis of CRISPR-Cas systems. Nat. Rev. Microbiol. 12, 479–492 (2014).

    PubMed  PubMed Central  Google Scholar 

  3. Jackson, S. A. et al. CRISPR-Cas: adapting to change. Science 356, eaal5056 (2017).

  4. Bolotin, A., Quinquis, B., Sorokin, A. & Ehrlich, S. D. Clustered regularly interspaced short palindrome repeats (CRISPRs) have spacers of extrachromosomal origin. Microbiology 151, 2551–2561 (2005).

    CAS  PubMed  Google Scholar 

  5. Mojica, F. J. M., Díez-Villaseñor, C., García-Martínez, J. & Soria, E. Intervening sequences of regularly spaced prokaryotic repeats derive from foreign genetic elements. J. Mol. Evol. 60, 174–182 (2005).

    CAS  PubMed  Google Scholar 

  6. Sorek, R., Kunin, V. & Hugenholtz, P. CRISPR—a widespread system that provides acquired resistance against phages in bacteria and archaea. Nat. Rev. Microbiol. 6, 181–186 (2008).

    CAS  PubMed  Google Scholar 

  7. Arslan, Z., Hermanns, V., Wurm, R., Wagner, R. & Pul, Ü. Detection and characterization of spacer integration intermediates in type I-E CRISPR–Cas system. Nucleic Acids Res. 42, 7884–7893 (2014).

    CAS  PubMed  PubMed Central  Google Scholar 

  8. Xiao, Y., Ng, S., Nam, K. H. & Ke, A. How type II CRISPR-Cas establish immunity through Cas1-Cas2-mediated spacer integration. Nature 550, 137–141 (2017).

    CAS  PubMed  PubMed Central  Google Scholar 

  9. McGinn, J. & Marraffini, L. A. Molecular mechanisms of CRISPR-Cas spacer acquisition. Nat. Rev. Microbiol. 17, 7–12 (2019).

    CAS  PubMed  Google Scholar 

  10. Brouns, S. J. J. et al. Small CRISPR RNAs guide antiviral defense in prokaryotes. Science 321, 960–964 (2008).

    CAS  PubMed  PubMed Central  Google Scholar 

  11. Charpentier, E., Richter, H., van der Oost, J. & White, M. F. Biogenesis pathways of RNA guides in archaeal and bacterial CRISPR-Cas adaptive immunity. FEMS Microbiol. Rev. 39, 428–441 (2015).

    CAS  PubMed  PubMed Central  Google Scholar 

  12. Garneau, J. E. et al. The CRISPR/Cas bacterial immune system cleaves bacteriophage and plasmid DNA. Nature 468, 67–71 (2010).

    CAS  PubMed  Google Scholar 

  13. Meeske, A. J., Nakandakari-Higa, S. & Marraffini, L. A. Cas13-induced cellular dormancy prevents the rise of CRISPR-resistant bacteriophage. Nature 570, 241–245 (2019).

    CAS  PubMed  PubMed Central  Google Scholar 

  14. Rostøl, J. T. et al. The Card1 nuclease provides defence during type III CRISPR immunity. Nature 590, 624–629 (2021).

    PubMed  PubMed Central  Google Scholar 

  15. Elmore, J. R. et al. Programmable plasmid interference by the CRISPR-Cas system in Thermococcus kodakarensis. RNA Biol. 10, 828–840 (2013).

    CAS  PubMed  PubMed Central  Google Scholar 

  16. Carte, J. et al. The three major types of CRISPR-Cas systems function independently in CRISPR RNA biogenesis in Streptococcus thermophilus. Mol. Microbiol. 93, 98–112 (2014).

    CAS  PubMed  PubMed Central  Google Scholar 

  17. Crawley, A. B., Henriksen, E. D., Stout, E., Brandt, K. & Barrangou, R. Characterizing the activity of abundant, diverse and active CRISPR-Cas systems in lactobacilli. Sci. Rep. 8, 11544 (2018).

    PubMed  PubMed Central  Google Scholar 

  18. Deltcheva, E. et al. CRISPR RNA maturation by trans-encoded small RNA and host factor RNase III. Nature 471, 602–607 (2011).

    CAS  PubMed  PubMed Central  Google Scholar 

  19. McGinn, J. & Marraffini, L. A. CRISPR-Cas systems optimize their immune response by specifying the site of spacer integration. Mol. Cell 64, 616–623 (2016).

    CAS  PubMed  PubMed Central  Google Scholar 

  20. Martynov, A., Severinov, K. & Ispolatov, I. Optimal number of spacers in CRISPR arrays. PLoS Comput. Biol. 13, e1005891 (2017).

    PubMed  PubMed Central  Google Scholar 

  21. Rao, C., Chin, D. & Ensminger, A. W. Priming in a permissive type I-C CRISPR-Cas system reveals distinct dynamics of spacer acquisition and loss. RNA 23, 1525–1538 (2017).

  22. Liao, C. & Beisel, C. L. The tracrRNA in CRISPR biology and technologies. Annu. Rev. Genet. 55, 161–181 (2021).

    PubMed  Google Scholar 

  23. Karvelis, T. et al. crRNA and tracrRNA guide Cas9-mediated DNA interference in Streptococcus thermophilus. RNA Biol. 10, 841–851 (2013).

    CAS  PubMed  PubMed Central  Google Scholar 

  24. Jinek, M. et al. A programmable dual-RNA-guided DNA endonuclease in adaptive bacterial immunity. Science 337, 816–821 (2012).

    CAS  PubMed  PubMed Central  Google Scholar 

  25. Pickar-Oliver, A. & Gersbach, C. A. The next generation of CRISPR–Cas technologies and applications. Nat. Rev. Mol. Cell Biol. 20, 490–507 (2019).

    CAS  PubMed  PubMed Central  Google Scholar 

  26. Bikard, D. et al. Programmable repression and activation of bacterial gene expression using an engineered CRISPR-Cas system. Nucleic Acids Res. 41, 7429–7437 (2013).

    CAS  PubMed  PubMed Central  Google Scholar 

  27. Jiang, W., Bikard, D., Cox, D., Zhang, F. & Marraffini, L. A. RNA-guided editing of bacterial genomes using CRISPR-Cas systems. Nat. Biotechnol. 31, 233–239 (2013).

    CAS  PubMed  PubMed Central  Google Scholar 

  28. Citorik, R. J., Mimee, M. & Lu, T. K. Sequence-specific antimicrobials using efficiently delivered RNA-guided nucleases. Nat. Biotechnol. 32, 1141–1145 (2014).

    CAS  PubMed  PubMed Central  Google Scholar 

  29. Leenay, R. T. & Beisel, C. L. Deciphering, communicating, and engineering the CRISPR PAM. J. Mol. Biol. 429, 177–191 (2017).

    CAS  PubMed  Google Scholar 

  30. Dugar, G. et al. CRISPR RNA-dependent binding and cleavage of endogenous RNAs by the Campylobacter jejuni Cas9. Mol. Cell 69, 893–905 (2018).

    CAS  PubMed  PubMed Central  Google Scholar 

  31. Xue, C. et al. CRISPR interference and priming varies with individual spacer sequences. Nucleic Acids Res. 43, 10831–10847 (2015).

    CAS  PubMed  PubMed Central  Google Scholar 

  32. Collias, D. et al. A positive, growth-based PAM screen identifies noncanonical motifs recognized by the Cas9. Sci. Adv. 6, eabb4054 (2020).

    CAS  PubMed  PubMed Central  Google Scholar 

  33. Altuvia, Y. et al. In vivo cleavage rules and target repertoire of RNase III in Escherichia coli. Nucleic Acids Res. 46, 10530–10531 (2018).

    CAS  PubMed  PubMed Central  Google Scholar 

  34. Wei, Y., Chesne, M. T., Terns, R. M. & Terns, M. P. Sequences spanning the leader-repeat junction mediate CRISPR adaptation to phage in Streptococcus thermophilus. Nucleic Acids Res. 43, 1749–1758 (2015).

    CAS  PubMed  PubMed Central  Google Scholar 

  35. Pougach, K. et al. Transcription, processing and function of CRISPR cassettes in Escherichia coli. Mol. Microbiol. 77, 1367–1379 (2010).

    CAS  PubMed  PubMed Central  Google Scholar 

  36. Yosef, I., Goren, M. G. & Qimron, U. Proteins and DNA elements essential for the CRISPR adaptation process in Escherichia coli. Nucleic Acids Res. 40, 5569–5576 (2012).

    CAS  PubMed  PubMed Central  Google Scholar 

  37. Jiao, C. et al. Noncanonical crRNAs derived from host transcripts enable multiplexable RNA detection by Cas9. Science 372, 941–948 (2021).

  38. Jabbari, H., Wark, I. & Montemagno, C. RNA secondary structure prediction with pseudoknots: contribution of algorithm versus energy model. PLoS ONE 13, e0194583 (2018).

    PubMed  PubMed Central  Google Scholar 

  39. Wei, Y., Terns, R. M. & Terns, M. P. Cas9 function and host genome sampling in Type II-A CRISPR-Cas adaptation. Genes Dev. 29, 356–361 (2015).

    CAS  PubMed  PubMed Central  Google Scholar 

  40. Laanto, E., Hoikkala, V., Ravantti, J. & Sundberg, L.-R. Long-term genomic coevolution of host-parasite interaction in the natural environment. Nat. Commun. 8, 111 (2017).

    PubMed  PubMed Central  Google Scholar 

  41. Zhang, Y. et al. Processing-independent CRISPR RNAs limit natural transformation in Neisseria meningitidis. Mol. Cell 50, 488–503 (2013).

    CAS  PubMed  PubMed Central  Google Scholar 

  42. Dugar, G. et al. High-resolution transcriptome maps reveal strain-specific regulatory features of multiple Campylobacter jejuni isolates. PLoS Genet. 9, e1003495 (2013).

    CAS  PubMed  PubMed Central  Google Scholar 

  43. Haurwitz, R. E., Jinek, M., Wiedenheft, B., Zhou, K. & Doudna, J. A. Sequence- and structure-specific RNA processing by a CRISPR endonuclease. Science 329, 1355–1358 (2010).

    CAS  PubMed  PubMed Central  Google Scholar 

  44. Li, R. & Bowerman, B. Symmetry breaking in biology. Cold Spring Harb. Perspect. Biol. 2, a003475 (2010).

    PubMed  PubMed Central  Google Scholar 

  45. McCarty, N. S., Graham, A. E., Studená, L. & Ledesma-Amaro, R. Multiplexed CRISPR technologies for gene editing and transcriptional regulation. Nat. Commun. 11, 1281 (2020).

    CAS  PubMed  PubMed Central  Google Scholar 

  46. Al-Hashimi, H. M. & Walter, N. G. RNA dynamics: it is about time. Curr. Opin. Struct. Biol. 18, 321–329 (2008).

    CAS  PubMed  PubMed Central  Google Scholar 

  47. Watters, K. E., Strobel, E. J., Yu, A. M., Lis, J. T. & Lucks, J. B. Cotranscriptional folding of a riboswitch at nucleotide resolution. Nat. Struct. Mol. Biol. 23, 1124–1131 (2016).

    CAS  PubMed  PubMed Central  Google Scholar 

  48. Liao, C. et al. Modular one-pot assembly of CRISPR arrays enables library generation and reveals factors influencing crRNA biogenesis. Nat. Commun. 10, 2948 (2019).

    PubMed  PubMed Central  Google Scholar 

  49. Wimmer, F. & Beisel, C. L. CRISPR-Cas systems and the paradox of self-targeting spacers. Front. Microbiol. 10, 3078 (2019).

    PubMed  Google Scholar 

  50. Leenay, R. T. et al. Genome editing with CRISPR-Cas9 in Lactobacillus plantarum revealed that editing outcomes can vary across strains and between methods. Biotechnol. J. 14, e1700583 (2019).

    PubMed  Google Scholar 

  51. Gruber, A. R., Lorenz, R., Bernhart, S. H., Neubock, R. & Hofacker, I. L. The Vienna RNA Websuite. Nucleic Acids Res. 36, W70–W74 (2008).

    CAS  PubMed  PubMed Central  Google Scholar 

  52. Lorenz, R. et al. ViennaRNA Package 2.0. Algorithms Mol. Biol. 6, 26 (2011).

  53. Sharma, C. M. et al. The primary transcriptome of the major human pathogen Helicobacter pylori. Nature 464, 250–255 (2010).

    CAS  PubMed  Google Scholar 

  54. Papenfort, K. et al. σE-Dependent small RNAs of Salmonella respond to membrane stress by accelerating global omp mRNA decay. Mol. Microbiol. 62, 1674–1688 (2006).

    CAS  PubMed  PubMed Central  Google Scholar 

  55. Pernitzsch, S. R., Tirier, S. M., Beier, D. & Sharma, C. M. A variable homopolymeric G-repeat defines small RNA-mediated posttranscriptional regulation of a chemotaxis receptor in Helicobacter pylori. Proc. Natl Acad. Sci. USA 111, E501–E510 (2014).

    CAS  PubMed  PubMed Central  Google Scholar 

  56. Martin, M. Cutadapt removes adapter sequences from high-throughput sequencing reads. EMBnet J. 17, 1 (2011).

    Google Scholar 

  57. Förstner, K. U., Vogel, J. & Sharma, C. M. READemption-a tool for the computational analysis of deep-sequencing-based transcriptome data. Bioinformatics 30, 3421–3423 (2014).

    PubMed  Google Scholar 

  58. Hoffmann, S. et al. Fast mapping of short sequences with mismatches, insertions and deletions using index structures. PLoS Comput. Biol. 5, e1000502 (2009).

    PubMed  PubMed Central  Google Scholar 

  59. Li, H. et al. The Sequence Alignment/Map format and SAMtools. Bioinformatics 25, 2078–2079 (2009).

    PubMed  PubMed Central  Google Scholar 

  60. Lopez-Delisle, L. et al. pyGenomeTracks: reproducible plots for multivariate genomic data sets. Bioinformatics 37, 422–423 (2020).

    PubMed Central  Google Scholar 

  61. Kent, W. J., Zweig, A. S., Barber, G., Hinrichs, A. S. & Karolchik, D. BigWig and BigBed: enabling browsing of large distributed datasets. Bioinformatics 26, 2204–2207 (2010).

    CAS  PubMed  PubMed Central  Google Scholar 

  62. Padilha, V. A., Alkhnbashi, O. S., Shah, S. A., de Carvalho, A. C. P. L. F. & Backofen, R. CRISPRcasIdentifier: machine learning for accurate identification and classification of CRISPR-Cas systems. Gigascience 9, giaa062 (2020).

  63. Padilha, V. A. et al. Casboundary: automated definition of integral Cas cassettes. Bioinformatics 37, 1352–1359 (2020).

    PubMed Central  Google Scholar 

  64. Mitrofanov, A. et al. CRISPRidentify: identification of CRISPR arrays using machine learning approach. Nucleic Acids Res. 49, e20 (2021).

    CAS  PubMed  Google Scholar 

  65. Alkhnbashi, O. S. et al. CRISPRstrand: predicting repeat orientations to determine the crRNA-encoding strand at CRISPR loci. Bioinformatics 30, i489–i496 (2014).

    CAS  PubMed  PubMed Central  Google Scholar 

  66. Fu, L., Niu, B., Zhu, Z., Wu, S. & Li, W. CD-HIT: accelerated for clustering the next-generation sequencing data. Bioinformatics 28, 3150–3152 (2012).

    CAS  PubMed  PubMed Central  Google Scholar 

  67. Ding, Y. & Lawrence, C. E. A statistical sampling algorithm for RNA secondary structure prediction. Nucleic Acids Res. 31, 7280–7301 (2003).

    CAS  PubMed  PubMed Central  Google Scholar 

  68. Altschul, S. F. & Erickson, B. W. Significance of nucleotide sequence alignments: a method for random sequence permutation that preserves dinucleotide and codon usage. Mol. Biol. Evol. 2, 526–538 (1985).

    CAS  PubMed  Google Scholar 

Download references

Acknowledgements

We thank T. Achmedov for extensive assistance with RNA preparation and RNA-blotting, F. Tippel from NanoTemper Technologies (Munich) for technical support and J. Vogel and G. Storz for critical feedback on the manuscript. This work was supported by funding through the European Research Council Consolidator Award (no. 865973 to C.L.B.), Deutsche Forschungsgemeinschaft SPP 2141 (nos. BE 6703/1-1 to C.L.B., SH 580/9-1 to C.M.S. and BA 2168/23-1 to R.B.) and the Interdisciplinary Center for Clinical Research Würzburg project Z-6.

Author information

Authors and Affiliations

Authors

Contributions

C.L. and C.L.B. conceived this study. C.L. and C.L.B. designed the experiments. C.L. performed plasmid cloning, in vivo assays in E. coli and L. rhamnosus, in vitro RNA transcription and purification, RNase III cleavage assays and RNA-blotting. C.L. and C.L.B. analysed the associated data. S.S. conducted immunoblotting and RNA immunoprecipitation for RIP–seq and helped analyse the data. S.L.S. conducted RNA structural probing and RNase III cleavage site mapping and helped analyse data. C.M.S. supervised the work performed by S.S. and S.L.S. A.K. designed and performed the in vitro assay for RNA–RNA binding affinity and analysed the data, with supervision by N.C. O.S.A. identified the repeat-leaders and computed mutations, with supervision by R.B. Z.W. assessed base-pairing probabilities. T.B. analysed RIP–seq data. C.L.B. and C.L. wrote the manuscript, which was read and approved by all authors. C.L.B. supervised the project.

Corresponding author

Correspondence to Chase L. Beisel.

Ethics declarations

Competing interests

C.L.B. is a cofounder and member of the scientific advisory board for Locus Biosciences and is a member of the scientific advisory board for Benson Hill. C.L.B. and C.M.S. have submitted patent applications on CRISPR technologies unrelated to this work. The other authors declare no conflicts of interest.

Peer review

Peer review information

Nature Microbiology thanks the anonymous reviewers for their contribution to the peer review of this work. Peer reviewer reports are available.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Extended data

Extended Data Fig. 1 The leader-repeat stem-loop from the CRISPR-Cas9 system native to Streptococcus pyogenes SF370.

Accession #: NC_002737.2. a, Array sequence and context within the CRISPR-Cas system. Repeats are in gray, spacers match the corresponding color in the cartoon, and mutations to the consensus repeat are shown in red. The underlined sequence encodes the transcribed RNA leader as determined in S. pyogenes SF37018. The bold and italicized sequence is the putative -10 promoter element, while the lowercase letters designate the stop codon of csn2. The red box indicates the mapped transcriptional start site in E. coli determined using 5′ RACE. b, PCR product generated by 5′ RACE. Biological duplicates are shown. M: DNA marker. C, Predicted minimal free-energy structure of the native and mutated leader-repeat RNA predicted by NUPACK. Left: nucleotide (nt) identities. Right: base-pairing probabilities. d, In vitro determination of the secondary structure and RNase III cleavage sites for the leader-repeat RNA associated with SpyCas9. The transcription start site was extended by 17 nts using the sequence from S. pyogenes to allow visualization of shorter RNAs. Vertical bars: unstructured regions. C: full-length (untreated) control. T1: Ladder of G’s generated by incubating the RNA with RNase T1. OH: single-nucleotide ladder generated by incubating the RNA under basic conditions. Dark and light red arrows indicate the most and second most preferred sites of RNase III cleavage, respectively. Results are representative of triplicate independent experiments. e, Corresponding secondary structure of the leader-repeat RNA. Circles indicate unstructured bases identified by in-line probing. The preferred site of RNase III cleavage lies within one nt of the equivalent site within the crRNA:tracrRNA duplex (see Fig. 1c). R1: first repeat. S1: first spacer.

Source data

Extended Data Fig. 2 Capillary scans and thermophoretic time-traces of microscale thermophoresis (MST) measurements of binding between the leader-repeat RNA and tracrRNA associated with different CRISPR-Cas9 systems.

a, Streptococcus pyogenes SF370 with an RNA spanning the leader to the first spacer. b, Streptococcus pyogenes SF370 with an RNA spanning the leader to the second spacer. c, Lactobacillus rhamnosus GG with an RNA spanning the leader to the first spacer. d, Streptococcus thermophilus DGCC 7710 (CRISPR1) with an RNA spanning the leader to the first spacer. In all cases, the tracrRNA was fluorescently labeled while unlabeled leader-repeat RNA was added at different concentrations. Capillary scans and traces of one of three independent experiments are shown. The gray boxes in the capillary scans mark 20% above and below the average peak fluorescence indicated in orange, the acceptable limit of deviations across the fluorescence scans. Blue and red boxes in the time-course traces represent the temperature jump and MST-on time, respectively. In all cases, there is no adsorption of the labeled tracrRNAs to the capillaries, and the time traces indicate no aggregation. See Figs. 1d and 4d and Extended Data Figs. 8b and 8e for the resulting binding curves. Values in a-d represent the mean and standard deviation of triplicate independent measurements.

Extended Data Fig. 3 Data rejecting alternative explanations for the impact of mutating the leader region associated with SpyCas9.

a, Assessing targeting by the mutated ecrRNA guide by plasmid clearance in E. coli. The native and mutated ecrRNAs were encoded as single-spacer arrays with the native leader. There was no significant difference in plasmid clearance with (Student’s two-tailed t-test with unequal variance, P = 0.36, n = 3) or without (Student’s two-tailed t-test with unequal variance, P = 0.80, n = 3) outgrowth. b, Western blotting analysis of SpyCas9-3xFLAG levels with the native or mutated leader. Results are representative of two independent experiments. c, Plasmid clearance with SpyCas9-3xFLAG in E. coli. The SpyCas9-3xFLAG fusions were tested using an sgRNA with a guide derived from spacer 1 (S1) in the native array. The transformations were conducted without non-selective outgrowth. The results showed that the fusion did not compromise clearance activity by SpyCas9, and introducing the mutations into the CRISPR leader did not significantly affect SpyCas9 activity (Student’s two-tailed t-test with unequal variance, P = 0.168, n = 3). d, Assessing transcription of the CRISPR array with the mutated leader. The native or mutated leader through the first spacer was cloned upstream of gfp in the pUA66 plasmid. E. coli cells harboring either plasmid were then subjected to flow cytometry analysis. There was no significant difference (Student’s two-tailed t-test with unequal variance, P = 0.103, n = 3) in the background-subtracted GFP fluorescence between the constructs. Values represent the mean and standard deviation of triplicate independent measurements starting from separate colonies. Values in a, c and d represent the geometric mean and standard deviation from independent experiments starting from three separate colonies. n.s.: not significant. n.s.: P > 0.05. Statistical tests were performed using a two-tailed Student’s t-test with unequal variance, n = 3.

Source data

Extended Data Fig. 4 RIP-seq analysis using SpyCas9 combined with the native or mutated leader in E. coli.

The left and right sides of the figure represent the results from two independent experiments. RIP-seq was performed using E. coli BW25113 harboring the SpyCas9/tracrRNA/CRISPR or SpyCas9-3xFLAG/tracrRNA/CRISPR plasmid. a, Western blotting confirmed enrichment of SpyCas9-FLAG. Co-immunoprecipitated RNAs were isolated and subjected to next-generation sequencing. b, Distribution of RNA classes based on total mapped reads.c, Mapped reads for the CRISPR locus with the native or mutated leader. The scale above the plot indicates the location in the plasmid. Positional coverage for total aligned reads and reads aligning with a reference length ≤ 50 nts was normalized based on the total number of aligned reads in each sample. The reduction in reads upon applying the size filter indicates an excess of pre-crRNA and immature crRNAs, which parallels Northern blotting analysis for the ecrRNA and individual crRNAs (see Fig. 3b and Extended Data Fig. 5a-b). We also note that the reads begin ~12 nts upstream of the transcriptional start site mapped by 5’ RACE (see Extended Data Fig. 1), suggesting that a slightly upstream transcriptional start site or processing site from a longer transcript also exists. d, Direct comparison of mapped reads with the native or mutated leader. The plot corresponds to that shown in Fig. 3a. The read score for the first crRNA downstream of the native leader extends above the vertical limit of 1,500. The relative read scores for the ecrRNA and each crRNA are indicated below the plots. Values below one indicate a reduction in (e)crRNA abundance with the introduced mutations. See Supplementary Table 1 for statistics about the RIP-seq analyses.

Source data

Extended Data Fig. 5 Impact of mutating the leader-repeat stem-loop from the CRISPR-Cas9 system from Streptococcus pyogenes SF370.

a, Northern blotting analysis of the produced crRNAs with the native or mutated RNA leader. The system’s CRISPR array was expressed in E. coli with SpyCas9 and the tracrRNA, and the ecrRNA (probe #1), crRNA1 (probe #2), and crRNA5 (probe #3) were detected. The ecrRNA and mecrRNA were detected using an equimolar mixture of both probes. b, Northern blotting analysis of the produced crRNAs with different mutant backgrounds. See a for details. Experiments were conducted with the native or mutated leader or with the tracrRNA, cas9, or rnc deleted. The results for probe #1 are those shown in Fig. 3b. All probing was performed with the same blot. The indicated RNA spanning the leader through the processed crRNA1 corresponds to that observed by RIP-seq (see Extended Data Fig. 4c) and is supported by the band’s absence when probing for crRNA2. Results in a and b are representative of duplicate independent experiments. c, Predicted secondary structures of three different restoring mutant sets. Disruptive mutations were made to the mutated leader depicted in Fig. 1c. In each case, a stable stem was created by making restoring mutations, although the upper structure deviates from that found in the native leader-repeat. d, Impact of the mutations on plasmid clearance by SpyCas9 in E. coli. The clearance assays were conducted with or without a non-selective outgrowth, where the non-selective outgrowth improves the extent of plasmid clearance. Values represent the geometric mean and standard deviation from independent experiments starting from three separate colonies.

Source data

Extended Data Fig. 6 CRISPR arrays from other CRISPR-Cas9 systems within the II-A subtype that appear to possess a leader-repeat stem-loop.

a, Array sequence and context within the CRISPR-Cas system native to Lactobacillus rhamnosus GG. Accession #: GCF_000026505.1. The sequence begins within csn2 (annotated as LGG_02201) and ends after the terminal repeat. See Extended Data Fig. 1a for details. The underlined sequence encodes the transcribed RNA leader as determined by 5′ RACE in L. rhamnosus in this work. Lowercase letters designate the stop codon of csn2. The promoter(s) driving expression of the cas genes has not been mapped. b, PCR product as part of 5′ RACE using total RNA from L. rhamnosus GG. See Extended Data Fig. 1b for details. Only one major product was visible in both replicates. Biological duplicates are shown. M: DNA marker. Results from duplicate independent experiments are shown. c, Secondary structure of the native and mutated leader-repeat RNA predicted by NUPACK. See Extended Data Fig. 1c for details. The 5′ of the leader was truncated to match the sequence used in the structural probing and RNase III cleavage assays (see Extended Data Fig. 7b). Mutations were selected to disrupt the original secondary structure of the native leader-repeat RNA. d, Array sequence and context within the CRISPR-Cas system native to Streptococcus thermophilus DGCC 7710 (CRISPR1 locus). Accession #: CP025216.1. The sequence begins downstream of csn2 and ends after the terminal repeat. See Extended Data Fig. 1a for details. The underlined sequence encodes the transcribed RNA leader as determined previously by RNA sequencing analysis of transcripts16. The promoter(s) driving expression of the cas genes has not been mapped. e, Secondary structure of the native and mutated leader-repeat RNA predicted by NUPACK. See Extended Data Fig. 1c for details.

Source data

Extended Data Fig. 7 In vitro determination of the secondary structure and RNase III cleavage sites for the leader-repeat RNA associated with LrhCas9 and Sth1Cas9.

a, In vitro determination of the secondary structure and RNase III cleavage sites for the leader-repeat RNA associated with LrhCas9. The probed RNA was 5′ radiolabeled and resolved by denaturing PAGE. The 5′ end was truncated to focus on the predicted secondary structure involving the repeat. Vertical bars on the right indicate unstructured regions. C - full-length control. T1: Ladder of G’s generated by incubating the RNA with RNase T1. OH: single-nucleotide ladder generated by incubating the RNA under basic conditions. RNase III: the RNA was incubated with the indicated units of E. coli RNase III (0, 0.0016, 0.008, 0.04, 0.2, 1) for 5 min at 37 °C. Dark and light red arrows indicate the most preferred and second most preferred sites of RNase III cleavage, respectively. Results are representative of triplicate independent experiments. b, Corresponding secondary structure of the leader-repeat RNA. Circles indicate unstructured bases identified by in-line probing. The preferred site of RNase III cleavage lies below the equivalent site within the crRNA:tracrRNA duplex (see Extended Data Fig. 8a). R1: first repeat. S1: first spacer. c, In vitro determination of the secondary structure and RNase III cleavage sites for the leader-repeat RNA associated with Sth1Cas9. See a for details. The 5′ end was truncated to focus on the predicted secondary structure involving the repeat. Results are representative of triplicate independent experiments. d, Corresponding secondary structure of the leader-repeat RNA. Circles indicate unstructured bases identified by in-line probing. The preferred site of RNase III cleavage lies above the equivalent site within the crRNA:tracrRNA duplex (see Extended Data Fig. 8d). R1: first repeat. S1: first spacer.

Source data

Extended Data Fig. 8 II-A CRISPR-Cas9 systems form distinct leader-repeat stem-loops.

a, The CRISPR-Cas system from L. rhamnosus GG and the secondary structure of the leader-repeat RNA. The structure was predicted by NUPACK and confirmed in vitro (see Extended Data Fig. 6c and 7a-b). Mutations indicated in red were made to disrupt stems formed between the leader RNA and the first repeat. b, Measured equilibrium binding between the tracrRNA and native or mutated RNA leader-repeat RNA. See Extended Data Fig. 2c for supporting data. Values represent the mean and standard deviation of triplicate independent measurements. c, RNase III cleavage of the native and mutated leader-repeat RNA in vitro. See Extended Data Fig. 7a-b for the mapped secondary structure and RNase III cleavage sites. Results are representative of duplicate independent experiments. d, The CRISPR-Cas system associated with the CRISPR1 locus of S. thermophilus and the secondary structure of the leader-repeat RNA. The structure was predicted by NUPACK and confirmed in vitro (see Extended Data Fig. 6e and 7c-d). Indicated mutations in red were made to disrupt the stem formed between the leader RNA and first repeat. The three mutations in the loop were introduced to disrupt alternative structures formed by the other mutations. Pairing between the repeat and the tracrRNA is provided as a basis of comparison. Red arrows indicate the previously mapped site cleaved by RNase III16. R1: first repeat. R2: second repeat. S1: first spacer. S2: second spacer. e, Measured equilibrium binding affinity between the leader-repeat and the tracrRNA under in vitro conditions. See Extended Data Fig. 2d for supporting data. Values represent the mean and standard deviation of triplicate independent measurements. f, RNase III cleavage of the native and mutated leader-repeat RNA in vitro. Results are representative of duplicate independent experiments.

Source data

Extended Data Fig. 9 RIP-seq analysis of RNAs bound to Cas9 from Lactobacillus rhamnosus GG. LrhCas9 with or without a 3xFLAG affinity was expressed from a plasmid, and the lysate was subjected to RIP-seq analysis.

LrhCas9 with or without a 3xFLAG affinity was expressed from a plasmid, and the lysate was subjected to RIP-seq analysis. a, Western blotting analysis of samples for RIP-seq using LrhCas9 in Lactobacillus rhamnosus GG. Western blotting confirmed enrichment of LrhCas9-FLAG. Co-immunoprecipitated RNAs were isolated and subjected to next-generation sequencing. Results from duplicate independent experiments are shown on the left and right. b, Distribution of RNA classes based on total mapped reads. hkRNAs: house-keeping RNAs. ncRNAs: non-coding RNAs. c, Mapped reads for the CRISPR locus with the genome of L. rhamnosus GG (NC_013198.1). The scale above the plot indicates the location in the genome. The CRISPR locus is encoded on the negative strand. Positional coverage for total reads and reads aligning with a reference length ≤ 50 nts was normalized based on the total number of aligned reads in each sample. The maximum read length for the NGS run was 76 nts, explaining the drop in unfiltered read counts shortly downstream of the transcriptional start site. See Supplementary Table 1 for statistics from the RIP-seq analyses. Results in b and c are representative of duplicate independent experiments. d, Plasmid clearance by the CRISPR-Cas9 system in L. rhamnosus GG. The corresponding target of the ecrRNA or crRNA1 was encoded within the transformed plasmid. L.O.D.: limit of detection. There was no detectable ecrRNA-directed plasmid clearance. Values represent the geometric mean and standard deviation from three independent experiments starting from separate colonies. **: P < 0.01. n.s.: P > 0.05. Statistical tests were performed using a two-tailed Student’s t-test with unequal variance, n = 3.

Source data

Extended Data Fig. 10 The CRISPR array from the CRISPR-Cas9 system native to Alkalihalobacillus pseudalcaliphilus DSM 8725.

The system falls within the II-C subtype. Accession #: LFJO01000002.1. a, Array sequence and context within the CRISPR-Cas9 system. The sequence begins immediately downstream of the AB990_04425 gene unrelated to the CRISPR-Cas9 system and ends after the last repeat of the CRISPR2 array. Repeats are in gray, spacers match the corresponding color in the cartoon, and mutations to the consensus repeat are shown in red. The underlined sequence denotes the upstream region used for the folding predictions for the CRISPR1 array. The transcriptional start sites for both arrays are unknown, although there is a clear Rho-independent terminator downstream of each array. The promoters driving expression of the cas genes, the CRISPR arrays, or the tracrRNA have not been mapped. The predicted direction of transcription for the tracrRNA and CRISPR array are indicated with black arrows. b, tracrRNA sequence and context within the CRISPR-Cas9 system. The sequence begins ~2.7 kb upstream of the AB990_04405 gene unrelated to the CRISPR-Cas9 system and ends immediately upstream of cas9. The sequence in orange corresponds to the putative tracrRNA used in the folding predictions. c, Predicted stem-loop between the first repeat and upstream region for the CRISPR1 array. The predicted stem-loop is part of the minimal-free energy structure and reflects base-pairing probabilities principally between 90% and 100%. Pairing between the second repeat and the tracrRNA is provided as a basis of comparison. The tracrRNA ends with a canonical Rho-independent terminator.

Supplementary information

Source data

Numerical Source Data Fig. 1

Numerical data for Fig. 1b,d.

Numerical Source Data Fig. 2

Numerical data for Fig. 2a.

Numerical Source Data Fig. 3

Numerical data for Fig. 3c,d.

Numerical Source Data Fig. 4

Numerical data for Fig. 4b–d.

Numerical Source Data Extended Data Fig. 3

Numerical data for Extended Data Fig. 3a,c,d.

Numerical Source Data Extended Data Fig. 5

Numerical data for Extended Data Fig. 5d.

Numerical Source Data Extended Data Fig. 8

Numerical data for Extended Data Fig. 8b,e.

Numerical Source Data Extended Data Fig. 9

Numerical data for Extended Data Fig. 9d.

Source Data Fig. 1

Representative gating for flow cytometry and uncropped gel image for Fig. 1.

Source Data Fig. 2

Six uncropped plate images for Fig. 2b.

Source Data Fig. 3

Two uncropped gel images for Fig. 3b.

Source Data Extended Data Fig. 1

Two uncropped gel images for Extended Data Fig. 1b,d.

Source Data Extended Data Fig. 3

Four uncropped gel images for Extended Data Fig. 3b.

Source Data Extended Data Fig. 4

Sixteen uncropped gel images for Extended Data Fig. 4a.

Source Data Extended Data Fig. 5

Four uncropped gel images for Extended Data Fig. 5a and one uncropped gel image for Extended Data Fig. 5b.

Source Data Extended Data Fig. 6

Uncropped gel image for Extended Data Fig. 6b.

Source Data Extended Data Fig. 7

Uncropped gel images for Extended Data Fig. 7a,c.

Source Data Extended Data Fig. 8

Uncropped gel images for Extended Data Fig. 8c,f.

Source Data Extended Data Fig. 9

Eight uncropped gel images for Extended Data Fig. 9a.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Liao, C., Sharma, S., Svensson, S.L. et al. Spacer prioritization in CRISPR–Cas9 immunity is enabled by the leader RNA. Nat Microbiol 7, 530–541 (2022). https://doi.org/10.1038/s41564-022-01074-3

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1038/s41564-022-01074-3

Search

Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing