Abstract
CRISPR–Cas systems store fragments of foreign DNA, called spacers, as immunological recordings used to combat future infections. Of the many spacers stored in a CRISPR array, the most recent are known to be prioritized for immune defence. However, the underlying mechanism remains unclear. Here we show that the leader region upstream of CRISPR arrays in CRISPR–Cas9 systems enhances CRISPR RNA (crRNA) processing from the newest spacer, prioritizing defence against the matching invader. Using the CRISPR–Cas9 system from Streptococcus pyogenes as a model, we found that the transcribed leader interacts with the conserved repeats bordering the newest spacer. The resulting interaction promotes transactivating crRNA (tracrRNA) hybridization with the second of the two repeats, accelerating crRNA processing. Accordingly, disruption of this structure reduces the abundance of the associated crRNA and immune defence against targeted plasmids and bacteriophages. Beyond the S. pyogenes system, bioinformatics analyses revealed that leader-repeat structures appear across CRISPR–Cas9 systems. CRISPR–Cas systems thus possess an RNA-based mechanism to prioritize defence against the most recently encountered invaders.
This is a preview of subscription content, access via your institution
Access options
Access Nature and 54 other Nature Portfolio journals
Get Nature+, our best-value online-access subscription
$29.99 per month
cancel any time
Subscribe to this journal
Receive 12 digital issues and online access to articles
$119.00 per year
only $9.92 per issue
Rent or buy this article
Get just this article for as long as you need it
$39.95
Prices may be subject to local taxes which are calculated during checkout






Data availability
Next-generation sequencing data for RNA immunoprecipitation sequencing are accessible through NCBI Gene Expression Omnibus accession no. GSE158637 using the link https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE158637 (Supplementary Table 4). Source data for Figs. 1b,d,e, 2a,b, 3b–d and 4b–d and Extended Data Figs. 1b,d, 3a,c,d, 4a–d, 5a,b,d, 6a, 7a,c, 8b,c,e,f and 9a,d are included in the Source Data files. Source data are provided with this paper.
Code availability
Custom scripts analysing folding of the leader-repeat region of different CRISPR–Cas systems are available on GitHub at https://github.com/zashaweinberglab/type-II-A-leader-repeat.
References
Barrangou, R. et al. CRISPR provides acquired resistance against viruses in prokaryotes. Science 315, 1709–1712 (2007).
van der Oost, J., Westra, E. R., Jackson, R. N. & Wiedenheft, B. Unravelling the structural and mechanistic basis of CRISPR-Cas systems. Nat. Rev. Microbiol. 12, 479–492 (2014).
Jackson, S. A. et al. CRISPR-Cas: adapting to change. Science 356, eaal5056 (2017).
Bolotin, A., Quinquis, B., Sorokin, A. & Ehrlich, S. D. Clustered regularly interspaced short palindrome repeats (CRISPRs) have spacers of extrachromosomal origin. Microbiology 151, 2551–2561 (2005).
Mojica, F. J. M., Díez-Villaseñor, C., García-Martínez, J. & Soria, E. Intervening sequences of regularly spaced prokaryotic repeats derive from foreign genetic elements. J. Mol. Evol. 60, 174–182 (2005).
Sorek, R., Kunin, V. & Hugenholtz, P. CRISPR—a widespread system that provides acquired resistance against phages in bacteria and archaea. Nat. Rev. Microbiol. 6, 181–186 (2008).
Arslan, Z., Hermanns, V., Wurm, R., Wagner, R. & Pul, Ü. Detection and characterization of spacer integration intermediates in type I-E CRISPR–Cas system. Nucleic Acids Res. 42, 7884–7893 (2014).
Xiao, Y., Ng, S., Nam, K. H. & Ke, A. How type II CRISPR-Cas establish immunity through Cas1-Cas2-mediated spacer integration. Nature 550, 137–141 (2017).
McGinn, J. & Marraffini, L. A. Molecular mechanisms of CRISPR-Cas spacer acquisition. Nat. Rev. Microbiol. 17, 7–12 (2019).
Brouns, S. J. J. et al. Small CRISPR RNAs guide antiviral defense in prokaryotes. Science 321, 960–964 (2008).
Charpentier, E., Richter, H., van der Oost, J. & White, M. F. Biogenesis pathways of RNA guides in archaeal and bacterial CRISPR-Cas adaptive immunity. FEMS Microbiol. Rev. 39, 428–441 (2015).
Garneau, J. E. et al. The CRISPR/Cas bacterial immune system cleaves bacteriophage and plasmid DNA. Nature 468, 67–71 (2010).
Meeske, A. J., Nakandakari-Higa, S. & Marraffini, L. A. Cas13-induced cellular dormancy prevents the rise of CRISPR-resistant bacteriophage. Nature 570, 241–245 (2019).
Rostøl, J. T. et al. The Card1 nuclease provides defence during type III CRISPR immunity. Nature 590, 624–629 (2021).
Elmore, J. R. et al. Programmable plasmid interference by the CRISPR-Cas system in Thermococcus kodakarensis. RNA Biol. 10, 828–840 (2013).
Carte, J. et al. The three major types of CRISPR-Cas systems function independently in CRISPR RNA biogenesis in Streptococcus thermophilus. Mol. Microbiol. 93, 98–112 (2014).
Crawley, A. B., Henriksen, E. D., Stout, E., Brandt, K. & Barrangou, R. Characterizing the activity of abundant, diverse and active CRISPR-Cas systems in lactobacilli. Sci. Rep. 8, 11544 (2018).
Deltcheva, E. et al. CRISPR RNA maturation by trans-encoded small RNA and host factor RNase III. Nature 471, 602–607 (2011).
McGinn, J. & Marraffini, L. A. CRISPR-Cas systems optimize their immune response by specifying the site of spacer integration. Mol. Cell 64, 616–623 (2016).
Martynov, A., Severinov, K. & Ispolatov, I. Optimal number of spacers in CRISPR arrays. PLoS Comput. Biol. 13, e1005891 (2017).
Rao, C., Chin, D. & Ensminger, A. W. Priming in a permissive type I-C CRISPR-Cas system reveals distinct dynamics of spacer acquisition and loss. RNA 23, 1525–1538 (2017).
Liao, C. & Beisel, C. L. The tracrRNA in CRISPR biology and technologies. Annu. Rev. Genet. 55, 161–181 (2021).
Karvelis, T. et al. crRNA and tracrRNA guide Cas9-mediated DNA interference in Streptococcus thermophilus. RNA Biol. 10, 841–851 (2013).
Jinek, M. et al. A programmable dual-RNA-guided DNA endonuclease in adaptive bacterial immunity. Science 337, 816–821 (2012).
Pickar-Oliver, A. & Gersbach, C. A. The next generation of CRISPR–Cas technologies and applications. Nat. Rev. Mol. Cell Biol. 20, 490–507 (2019).
Bikard, D. et al. Programmable repression and activation of bacterial gene expression using an engineered CRISPR-Cas system. Nucleic Acids Res. 41, 7429–7437 (2013).
Jiang, W., Bikard, D., Cox, D., Zhang, F. & Marraffini, L. A. RNA-guided editing of bacterial genomes using CRISPR-Cas systems. Nat. Biotechnol. 31, 233–239 (2013).
Citorik, R. J., Mimee, M. & Lu, T. K. Sequence-specific antimicrobials using efficiently delivered RNA-guided nucleases. Nat. Biotechnol. 32, 1141–1145 (2014).
Leenay, R. T. & Beisel, C. L. Deciphering, communicating, and engineering the CRISPR PAM. J. Mol. Biol. 429, 177–191 (2017).
Dugar, G. et al. CRISPR RNA-dependent binding and cleavage of endogenous RNAs by the Campylobacter jejuni Cas9. Mol. Cell 69, 893–905 (2018).
Xue, C. et al. CRISPR interference and priming varies with individual spacer sequences. Nucleic Acids Res. 43, 10831–10847 (2015).
Collias, D. et al. A positive, growth-based PAM screen identifies noncanonical motifs recognized by the Cas9. Sci. Adv. 6, eabb4054 (2020).
Altuvia, Y. et al. In vivo cleavage rules and target repertoire of RNase III in Escherichia coli. Nucleic Acids Res. 46, 10530–10531 (2018).
Wei, Y., Chesne, M. T., Terns, R. M. & Terns, M. P. Sequences spanning the leader-repeat junction mediate CRISPR adaptation to phage in Streptococcus thermophilus. Nucleic Acids Res. 43, 1749–1758 (2015).
Pougach, K. et al. Transcription, processing and function of CRISPR cassettes in Escherichia coli. Mol. Microbiol. 77, 1367–1379 (2010).
Yosef, I., Goren, M. G. & Qimron, U. Proteins and DNA elements essential for the CRISPR adaptation process in Escherichia coli. Nucleic Acids Res. 40, 5569–5576 (2012).
Jiao, C. et al. Noncanonical crRNAs derived from host transcripts enable multiplexable RNA detection by Cas9. Science 372, 941–948 (2021).
Jabbari, H., Wark, I. & Montemagno, C. RNA secondary structure prediction with pseudoknots: contribution of algorithm versus energy model. PLoS ONE 13, e0194583 (2018).
Wei, Y., Terns, R. M. & Terns, M. P. Cas9 function and host genome sampling in Type II-A CRISPR-Cas adaptation. Genes Dev. 29, 356–361 (2015).
Laanto, E., Hoikkala, V., Ravantti, J. & Sundberg, L.-R. Long-term genomic coevolution of host-parasite interaction in the natural environment. Nat. Commun. 8, 111 (2017).
Zhang, Y. et al. Processing-independent CRISPR RNAs limit natural transformation in Neisseria meningitidis. Mol. Cell 50, 488–503 (2013).
Dugar, G. et al. High-resolution transcriptome maps reveal strain-specific regulatory features of multiple Campylobacter jejuni isolates. PLoS Genet. 9, e1003495 (2013).
Haurwitz, R. E., Jinek, M., Wiedenheft, B., Zhou, K. & Doudna, J. A. Sequence- and structure-specific RNA processing by a CRISPR endonuclease. Science 329, 1355–1358 (2010).
Li, R. & Bowerman, B. Symmetry breaking in biology. Cold Spring Harb. Perspect. Biol. 2, a003475 (2010).
McCarty, N. S., Graham, A. E., Studená, L. & Ledesma-Amaro, R. Multiplexed CRISPR technologies for gene editing and transcriptional regulation. Nat. Commun. 11, 1281 (2020).
Al-Hashimi, H. M. & Walter, N. G. RNA dynamics: it is about time. Curr. Opin. Struct. Biol. 18, 321–329 (2008).
Watters, K. E., Strobel, E. J., Yu, A. M., Lis, J. T. & Lucks, J. B. Cotranscriptional folding of a riboswitch at nucleotide resolution. Nat. Struct. Mol. Biol. 23, 1124–1131 (2016).
Liao, C. et al. Modular one-pot assembly of CRISPR arrays enables library generation and reveals factors influencing crRNA biogenesis. Nat. Commun. 10, 2948 (2019).
Wimmer, F. & Beisel, C. L. CRISPR-Cas systems and the paradox of self-targeting spacers. Front. Microbiol. 10, 3078 (2019).
Leenay, R. T. et al. Genome editing with CRISPR-Cas9 in Lactobacillus plantarum revealed that editing outcomes can vary across strains and between methods. Biotechnol. J. 14, e1700583 (2019).
Gruber, A. R., Lorenz, R., Bernhart, S. H., Neubock, R. & Hofacker, I. L. The Vienna RNA Websuite. Nucleic Acids Res. 36, W70–W74 (2008).
Lorenz, R. et al. ViennaRNA Package 2.0. Algorithms Mol. Biol. 6, 26 (2011).
Sharma, C. M. et al. The primary transcriptome of the major human pathogen Helicobacter pylori. Nature 464, 250–255 (2010).
Papenfort, K. et al. σE-Dependent small RNAs of Salmonella respond to membrane stress by accelerating global omp mRNA decay. Mol. Microbiol. 62, 1674–1688 (2006).
Pernitzsch, S. R., Tirier, S. M., Beier, D. & Sharma, C. M. A variable homopolymeric G-repeat defines small RNA-mediated posttranscriptional regulation of a chemotaxis receptor in Helicobacter pylori. Proc. Natl Acad. Sci. USA 111, E501–E510 (2014).
Martin, M. Cutadapt removes adapter sequences from high-throughput sequencing reads. EMBnet J. 17, 1 (2011).
Förstner, K. U., Vogel, J. & Sharma, C. M. READemption-a tool for the computational analysis of deep-sequencing-based transcriptome data. Bioinformatics 30, 3421–3423 (2014).
Hoffmann, S. et al. Fast mapping of short sequences with mismatches, insertions and deletions using index structures. PLoS Comput. Biol. 5, e1000502 (2009).
Li, H. et al. The Sequence Alignment/Map format and SAMtools. Bioinformatics 25, 2078–2079 (2009).
Lopez-Delisle, L. et al. pyGenomeTracks: reproducible plots for multivariate genomic data sets. Bioinformatics 37, 422–423 (2020).
Kent, W. J., Zweig, A. S., Barber, G., Hinrichs, A. S. & Karolchik, D. BigWig and BigBed: enabling browsing of large distributed datasets. Bioinformatics 26, 2204–2207 (2010).
Padilha, V. A., Alkhnbashi, O. S., Shah, S. A., de Carvalho, A. C. P. L. F. & Backofen, R. CRISPRcasIdentifier: machine learning for accurate identification and classification of CRISPR-Cas systems. Gigascience 9, giaa062 (2020).
Padilha, V. A. et al. Casboundary: automated definition of integral Cas cassettes. Bioinformatics 37, 1352–1359 (2020).
Mitrofanov, A. et al. CRISPRidentify: identification of CRISPR arrays using machine learning approach. Nucleic Acids Res. 49, e20 (2021).
Alkhnbashi, O. S. et al. CRISPRstrand: predicting repeat orientations to determine the crRNA-encoding strand at CRISPR loci. Bioinformatics 30, i489–i496 (2014).
Fu, L., Niu, B., Zhu, Z., Wu, S. & Li, W. CD-HIT: accelerated for clustering the next-generation sequencing data. Bioinformatics 28, 3150–3152 (2012).
Ding, Y. & Lawrence, C. E. A statistical sampling algorithm for RNA secondary structure prediction. Nucleic Acids Res. 31, 7280–7301 (2003).
Altschul, S. F. & Erickson, B. W. Significance of nucleotide sequence alignments: a method for random sequence permutation that preserves dinucleotide and codon usage. Mol. Biol. Evol. 2, 526–538 (1985).
Acknowledgements
We thank T. Achmedov for extensive assistance with RNA preparation and RNA-blotting, F. Tippel from NanoTemper Technologies (Munich) for technical support and J. Vogel and G. Storz for critical feedback on the manuscript. This work was supported by funding through the European Research Council Consolidator Award (no. 865973 to C.L.B.), Deutsche Forschungsgemeinschaft SPP 2141 (nos. BE 6703/1-1 to C.L.B., SH 580/9-1 to C.M.S. and BA 2168/23-1 to R.B.) and the Interdisciplinary Center for Clinical Research Würzburg project Z-6.
Author information
Authors and Affiliations
Contributions
C.L. and C.L.B. conceived this study. C.L. and C.L.B. designed the experiments. C.L. performed plasmid cloning, in vivo assays in E. coli and L. rhamnosus, in vitro RNA transcription and purification, RNase III cleavage assays and RNA-blotting. C.L. and C.L.B. analysed the associated data. S.S. conducted immunoblotting and RNA immunoprecipitation for RIP–seq and helped analyse the data. S.L.S. conducted RNA structural probing and RNase III cleavage site mapping and helped analyse data. C.M.S. supervised the work performed by S.S. and S.L.S. A.K. designed and performed the in vitro assay for RNA–RNA binding affinity and analysed the data, with supervision by N.C. O.S.A. identified the repeat-leaders and computed mutations, with supervision by R.B. Z.W. assessed base-pairing probabilities. T.B. analysed RIP–seq data. C.L.B. and C.L. wrote the manuscript, which was read and approved by all authors. C.L.B. supervised the project.
Corresponding author
Ethics declarations
Competing interests
C.L.B. is a cofounder and member of the scientific advisory board for Locus Biosciences and is a member of the scientific advisory board for Benson Hill. C.L.B. and C.M.S. have submitted patent applications on CRISPR technologies unrelated to this work. The other authors declare no conflicts of interest.
Peer review
Peer review information
Nature Microbiology thanks the anonymous reviewers for their contribution to the peer review of this work. Peer reviewer reports are available.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Extended data
Extended Data Fig. 1 The leader-repeat stem-loop from the CRISPR-Cas9 system native to Streptococcus pyogenes SF370.
Accession #: NC_002737.2. a, Array sequence and context within the CRISPR-Cas system. Repeats are in gray, spacers match the corresponding color in the cartoon, and mutations to the consensus repeat are shown in red. The underlined sequence encodes the transcribed RNA leader as determined in S. pyogenes SF37018. The bold and italicized sequence is the putative -10 promoter element, while the lowercase letters designate the stop codon of csn2. The red box indicates the mapped transcriptional start site in E. coli determined using 5′ RACE. b, PCR product generated by 5′ RACE. Biological duplicates are shown. M: DNA marker. C, Predicted minimal free-energy structure of the native and mutated leader-repeat RNA predicted by NUPACK. Left: nucleotide (nt) identities. Right: base-pairing probabilities. d, In vitro determination of the secondary structure and RNase III cleavage sites for the leader-repeat RNA associated with SpyCas9. The transcription start site was extended by 17 nts using the sequence from S. pyogenes to allow visualization of shorter RNAs. Vertical bars: unstructured regions. C: full-length (untreated) control. T1: Ladder of G’s generated by incubating the RNA with RNase T1. OH: single-nucleotide ladder generated by incubating the RNA under basic conditions. Dark and light red arrows indicate the most and second most preferred sites of RNase III cleavage, respectively. Results are representative of triplicate independent experiments. e, Corresponding secondary structure of the leader-repeat RNA. Circles indicate unstructured bases identified by in-line probing. The preferred site of RNase III cleavage lies within one nt of the equivalent site within the crRNA:tracrRNA duplex (see Fig. 1c). R1: first repeat. S1: first spacer.
Extended Data Fig. 2 Capillary scans and thermophoretic time-traces of microscale thermophoresis (MST) measurements of binding between the leader-repeat RNA and tracrRNA associated with different CRISPR-Cas9 systems.
a, Streptococcus pyogenes SF370 with an RNA spanning the leader to the first spacer. b, Streptococcus pyogenes SF370 with an RNA spanning the leader to the second spacer. c, Lactobacillus rhamnosus GG with an RNA spanning the leader to the first spacer. d, Streptococcus thermophilus DGCC 7710 (CRISPR1) with an RNA spanning the leader to the first spacer. In all cases, the tracrRNA was fluorescently labeled while unlabeled leader-repeat RNA was added at different concentrations. Capillary scans and traces of one of three independent experiments are shown. The gray boxes in the capillary scans mark 20% above and below the average peak fluorescence indicated in orange, the acceptable limit of deviations across the fluorescence scans. Blue and red boxes in the time-course traces represent the temperature jump and MST-on time, respectively. In all cases, there is no adsorption of the labeled tracrRNAs to the capillaries, and the time traces indicate no aggregation. See Figs. 1d and 4d and Extended Data Figs. 8b and 8e for the resulting binding curves. Values in a-d represent the mean and standard deviation of triplicate independent measurements.
Extended Data Fig. 3 Data rejecting alternative explanations for the impact of mutating the leader region associated with SpyCas9.
a, Assessing targeting by the mutated ecrRNA guide by plasmid clearance in E. coli. The native and mutated ecrRNAs were encoded as single-spacer arrays with the native leader. There was no significant difference in plasmid clearance with (Student’s two-tailed t-test with unequal variance, P = 0.36, n = 3) or without (Student’s two-tailed t-test with unequal variance, P = 0.80, n = 3) outgrowth. b, Western blotting analysis of SpyCas9-3xFLAG levels with the native or mutated leader. Results are representative of two independent experiments. c, Plasmid clearance with SpyCas9-3xFLAG in E. coli. The SpyCas9-3xFLAG fusions were tested using an sgRNA with a guide derived from spacer 1 (S1) in the native array. The transformations were conducted without non-selective outgrowth. The results showed that the fusion did not compromise clearance activity by SpyCas9, and introducing the mutations into the CRISPR leader did not significantly affect SpyCas9 activity (Student’s two-tailed t-test with unequal variance, P = 0.168, n = 3). d, Assessing transcription of the CRISPR array with the mutated leader. The native or mutated leader through the first spacer was cloned upstream of gfp in the pUA66 plasmid. E. coli cells harboring either plasmid were then subjected to flow cytometry analysis. There was no significant difference (Student’s two-tailed t-test with unequal variance, P = 0.103, n = 3) in the background-subtracted GFP fluorescence between the constructs. Values represent the mean and standard deviation of triplicate independent measurements starting from separate colonies. Values in a, c and d represent the geometric mean and standard deviation from independent experiments starting from three separate colonies. n.s.: not significant. n.s.: P > 0.05. Statistical tests were performed using a two-tailed Student’s t-test with unequal variance, n = 3.
Extended Data Fig. 4 RIP-seq analysis using SpyCas9 combined with the native or mutated leader in E. coli.
The left and right sides of the figure represent the results from two independent experiments. RIP-seq was performed using E. coli BW25113 harboring the SpyCas9/tracrRNA/CRISPR or SpyCas9-3xFLAG/tracrRNA/CRISPR plasmid. a, Western blotting confirmed enrichment of SpyCas9-FLAG. Co-immunoprecipitated RNAs were isolated and subjected to next-generation sequencing. b, Distribution of RNA classes based on total mapped reads.c, Mapped reads for the CRISPR locus with the native or mutated leader. The scale above the plot indicates the location in the plasmid. Positional coverage for total aligned reads and reads aligning with a reference length ≤ 50 nts was normalized based on the total number of aligned reads in each sample. The reduction in reads upon applying the size filter indicates an excess of pre-crRNA and immature crRNAs, which parallels Northern blotting analysis for the ecrRNA and individual crRNAs (see Fig. 3b and Extended Data Fig. 5a-b). We also note that the reads begin ~12 nts upstream of the transcriptional start site mapped by 5’ RACE (see Extended Data Fig. 1), suggesting that a slightly upstream transcriptional start site or processing site from a longer transcript also exists. d, Direct comparison of mapped reads with the native or mutated leader. The plot corresponds to that shown in Fig. 3a. The read score for the first crRNA downstream of the native leader extends above the vertical limit of 1,500. The relative read scores for the ecrRNA and each crRNA are indicated below the plots. Values below one indicate a reduction in (e)crRNA abundance with the introduced mutations. See Supplementary Table 1 for statistics about the RIP-seq analyses.
Extended Data Fig. 5 Impact of mutating the leader-repeat stem-loop from the CRISPR-Cas9 system from Streptococcus pyogenes SF370.
a, Northern blotting analysis of the produced crRNAs with the native or mutated RNA leader. The system’s CRISPR array was expressed in E. coli with SpyCas9 and the tracrRNA, and the ecrRNA (probe #1), crRNA1 (probe #2), and crRNA5 (probe #3) were detected. The ecrRNA and mecrRNA were detected using an equimolar mixture of both probes. b, Northern blotting analysis of the produced crRNAs with different mutant backgrounds. See a for details. Experiments were conducted with the native or mutated leader or with the tracrRNA, cas9, or rnc deleted. The results for probe #1 are those shown in Fig. 3b. All probing was performed with the same blot. The indicated RNA spanning the leader through the processed crRNA1 corresponds to that observed by RIP-seq (see Extended Data Fig. 4c) and is supported by the band’s absence when probing for crRNA2. Results in a and b are representative of duplicate independent experiments. c, Predicted secondary structures of three different restoring mutant sets. Disruptive mutations were made to the mutated leader depicted in Fig. 1c. In each case, a stable stem was created by making restoring mutations, although the upper structure deviates from that found in the native leader-repeat. d, Impact of the mutations on plasmid clearance by SpyCas9 in E. coli. The clearance assays were conducted with or without a non-selective outgrowth, where the non-selective outgrowth improves the extent of plasmid clearance. Values represent the geometric mean and standard deviation from independent experiments starting from three separate colonies.
Extended Data Fig. 6 CRISPR arrays from other CRISPR-Cas9 systems within the II-A subtype that appear to possess a leader-repeat stem-loop.
a, Array sequence and context within the CRISPR-Cas system native to Lactobacillus rhamnosus GG. Accession #: GCF_000026505.1. The sequence begins within csn2 (annotated as LGG_02201) and ends after the terminal repeat. See Extended Data Fig. 1a for details. The underlined sequence encodes the transcribed RNA leader as determined by 5′ RACE in L. rhamnosus in this work. Lowercase letters designate the stop codon of csn2. The promoter(s) driving expression of the cas genes has not been mapped. b, PCR product as part of 5′ RACE using total RNA from L. rhamnosus GG. See Extended Data Fig. 1b for details. Only one major product was visible in both replicates. Biological duplicates are shown. M: DNA marker. Results from duplicate independent experiments are shown. c, Secondary structure of the native and mutated leader-repeat RNA predicted by NUPACK. See Extended Data Fig. 1c for details. The 5′ of the leader was truncated to match the sequence used in the structural probing and RNase III cleavage assays (see Extended Data Fig. 7b). Mutations were selected to disrupt the original secondary structure of the native leader-repeat RNA. d, Array sequence and context within the CRISPR-Cas system native to Streptococcus thermophilus DGCC 7710 (CRISPR1 locus). Accession #: CP025216.1. The sequence begins downstream of csn2 and ends after the terminal repeat. See Extended Data Fig. 1a for details. The underlined sequence encodes the transcribed RNA leader as determined previously by RNA sequencing analysis of transcripts16. The promoter(s) driving expression of the cas genes has not been mapped. e, Secondary structure of the native and mutated leader-repeat RNA predicted by NUPACK. See Extended Data Fig. 1c for details.
Extended Data Fig. 7 In vitro determination of the secondary structure and RNase III cleavage sites for the leader-repeat RNA associated with LrhCas9 and Sth1Cas9.
a, In vitro determination of the secondary structure and RNase III cleavage sites for the leader-repeat RNA associated with LrhCas9. The probed RNA was 5′ radiolabeled and resolved by denaturing PAGE. The 5′ end was truncated to focus on the predicted secondary structure involving the repeat. Vertical bars on the right indicate unstructured regions. C - full-length control. T1: Ladder of G’s generated by incubating the RNA with RNase T1. OH: single-nucleotide ladder generated by incubating the RNA under basic conditions. RNase III: the RNA was incubated with the indicated units of E. coli RNase III (0, 0.0016, 0.008, 0.04, 0.2, 1) for 5 min at 37 °C. Dark and light red arrows indicate the most preferred and second most preferred sites of RNase III cleavage, respectively. Results are representative of triplicate independent experiments. b, Corresponding secondary structure of the leader-repeat RNA. Circles indicate unstructured bases identified by in-line probing. The preferred site of RNase III cleavage lies below the equivalent site within the crRNA:tracrRNA duplex (see Extended Data Fig. 8a). R1: first repeat. S1: first spacer. c, In vitro determination of the secondary structure and RNase III cleavage sites for the leader-repeat RNA associated with Sth1Cas9. See a for details. The 5′ end was truncated to focus on the predicted secondary structure involving the repeat. Results are representative of triplicate independent experiments. d, Corresponding secondary structure of the leader-repeat RNA. Circles indicate unstructured bases identified by in-line probing. The preferred site of RNase III cleavage lies above the equivalent site within the crRNA:tracrRNA duplex (see Extended Data Fig. 8d). R1: first repeat. S1: first spacer.
Extended Data Fig. 8 II-A CRISPR-Cas9 systems form distinct leader-repeat stem-loops.
a, The CRISPR-Cas system from L. rhamnosus GG and the secondary structure of the leader-repeat RNA. The structure was predicted by NUPACK and confirmed in vitro (see Extended Data Fig. 6c and 7a-b). Mutations indicated in red were made to disrupt stems formed between the leader RNA and the first repeat. b, Measured equilibrium binding between the tracrRNA and native or mutated RNA leader-repeat RNA. See Extended Data Fig. 2c for supporting data. Values represent the mean and standard deviation of triplicate independent measurements. c, RNase III cleavage of the native and mutated leader-repeat RNA in vitro. See Extended Data Fig. 7a-b for the mapped secondary structure and RNase III cleavage sites. Results are representative of duplicate independent experiments. d, The CRISPR-Cas system associated with the CRISPR1 locus of S. thermophilus and the secondary structure of the leader-repeat RNA. The structure was predicted by NUPACK and confirmed in vitro (see Extended Data Fig. 6e and 7c-d). Indicated mutations in red were made to disrupt the stem formed between the leader RNA and first repeat. The three mutations in the loop were introduced to disrupt alternative structures formed by the other mutations. Pairing between the repeat and the tracrRNA is provided as a basis of comparison. Red arrows indicate the previously mapped site cleaved by RNase III16. R1: first repeat. R2: second repeat. S1: first spacer. S2: second spacer. e, Measured equilibrium binding affinity between the leader-repeat and the tracrRNA under in vitro conditions. See Extended Data Fig. 2d for supporting data. Values represent the mean and standard deviation of triplicate independent measurements. f, RNase III cleavage of the native and mutated leader-repeat RNA in vitro. Results are representative of duplicate independent experiments.
Extended Data Fig. 9 RIP-seq analysis of RNAs bound to Cas9 from Lactobacillus rhamnosus GG. LrhCas9 with or without a 3xFLAG affinity was expressed from a plasmid, and the lysate was subjected to RIP-seq analysis.
LrhCas9 with or without a 3xFLAG affinity was expressed from a plasmid, and the lysate was subjected to RIP-seq analysis. a, Western blotting analysis of samples for RIP-seq using LrhCas9 in Lactobacillus rhamnosus GG. Western blotting confirmed enrichment of LrhCas9-FLAG. Co-immunoprecipitated RNAs were isolated and subjected to next-generation sequencing. Results from duplicate independent experiments are shown on the left and right. b, Distribution of RNA classes based on total mapped reads. hkRNAs: house-keeping RNAs. ncRNAs: non-coding RNAs. c, Mapped reads for the CRISPR locus with the genome of L. rhamnosus GG (NC_013198.1). The scale above the plot indicates the location in the genome. The CRISPR locus is encoded on the negative strand. Positional coverage for total reads and reads aligning with a reference length ≤ 50 nts was normalized based on the total number of aligned reads in each sample. The maximum read length for the NGS run was 76 nts, explaining the drop in unfiltered read counts shortly downstream of the transcriptional start site. See Supplementary Table 1 for statistics from the RIP-seq analyses. Results in b and c are representative of duplicate independent experiments. d, Plasmid clearance by the CRISPR-Cas9 system in L. rhamnosus GG. The corresponding target of the ecrRNA or crRNA1 was encoded within the transformed plasmid. L.O.D.: limit of detection. There was no detectable ecrRNA-directed plasmid clearance. Values represent the geometric mean and standard deviation from three independent experiments starting from separate colonies. **: P < 0.01. n.s.: P > 0.05. Statistical tests were performed using a two-tailed Student’s t-test with unequal variance, n = 3.
Extended Data Fig. 10 The CRISPR array from the CRISPR-Cas9 system native to Alkalihalobacillus pseudalcaliphilus DSM 8725.
The system falls within the II-C subtype. Accession #: LFJO01000002.1. a, Array sequence and context within the CRISPR-Cas9 system. The sequence begins immediately downstream of the AB990_04425 gene unrelated to the CRISPR-Cas9 system and ends after the last repeat of the CRISPR2 array. Repeats are in gray, spacers match the corresponding color in the cartoon, and mutations to the consensus repeat are shown in red. The underlined sequence denotes the upstream region used for the folding predictions for the CRISPR1 array. The transcriptional start sites for both arrays are unknown, although there is a clear Rho-independent terminator downstream of each array. The promoters driving expression of the cas genes, the CRISPR arrays, or the tracrRNA have not been mapped. The predicted direction of transcription for the tracrRNA and CRISPR array are indicated with black arrows. b, tracrRNA sequence and context within the CRISPR-Cas9 system. The sequence begins ~2.7 kb upstream of the AB990_04405 gene unrelated to the CRISPR-Cas9 system and ends immediately upstream of cas9. The sequence in orange corresponds to the putative tracrRNA used in the folding predictions. c, Predicted stem-loop between the first repeat and upstream region for the CRISPR1 array. The predicted stem-loop is part of the minimal-free energy structure and reflects base-pairing probabilities principally between 90% and 100%. Pairing between the second repeat and the tracrRNA is provided as a basis of comparison. The tracrRNA ends with a canonical Rho-independent terminator.
Supplementary information
Source data
Numerical Source Data Fig. 1
Numerical data for Fig. 1b,d.
Numerical Source Data Fig. 2
Numerical data for Fig. 2a.
Numerical Source Data Fig. 3
Numerical data for Fig. 3c,d.
Numerical Source Data Fig. 4
Numerical data for Fig. 4b–d.
Numerical Source Data Extended Data Fig. 3
Numerical data for Extended Data Fig. 3a,c,d.
Numerical Source Data Extended Data Fig. 5
Numerical data for Extended Data Fig. 5d.
Numerical Source Data Extended Data Fig. 8
Numerical data for Extended Data Fig. 8b,e.
Numerical Source Data Extended Data Fig. 9
Numerical data for Extended Data Fig. 9d.
Source Data Fig. 1
Representative gating for flow cytometry and uncropped gel image for Fig. 1.
Source Data Fig. 2
Six uncropped plate images for Fig. 2b.
Source Data Fig. 3
Two uncropped gel images for Fig. 3b.
Source Data Extended Data Fig. 1
Two uncropped gel images for Extended Data Fig. 1b,d.
Source Data Extended Data Fig. 3
Four uncropped gel images for Extended Data Fig. 3b.
Source Data Extended Data Fig. 4
Sixteen uncropped gel images for Extended Data Fig. 4a.
Source Data Extended Data Fig. 5
Four uncropped gel images for Extended Data Fig. 5a and one uncropped gel image for Extended Data Fig. 5b.
Source Data Extended Data Fig. 6
Uncropped gel image for Extended Data Fig. 6b.
Source Data Extended Data Fig. 7
Uncropped gel images for Extended Data Fig. 7a,c.
Source Data Extended Data Fig. 8
Uncropped gel images for Extended Data Fig. 8c,f.
Source Data Extended Data Fig. 9
Eight uncropped gel images for Extended Data Fig. 9a.
Rights and permissions
About this article
Cite this article
Liao, C., Sharma, S., Svensson, S.L. et al. Spacer prioritization in CRISPR–Cas9 immunity is enabled by the leader RNA. Nat Microbiol 7, 530–541 (2022). https://doi.org/10.1038/s41564-022-01074-3
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1038/s41564-022-01074-3