CRISPR adaptation biases explain preference for acquisition of foreign DNA

Journal name:
Nature
Volume:
520,
Pages:
505–510
Date published:
DOI:
doi:10.1038/nature14302
Received
Accepted
Published online

Abstract

CRISPR–Cas (clustered, regularly interspaced short palindromic repeats coupled with CRISPR-associated proteins) is a bacterial immunity system that protects against invading phages or plasmids. In the process of CRISPR adaptation, short pieces of DNA (‘spacers’) are acquired from foreign elements and integrated into the CRISPR array. So far, it has remained a mystery how spacers are preferentially acquired from the foreign DNA while the self chromosome is avoided. Here we show that spacer acquisition is replication-dependent, and that DNA breaks formed at stalled replication forks promote spacer acquisition. Chromosomal hotspots of spacer acquisition were confined by Chi sites, which are sequence octamers highly enriched on the bacterial chromosome, suggesting that these sites limit spacer acquisition from self DNA. We further show that the avoidance of self is mediated by the RecBCD double-stranded DNA break repair complex. Our results suggest that, in Escherichia coli, acquisition of new spacers largely depends on RecBCD-mediated processing of double-stranded DNA breaks occurring primarily at replication forks, and that the preference for foreign DNA is achieved through the higher density of Chi sites on the self chromosome, in combination with the higher number of forks on the foreign DNA. This model explains the strong preference to acquire spacers both from high copy plasmids and from phages.

At a glance

Figures

  1. Chromosome-scale hotspots for spacer acquisition.
    Figure 1: Chromosome-scale hotspots for spacer acquisition.

    a, Distribution of protospacers across the E. coli BL21-AI genome. Protospacers were deduced from aligning new spacers, acquired into the CRISPR I array after 16 h growth with no arabinose, to the bacterial genome. Only unique protospacers are presented, to avoid possible biases stemming from PCR amplification of the CRISPR array. Pooled protospacers from two replicates are presented. b, Protospacer density across a circular representation of the E. coli genome, normalized to the DNA content of the culture. Dark brown, normalized protospacer numbers; orange, PAM density. c, Protospacer distribution at the Ter region. Protospacer density is shown in 1-kb windows. d, Protospacer density in an E. coli BL21-AI in which the native 23 base pair (bp)-long TerB site was engineered into the pheA locus.

  2. Dependence of spacer acquisition on replication.
    Figure 2: Dependence of spacer acquisition on replication.

    a, Spacer acquisition rates in antibiotic-treated E. coli BL21-AI cells. Cells induced to express Cas1–Cas2 were grown for 16 h, with addition of the replication inhibitor nalidixic acid (Nal) or the transcription inhibitor rifampicin (Rif). b, Spacer acquisition rates of K-12ΔcasCdnaC2 and an isogenic K-12ΔcasC strains during overnight Cas1–Cas2 induction. c, Spacer acquisition patterns measured after transfer of K-12ΔcasCdnaC2 cells from 39 °C to 30 °C, during induction of Cas1–Cas2. For all panels, average and error margins for two biological replicates are shown.

  3. Chi sites define boundaries of protospacer hotspots.
    Figure 3: Chi sites define boundaries of protospacer hotspots.

    ad, Protospacer hotspot peaks. Each panel shows a 100 kb window around a major hotspot for spacer acquisition. Short blue and red ticks mark positive- and negative-strand Chi sites, respectively. Green lines mark a replication fork stalling site (TerA, TerC) or putative stalling site (CRISPR array). Dashed lines mark the first properly oriented Chi site upstream relative to the fork stalling site. a, The CRISPR region in E. coli BL21-AI. b, The CRISPR region in E. coli K-12. c, The TerC region and d, the TerA region in E. coli BL21-AI. In c, the Chi site drawn at ~2,260 kb represents a cluster of three consecutive Chi sites found in the same 1 kb window.

  4. Involvement of the dsDNA break repair machinery in defining spacer acquisition patterns.
    Figure 4: Involvement of the dsDNA break repair machinery in defining spacer acquisition patterns.

    a, The overall number of protospacers around all Chi sites in E. coli BL21-AI, that are not included in the CRISPR region (950,000–1,050,000) or the Ter region (2–2.5 Mb), is shown in windows of 0.5 kb. WT, wild-type. b, Protospacer hotspot peak resulting from a dsDNA break formed by the homing endonuclease I-SceI. c, The overall number of protospacers around all Chi sites that are not included in the CRISPR or the Ter regions in a BL21-AIΔrecB strain. d, The protospacer hotspot at the CRISPR region in the BL21-AIΔrecB strain is not confined by a Chi site (compare with the same hotspot in the wild-type strain, Fig. 3a). e, Adaption levels in wild-type BL21-AI and BL21-AIΔrecB, ΔrecC or ΔrecD strains after overnight growth without arabinose induction of Cas1–Cas2. f, Percentage of new spacers derived from the self chromosome in the experiment described in e. g, Percentage of new spacers derived from the self chromosome in the presence of a plasmid containing a cluster of four Chi sites (pChi) compared with an identical plasmid lacking Chi sites (pCtrl-Chi). For eg, average and error margins for two biological replicates are shown.

  5. A model explaining the preference for foreign DNA in spacer acquisition.
    Figure 5: A model explaining the preference for foreign DNA in spacer acquisition.

    a, RecBCD localizes to a dsDNA break (DSB) and unwinds/degrades the DNA until reaching the nearest properly oriented Chi site. The RecBCD activity generates significant amounts of DNA ‘debris’, including short and long ssDNA fragments and degraded dsDNA, all of which may serve as substrates for spacer acquisition by Cas1–Cas2. b, The high density of Chi sites on the chromosome reduces spacer acquisition from self DNA. On average, the 8-bp-long Chi sites are found every 4.6 kb on the E. coli chromosome, 14 times more often than on random DNA. When a dsDNA break occurs on the chromosome, RecBCD DNA degradation activity will quickly be moderated by a nearby Chi site, but a similar dsDNA break on a foreign DNA will lead to much more extensive DNA processing, providing more substrate for spacer acquisition. c, Preference for spacer acquisition from high-copy plasmids. In a replicating cell, most replication forks (blue circles) localize to the multiple copies of the plasmid. Since most dsDNA breaks occur during replication23, 26 at stalled replication forks24, 25, plasmid DNA would become more amenable for spacer acquisition. d, Most phages inject linear DNA into the infected cell. When such linear DNA is not protected, RecBCD will quickly degrade it, providing an intrinsic preference for spacer acquisition from phage DNA.

  6. Graphic overview of the procedure for characterizing the frequency and sequence of newly acquired spacers.
    Extended Data Fig. 1: Graphic overview of the procedure for characterizing the frequency and sequence of newly acquired spacers.

    DNA from cultures of either E. coli K-12 (left) or E. coli BL21-AI (right) strains expressing Cas1–Cas2 from two different plasmids were used as templates for PCR. Round 1 was used to determine the frequency of spacer acquisition by comparing occurrences of expanded arrays to wild-type (WT) arrays. Round 2 amplified only the expanded arrays and, followed by deep sequencing, was used to determine the sequence, location and source of newly acquired spacers.

  7. PAMs and DNA content along the E. coli BL21-AI genome.
    Extended Data Fig. 2: PAMs and DNA content along the E. coli BL21-AI genome.

    a, Distribution of PAM (AAG) sequences. Each data point represents the number of PAMs in a window of 10 kb. b, DNA content of a culture growing in log phase. Genomic DNA was extracted from E. coli BL21-AI cells carrying the pCas plasmid, grown at log phase, and was sequenced using the Illumina technology. The resulting reads were mapped to the sequenced E. coli BL21(DE3) genome (GenBank accession number NC_012947). Areas where few or no reads map to the genome represent regions that are present in the reference BL21(DE3) genome but are missing from the genome of the sequenced strain (BL21-AI).

  8. Distribution of newly acquired spacers on the genome during synchronized replication.
    Extended Data Fig. 3: Distribution of newly acquired spacers on the genome during synchronized replication.

    E. coli K-12ΔcasCdnaC2 cells were transferred from 39 °C (replication restrictive temperature) to 30 °C (replication permissive). Cas1–Cas2 were induced in these cells 30 min before the transfer to 30 °C and during the growth in 30 °C. Newly acquired spacers were sequenced at the given time points: a, following 20 min; b, following 40 min; c, following 60 min from replication initiation. The positions of the newly acquired spacers in windows of 100 kb are shown, and their fraction out of the total new spacers in the sample.

  9. A model explaining the preference for spacer acquisition near TerC compared with TerA in E. coli BL21-AI.
    Extended Data Fig. 4: A model explaining the preference for spacer acquisition near TerC compared with TerA in E. coli BL21-AI.

    The DNA manipulation at the CRISPR region forms a replication fork stalling site, and leads to extensive spacer acquisition upstream of the CRISPR. While the clockwise fork is stalled at the CRISPR, the anticlockwise fork reaches the Ter region and is stalled at the respective Ter site, TerC, leading to extensive spacer acquisition upstream of TerC. Another factor that can contribute to the observed TerC/TerA bias may be that the clockwise replichore in E. coli (oriC to TerA) is longer than the anticlockwise one (oriC to TerC), leading the forks to stall at TerC more often than at TerA.

  10. The protein product of T7 gene 5.9 inhibits spacer acquisition activity.
    Extended Data Fig. 5: The protein product of T7 gene 5.9 inhibits spacer acquisition activity.

    E. coli BL21-AI strains harbouring pBAD-Cas1+2 and pBAD33-gp5.9 (lane 1) or pBAD33 vector control (lane 2) were grown overnight in the presence of inducers (0.4% l-arabinose). Gel shows PCR products amplified from the indicated cultures using primers annealing to the leader and to the fifth spacer of the CRISPR array. Results represent one of three independent experiments.

  11. Distribution of protospacers across the plasmids.
    Extended Data Fig. 6: Distribution of protospacers across the plasmids.

    a, Distribution across pCtrl-Chi; b, distribution across pChi plasmids. Circular representation of the 4.7 kb plasmid is presented, with the inserted 4-Chi cluster present at the top of the circle. Black bars indicate the number of PAM-derived spacers sequenced from each position; green bars represent non-PAM spacers. Scale bar, 100,000 spacers. Pooled protospacers from two replicates are presented for each panel.

Tables

  1. Spacer acquisition in normal and perturbed conditions
    Extended Data Table 1: Spacer acquisition in normal and perturbed conditions
  2. Replication-dependent spacer acquisition
    Extended Data Table 2: Replication-dependent spacer acquisition
  3. Involvement of the DNA repair machinery in spacer acquisition
    Extended Data Table 3: Involvement of the DNA repair machinery in spacer acquisition

Accession codes

References

  1. Terns, M. P. & Terns, R. M. CRISPR-based adaptive immune systems. Curr. Opin. Microbiol. 14, 321327 (2011)
  2. Westra, E. R. et al. The CRISPRs, they are a-changin’: how prokaryotes generate adaptive immunity. Annu. Rev. Genet. 46, 311339 (2012)
  3. Wiedenheft, B., Sternberg, S. H. & Doudna, J. A. RNA-guided genetic silencing systems in bacteria and archaea. Nature 482, 331338 (2012)
  4. Koonin, E. V. & Makarova, K. S. CRISPR-Cas: evolution of an RNA-based adaptive immunity system in prokaryotes. RNA Biol. 10, 679686 (2013)
  5. Sorek, R., Lawrence, C. M. & Wiedenheft, B. CRISPR-mediated adaptive immune systems in Bacteria and Archaea. Annu. Rev. Biochem. 82, 237266 (2013)
  6. Barrangou, R. & Marraffini, L. A. CRISPR-Cas systems: prokaryotes upgrade to adaptive immunity. Mol. Cell 54, 234244 (2014)
  7. Yosef, I., Goren, M. G. & Qimron, U. Proteins and DNA elements essential for the CRISPR adaptation process in Escherichia coli. Nucleic Acids Res. 40, 55695576 (2012)
  8. Nunez, J. K. et al. Cas1–Cas2 complex formation mediates spacer acquisition during CRISPR–Cas adaptive immunity. Nature Struct. Mol. Biol. 21, 528534 (2014)
  9. Swarts, D. C., Mosterd, C., van Passel, M. W. & Brouns, S. J. CRISPR interference directs strand specific spacer acquisition. PLoS ONE 7, e35888 (2012)
  10. Datsenko, K. A. et al. Molecular memory of prior infections activates the CRISPR/Cas adaptive bacterial immunity system. Nat. Commun. 3, 945 (2012)
  11. Diez-Villasenor, C., Guzman, N. M., Almendros, C., Garcia-Martinez, J. & Mojica, F. J. CRISPR-spacer integration reporter plasmids reveal distinct genuine acquisition specificities among CRISPR-Cas I-E variants of Escherichia coli. RNA Biol. 10, 792802 (2013)
  12. Yosef, I. et al. DNA motifs determining the efficiency of adaptation into the Escherichia coli CRISPR array. Proc. Natl Acad. Sci. USA 110, 1439614401 (2013)
  13. Arslan, Z., Hermanns, V., Wurm, R., Wagner, R. & Pul, U. Detection and characterization of spacer integration intermediates in type I-E CRISPR-Cas system. Nucleic Acids Res. 42, 78847893 (2014)
  14. Savitskaya, E., Semenova, E., Dedkov, V., Metlitskaya, A. & Severinov, K. High-throughput analysis of type I-E CRISPR/Cas spacer acquisition in E. coli. RNA Biol. 10, 716725 (2013)
  15. Fineran, P. C. et al. Degenerate target sites mediate rapid primed CRISPR adaptation. Proc. Natl Acad. Sci. USA 111, E1629E1638 (2014)
  16. Skovgaard, O., Bak, M., Lobner-Olesen, A. & Tommerup, N. Genome-wide detection of chromosomal rearrangements, indels, and mutations in circular chromosomes by short read sequencing. Genome Res. 21, 13881393 (2011)
  17. Neylon, C., Kralicek, A. V., Hill, T. M. & Dixon, N. E. Replication termination in Escherichia coli: structure and antihelicase activity of the Tus-Ter complex. Microbiol. Mol. Biol. Rev. 69, 501526 (2005)
  18. Waldminghaus, T., Weigel, C. & Skarstad, K. Replication fork movement and methylation govern SeqA binding to the Escherichia coli chromosome. Nucleic Acids Res. 40, 54655476 (2012)
  19. Breier, A. M., Weier, H. U. & Cozzarelli, N. R. Independence of replisomes in Escherichia coli chromosomal replication. Proc. Natl Acad. Sci. USA 102, 39423947 (2005)
  20. del Solar, G. et al. Replication and control of circular bacterial plasmids. Microbiol. Mol. Biol. Rev. 62, 434464 (1998)
  21. Erdmann, S., Le Moine Bauer, S. & Garrett, R. A. Inter-viral conflicts that exploit host CRISPR immune systems of Sulfolobus. Mol. Microbiol. 91, 900917 (2014)
  22. Smith, G. R. How RecBCD enzyme and Chi promote DNA break repair and recombination: a molecular biologist’s view. Microbiol. Mol. Biol. Rev. 76, 217228 (2012)
  23. Dillingham, M. S. & Kowalczykowski, S. C. RecBCD enzyme and the repair of double-stranded DNA breaks. Microbiol. Mol. Biol. Rev. 72, 642671 (2008)
  24. Kuzminov, A. Single-strand interruptions in replicating chromosomes cause double-strand breaks. Proc. Natl Acad. Sci. USA 98, 82418246 (2001)
  25. Michel, B. et al. Rescue of arrested replication forks by homologous recombination. Proc. Natl Acad. Sci. USA 98, 81818188 (2001)
  26. Shee, C. et al. Engineered proteins detect spontaneous DNA breakage in human and bacterial cells. eLife 2, e01222 (2013)
  27. Babu, M. et al. A dual function of the CRISPR-Cas system in bacterial antivirus immunity and DNA repair. Mol. Microbiol. 79, 484502 (2011)
  28. Lin, L. Study of Bacteriophage T7 Gene 5.9 and Gene 5.5. PhD thesis, State Univ. New York. (1992)
  29. Brouns, S. J. et al. Small CRISPR RNAs guide antiviral defense in prokaryotes. Science 321, 960964 (2008)
  30. Guzman, L. M. et al. Tight regulation, modulation, and high-level expression by vectors containing the arabinose PBAD promoter. J. Bacteriol. 177, 41214130 (1995)
  31. Baba, T. et al. Construction of Escherichia coli K-12 in-frame, single-gene knockout mutants: the Keio collection. Mol. Syst. Biol. 2, 111 (2006)
  32. Sharan, S. K. et al. Recombineering: a homologous recombination-based method of genetic engineering. Nature Protocols 4, 206223 (2009)
  33. Datta, S., Costantino, N. & Court, D. L. A set of recombineering plasmids for gram-negative bacteria. Gene 379, 109115 (2006)
  34. Waldminghaus, T., Weigel, C. & Skarstad, K. Replication fork movement and methylation govern SeqA binding to the Escherichia coli chromosome. Nucleic Acids Res. 40, 54655476 (2012)
  35. Bidnenko, V., Ehrlich, S. D. & Michel, B. Replication fork collapse at replication terminator sequences. EMBO J. 21, 38983907 (2002)
  36. Svenningsen, S. L. et al. On the role of Cro in lambda prophage induction. Proc. Natl Acad. Sci. USA 102, 44654469 (2005)
  37. Yu, D. et al. An efficient recombination system for chromosome engineering in Escerichia coli. Proc. Natl Acad. Sci. USA 97, 59785983 (2000)
  38. Tischer, B. K. et al. Two-step red-mediated recombination for versatile high-efficiency markerless DNA manipulation in Escherichia coli. Biotechniques 40, 191197 (2006)

Download references

Author information

  1. These authors contributed equally to this work.

    • Asaf Levy &
    • Moran G. Goren
  2. These authors jointly supervised this work.

    • Udi Qimron &
    • Rotem Sorek

Affiliations

  1. Department of Molecular Genetics, Weizmann Institute of Science, Rehovot 76100, Israel

    • Asaf Levy,
    • Gil Amitai &
    • Rotem Sorek
  2. Department of Clinical Microbiology and Immunology, Sackler School of Medicine, Tel Aviv University, Tel Aviv 69978, Israel

    • Moran G. Goren,
    • Ido Yosef,
    • Oren Auster,
    • Miriam Manor,
    • Rotem Edgar &
    • Udi Qimron

Contributions

M.G., U.Q., A.L. and R.S. conceived and designed the research studies; M.G., I.Y., O.A., M.M., G.A. and R.E. performed the experiments; A.L., M.G., U.Q. and R.S. analysed data; A.L., M.G., U.Q. and R.S. wrote the manuscript.

Competing financial interests

The authors declare no competing financial interests.

Corresponding authors

Correspondence to:

RNA sequencing data are available in the National Center for Biotechnology Information Sequence Read Archive database under accession numbers SRX862155SRX862158 in study SRP053013. Raw data of spacer sequences are accessible at http://www.weizmann.ac.il/molgen/Sorek/files/CRISPR_adaptation_2015/crispr_adaptation_2015_data.html.

Author details

Extended data figures and tables

Extended Data Figures

  1. Extended Data Figure 1: Graphic overview of the procedure for characterizing the frequency and sequence of newly acquired spacers. (181 KB)

    DNA from cultures of either E. coli K-12 (left) or E. coli BL21-AI (right) strains expressing Cas1–Cas2 from two different plasmids were used as templates for PCR. Round 1 was used to determine the frequency of spacer acquisition by comparing occurrences of expanded arrays to wild-type (WT) arrays. Round 2 amplified only the expanded arrays and, followed by deep sequencing, was used to determine the sequence, location and source of newly acquired spacers.

  2. Extended Data Figure 2: PAMs and DNA content along the E. coli BL21-AI genome. (92 KB)

    a, Distribution of PAM (AAG) sequences. Each data point represents the number of PAMs in a window of 10 kb. b, DNA content of a culture growing in log phase. Genomic DNA was extracted from E. coli BL21-AI cells carrying the pCas plasmid, grown at log phase, and was sequenced using the Illumina technology. The resulting reads were mapped to the sequenced E. coli BL21(DE3) genome (GenBank accession number NC_012947). Areas where few or no reads map to the genome represent regions that are present in the reference BL21(DE3) genome but are missing from the genome of the sequenced strain (BL21-AI).

  3. Extended Data Figure 3: Distribution of newly acquired spacers on the genome during synchronized replication. (260 KB)

    E. coli K-12ΔcasCdnaC2 cells were transferred from 39 °C (replication restrictive temperature) to 30 °C (replication permissive). Cas1–Cas2 were induced in these cells 30 min before the transfer to 30 °C and during the growth in 30 °C. Newly acquired spacers were sequenced at the given time points: a, following 20 min; b, following 40 min; c, following 60 min from replication initiation. The positions of the newly acquired spacers in windows of 100 kb are shown, and their fraction out of the total new spacers in the sample.

  4. Extended Data Figure 4: A model explaining the preference for spacer acquisition near TerC compared with TerA in E. coli BL21-AI. (209 KB)

    The DNA manipulation at the CRISPR region forms a replication fork stalling site, and leads to extensive spacer acquisition upstream of the CRISPR. While the clockwise fork is stalled at the CRISPR, the anticlockwise fork reaches the Ter region and is stalled at the respective Ter site, TerC, leading to extensive spacer acquisition upstream of TerC. Another factor that can contribute to the observed TerC/TerA bias may be that the clockwise replichore in E. coli (oriC to TerA) is longer than the anticlockwise one (oriC to TerC), leading the forks to stall at TerC more often than at TerA.

  5. Extended Data Figure 5: The protein product of T7 gene 5.9 inhibits spacer acquisition activity. (81 KB)

    E. coli BL21-AI strains harbouring pBAD-Cas1+2 and pBAD33-gp5.9 (lane 1) or pBAD33 vector control (lane 2) were grown overnight in the presence of inducers (0.4% l-arabinose). Gel shows PCR products amplified from the indicated cultures using primers annealing to the leader and to the fifth spacer of the CRISPR array. Results represent one of three independent experiments.

  6. Extended Data Figure 6: Distribution of protospacers across the plasmids. (109 KB)

    a, Distribution across pCtrl-Chi; b, distribution across pChi plasmids. Circular representation of the 4.7 kb plasmid is presented, with the inserted 4-Chi cluster present at the top of the circle. Black bars indicate the number of PAM-derived spacers sequenced from each position; green bars represent non-PAM spacers. Scale bar, 100,000 spacers. Pooled protospacers from two replicates are presented for each panel.

Extended Data Tables

  1. Extended Data Table 1: Spacer acquisition in normal and perturbed conditions (408 KB)
  2. Extended Data Table 2: Replication-dependent spacer acquisition (343 KB)
  3. Extended Data Table 3: Involvement of the DNA repair machinery in spacer acquisition (300 KB)

Supplementary information

Excel files

  1. Supplementary Table 1 (657 KB)

    This file contains the data for Supplementary Table 1.

  2. Supplementary Table 2 (16 KB)

    This table contains the bacterial strains, plasmids and oligonucleotides used in this study.

Additional data