Abstract

Despite widespread use of CRISPR, comprehensive data on the frequency and impact of Cas9-mediated off-targets in modified rodents are limited. Here we present deep-sequencing data from 81 genome-editing projects on mouse and rat genomes at 1,423 predicted off-target sites, 32 of which were confirmed, and show that high-fidelity Cas9 versions reduced off-target mutation rates in vivo. Using whole-genome sequencing data from ten mouse embryos, treated with a single guide RNA (sgRNA), and from their genetic parents, we found 43 off-targets, 30 of which were predicted by an adapted version of GUIDE-seq.

Access optionsAccess options

Rent or Buy article

Get time limited or full article access on ReadCube.

from$8.99

All prices are NET prices.

Additional information

Publisher’s note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

References

  1. 1.

    Frock, R. L. et al. Nat. Biotechnol. 33, 179–186 (2015).

  2. 2.

    Tsai, S. Q. et al. Nat. Biotechnol. 33, 187–197 (2015).

  3. 3.

    Kim, D. et al. Nat. Methods 12, 237–243 (2015).

  4. 4.

    Crosetto, N. et al. Nat. Methods 10, 361–365 (2013).

  5. 5.

    Cameron, P. et al. Nat. Methods 14, 600–606 (2017).

  6. 6.

    Tsai, S. Q. et al. Nat. Methods 14, 607–614 (2017).

  7. 7.

    Hsu, P. D. et al. Nat. Biotechnol. 31, 827–832 (2013).

  8. 8.

    Doench, J. G. et al. Nat. Biotechnol. 34, 184–191 (2016).

  9. 9.

    Haeussler, M. et al. Genome Biol. 17, 148 (2016).

  10. 10.

    Hnisz, D. et al. Science 351, 1454–1458 (2016).

  11. 11.

    Paquet, D. et al. Nature 533, 125–129 (2016).

  12. 12.

    Singh, P., Schimenti, J. C. & Bolcun-Filas, E. Genetics 199, 1–15 (2015).

  13. 13.

    Dow, L. E. et al. Nat. Biotechnol. 33, 390–394 (2015).

  14. 14.

    Slaymaker, I. M. et al. Science 351, 84–88 (2016).

  15. 15.

    Kleinstiver, B. P. et al. Nature 529, 490–495 (2016).

  16. 16.

    Chen, J. S. et al. Nature 550, 407–410 (2017).

  17. 17.

    Lin, Y. et al. Nucleic Acids Res. 42, 7473–7485 (2014).

  18. 18.

    Takeo, T. & Nakagata, N. PLoS One 10, e0128330 (2015).

  19. 19.

    Hu, L.-L. et al. Zygote 20, 361–369 (2012).

  20. 20.

    Ye, J. et al. BMC Bioinformatics 13, 134 (2012).

  21. 21.

    Wu, T. D. & Nacu, S. Bioinformatics 26, 873–881 (2010).

  22. 22.

    Darlington, G. J., Bernhard, H. P., Miller, R. A. & Ruddle, F. H. J. Natl. Cancer Inst. 64, 809–819 (1980).

  23. 23.

    Topp, W. C. Virology 113, 408–411 (1981).

  24. 24.

    Klebe, R. J. & Ruddle, F. H. J. Cell Biol. 43, 69A (1969).

  25. 25.

    Wickham, H. ggplot2 (Springer, New York, 2009).

Download references

Acknowledgements

We thank the Genentech animal core groups for animal care and preparation of genomic DNA; B. Haley and V. Dixit for helpful discussions; S. Seshagiri for next-generation sequencing support; K. Kawamura, C. Reyes and P.-Z. Tang (Thermo Fisher) for generating the TEG-seq data; and A. Bruce for generating Supplementary Fig. 1a. Most authors were Genentech employees at the time of the study, and all studies were funded by Genentech, a member of the Roche Group. M. Haeussler was funded by NIH/NHGRI grant 5U41HG002371-15.

Author information

Affiliations

  1. Department of Molecular Biology, Genentech, Inc., South San Francisco, CA, USA

    • Keith R. Anderson
    • , Vasantharajan Janakiraman
    • , Jessica Lund
    • , Zora Modrusan
    • , Jeremy Stinson
    • , Qixin Bei
    • , Andrew Buechler
    • , Charles Yu
    • , Sobha R. Thamminana
    • , Lucinda Tam
    • , Merone Roose-Girma
    • , Steffen Durinck
    •  & Søren Warming
  2. Jack Baskin School of Engineering, University of California, Santa Cruz, Santa Cruz, CA, USA

    • Maximilian Haeussler
  3. Bioinformatics & Computational Biology, Genentech, Inc., South San Francisco, CA, USA

    • Colin Watanabe
  4. Laboratory Animal Resources, Genentech, Inc., South San Francisco, CA, USA

    • Michael-Anne Sowick
    • , Tuija Alcantar
    • , Natasha O’Neil
    • , Jinjie Li
    • , Linda Ta
    • , Lisa Lima
    •  & Xin Rairdan

Authors

  1. Search for Keith R. Anderson in:

  2. Search for Maximilian Haeussler in:

  3. Search for Colin Watanabe in:

  4. Search for Vasantharajan Janakiraman in:

  5. Search for Jessica Lund in:

  6. Search for Zora Modrusan in:

  7. Search for Jeremy Stinson in:

  8. Search for Qixin Bei in:

  9. Search for Andrew Buechler in:

  10. Search for Charles Yu in:

  11. Search for Sobha R. Thamminana in:

  12. Search for Lucinda Tam in:

  13. Search for Michael-Anne Sowick in:

  14. Search for Tuija Alcantar in:

  15. Search for Natasha O’Neil in:

  16. Search for Jinjie Li in:

  17. Search for Linda Ta in:

  18. Search for Lisa Lima in:

  19. Search for Merone Roose-Girma in:

  20. Search for Xin Rairdan in:

  21. Search for Steffen Durinck in:

  22. Search for Søren Warming in:

Contributions

K.R.A. and S.W. designed the study and wrote the manuscript, with input from all other coauthors. K.R.A., M.H., C.W. and S.W. analyzed the data. S.D., C.W. and Q.B. generated the primer design and deep-sequencing analysis pipeline. S.D. analyzed the whole-genome sequencing data. K.R.A., A.B., M.R.-G., L. Tam, C.Y. and S.R.T. designed the CRISPR strategies and analyzed the founders and G1 progeny. X.R., T.A., N.O., J. Li, L. Ta and L.L. performed in vitro fertilization, microinjections and tissue/embryo collection. M.-A.S. coordinated mosaic founder breeding and colony management. V.J., J. Lund, Z.M. and J.S. generated sequencing libraries and performed deep sequencing.

Competing interests

The authors declare no competing interests.

Corresponding author

Correspondence to Søren Warming.

Integrated supplementary information

  1. Supplementary Figure 1 NGS-based analysis of G0 founders.

    a, Schematic overview of the analysis workflow resulting in the identification of G1 animals without off-targets. OT-number: Refers to algorithm-predicted off-targets on the top-15 list. b, Fraction of sgRNAs with off-targets. c, Fraction of algorithm-predicted analyzed loci with off-targets.

  2. Supplementary Figure 2 Predicted lists for all off-target positive sgRNAs (1).

    List of top 11 or top 15 predicted off-targets with mismatches to target region highlighted by colored bases. Specificity and off-target scores (MIT algorithm) reflect values at time of sgRNA design. Validated off-targets are highlighted in yellow. Predicted off-targets without NGS data are marked by grey boxes. d, k, l: Some off-targets with higher scores are placed lower in the table due to requirement for NAG PAM.

  3. Supplementary Figure 3 Predicted lists for all off-target positive sgRNAs (2).

    List of top 11 or top 15 predicted off-targets with mismatches to target region highlighted by colored bases. Specificity and off-target scores (MIT algorithm) reflect values at time of sgRNA design. Validated off-targets are highlighted in yellow. Predicted off-targets without NGS data are marked by grey boxes. c, f: Some off-targets with higher scores are placed lower in the table due to requirement for NAG PAM.

  4. Supplementary Figure 4 Summary table of all 32 identified off-targets.

    Off-targets are aligned to a generic target sequence and generic NGG PAM. Number of mismatches in each sgRNA position is summarized by a histogram. Off-target scores as in Supplementary Figs. 2 and 3.

  5. Supplementary Figure 5 Thirty-two off-targets identified from 81 CRISPR projects.

    a-o, Box-and-whisker plots showing the distribution of the remaining 17/21 targets (black dots) and 27/32 off-targets (red dots) not depicted in Fig. 1, sorted by project. Y-axes show fraction of NGS-reads with evidence of Cas9 activity at each locus (one minus fraction of wildtype reads). a, EYFP, n=6 mice. n/d: no data; the EYFP sgRNA targets a mouse knock-in allele and on-target efficiency was analyzed by Sanger sequencing. b, Ehmt2, n=14 mice. c, Gsdma, n=10 mice. d, Il17rb, n=10 mice. e, Mertk, n=6 mice. f, Mrgprb2, n=19 mice. g, Pdcd1, n=33 mice. h, Peg10, n=41 mice (sgRNA1) and n=8 mice (sgRNA2). i, Pnpla3, n=3 mice. j, Tigit, n=4 mice. k, Wisp1, n=10 mice. l, Il13, n=16 rats. m, Map3k14, n=10 rat embryos. n, Trpa1, n=24 rats. o, Usp30, n=1 rat. All box-and-whisker plots depict min-max range, four quartiles, and center line represents median. Data points depict individual animals. OT: Off-target rank on top 15 list of predicted off-target locations.

  6. Supplementary Figure 6 Germline transmission of off-target alleles.

    a, Allele frequencies for G0 mosaic founders from projects with off-targets. Data from projects where G0 founders with off-target alleles were bred is included in this figure. For the remaining projects, G0 founders without off-target alleles were identified and used for subsequent breeding. Animal ID for founders with germline transmission data is highlighted in red. Sum of alleles <3%: Combined reads of alleles that did not individually meet the 3% threshold, including off-target allele reads with PCR/sequencing errors. WT allele: sequencing reads matching the genome reference sequence, including reads with PCR/sequencing errors that do not show evidence of Cas9 activity. b, Pie charts showing transmission of alleles from G0 mosaic parents to G1 progeny. Number of G1 animals assessed for each project is indicated. Transmitted alleles are color-coded as in “a”. No sequencing data available for transmission of Usp30 OT1 alleles.

  7. Supplementary Figure 7 G0 mosaic founder allele frequency distribution.

    Allele frequencies for G0 mosaic animals not included in Supplementary Figure 6. For all projects except rat Map3k14 (analysis of embryos), G0 founders without off-target alleles were chosen for breeding. Gsdmc G0 7989-11 allele 1 and Tigit G0 8190-6 allele 1 were SNPs (defined as deletion+insertion of one bp). All other off-target alleles are InDels. Sum of alleles <3%: Combined reads of alleles that did not individually meet the 3% threshold, including off-target allele reads with PCR/sequencing errors. WT allele: sequencing reads matching the genome reference sequence, including reads with PCR/sequencing errors that do not show evidence of Cas9 activity.

  8. Supplementary Figure 8 Examples of genomic off-target alignments.

    Integrated Genome Viewer (IGV) snapshots for 4 off-targets each with a single positive animal. a, EYFP, animal 8014-5 is positive. b, Gsdma, animal 7941-44 is positive. c, Il17rb (2), animal 8085-8 is positive. d, Pdcd1, animal 7919-41 is positive. A nearby SNP is also apparent. Alignment and orientation of the sgRNA:genomic target as well as the PAM site is indicated. Windows are centered on the position of the Cas9-induced double-strand break. Off-target negative G0 founders (a-d) and a wildtype control (a,c,d) is included.

  9. Supplementary Figure 9 Limited predictive power from algorithms.

    a, Dot plots showing the distribution of average on-target activities for sgRNAs with (red, n=20 sgRNAs) or without (black, n=79 sgRNAs) off-target activity. Average on-target activities are shown as percent NGS reads. On-target deep sequencing data is available for 20/21 off-target-positive sgRNAs (EYFP on-target efficiency assessed by Sanger sequencing). On-target deep sequencing data is available for 79/98 off-target negative sgRNAs. For the remaining 19 sgRNAs, on-target activity was analyzed by PCR and Sanger sequencing. Center line depicts mean and error bars s.e.m. Un-paired two-tailed t-test for means being identical: t=1.473, df=97, p=0.144. Raw data in Supplementary Table 5. b, Fraction of analyzed G0 founders with off-targets as a function of calculated sgRNA specificity score (MIT). n=119 sgRNAs. Pearson r2=0.034. Two-tailed t-test: r not significantly different from zero (p=0.046). Vertical line denotes a specificity score cut-off: above a score of 66, the odds ratio of identifying an sgRNA without off-targets is 18 (two-tailed Fisher’s exact test for independence, p=0.0001). c, Histogram showing number (n=32) of validated off-targets at each position on the predicted top 15 lists for all 21 off-target positive sgRNAs. OT: Off-target.

  10. Supplementary Figure 10 Reduction of off-targets in mouse and rat cell lines with re-engineered Cas9.

    a, Percent mutation reads at the mouse Pnpla3 genomic target and the originally identified 4 off-target loci in a mouse cell line (Hepa1-6) from plasmid-based transfection of wildtype and re-engineered Cas9 variants + sgRNA. n=3 independent transfections. Off-target numbering as in Supplementary Fig. 3c. Un-paired two-tailed t-test for on-target means being identical (df=4): wt vs. 1.1, t=1.991, p=0.1173; wt vs. HF1, t=3.473, p=0.0255; 1.1 vs. HF1, t=1.996, p=0.1167. b, Percent mutation reads at the rat Map3k14 target and originally identified off-target in a rat cell line (Rat-2) from plasmid-based transfection of wildtype and re-engineered Cas9 + sgRNA. n=3 independent transfections. Off-target numbering as in Supplementary Figure 3g. Un-paired two-tailed t-test for on-target means being identical (df=4): wt vs. 1.1, t=3.928, p=0.0171; wt vs. HF1, t=5.279, p=0.0062; 1.1 vs. HF1, t=.9775, p=0.3837. Center lines represent mean and error bars represent s.e.m. For controls (ctrl), n=2 independent mock transfections. No statistical analysis for ctrl. experiments. Raw data in Supplementary Table 6.

  11. Supplementary Figure 11 Forty-three Pnpla3 off-targets identified by whole-genome sequencing.

    Venn diagram summarizing overlap between Pnpla3 algorithm-predicted off-targets (10,360, 5 mismatch, NAG and NGA PAMs allowed), confirmed TEG-seq off-targets (105), and off-targets identified from whole-genome sequencing (43). Notes: (I) two of these off-targets were identified by TEG-seq but not confirmed by follow-up ampli-seq. (II,III) Not identified by TEG-seq but 4/7 (II) and 3/6 (III) were subsequently identified in the original TEG-seq data set by targeted re-sequencing, albeit at very low frequencies.

  12. Supplementary Figure 12 Summary of 43 off-targets identified from whole-genome sequencing.

    Off-targets with mismatches to the Pnpla3 target region highlighted by colored bases. MIT and CFD off-target scores were calculated using CRISPOR9 and are based on a 20 nt sgRNA sequence. TEG-seq reads: Original number of sequence reads at each off-target locus. Ampli-seq indel%: Percent of reads with evidence of Cas9 activity from targeted amplicon re-sequencing of all loci identified by TEG-seq. Avg. emb. indel%: Average embryo indel percentage (n=10 mouse embryos). Embryos w/indel: Number of embryos with Cas9 activity at the given off-target locus. OT: Off-target. OT1, 4, 6, 13 correspond to originally validated OT7, 9, 11, 5, respectively, in Supplementary Figure 3c (Pnpla3).

  13. Supplementary Figure 13 Examples of genomic alignments for Pnpla3 whole-genome sequencing hits.

    Integrated Genome Viewer (IGV) snapshots for 4/43 identified off-targets. a, Off-target 1, high mutation rates for embryos 180 and 183. b, Off-target 2, high frequency mutation rate for embryo 181 and medium rate for embryo 184. c, Off-target 27, an insertion occurring in embryo 181 (purple I symbol). d, Off-target 39, an insertion in embryo 231. Alignment and orientation of the sgRNA:genomic target as well as the PAM site is indicated. Windows are not centered on the position of the Cas9-induced double-strand break. Sequence reads from off-target negative littermates (c, d) and both genetic parents (a-d) are included. OT: Off-target.

  14. Supplementary Figure 14 Whole-genome sequencing off-target frequency distribution in embryos.

    Radar plots for the eight embryos not depicted in Fig. 2a. The plots show the distribution of Cas9 activity per embryo across the 43 off-target loci and the Pnpla3 target (ON). Percent mutation read quartiles are indicated by grey circles.

  15. Supplementary Figure 15 Whole-genome sequencing embryo data are not strongly correlated with CFD and MIT scores.

    a, Box-and-whisker plots showing percent mutation reads for individual embryos at all 43 off-targets identified from WGS analysis. n=10 independent mouse embryos (individually identified in the figure legend). Plots show the four quartiles and median (center line), the ends of the whiskers depict min. and max values of the distribution. b, Correlation between number of TEG-seq reads in Neuro-2a cells and average embryo mutation frequency (n=120 loci, 119 off-targets, black dots, and Pnpla3 target, red dot). TEG-seq reads are shown as reads per million (RPM: normalized to total of number of reads in the TEG-seq experiment). c, MIT off-target scores for the 43 WGS off-target loci (Supplementary Figure 12) plotted against average embryo mutation frequency (n=43 off-target loci). On-target not included. d, Same as "c", but CFD off-target scores (n=43 off-target loci). For 9/43 loci scores could not be calculated (number of mismatches exceeding 5). For b, c, d, r2 values (coefficient of determination) are indicated. Two-tailed t-test for r values not different from zero: b, p<0.0001; c, p=0.0001; d, p<0.0001. e, Histogram showing percentage and number of off-targets with more than 10% average mutation reads captured at a given ampli-seq mutation rate cut-off. f, Histogram showing fraction and number of off-targets with more than 10% average mutation reads captured at a given MIT score cut-off. Number of predicted off-targets to analyze for each cut-off is superimposed in white lettering. g, Same as in “f”, but for CFD score cut-off.

Supplementary information

  1. Supplementary Text and Figures

    Supplementary Figures 1–15 and Supplementary Notes 1–6

  2. Reporting Summary

  3. Supplementary Table 1

    Full list of sgRNAs and predicted off-targets

  4. Supplementary Table 2

    Allele breakdown for all G0 animals in this study

  5. Supplementary Table 3

    Summary of TEG-seq and whole-genome sequencing data for Pnpla3

  6. Supplementary Table 4

    Summary of TEG-seq analysis of Casp7 and Il17rb sgRNA

  7. Supplementary Table 5

    Raw data for Supplementary Fig. 9a

  8. Supplementary Table 6

    Raw data for Supplementary Fig. 10

  9. Source Data, Fig. 1g,h

About this article

Publication history

Received

Accepted

Published

DOI

https://doi.org/10.1038/s41592-018-0011-5

Further reading