Article

Optimized sgRNA design to maximize activity and minimize off-target effects of CRISPR-Cas9

Received:
Accepted:
Published online:

Abstract

CRISPR-Cas9–based genetic screens are a powerful new tool in biology. By simply altering the sequence of the single-guide RNA (sgRNA), one can reprogram Cas9 to target different sites in the genome with relative ease, but the on-target activity and off-target effects of individual sgRNAs can vary widely. Here, we use recently devised sgRNA design rules to create human and mouse genome-wide libraries, perform positive and negative selection screens and observe that the use of these rules produced improved results. Additionally, we profile the off-target activity of thousands of sgRNAs and develop a metric to predict off-target sites. We incorporate these findings from large-scale, empirical data to improve our computational design rules and create optimized sgRNA libraries that maximize on-target activity and minimize off-target effects to enable more effective and efficient genetic screens and genome engineering.

  • Subscribe to Nature Biotechnology for full access:

    $250

    Subscribe

Additional access options:

Already a subscriber?  Log in  now or  Register  for online access.

Accessions

Primary accessions

Sequence Read Archive

References

  1. 1.

    et al. A programmable dual-RNA-guided DNA endonuclease in adaptive bacterial immunity. Science 337, 816–821 (2012).

  2. 2.

    et al. RNA-guided human genome engineering via Cas9. Science 339, 823–826 (2013).

  3. 3.

    et al. Multiplex genome engineering using CRISPR/Cas systems. Science 339, 819–823 (2013).

  4. 4.

    et al. RNA-programmed genome editing in human cells. eLife 2, e00471 (2013).

  5. 5.

    & Genetic screens and functional genomics using CRISPR/Cas9 technology. FEBS J. 282, 1383–1393 (2015).

  6. 6.

    et al. Genome-scale CRISPR-Cas9 knockout screening in human cells. Science 343, 84–87 (2014).

  7. 7.

    , , & Genetic screens in human cells using the CRISPR-Cas9 system. Science 343, 80–84 (2014).

  8. 8.

    , , , & Genome-wide recessive genetic screening in mammalian cells with a lentiviral CRISPR-guide RNA library. Nat. Biotechnol. 32, 267–273 (2014).

  9. 9.

    et al. High-frequency off-target mutagenesis induced by CRISPR-Cas nucleases in human cells. Nat. Biotechnol. 31, 822–826 (2013).

  10. 10.

    et al. Low incidence of off-target mutations in individual CRISPR-Cas9 and TALEN targeted human stem cell clones detected by whole-genome sequencing. Cell Stem Cell 15, 27–30 (2014).

  11. 11.

    et al. Double nicking by RNA-guided CRISPR Cas9 for enhanced genome editing specificity. Cell 154, 1380–1389 (2013).

  12. 12.

    , & Fusion of catalytically inactive Cas9 to FokI nuclease improves the specificity of genome modification. Nat. Biotechnol. 32, 577–582 (2014).

  13. 13.

    et al. DNA targeting specificity of RNA-guided Cas9 nucleases. Nat. Biotechnol. 31, 827–832 (2013).

  14. 14.

    et al. Rational design of highly active sgRNAs for CRISPR-Cas9-mediated gene inactivation. Nat. Biotechnol. 32, 1262–1267 (2014).

  15. 15.

    , & Improved vectors and genome-wide libraries for CRISPR screening. Nat. Methods 11, 783–784 (2014).

  16. 16.

    et al. A genome-scale RNA interference screen implicates NF1 loss in resistance to RAF inhibition. Cancer Discov. 3, 350–362 (2013).

  17. 17.

    et al. Clinical efficacy of a RAF inhibitor needs broad target blockade in BRAF-mutant melanoma. Nature 467, 596–599 (2010).

  18. 18.

    et al. COT drives resistance to RAF inhibition through MAP kinase pathway reactivation. Nature 468, 968–972 (2010).

  19. 19.

    et al. AZD6244 (ARRY-142886), a potent inhibitor of mitogen-activated protein kinase/extracellular signal-regulated kinase kinase 1/2 kinases: mechanism of action in vivo, pharmacokinetic/pharmacodynamic relationship, and potential for combination in preclinical models. Mol. Cancer Ther. 6, 2209–2219 (2007).

  20. 20.

    & Statistical significance for genomewide studies. Proc. Natl. Acad. Sci. USA 100, 9440–9445 (2003).

  21. 21.

    et al. MAGeCK enables robust identification of essential genes from genome-scale CRISPR/Cas9 knockout screens. Genome Biol. 15, 554 (2014).

  22. 22.

    et al. Discovery and saturation analysis of cancer genes across 21 tumour types. Nature 505, 495–501 (2014).

  23. 23.

    et al. TRIAD1 inhibits MDM2-mediated p53 ubiquitination and degradation. FEBS Lett. 586, 3057–3063 (2012).

  24. 24.

    & Multivalent binding of p53 to the STAGA complex mediates coactivator recruitment after UV damage. Mol. Cell. Biol. 28, 2517–2527 (2008).

  25. 25.

    , , , & Measuring error rates in genomic perturbation screens: gold standards for human functional genomics. Mol. Syst. Biol. 10, 733 (2014).

  26. 26.

    et al. Gene set enrichment analysis: a knowledge-based approach for interpreting genome-wide expression profiles. Proc. Natl. Acad. Sci. USA 102, 15545–15550 (2005).

  27. 27.

    et al. Identification and characterization of essential genes in the human genome. Science 350, 1096–1101 (2015).

  28. 28.

    & The HPRT locus. Cell 16, 1–9 (1979).

  29. 29.

    , , & Easy quantitative assessment of genome editing by sequence trace decomposition. Nucleic Acids Res. 42, e168 (2014).

  30. 30.

    et al. Molecular mechanism of ADP-ribose hydrolysis by human NUDT5 from structural and kinetic studies. J. Mol. Biol. 379, 568–578 (2008).

  31. 31.

    & Acute lymphoblastic leukaemia: a model for the pharmacogenomics of cancer therapy. Nat. Rev. Cancer 6, 117–129 (2006).

  32. 32.

    et al. The Cancer Cell Line Encyclopedia enables predictive modelling of anticancer drug sensitivity. Nature 483, 603–607 (2012).

  33. 33.

    et al. Molecular signatures database (MSigDB) 3.0. Bioinformatics 27, 1739–1740 (2011).

  34. 34.

    et al. Discovery of cancer drug targets by CRISPR-Cas9 screening of protein domains. Nat. Biotechnol. 33, 661–667 (2015).

  35. 35.

    , , & Unraveling CRISPR-Cas9 genome engineering parameters via a library-on-library approach. Nat. Methods 12, 823–826 (2015).

  36. 36.

    et al. Sequence determinants of improved CRISPR sgRNA design. Genome Res. 25, 1147–1157 (2015).

  37. 37.

    et al. Structures of Cas9 endonucleases reveal RNA-mediated conformational activation. Science 343, 1247997 (2014).

  38. 38.

    , , , & DNA interrogation by the CRISPR RNA-guided endonuclease Cas9. Nature 507, 62–67 (2014).

  39. 39.

    , , & Microhomology-based choice of Cas9 nuclease target sites. Nat. Methods 11, 705–706 (2014).

  40. 40.

    et al. Genome-scale CRISPR-mediated control of gene repression and activation. Cell 159, 647–661 (2014).

  41. 41.

    et al. CRISPR/Cas9 systems have off-target activity with insertions or deletions between target DNA and guide RNA sequences. Nucleic Acids Res. 42, 7473–7485 (2014).

  42. 42.

    , , , & CCTop: an intuitive, flexible and reliable CRISPR/Cas9 target prediction tool. PLoS One 10, e0124633–e11 (2015).

  43. 43.

    et al. GUIDE-seq enables genome-wide profiling of off-target cleavage by CRISPR-Cas nucleases. Nat. Biotechnol. 33, 187–197 (2015).

  44. 44.

    & Fast gapped-read alignment with Bowtie 2. Nat. Methods 9, 357–359 (2012).

  45. 45.

    , & E-CRISP: fast CRISPR target site identification. Nat. Methods 11, 122–123 (2014).

  46. 46.

    , , & Cas-OFFinder: a fast and versatile algorithm that searches for potential off-target sites of Cas9 RNA-guided endonucleases. Bioinformatics 30, 1473–1475 (2014).

  47. 47.

    et al. Next-generation libraries for robust RNA interference-based genome-wide screens. Proc. Natl. Acad. Sci. USA 112, E3384–E3391 (2015).

  48. 48.

    Tests for comparing elements of a correlation matrix. Psychol. Bull. 87, 245–251 (1980).

  49. 49.

    , , & A murine macrophage cell line, immortalized by v-raf and v-myc oncogenes, exhibits normal macrophage functions. Eur. J. Immunol. 17, 1491–1498 (1987).

  50. 50.

    , & A comparative review of cell culture systems for the study of microglial biology in Alzheimer's disease. J. Neuroinflammation 9, 115 (2012).

  51. 51.

    , , , & Support vector machines and kernels for computational biology. PLoS Comput. Biol. 4, e1000173 (2008).

  52. 52.

    et al. Biopython: freely available Python tools for computational molecular biology and bioinformatics. Bioinformatics 25, 1422–1423 (2009).

  53. 53.

    MELTING, computing the melting temperature of nucleic acid duplex. Bioinformatics 17, 1226–1227 (2001).

  54. 54.

    Tests for comparing elements of a correlation matrix. Psychol. Bull. 87, 245–251 (1980).

Download references

Acknowledgements

We thank M. Tomko, M. Greene, A. Brown, D. Alan and T. Green for software engineering support, and T. Nguyen, N. Tran and X. Yang for library production support (Broad Institute). Z.T. is funded by NIH 5K12CA087723-12, ASCO Young Investigator Award, LLS Special Fellow Award. J.G.D. is a Merkin Institute Fellow and is supported by the Next Generation Fund at the Broad Institute of MIT and Harvard.

Author information

Author notes

    • John G Doench
    • , Nicolo Fusi
    • , Meagan Sullender
    • , Mudra Hegde
    • , Emma W Vaimberg
    •  & Jennifer Listgarten

    These authors contributed equally to this work.

Affiliations

  1. Broad Institute of MIT and Harvard, Cambridge, Massachusetts, USA.

    • John G Doench
    • , Meagan Sullender
    • , Mudra Hegde
    • , Emma W Vaimberg
    • , Katherine F Donovan
    • , Ian Smith
    • , Zuzana Tothova
    •  & David E Root
  2. Microsoft Research New England, Cambridge, Massachusetts, USA.

    • Nicolo Fusi
    •  & Jennifer Listgarten
  3. Dana Farber Cancer Institute, Division of Hematologic Malignancies, Boston, Massachusetts, USA.

    • Zuzana Tothova
  4. Department of Pathology and Immunology, Washington University School of Medicine, Washington University School of Medicine., St. Louis, Missouri, USA.

    • Craig Wilen
    • , Robert Orchard
    •  & Herbert W Virgin

Authors

  1. Search for John G Doench in:

  2. Search for Nicolo Fusi in:

  3. Search for Meagan Sullender in:

  4. Search for Mudra Hegde in:

  5. Search for Emma W Vaimberg in:

  6. Search for Katherine F Donovan in:

  7. Search for Ian Smith in:

  8. Search for Zuzana Tothova in:

  9. Search for Craig Wilen in:

  10. Search for Robert Orchard in:

  11. Search for Herbert W Virgin in:

  12. Search for Jennifer Listgarten in:

  13. Search for David E Root in:

Contributions

J.G.D., M.S., E.W.V., Z.T., C.W. and R.O. designed experiments; M.S., E.W.V., K.F.D., Z.T., C.W. and R.O. performed experiments; J.G.D., M.H. and I.S. analyzed experiments; N.F. and J.L. performed the computational modeling; J.G.D., N.F., J.L. and D.E.R. wrote the manuscript with assistance from other authors; J.G.D., H.W.V. and D.E.R. supervised the research.

Competing interests

N.F. and J.L. are employed by Microsoft Research.

Corresponding authors

Correspondence to John G Doench or Nicolo Fusi or Jennifer Listgarten or David E Root.

Supplementary information

PDF files

  1. 1.

    Supplementary Figures

    Supplementary Figures 1–22

Zip files

  1. 1.

    Supplementary Tables 1–23

    Supplementary Table 1. Rounds of selection used to design Avana and Asiago library Supplementary Table 2. sgRNAs in the six subpools of Avana library Supplementary Table 3. sgRNAs in the six subpools of Asiago library Supplementary Table 4. Screening data for vemurafenib in A375 cells for all biological replicates screened with Avana libraries (divided by subpools) as well as GeCKOv1 and GeCKOv2 libraries Supplementary Table 5. RIGER analysis of vemurafenib screens using weighted-sum option Supplementary Table 6. STARS analysis of vemurafenib screens Supplementary Table 7. List of PanCancer genes Supplementary Table 8. Screening data for selumetinib in A375 cells for all biological replicates screened with Avana library Supplementary Table 9. STARS analysis of selumetinib screens Supplementary Table 10. Negative selection screening data in HT29 and A375 cells with GeCKO libraries Supplementary Table 11. Negative selection screening data in HT29 and A375 cells with GeCKO libraries and the set of 291 core essential genes annotated by Hart and colleagues Supplementary Table 12. STARS analysis of the negative selection screening data for GeCKO and Avana libraries individually Supplementary Table 13. STARS analysis of the negative selection screening data for GeCKO and Avana libraries merged Supplementary Table 14. Screening data for 6-thioguanine screen in 293T, A375 and HT29 cells Supplementary Table 15. Screening data for interferon-gamma treatment of BV2 cells and output of STARS analysis Supplementary Table 16. Screening data for the tiling of resistance genes Supplementary Table 17. Gini importance of individual features in the gradient-boosted regression tress model, Rule Set 2 Supplementary Table 18. Screening data for off-target analysis of CD33 in MOLM13 cells Supplementary Table 19. Percent-active, delta-log-fold-change, and one-sided Welch's t-test p-value calculations for the CD33 off-target dataset that is used to calculate the CFD score Supplementary Table 20. Activity of sgRNAs designed against H2-D1 that have up to 6 mismatches to H2-K Supplementary Table 21. sgRNAs in the Brunello library Supplementary Table 22. sgRNAs in the Brie library Supplementary Table 23. sgRNA sequences and primers used for individual follow-up experiments

  2. 2.

    Supplementary Code