Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Article
  • Published:

Prediction of off-target activities for the end-to-end design of CRISPR guide RNAs

Abstract

Off-target effects of the CRISPR–Cas9 system can lead to suboptimal gene-editing outcomes and are a bottleneck in its development. Here, we introduce two interdependent machine-learning models for the prediction of off-target effects of CRISPR–Cas9. The approach, which we named Elevation, scores individual guide–target pairs, and also aggregates them into a single, overall summary guide score. We demonstrate that Elevation consistently outperforms competing approaches on both tasks. We also introduce an evaluation method that balances errors between active and inactive guides, thereby encapsulating a range of practical use cases. Because of the large-scale and computational demands of the prediction of off-target activities, we have developed a fast cloud-based service (https://crispr.ml) for end-to-end guide-RNA design. The service makes use of pre-computed on-target and off-target activity prediction for every genic region in the human genome.

This is a preview of subscription content, access via your institution

Access options

Buy this article

Prices may be subject to local taxes which are calculated during checkout

Fig. 1: Schematic of Elevation off-target predictive modelling.
Fig. 2: gRNA–target pair scoring.
Fig. 3: First-layer gRNA–target scoring feature importances.
Fig. 4: Validation of the Elevation gRNA–target scoring model.
Fig. 5: Joint scoring and aggregation on viability screens.
Fig. 6: Aggregator feature importances.

Similar content being viewed by others

References

  1. Doench, J. G. et al. Optimized sgRNA design to maximize activity and minimize off-target effects of CRISPR–Cas9. Nat. Biotechnol. 34, 184–191 (2016).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  2. Hsu, P. D. et al. DNA targeting specificity of RNA-guided Cas9 nucleases. Nat. Biotechnol. 31, 827–832 (2013).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  3. Tsai, S. Q. et al. GUIDE-seq enables genome-wide profiling of off-target cleavage by CRISPR–Cas nucleases. Nat. Biotechnol. 33, 187–197 (2015).

    Article  CAS  PubMed  Google Scholar 

  4. Frock, R. L. et al. Genome-wide detection of DNA double-stranded breaks induced by engineered nucleases. Nat. Biotechnol. 33, 179–186 (2015).

    Article  CAS  PubMed  Google Scholar 

  5. Wang, X. et al. Unbiased detection of off-target cleavage by CRISPR–Cas9 and TALENs using integrase-defective lentiviral vectors. Nat. Biotechnol. 33, 175–178 (2015).

    Article  CAS  PubMed  Google Scholar 

  6. Kim, D. et al. Digenome-seq: genome-wide profiling of CRISPR–Cas9 off-target effects in human cells. Nat. Methods 12, 237–243 (2015).

    Article  CAS  PubMed  Google Scholar 

  7. Kim, D., Kim, S., Kim, S., Park, J. & Kim, J.-S. Genome-wide target specificities of CRISPR–Cas9 nucleases revealed by multiplex Digenome-seq. Genome Res. 26, 406–415 (2016).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  8. Tsai, S. Q. et al. CIRCLE-seq: a highly sensitive in vitro screen for genome-wide CRISPR–Cas9 nuclease off-targets. Nat. Methods 14, 607–614 (2017).

    Article  CAS  PubMed  Google Scholar 

  9. Cameron, P. et al. Mapping the genomic landscape of CRISPR–Cas9 cleavage. Nat. Methods 14, 600–606 (2017).

    Article  CAS  PubMed  Google Scholar 

  10. Ran, F. A. et al. In vivo genome editing using Staphylococcus aureus Cas9. Nature 520, 186–191 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  11. Yan, W. X. et al. BLISS is a versatile and quantitative method for genome-wide profiling of DNA double-strand breaks. Nat. Commun. 8, 15058 (2017).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  12. Crosetto, N. et al. Nucleotide-resolution DNA double-strand break mapping by next-generation sequencing. Nat. Methods 10, 361–365 (2013).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  13. Stemmer, M., Thumberger, T., del Sol Keyer, M., Wittbrodt, J. & Mateo, J. L. CCTop: an intuitive, flexible and reliable CRISPR/Cas9 target prediction tool. PLoS ONE 10, e0124633 (2015).

    Article  PubMed  PubMed Central  Google Scholar 

  14. Bae, S., Park, J. & Kim, J. S. Cas-OFFinder: a fast and versatile algorithm that searches for potential off-target sites of Cas9 RNA-guided endonucleases. Bioinformatics 30, 1473–1475 (2014).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  15. Haeussler, M. et al. Evaluation of off-target and on-target scoring algorithms and integration into the guide RNA selection tool CRISPOR. Genome Biol. 17, 148 (2016).

    Article  PubMed  PubMed Central  Google Scholar 

  16. Labun, K., Montague, T. G., Gagnon, J. A., Thyme, S. B. & Valen, E. CHOPCHOP v2: a web tool for the next generation of CRISPR genome engineering. Nucleic Acids Res. 44, W272–W276 (2016).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  17. Heigwer, F., Kerr, G. & Boutros, M. E-CRISP: fast CRISPR target site identification. Nat. Methods 11, 122–123 (2014).

    Article  CAS  PubMed  Google Scholar 

  18. Ma, J. et al. CRISPR-DO for genome-wide CRISPR design and optimization. Bioinformatics 32, 3336–3338 (2016).

    Article  CAS  PubMed  Google Scholar 

  19. Singh, R., Kuscu, C., Quinlan, A., Qi, Y. & Adli, M. Cas9–chromatin binding information enables more accurate CRISPR off-target prediction. Nucleic Acids Res. 43, e118 (2015).

    Article  PubMed  PubMed Central  Google Scholar 

  20. Cradick, T. J., Qiu, P., Lee, C. M., Fine, E. J. & Bao, G. COSMID: a web-based tool for identifying and validating CRISPR/Cas off-target sites. Mol. Ther. Nucleic Acids 3, e214 (2014).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  21. Xu, H.et al. Sequence determinants of improved CRISPR sgRNA design. Genome Res. 25, 1147–1157 2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  22. Chari, R., Mali, P., Moosburner, M. & Church, G. M. Unraveling CRISPR–Cas9 genome engineering parameters via a library-on-library approach. Nat. Methods 12, 823–826 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  23. Doench, J. G.et al. Rational design of highly active sgRNAs for CRISPR–Cas9-mediated gene inactivation. Nat. Biotechnol. 32, 1262–1267 (2014).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  24. Wang, T., Wei, J. J., Sabatini, D. M. & Lander, E. S. Genetic screens in human cells using the CRISPR–Cas9 system. Science 343, 80–84 (2014).

    Article  CAS  PubMed  Google Scholar 

  25. Moreno-Mateos, M. A. et al. CRISPRscan: designing highly efficient sgRNAs for CRISPR–Cas9 targeting in vivo. Nat. Methods 12, 982–988 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  26. Housden, B. E. et al. Identification of potential drug targets for tuberous sclerosis complex by synthetic screens combining CRISPR-based knockouts with RNAi. Sci. Signal. 8, rs9 (2015).

    Article  PubMed  PubMed Central  Google Scholar 

  27. Kim, D. et al. Genome-wide analysis reveals specificities of Cpf1 endonucleases in human cells. Nat. Biotechnol. 34, 863–868 (2016).

    Article  CAS  PubMed  Google Scholar 

  28. Kleinstiver, B. P. et al. Genome-wide specificities of CRISPR–Cas Cpf1 nucleases in human cells. Nat. Biotechnol. 34, 869–874 (2016).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  29. Lin, Y. et al. CRISPR/Cas9 systems have off-target activity with insertions or deletions between target DNA and guide RNA sequences. Nucleic Acids Res. 42, 7473–7485 (2014).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  30. Kleinstiver, B. P. et al. High-fidelity CRISPR–Cas9 nucleases with no detectable genome-wide off-target effects. Nature 529, 490–495 (2016).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  31. Aguirre, A. J. et al. Genomic copy number dictates a gene-independent cell response to CRISPR/Cas9 targeting. Cancer Discov. 6, 914–929 (2016).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  32. Munoz, D. M. et al. CRISPR screens provide a comprehensive assessment of cancer vulnerabilities but generate false-positive hits for highly amplified genomic regions. Cancer Discov. 6, 900–913 (2016).

    Article  CAS  PubMed  Google Scholar 

  33. Morgens, D. W. et al. Genome-scale measurement of off-target activity using Cas9 toxicity in high-throughput screens. Nat. Commun. 8, 15178 (2017).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  34. Yates, A. et al. Ensembl 2016. Nucleic Acids Res. 44, D710–D716 (2016).

    Article  CAS  PubMed  Google Scholar 

  35. Lee, C. M., Davis, T. H. & Bao, G. Examination of CRISPR/Cas9 design tools and the effect of target site accessibility on Cas9 activity. Exp. Physiol. https://doi.org/10.1113/EP086043 (2017).

    Google Scholar 

  36. Horlbeck, M. A. et al. Nucleosomes impede Cas9 access to DNA in vivo and in vitro. eLife 5, e12677 (2016).

    PubMed  PubMed Central  Google Scholar 

  37. Box, G. E. P. & Cox, D. R. An analysis of transformations. J. R. Stat. Soc. Ser. B Methodol. 26, 211–252 (1964).

    Google Scholar 

  38. Reyon, D.et al. FLASH assembly of TALENs for high-throughput genome editing. Nat. Biotechnol. 30, 460–465 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  39. Tsai, S. Q., Topkar, V. V., Joung, J. K. & Aryee, M. J. Open-source guideseq software for analysis of GUIDE-seq data. Nat. Biotechnol. 34, 483 (2016).

    Article  PubMed  Google Scholar 

  40. Russell, S. & Norvig, P. Artificial Intelligence: A Modern Approach: International Edition 3rd edn (Pearson, New Jersey, 2010).

  41. Frank, E., Trigg, L., Holmes, G. & Witten, I. H. Naive Bayes for regression. Mach. Learn. 41, 5–25 (2000).

    Article  Google Scholar 

  42. Freund, Y. & Schapire, R. E. A decision-theoretic generalization of on-line learning and an application to boosting. J. Comp. Syst. Sci. 55, 119–139 (1997).

    Article  Google Scholar 

  43. Bishop, C. M. Pattern Recognition and Machine Learning (Springer, New York, 2007).

  44. Wolpert, D. H. Stacked generalization. Neural Netw. 5, 241–259 (1992).

    Article  Google Scholar 

  45. Baeza-Yates, R. A. & Perleberg, C. H. Fast and practical approximate string matching. Inf. Process. Lett. 59, 21–27 (1996).

    Article  Google Scholar 

  46. Hoffman, M. M. et al. Integrative annotation of chromatin elements from ENCODE data. Nucleic Acids Res. 41, 827–841 (2013).

    Article  CAS  PubMed  Google Scholar 

Download references

Acknowledgements

We thank A. Annavajhala for Azure cloud support, C. Kadie for use and support of his HPC cluster code, J. Jernigan, O. Losinets and the HPC team for cluster support, M. Hegde for help with the data, J. Lopez and M. Aryee for assistance with GUIDE-seq data analysis, M. Haeussler for help accessing the data from his paper, and J.-P. Concordet for feedback on the manuscript. M.W. is supported by a UCLA Collaboratory Fellowship. This work used computational and storage services associated with the Hoffman2 Shared Cluster provided by the UCLA Institute for Digital Research and Education’s Research Technology Group, and also an Azure-for-Research grant to UCLA. We acknowledge the ENCODE Consortium, the UW ENCODE group for generating these data, and UCSC for processing these data and making them available for download.

Author information

Authors and Affiliations

Authors

Contributions

J.L. and N.F. designed, implemented and evaluated the machine learning and statistical methods (Elevation-score and Elevation-aggregate). M.W. designed and implemented the Elevation-search infrastructure, also known as dsNickFury. J.G.D. provided biological expertise. B.P.K., J.K.J., J.L., N.F. and J.G.D. selected validation gRNAs. B.P.K., A.A.S. and J.K.J. assayed the validation gRNAs for off-target activity. J.L., N.F., M.W. and J.G.D. designed the web interface. L.H. and K.G. created the front-end webpage for the cloud service. M.E. and J.C. helped run the experiments and populated the cloud server. J.L., M.W., J.G.D., N.F., B.P.K. and J.K.J. wrote the paper.

Corresponding authors

Correspondence to Jennifer Listgarten, Michael Weinstein, John G. Doench or Nicolo Fusi.

Ethics declarations

Competing interests

J.L., L.H., M.E., J.C. and N.F. performed research related to this manuscript while employed by Microsoft. J.K.J. has financial interests in Beam Therapeutics, Editas Medicine, Monitor Biotechnologies, Pairwise Plants, Poseida Therapeutics and Transposagen Biopharmaceuticals. J.K.J.’s interests were reviewed and are managed by Massachusetts General Hospital and Partners HealthCare in accordance with their conflict of interest policies.

Additional information

Publisher’s note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplementary Information

Supplementary tables and figures.

Life Sciences Reporting Summary

Supplementary Table 1

GUIDE-seq details for validation dataset 2.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Listgarten, J., Weinstein, M., Kleinstiver, B.P. et al. Prediction of off-target activities for the end-to-end design of CRISPR guide RNAs. Nat Biomed Eng 2, 38–47 (2018). https://doi.org/10.1038/s41551-017-0178-6

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1038/s41551-017-0178-6

This article is cited by

Search

Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing