Abstract
Off-target effects of the CRISPR–Cas9 system can lead to suboptimal gene-editing outcomes and are a bottleneck in its development. Here, we introduce two interdependent machine-learning models for the prediction of off-target effects of CRISPR–Cas9. The approach, which we named Elevation, scores individual guide–target pairs, and also aggregates them into a single, overall summary guide score. We demonstrate that Elevation consistently outperforms competing approaches on both tasks. We also introduce an evaluation method that balances errors between active and inactive guides, thereby encapsulating a range of practical use cases. Because of the large-scale and computational demands of the prediction of off-target activities, we have developed a fast cloud-based service (https://crispr.ml) for end-to-end guide-RNA design. The service makes use of pre-computed on-target and off-target activity prediction for every genic region in the human genome.
This is a preview of subscription content, access via your institution
Access options
Access Nature and 54 other Nature Portfolio journals
Get Nature+, our best-value online-access subscription
$29.99 / 30 days
cancel any time
Subscribe to this journal
Receive 12 digital issues and online access to articles
$99.00 per year
only $8.25 per issue
Rent or buy this article
Prices vary by article type
from$1.95
to$39.95
Prices may be subject to local taxes which are calculated during checkout






Similar content being viewed by others
References
Doench, J. G. et al. Optimized sgRNA design to maximize activity and minimize off-target effects of CRISPR–Cas9. Nat. Biotechnol. 34, 184–191 (2016).
Hsu, P. D. et al. DNA targeting specificity of RNA-guided Cas9 nucleases. Nat. Biotechnol. 31, 827–832 (2013).
Tsai, S. Q. et al. GUIDE-seq enables genome-wide profiling of off-target cleavage by CRISPR–Cas nucleases. Nat. Biotechnol. 33, 187–197 (2015).
Frock, R. L. et al. Genome-wide detection of DNA double-stranded breaks induced by engineered nucleases. Nat. Biotechnol. 33, 179–186 (2015).
Wang, X. et al. Unbiased detection of off-target cleavage by CRISPR–Cas9 and TALENs using integrase-defective lentiviral vectors. Nat. Biotechnol. 33, 175–178 (2015).
Kim, D. et al. Digenome-seq: genome-wide profiling of CRISPR–Cas9 off-target effects in human cells. Nat. Methods 12, 237–243 (2015).
Kim, D., Kim, S., Kim, S., Park, J. & Kim, J.-S. Genome-wide target specificities of CRISPR–Cas9 nucleases revealed by multiplex Digenome-seq. Genome Res. 26, 406–415 (2016).
Tsai, S. Q. et al. CIRCLE-seq: a highly sensitive in vitro screen for genome-wide CRISPR–Cas9 nuclease off-targets. Nat. Methods 14, 607–614 (2017).
Cameron, P. et al. Mapping the genomic landscape of CRISPR–Cas9 cleavage. Nat. Methods 14, 600–606 (2017).
Ran, F. A. et al. In vivo genome editing using Staphylococcus aureus Cas9. Nature 520, 186–191 (2015).
Yan, W. X. et al. BLISS is a versatile and quantitative method for genome-wide profiling of DNA double-strand breaks. Nat. Commun. 8, 15058 (2017).
Crosetto, N. et al. Nucleotide-resolution DNA double-strand break mapping by next-generation sequencing. Nat. Methods 10, 361–365 (2013).
Stemmer, M., Thumberger, T., del Sol Keyer, M., Wittbrodt, J. & Mateo, J. L. CCTop: an intuitive, flexible and reliable CRISPR/Cas9 target prediction tool. PLoS ONE 10, e0124633 (2015).
Bae, S., Park, J. & Kim, J. S. Cas-OFFinder: a fast and versatile algorithm that searches for potential off-target sites of Cas9 RNA-guided endonucleases. Bioinformatics 30, 1473–1475 (2014).
Haeussler, M. et al. Evaluation of off-target and on-target scoring algorithms and integration into the guide RNA selection tool CRISPOR. Genome Biol. 17, 148 (2016).
Labun, K., Montague, T. G., Gagnon, J. A., Thyme, S. B. & Valen, E. CHOPCHOP v2: a web tool for the next generation of CRISPR genome engineering. Nucleic Acids Res. 44, W272–W276 (2016).
Heigwer, F., Kerr, G. & Boutros, M. E-CRISP: fast CRISPR target site identification. Nat. Methods 11, 122–123 (2014).
Ma, J. et al. CRISPR-DO for genome-wide CRISPR design and optimization. Bioinformatics 32, 3336–3338 (2016).
Singh, R., Kuscu, C., Quinlan, A., Qi, Y. & Adli, M. Cas9–chromatin binding information enables more accurate CRISPR off-target prediction. Nucleic Acids Res. 43, e118 (2015).
Cradick, T. J., Qiu, P., Lee, C. M., Fine, E. J. & Bao, G. COSMID: a web-based tool for identifying and validating CRISPR/Cas off-target sites. Mol. Ther. Nucleic Acids 3, e214 (2014).
Xu, H.et al. Sequence determinants of improved CRISPR sgRNA design. Genome Res. 25, 1147–1157 2015).
Chari, R., Mali, P., Moosburner, M. & Church, G. M. Unraveling CRISPR–Cas9 genome engineering parameters via a library-on-library approach. Nat. Methods 12, 823–826 (2015).
Doench, J. G.et al. Rational design of highly active sgRNAs for CRISPR–Cas9-mediated gene inactivation. Nat. Biotechnol. 32, 1262–1267 (2014).
Wang, T., Wei, J. J., Sabatini, D. M. & Lander, E. S. Genetic screens in human cells using the CRISPR–Cas9 system. Science 343, 80–84 (2014).
Moreno-Mateos, M. A. et al. CRISPRscan: designing highly efficient sgRNAs for CRISPR–Cas9 targeting in vivo. Nat. Methods 12, 982–988 (2015).
Housden, B. E. et al. Identification of potential drug targets for tuberous sclerosis complex by synthetic screens combining CRISPR-based knockouts with RNAi. Sci. Signal. 8, rs9 (2015).
Kim, D. et al. Genome-wide analysis reveals specificities of Cpf1 endonucleases in human cells. Nat. Biotechnol. 34, 863–868 (2016).
Kleinstiver, B. P. et al. Genome-wide specificities of CRISPR–Cas Cpf1 nucleases in human cells. Nat. Biotechnol. 34, 869–874 (2016).
Lin, Y. et al. CRISPR/Cas9 systems have off-target activity with insertions or deletions between target DNA and guide RNA sequences. Nucleic Acids Res. 42, 7473–7485 (2014).
Kleinstiver, B. P. et al. High-fidelity CRISPR–Cas9 nucleases with no detectable genome-wide off-target effects. Nature 529, 490–495 (2016).
Aguirre, A. J. et al. Genomic copy number dictates a gene-independent cell response to CRISPR/Cas9 targeting. Cancer Discov. 6, 914–929 (2016).
Munoz, D. M. et al. CRISPR screens provide a comprehensive assessment of cancer vulnerabilities but generate false-positive hits for highly amplified genomic regions. Cancer Discov. 6, 900–913 (2016).
Morgens, D. W. et al. Genome-scale measurement of off-target activity using Cas9 toxicity in high-throughput screens. Nat. Commun. 8, 15178 (2017).
Yates, A. et al. Ensembl 2016. Nucleic Acids Res. 44, D710–D716 (2016).
Lee, C. M., Davis, T. H. & Bao, G. Examination of CRISPR/Cas9 design tools and the effect of target site accessibility on Cas9 activity. Exp. Physiol. https://doi.org/10.1113/EP086043 (2017).
Horlbeck, M. A. et al. Nucleosomes impede Cas9 access to DNA in vivo and in vitro. eLife 5, e12677 (2016).
Box, G. E. P. & Cox, D. R. An analysis of transformations. J. R. Stat. Soc. Ser. B Methodol. 26, 211–252 (1964).
Reyon, D.et al. FLASH assembly of TALENs for high-throughput genome editing. Nat. Biotechnol. 30, 460–465 (2012).
Tsai, S. Q., Topkar, V. V., Joung, J. K. & Aryee, M. J. Open-source guideseq software for analysis of GUIDE-seq data. Nat. Biotechnol. 34, 483 (2016).
Russell, S. & Norvig, P. Artificial Intelligence: A Modern Approach: International Edition 3rd edn (Pearson, New Jersey, 2010).
Frank, E., Trigg, L., Holmes, G. & Witten, I. H. Naive Bayes for regression. Mach. Learn. 41, 5–25 (2000).
Freund, Y. & Schapire, R. E. A decision-theoretic generalization of on-line learning and an application to boosting. J. Comp. Syst. Sci. 55, 119–139 (1997).
Bishop, C. M. Pattern Recognition and Machine Learning (Springer, New York, 2007).
Wolpert, D. H. Stacked generalization. Neural Netw. 5, 241–259 (1992).
Baeza-Yates, R. A. & Perleberg, C. H. Fast and practical approximate string matching. Inf. Process. Lett. 59, 21–27 (1996).
Hoffman, M. M. et al. Integrative annotation of chromatin elements from ENCODE data. Nucleic Acids Res. 41, 827–841 (2013).
Acknowledgements
We thank A. Annavajhala for Azure cloud support, C. Kadie for use and support of his HPC cluster code, J. Jernigan, O. Losinets and the HPC team for cluster support, M. Hegde for help with the data, J. Lopez and M. Aryee for assistance with GUIDE-seq data analysis, M. Haeussler for help accessing the data from his paper, and J.-P. Concordet for feedback on the manuscript. M.W. is supported by a UCLA Collaboratory Fellowship. This work used computational and storage services associated with the Hoffman2 Shared Cluster provided by the UCLA Institute for Digital Research and Education’s Research Technology Group, and also an Azure-for-Research grant to UCLA. We acknowledge the ENCODE Consortium, the UW ENCODE group for generating these data, and UCSC for processing these data and making them available for download.
Author information
Authors and Affiliations
Contributions
J.L. and N.F. designed, implemented and evaluated the machine learning and statistical methods (Elevation-score and Elevation-aggregate). M.W. designed and implemented the Elevation-search infrastructure, also known as dsNickFury. J.G.D. provided biological expertise. B.P.K., J.K.J., J.L., N.F. and J.G.D. selected validation gRNAs. B.P.K., A.A.S. and J.K.J. assayed the validation gRNAs for off-target activity. J.L., N.F., M.W. and J.G.D. designed the web interface. L.H. and K.G. created the front-end webpage for the cloud service. M.E. and J.C. helped run the experiments and populated the cloud server. J.L., M.W., J.G.D., N.F., B.P.K. and J.K.J. wrote the paper.
Corresponding authors
Ethics declarations
Competing interests
J.L., L.H., M.E., J.C. and N.F. performed research related to this manuscript while employed by Microsoft. J.K.J. has financial interests in Beam Therapeutics, Editas Medicine, Monitor Biotechnologies, Pairwise Plants, Poseida Therapeutics and Transposagen Biopharmaceuticals. J.K.J.’s interests were reviewed and are managed by Massachusetts General Hospital and Partners HealthCare in accordance with their conflict of interest policies.
Additional information
Publisher’s note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary information
Supplementary Information
Supplementary tables and figures.
Supplementary Table 1
GUIDE-seq details for validation dataset 2.
Rights and permissions
About this article
Cite this article
Listgarten, J., Weinstein, M., Kleinstiver, B.P. et al. Prediction of off-target activities for the end-to-end design of CRISPR guide RNAs. Nat Biomed Eng 2, 38–47 (2018). https://doi.org/10.1038/s41551-017-0178-6
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1038/s41551-017-0178-6
This article is cited by
-
Genome-wide CRISPR off-target prediction and optimization using RNA-DNA interaction fingerprints
Nature Communications (2023)
-
Human genetic diversity alters off-target outcomes of therapeutic gene editing
Nature Genetics (2023)
-
Whole genome analysis for 163 gRNAs in Cas9-edited mice reveals minimal off-target activity
Communications Biology (2023)
-
Deep sampling of gRNA in the human genome and deep-learning-informed prediction of gRNA activities
Cell Discovery (2023)
-
Ewing sarcoma treatment: a gene therapy approach
Cancer Gene Therapy (2023)