Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Article
  • Published:

Prediction of on-target and off-target activity of CRISPR–Cas13d guide RNAs using deep learning

Abstract

Transcriptome engineering applications in living cells with RNA-targeting CRISPR effectors depend on accurate prediction of on-target activity and off-target avoidance. Here we design and test ~200,000 RfxCas13d guide RNAs targeting essential genes in human cells with systematically designed mismatches and insertions and deletions (indels). We find that mismatches and indels have a position- and context-dependent impact on Cas13d activity, and mismatches that result in G–U wobble pairings are better tolerated than other single-base mismatches. Using this large-scale dataset, we train a convolutional neural network that we term targeted inhibition of gene expression via gRNA design (TIGER) to predict efficacy from guide sequence and context. TIGER outperforms the existing models at predicting on-target and off-target activity on our dataset and published datasets. We show that TIGER scoring combined with specific mismatches yields the first general framework to modulate transcript expression, enabling the use of RNA-targeting CRISPRs to precisely control gene dosage.

This is a preview of subscription content, access via your institution

Access options

Buy this article

Prices may be subject to local taxes which are calculated during checkout

Fig. 1: Pooled CRISPR–Cas13 essentiality screen assaying Cas13d gRNA efficacy.
Fig. 2: Large-scale mapping of Cas13d gRNA mismatch activity.
Fig. 3: A deep learning model to predict optimal Cas13d gRNAs.
Fig. 4: Training TIGER using gRNAs with mismatches enables prediction of off-target activity and transcript modulation using gRNAs with SMs.

Similar content being viewed by others

Data availability

All data generated in this study have been deposited at NCBI Gene Expression Omnibus (GEO) with the accession number GSE232228. Flow cytometry screen data from ref. 7 is available under the accession number GSE142675.

Code availability

Code to run Cas13d on-target and off-target TIGER models has been deposited on Github (https://github.com/daklab/tiger). A web-accessible version of TIGER is available at https://tiger.nygenome.org/.

References

  1. Abudayyeh, O. O. et al. C2c2 is a single-component programmable RNA-guided RNA-targeting CRISPR effector. Science 353, aaf5573 (2016).

    PubMed  PubMed Central  Google Scholar 

  2. Abudayyeh, O. O. et al. RNA targeting with CRISPR–Cas13. Nature 550, 280–284 (2017).

    PubMed  PubMed Central  Google Scholar 

  3. Smargon, A. A. et al. Cas13b is a type VI-B CRISPR-associated RNA-guided RNase differentially regulated by accessory proteins Csx27 and Csx28. Mol. Cell 65, 618–630 (2017).

    CAS  PubMed  PubMed Central  Google Scholar 

  4. Konermann, S. et al. Transcriptome engineering with RNA-targeting article transcriptome engineering with RNA-targeting. Cell 173, 1–12 (2018).

    Google Scholar 

  5. Yan, W. X. et al. Cas13d is a compact RNA-targeting type VI CRISPR effector positively modulated by a WYL-domain-containing accessory protein. Mol. Cell 70, 327–339 (2018).

    CAS  PubMed  PubMed Central  Google Scholar 

  6. Smargon, A. A., Shi, Y. J. & Yeo, G. W. RNA-targeting CRISPR systems from metagenomic discovery to transcriptomic engineering. Nat. Cell Biol. 22, 143–150 (2020).

    CAS  PubMed  PubMed Central  Google Scholar 

  7. Wessels, H. H. et al. Massively parallel Cas13 screens reveal principles for guide RNA design. Nat. Biotechnol. 38, 722–727 (2020).

    CAS  PubMed  PubMed Central  Google Scholar 

  8. Wei, J. et al. Deep learning and CRISPR–Cas13d ortholog discovery for optimized RNA targeting. Preprint at bioRxiv https://doi.org/10.1101/2021.09.14.460134 (2022).

  9. Cheng, X. et al. Modeling CRISPR–Cas13d on-target and off-target effects using machine learning approaches. Nat. Commun. 14, 752 (2023).

  10. Metsky, H. C. et al. Designing sensitive viral diagnostics with machine learning. Nat. Biotechnol. 40, 1123–1131 (2022).

    CAS  PubMed  PubMed Central  Google Scholar 

  11. Tambe, A., East-seletsky, A., Knott, G. J., Connell, M. R. O. & Doudna, J. A. RNA binding and HEPN-nuclease activation are decoupled in CRISPR–Cas13a. Cell Rep. 24, 1025–1036 (2018).

    CAS  PubMed  PubMed Central  Google Scholar 

  12. Powell, J. E. et al. Targeted gene silencing in the nervous system with CRISPR–Cas13. Sci. Adv. 8, eabk2485 (2022).

    CAS  PubMed  PubMed Central  Google Scholar 

  13. Morelli, K. H. et al. An RNA-targeting CRISPR–Cas13d system alleviates disease-related phenotypes in Huntington’s disease models. Nat. Neurosci. 26, 27–38 (2023).

    CAS  PubMed  Google Scholar 

  14. Méndez-Mancilla, A. et al. Chemically modified guide RNAs enhance CRISPR–Cas13 knockdown in human cells. Cell Chem Biol. 29, 321–327 (2022).

  15. Rotolo, L. et al. Species-agnostic polymeric formulations for inhalable messenger RNA delivery to the lung. Nat. Mater. 22, 369–379 (2023).

    CAS  PubMed  Google Scholar 

  16. Fan, N. et al. Hierarchical self-uncloaking CRISPR–Cas13a-customized RNA nanococoons for spatial-controlled genome editing and precise cancer therapy. Sci. Adv. 8, eabn7382 (2022).

    CAS  PubMed  PubMed Central  Google Scholar 

  17. Guo, Y. et al. Specific knockdown of Htra2 by CRISPR–CasRx prevents acquired sensorineural hearing loss in mice. Mol. Ther. Nucleic Acids 28, 643–655 (2022).

    CAS  PubMed  PubMed Central  Google Scholar 

  18. Nasim, M. T. et al. Stoichiometric imbalance in the receptor complex contributes to dysfunctional BMPR-II mediated signalling in pulmonary arterial hypertension. Hum. Mol. Genet. 17, 1683–1694 (2008).

    CAS  PubMed  Google Scholar 

  19. Gurdon, J. B. & Bourillot, P. Y. Morphogen gradient interpretation. Nature 413, 797–803 (2001).

    CAS  PubMed  Google Scholar 

  20. McHugh, C. A. et al. The Xist lncRNA interacts directly with SHARP to silence transcription through HDAC3. Nature 521, 232–236 (2015).

    CAS  PubMed  PubMed Central  Google Scholar 

  21. Fehrmann, R. S. N. et al. Gene expression analysis identifies global gene dosage sensitivity in cancer. Nat. Genet. 47, 115–125 (2015).

    CAS  PubMed  Google Scholar 

  22. Collins, R. L. et al. A cross-disorder dosage sensitivity map of the human genome. Cell 185, 3041–3055 (2022).

    CAS  PubMed  PubMed Central  Google Scholar 

  23. Patwardhan, R. P. et al. High-resolution analysis of DNA regulatory elements by synthetic saturation mutagenesis. Nat. Biotechnol. 27, 1173–1175 (2009).

    CAS  PubMed  PubMed Central  Google Scholar 

  24. Gossen, M. & Bujard, H. Tight control of gene expression in mammalian cells by tetracycline-responsive promoters. Proc. Natl Acad. Sci. USA 89, 5547–5551 (1992).

    CAS  PubMed  PubMed Central  Google Scholar 

  25. Michaels, Y. S. et al. Precise tuning of gene expression levels in mammalian cells. Nat. Commun. 10, 818 (2019).

    CAS  PubMed  PubMed Central  Google Scholar 

  26. Jost, M. et al. Titrating gene expression using libraries of systematically attenuated CRISPR guide RNAs. Nat. Biotechnol. 38, 355–364 (2020).

    CAS  PubMed  PubMed Central  Google Scholar 

  27. Bintu, L. et al. Dynamics of epigenetic regulation at the single-cell level. Science 351, 720–724 (2016).

    CAS  PubMed  PubMed Central  Google Scholar 

  28. Zhang, C. et al. Structural basis for the RNA-guided ribonuclease activity of CRISPR–Cas13d. Cell 175, 212–223 (2018).

    CAS  PubMed  PubMed Central  Google Scholar 

  29. Charlier, J., Nadon, R. & Makarenkov, V. Accurate deep learning off-target prediction with novel sgRNA–DNA sequence encoding in CRISPR–Cas9 gene editing. Bioinformatics 37, 2299–2307 (2021).

    CAS  PubMed  Google Scholar 

  30. Kim, H. K. et al. SpCas9 activity prediction by DeepSpCas9, a deep learning-based model with high generalization performance. Sci. Adv. 5, eaax9249 (2019).

    CAS  PubMed  PubMed Central  Google Scholar 

  31. Lin, J. & Wong, K. C. Off-target predictions in CRISPR–Cas9 gene editing using deep learning. Bioinformatics 34, i656–i663 (2018).

    CAS  PubMed  PubMed Central  Google Scholar 

  32. Lin, J., Zhang, Z., Zhang, S., Chen, J. & Wong, K. C. CRISPR-Net: a recurrent convolutional network quantifies CRISPR off-target activities with mismatches and indels. Adv. Sci. 7, 1903562 (2020).

    CAS  Google Scholar 

  33. Liu, Q., Cheng, X., Liu, G., Li, B. & Liu, X. Deep learning improves the ability of sgRNA off-target propensity prediction. BMC Bioinf. 21, 51 (2020).

  34. Luo, J., Chen, W., Xue, L. & Tang, B. Prediction of activity and specificity of CRISPR-Cpf1 using convolutional deep learning neural networks. BMC Bioinformatics 20, 332 (2019).

    PubMed  PubMed Central  Google Scholar 

  35. Niu, R., Peng, J., Zhang, Z. & Shang, X. R-CRISPR: a deep learning network to predict off-target activities with mismatch, insertion and deletion in CRISPR-Cas9 system. Genes (Basel). 12, 1878 (2021).

  36. Zhang, G., Zeng, T., Dai, Z. & Dai, X. Prediction of CRISPR/Cas9 single guide RNA cleavage efficiency and specificity by attention-based convolutional neural networks. Comput. Struct. Biotechnol. J. 19, 1445–1457 (2021).

    CAS  PubMed  PubMed Central  Google Scholar 

  37. LeCun, Y. et al. Backpropagation applied to digit recognition. Neural Comput. 1, 541–551 (1989).

    Google Scholar 

  38. Krizhevsky, A., Sutskever, I. & Hinton, G. E. ImageNet classification with deep convolutional neural networks. Adv. Neural Inf. Process. Syst. 25, 1097–1105 (2012).

    Google Scholar 

  39. Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017).

  40. Shi, P. et al. Collateral activity of the CRISPR/RfxCas13d system in human cells. Commun. Biol. 6, 334 (2023).

  41. Kelley, C. P., Haerle, M. C. & Wang, E. T. Negative autoregulation mitigates collateral RNase activity of repeat-targeting CRISPR–Cas13d in mammalian cells. Cell Rep. 40, 111226 (2022).

  42. Wang, T. et al. Identification and characterization of essential genes in the human genome. Science 350, 1096–1101 (2015).

    CAS  PubMed  PubMed Central  Google Scholar 

  43. Hart, T., Brown, K. R., Sircoulomb, F., Rottapel, R. & Moffat, J. Measuring error rates in genomic perturbation screens: gold standards for human functional genomics. Mol. Syst. Biol. 10, 733 (2014).

    PubMed  PubMed Central  Google Scholar 

  44. Kim, H. K. et al. High-throughput analysis of the activities of xCas9, SpCas9-NG and SpCas9 at matched and mismatched target sequences in human cells. Nat. Biomed. Eng. 4, 111–124 (2020).

    CAS  PubMed  Google Scholar 

  45. Kim, N. et al. Prediction of the sequence-specific cleavage activity of Cas9 variants. Nat. Biotechnol. 38, 1328–1336 (2020).

    CAS  PubMed  Google Scholar 

  46. Xiang, X. et al. Enhancing CRISPR–Cas9 gRNA efficiency prediction by data integration and deep learning. Nat. Commun. 12, 3238 (2021).

    CAS  PubMed  PubMed Central  Google Scholar 

  47. Hu, W. et al. Single-base precision design of CRISPR–Cas13b enables systematic silencing of oncogenic fusions. Preprint at bioRxiv https://doi.org/10.1101/2022.06.22.497105 (2022).

  48. Raj, A. & van Oudenaarden, A. Nature, nurture, or chance: stochastic gene expression and its consequences. Cell 135, 216–226 (2008).

    CAS  PubMed  PubMed Central  Google Scholar 

  49. Lai, E. C., Tomancak, P., Williams, R. W. & Rubin, G. M. Computational identification of Drosophila microRNA genes. Genome Biol. 4, R42 (2003).

    PubMed  PubMed Central  Google Scholar 

  50. Stoeger, T., Battich, N. & Pelkmans, L. Passive noise filtering by cellular compartmentalization. Cell 164, 1151–1161 (2016).

    CAS  PubMed  Google Scholar 

  51. Noviello, G., Gjaltema, R.A.F. & Schulz, E.G. CasTuner is a degron and CRISPR/Cas-based toolkit for analog tuning of endogenous gene expression. Nat. Commun. 14, 3225 (2023).

  52. Lensch, S. et al. Dynamic spreading of chromatin-mediated gene silencing and reactivation between neighboring genes in single cells. eLife 11, e75115 (2022).

    CAS  PubMed  PubMed Central  Google Scholar 

  53. Steiger, J. H. Tests for comparing elements of a correlation matrix. Psychol. Bull. 87, 245–251 (1980).

    Google Scholar 

  54. DeLong, E. R., DeLong, D. M. & Clarke-Pearson, D. L. Comparing the areas under two or more correlated receiver operating characteristic curves: a nonparametric approach. Biometrics 44, 837–845 (1988).

    CAS  PubMed  Google Scholar 

  55. Sun, X. & Xu, W. Fast implementation of DeLong’s algorithm for comparing the areas under correlated receiver operating characteristic curves. IEEE Signal Process Lett. 21, 1389–1393 (2014).

    Google Scholar 

  56. Massey, F. J. J. The Kolmogorov-Smirnov test for goodness of fit. J. Am. Stat. Assoc. 46, 68–78 (1951).

    Google Scholar 

  57. Chen, S. et al. Genome-wide CRISPR screen in a mouse model of tumor growth and metastasis. Cell 160, 1246–1260 (2015).

    CAS  PubMed  PubMed Central  Google Scholar 

  58. Hart, T. et al. High-resolution CRISPR screens reveal fitness genes and genotype-specific cancer liabilities. Cell 163, 1515–1526 (2015).

    CAS  PubMed  Google Scholar 

  59. Gerstberger, S., Hafner, M. & Tuschl, T. A census of human RNA-binding proteins. Nat. Rev. Genet. 15, 829–845 (2014).

    CAS  PubMed  Google Scholar 

  60. Vaquerizas, J. M., Kummerfeld, S. K., Teichmann, S. A. & Luscombe, N. M. A census of human transcription factors: function, expression and evolution. Nat. Rev. Genet. 10, 252–263 (2009).

    CAS  PubMed  Google Scholar 

  61. Love, M. I., Huber, W. & Anders, S. Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2. Genome Biol. 15, 550 (2014).

    PubMed  PubMed Central  Google Scholar 

  62. Leek, J. T., Johnson, W. E., Parker, H. S., Jaffe, A. E. & Storey, J. D. The sva package for removing batch effects and other unwanted variation in high-throughput experiments. Bioinformatics 28, 882–883 (2012).

    CAS  PubMed  PubMed Central  Google Scholar 

  63. Sack, L. M., Davoli, T., Xu, Q., Li, M. Z. & Elledge, S. J. Sources of error in mammalian genetic screens. G3 (Bethesda). 6, 2781–2790 (2016).

    CAS  PubMed  PubMed Central  Google Scholar 

  64. Kolde, R., Laur, S., Adler, P. & Vilo, J. Robust rank aggregation for gene list integration and meta-analysis. Bioinformatics 28, 573–580 (2012).

    CAS  PubMed  PubMed Central  Google Scholar 

  65. Lorenz, R. et al. ViennaRNA Package 2.0. Algorithms Mol. Biol. 6, 26 (2011).

    PubMed  PubMed Central  Google Scholar 

  66. Agarwal, V., Subtelny, A. O., Thiru, P., Ulitsky, I. & Bartel, D. P. Predicting microRNA targeting efficacy in Drosophila. Genome Biol. 19, 152 (2018).

  67. Agarwal, V., Bell, G. W., Nam, J.-W. & Bartel, D. P. Predicting effective microRNA target sites in mammalian mRNAs. eLife 4, e05005 (2015).

    PubMed  PubMed Central  Google Scholar 

  68. Krueger, J. & Rehmsmeier, M. RNAhybrid: microRNA target prediction easy, fast and flexible. Nucleic Acids Res. 34, 451–454 (2006).

    Google Scholar 

  69. Kingma, D. P. & Ba, J. Adam: a method for stochastic optimization. Preprint at https://arxiv.org/abs/1412.6980 (2014).

  70. Cho, K., van Merriënboer, B., Bahdanau, D. & Bengio, Y. On the properties of neural machine translation: encoder–decoder approaches. In Proc. SSST-8, Eighth Workshop on Syntax, Semantics and Structure in Statistical Translation (eds Wu, D. et al.) 103–111 (Association for Computational Linguistics, 2014).

Download references

Acknowledgements

We thank the entire Sanjana and Knowles Labs for their support and advice. D.A.K. is supported by Columbia and NYGC startup funds, NIH/NCI (R21CA272345) and an NSF CAREER (DBI2146398). N.E.S. is supported by NYU and NYGC startup funds, NIH/NHGRI (DP2HG010099), NIH/NCI (R01CA218668), NIH/NIGMS (R01GM138635), DARPA (D18AP00053), Cancer Research Institute and the Simons Foundation for Autism Research Initiative.

Author information

Authors and Affiliations

Authors

Contributions

H.W. and N.E.S. conceived the study. H.W., A.S., D.A.K. and N.E.S. designed the experiments. H.W. and A.M. cloned libraries and performed the CRISPR screens. S.K.H. assisted with cell culture for pooled screens. H.W., A.S., D.A.K. and N.E.S. analyzed the data and developed the deep learning model. A.S. and E.J.K. implemented the web-based online TIGER tool. H.W., A.S., D.A.K. and N.E.S. wrote the paper with input from all authors.

Corresponding authors

Correspondence to David A. Knowles or Neville E. Sanjana.

Ethics declarations

Competing interests

The New York Genome Center and New York University have applied for patents relating to the work in this article. H.W. is a cofounder of Neptune Biotech. N.E.S. is an advisor to Qiagen and is a cofounder of OverT Bio. The other authors declare no competing interests.

Peer review

Peer review information

Nature Biotechnology thanks the anonymous reviewers for their contribution to the peer review of this work.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplementary Information

Supplementary Figs. 1–6 and Supplementary Note.

Reporting Summary

Supplementary Data 1

Off-target screen target genes.

Supplementary Data 2

Off-target screen gRNA annotation.

Supplementary Data 3

Off-target screen raw counts.

Supplementary Data 4

Off-target screen gRNA depletion.

Supplementary Data 5

TIGER on-target screen gRNA annotation.

Supplementary Data 6

TIGER on-target screen raw counts.

Supplementary Data 7

TIGER on-target gRNA depletion.

Supplementary Data 8

TIGER on-target gene depletion.

Supplementary Data 9

Off-target screen relative SM gRNA activities.

Supplementary Data 10

TIGER titration screen gRNA annotation.

Supplementary Data 11

TIGER titration screen raw gRNA counts.

Supplementary Data 12

TIGER titration screen gRNA depletion.

Supplementary Data 13

TIGER titration screen relative SM gRNA activities.

Supplementary Data 14

Oligonucleotides used in this study.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Wessels, HH., Stirn, A., Méndez-Mancilla, A. et al. Prediction of on-target and off-target activity of CRISPR–Cas13d guide RNAs using deep learning. Nat Biotechnol 42, 628–637 (2024). https://doi.org/10.1038/s41587-023-01830-8

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1038/s41587-023-01830-8

This article is cited by

Search

Quick links

Nature Briefing: Translational Research

Sign up for the Nature Briefing: Translational Research newsletter — top stories in biotechnology, drug discovery and pharma.

Get what matters in translational research, free to your inbox weekly. Sign up for Nature Briefing: Translational Research