Abstract
Transcriptome engineering applications in living cells with RNA-targeting CRISPR effectors depend on accurate prediction of on-target activity and off-target avoidance. Here we design and test ~200,000 RfxCas13d guide RNAs targeting essential genes in human cells with systematically designed mismatches and insertions and deletions (indels). We find that mismatches and indels have a position- and context-dependent impact on Cas13d activity, and mismatches that result in G–U wobble pairings are better tolerated than other single-base mismatches. Using this large-scale dataset, we train a convolutional neural network that we term targeted inhibition of gene expression via gRNA design (TIGER) to predict efficacy from guide sequence and context. TIGER outperforms the existing models at predicting on-target and off-target activity on our dataset and published datasets. We show that TIGER scoring combined with specific mismatches yields the first general framework to modulate transcript expression, enabling the use of RNA-targeting CRISPRs to precisely control gene dosage.
This is a preview of subscription content, access via your institution
Access options
Access Nature and 54 other Nature Portfolio journals
Get Nature+, our best-value online-access subscription
$29.99 / 30 days
cancel any time
Subscribe to this journal
Receive 12 print issues and online access
$209.00 per year
only $17.42 per issue
Buy this article
- Purchase on Springer Link
- Instant access to full article PDF
Prices may be subject to local taxes which are calculated during checkout
Similar content being viewed by others
Code availability
Code to run Cas13d on-target and off-target TIGER models has been deposited on Github (https://github.com/daklab/tiger). A web-accessible version of TIGER is available at https://tiger.nygenome.org/.
References
Abudayyeh, O. O. et al. C2c2 is a single-component programmable RNA-guided RNA-targeting CRISPR effector. Science 353, aaf5573 (2016).
Abudayyeh, O. O. et al. RNA targeting with CRISPR–Cas13. Nature 550, 280–284 (2017).
Smargon, A. A. et al. Cas13b is a type VI-B CRISPR-associated RNA-guided RNase differentially regulated by accessory proteins Csx27 and Csx28. Mol. Cell 65, 618–630 (2017).
Konermann, S. et al. Transcriptome engineering with RNA-targeting article transcriptome engineering with RNA-targeting. Cell 173, 1–12 (2018).
Yan, W. X. et al. Cas13d is a compact RNA-targeting type VI CRISPR effector positively modulated by a WYL-domain-containing accessory protein. Mol. Cell 70, 327–339 (2018).
Smargon, A. A., Shi, Y. J. & Yeo, G. W. RNA-targeting CRISPR systems from metagenomic discovery to transcriptomic engineering. Nat. Cell Biol. 22, 143–150 (2020).
Wessels, H. H. et al. Massively parallel Cas13 screens reveal principles for guide RNA design. Nat. Biotechnol. 38, 722–727 (2020).
Wei, J. et al. Deep learning and CRISPR–Cas13d ortholog discovery for optimized RNA targeting. Preprint at bioRxiv https://doi.org/10.1101/2021.09.14.460134 (2022).
Cheng, X. et al. Modeling CRISPR–Cas13d on-target and off-target effects using machine learning approaches. Nat. Commun. 14, 752 (2023).
Metsky, H. C. et al. Designing sensitive viral diagnostics with machine learning. Nat. Biotechnol. 40, 1123–1131 (2022).
Tambe, A., East-seletsky, A., Knott, G. J., Connell, M. R. O. & Doudna, J. A. RNA binding and HEPN-nuclease activation are decoupled in CRISPR–Cas13a. Cell Rep. 24, 1025–1036 (2018).
Powell, J. E. et al. Targeted gene silencing in the nervous system with CRISPR–Cas13. Sci. Adv. 8, eabk2485 (2022).
Morelli, K. H. et al. An RNA-targeting CRISPR–Cas13d system alleviates disease-related phenotypes in Huntington’s disease models. Nat. Neurosci. 26, 27–38 (2023).
Méndez-Mancilla, A. et al. Chemically modified guide RNAs enhance CRISPR–Cas13 knockdown in human cells. Cell Chem Biol. 29, 321–327 (2022).
Rotolo, L. et al. Species-agnostic polymeric formulations for inhalable messenger RNA delivery to the lung. Nat. Mater. 22, 369–379 (2023).
Fan, N. et al. Hierarchical self-uncloaking CRISPR–Cas13a-customized RNA nanococoons for spatial-controlled genome editing and precise cancer therapy. Sci. Adv. 8, eabn7382 (2022).
Guo, Y. et al. Specific knockdown of Htra2 by CRISPR–CasRx prevents acquired sensorineural hearing loss in mice. Mol. Ther. Nucleic Acids 28, 643–655 (2022).
Nasim, M. T. et al. Stoichiometric imbalance in the receptor complex contributes to dysfunctional BMPR-II mediated signalling in pulmonary arterial hypertension. Hum. Mol. Genet. 17, 1683–1694 (2008).
Gurdon, J. B. & Bourillot, P. Y. Morphogen gradient interpretation. Nature 413, 797–803 (2001).
McHugh, C. A. et al. The Xist lncRNA interacts directly with SHARP to silence transcription through HDAC3. Nature 521, 232–236 (2015).
Fehrmann, R. S. N. et al. Gene expression analysis identifies global gene dosage sensitivity in cancer. Nat. Genet. 47, 115–125 (2015).
Collins, R. L. et al. A cross-disorder dosage sensitivity map of the human genome. Cell 185, 3041–3055 (2022).
Patwardhan, R. P. et al. High-resolution analysis of DNA regulatory elements by synthetic saturation mutagenesis. Nat. Biotechnol. 27, 1173–1175 (2009).
Gossen, M. & Bujard, H. Tight control of gene expression in mammalian cells by tetracycline-responsive promoters. Proc. Natl Acad. Sci. USA 89, 5547–5551 (1992).
Michaels, Y. S. et al. Precise tuning of gene expression levels in mammalian cells. Nat. Commun. 10, 818 (2019).
Jost, M. et al. Titrating gene expression using libraries of systematically attenuated CRISPR guide RNAs. Nat. Biotechnol. 38, 355–364 (2020).
Bintu, L. et al. Dynamics of epigenetic regulation at the single-cell level. Science 351, 720–724 (2016).
Zhang, C. et al. Structural basis for the RNA-guided ribonuclease activity of CRISPR–Cas13d. Cell 175, 212–223 (2018).
Charlier, J., Nadon, R. & Makarenkov, V. Accurate deep learning off-target prediction with novel sgRNA–DNA sequence encoding in CRISPR–Cas9 gene editing. Bioinformatics 37, 2299–2307 (2021).
Kim, H. K. et al. SpCas9 activity prediction by DeepSpCas9, a deep learning-based model with high generalization performance. Sci. Adv. 5, eaax9249 (2019).
Lin, J. & Wong, K. C. Off-target predictions in CRISPR–Cas9 gene editing using deep learning. Bioinformatics 34, i656–i663 (2018).
Lin, J., Zhang, Z., Zhang, S., Chen, J. & Wong, K. C. CRISPR-Net: a recurrent convolutional network quantifies CRISPR off-target activities with mismatches and indels. Adv. Sci. 7, 1903562 (2020).
Liu, Q., Cheng, X., Liu, G., Li, B. & Liu, X. Deep learning improves the ability of sgRNA off-target propensity prediction. BMC Bioinf. 21, 51 (2020).
Luo, J., Chen, W., Xue, L. & Tang, B. Prediction of activity and specificity of CRISPR-Cpf1 using convolutional deep learning neural networks. BMC Bioinformatics 20, 332 (2019).
Niu, R., Peng, J., Zhang, Z. & Shang, X. R-CRISPR: a deep learning network to predict off-target activities with mismatch, insertion and deletion in CRISPR-Cas9 system. Genes (Basel). 12, 1878 (2021).
Zhang, G., Zeng, T., Dai, Z. & Dai, X. Prediction of CRISPR/Cas9 single guide RNA cleavage efficiency and specificity by attention-based convolutional neural networks. Comput. Struct. Biotechnol. J. 19, 1445–1457 (2021).
LeCun, Y. et al. Backpropagation applied to digit recognition. Neural Comput. 1, 541–551 (1989).
Krizhevsky, A., Sutskever, I. & Hinton, G. E. ImageNet classification with deep convolutional neural networks. Adv. Neural Inf. Process. Syst. 25, 1097–1105 (2012).
Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017).
Shi, P. et al. Collateral activity of the CRISPR/RfxCas13d system in human cells. Commun. Biol. 6, 334 (2023).
Kelley, C. P., Haerle, M. C. & Wang, E. T. Negative autoregulation mitigates collateral RNase activity of repeat-targeting CRISPR–Cas13d in mammalian cells. Cell Rep. 40, 111226 (2022).
Wang, T. et al. Identification and characterization of essential genes in the human genome. Science 350, 1096–1101 (2015).
Hart, T., Brown, K. R., Sircoulomb, F., Rottapel, R. & Moffat, J. Measuring error rates in genomic perturbation screens: gold standards for human functional genomics. Mol. Syst. Biol. 10, 733 (2014).
Kim, H. K. et al. High-throughput analysis of the activities of xCas9, SpCas9-NG and SpCas9 at matched and mismatched target sequences in human cells. Nat. Biomed. Eng. 4, 111–124 (2020).
Kim, N. et al. Prediction of the sequence-specific cleavage activity of Cas9 variants. Nat. Biotechnol. 38, 1328–1336 (2020).
Xiang, X. et al. Enhancing CRISPR–Cas9 gRNA efficiency prediction by data integration and deep learning. Nat. Commun. 12, 3238 (2021).
Hu, W. et al. Single-base precision design of CRISPR–Cas13b enables systematic silencing of oncogenic fusions. Preprint at bioRxiv https://doi.org/10.1101/2022.06.22.497105 (2022).
Raj, A. & van Oudenaarden, A. Nature, nurture, or chance: stochastic gene expression and its consequences. Cell 135, 216–226 (2008).
Lai, E. C., Tomancak, P., Williams, R. W. & Rubin, G. M. Computational identification of Drosophila microRNA genes. Genome Biol. 4, R42 (2003).
Stoeger, T., Battich, N. & Pelkmans, L. Passive noise filtering by cellular compartmentalization. Cell 164, 1151–1161 (2016).
Noviello, G., Gjaltema, R.A.F. & Schulz, E.G. CasTuner is a degron and CRISPR/Cas-based toolkit for analog tuning of endogenous gene expression. Nat. Commun. 14, 3225 (2023).
Lensch, S. et al. Dynamic spreading of chromatin-mediated gene silencing and reactivation between neighboring genes in single cells. eLife 11, e75115 (2022).
Steiger, J. H. Tests for comparing elements of a correlation matrix. Psychol. Bull. 87, 245–251 (1980).
DeLong, E. R., DeLong, D. M. & Clarke-Pearson, D. L. Comparing the areas under two or more correlated receiver operating characteristic curves: a nonparametric approach. Biometrics 44, 837–845 (1988).
Sun, X. & Xu, W. Fast implementation of DeLong’s algorithm for comparing the areas under correlated receiver operating characteristic curves. IEEE Signal Process Lett. 21, 1389–1393 (2014).
Massey, F. J. J. The Kolmogorov-Smirnov test for goodness of fit. J. Am. Stat. Assoc. 46, 68–78 (1951).
Chen, S. et al. Genome-wide CRISPR screen in a mouse model of tumor growth and metastasis. Cell 160, 1246–1260 (2015).
Hart, T. et al. High-resolution CRISPR screens reveal fitness genes and genotype-specific cancer liabilities. Cell 163, 1515–1526 (2015).
Gerstberger, S., Hafner, M. & Tuschl, T. A census of human RNA-binding proteins. Nat. Rev. Genet. 15, 829–845 (2014).
Vaquerizas, J. M., Kummerfeld, S. K., Teichmann, S. A. & Luscombe, N. M. A census of human transcription factors: function, expression and evolution. Nat. Rev. Genet. 10, 252–263 (2009).
Love, M. I., Huber, W. & Anders, S. Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2. Genome Biol. 15, 550 (2014).
Leek, J. T., Johnson, W. E., Parker, H. S., Jaffe, A. E. & Storey, J. D. The sva package for removing batch effects and other unwanted variation in high-throughput experiments. Bioinformatics 28, 882–883 (2012).
Sack, L. M., Davoli, T., Xu, Q., Li, M. Z. & Elledge, S. J. Sources of error in mammalian genetic screens. G3 (Bethesda). 6, 2781–2790 (2016).
Kolde, R., Laur, S., Adler, P. & Vilo, J. Robust rank aggregation for gene list integration and meta-analysis. Bioinformatics 28, 573–580 (2012).
Lorenz, R. et al. ViennaRNA Package 2.0. Algorithms Mol. Biol. 6, 26 (2011).
Agarwal, V., Subtelny, A. O., Thiru, P., Ulitsky, I. & Bartel, D. P. Predicting microRNA targeting efficacy in Drosophila. Genome Biol. 19, 152 (2018).
Agarwal, V., Bell, G. W., Nam, J.-W. & Bartel, D. P. Predicting effective microRNA target sites in mammalian mRNAs. eLife 4, e05005 (2015).
Krueger, J. & Rehmsmeier, M. RNAhybrid: microRNA target prediction easy, fast and flexible. Nucleic Acids Res. 34, 451–454 (2006).
Kingma, D. P. & Ba, J. Adam: a method for stochastic optimization. Preprint at https://arxiv.org/abs/1412.6980 (2014).
Cho, K., van Merriënboer, B., Bahdanau, D. & Bengio, Y. On the properties of neural machine translation: encoder–decoder approaches. In Proc. SSST-8, Eighth Workshop on Syntax, Semantics and Structure in Statistical Translation (eds Wu, D. et al.) 103–111 (Association for Computational Linguistics, 2014).
Acknowledgements
We thank the entire Sanjana and Knowles Labs for their support and advice. D.A.K. is supported by Columbia and NYGC startup funds, NIH/NCI (R21CA272345) and an NSF CAREER (DBI2146398). N.E.S. is supported by NYU and NYGC startup funds, NIH/NHGRI (DP2HG010099), NIH/NCI (R01CA218668), NIH/NIGMS (R01GM138635), DARPA (D18AP00053), Cancer Research Institute and the Simons Foundation for Autism Research Initiative.
Author information
Authors and Affiliations
Contributions
H.W. and N.E.S. conceived the study. H.W., A.S., D.A.K. and N.E.S. designed the experiments. H.W. and A.M. cloned libraries and performed the CRISPR screens. S.K.H. assisted with cell culture for pooled screens. H.W., A.S., D.A.K. and N.E.S. analyzed the data and developed the deep learning model. A.S. and E.J.K. implemented the web-based online TIGER tool. H.W., A.S., D.A.K. and N.E.S. wrote the paper with input from all authors.
Corresponding authors
Ethics declarations
Competing interests
The New York Genome Center and New York University have applied for patents relating to the work in this article. H.W. is a cofounder of Neptune Biotech. N.E.S. is an advisor to Qiagen and is a cofounder of OverT Bio. The other authors declare no competing interests.
Peer review
Peer review information
Nature Biotechnology thanks the anonymous reviewers for their contribution to the peer review of this work.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary information
Supplementary Information
Supplementary Figs. 1–6 and Supplementary Note.
Supplementary Data 1
Off-target screen target genes.
Supplementary Data 2
Off-target screen gRNA annotation.
Supplementary Data 3
Off-target screen raw counts.
Supplementary Data 4
Off-target screen gRNA depletion.
Supplementary Data 5
TIGER on-target screen gRNA annotation.
Supplementary Data 6
TIGER on-target screen raw counts.
Supplementary Data 7
TIGER on-target gRNA depletion.
Supplementary Data 8
TIGER on-target gene depletion.
Supplementary Data 9
Off-target screen relative SM gRNA activities.
Supplementary Data 10
TIGER titration screen gRNA annotation.
Supplementary Data 11
TIGER titration screen raw gRNA counts.
Supplementary Data 12
TIGER titration screen gRNA depletion.
Supplementary Data 13
TIGER titration screen relative SM gRNA activities.
Supplementary Data 14
Oligonucleotides used in this study.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Wessels, HH., Stirn, A., Méndez-Mancilla, A. et al. Prediction of on-target and off-target activity of CRISPR–Cas13d guide RNAs using deep learning. Nat Biotechnol 42, 628–637 (2024). https://doi.org/10.1038/s41587-023-01830-8
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1038/s41587-023-01830-8
This article is cited by
-
Repurposing CRISPR-Cas13 systems for robust mRNA trans-splicing
Nature Communications (2024)
-
Massively parallel profiling of RNA-targeting CRISPR-Cas13d
Nature Communications (2024)
-
Interpretable model of CRISPR–Cas9 enzymatic reactions
Nature Computational Science (2023)