Direct identification of A-to-I editing sites with nanopore native RNA sequencing

Nguyen, Tram Anh; Heng, Jia Wei Joel; Kaewsapsak, Pornchai; Kok, Eng Piew Louis; Stanojević, Dominik; Liu, Hao; Cardilla, Angelysia; Praditya, Albert; Yi, Zirong; Lin, Mingwan; Aw, Jong Ghut Ashley; Ho, Yin Ying; Peh, Kai Lay Esther; Wang, Yuanming; Zhong, Qixing; Heraud-Farlow, Jacki; Xue, Shifeng; Reversade, Bruno; Walkley, Carl; Ho, Ying Swan; Šikić, Mile; Wan, Yue; Tan, Meng How

doi:10.1038/s41592-022-01513-3

Article
Published: 13 June 2022

Direct identification of A-to-I editing sites with nanopore native RNA sequencing

Nature Methods volume 19, pages 833–844 (2022)Cite this article

9040 Accesses
24 Citations
41 Altmetric
Metrics details

Subjects

Abstract

Inosine is a prevalent RNA modification in animals and is formed when an adenosine is deaminated by the ADAR family of enzymes. Traditionally, inosines are identified indirectly as variants from Illumina RNA-sequencing data because they are interpreted as guanosines by cellular machineries. However, this indirect method performs poorly in protein-coding regions where exons are typically short, in non-model organisms with sparsely annotated single-nucleotide polymorphisms, or in disease contexts where unknown DNA mutations are pervasive. Here, we show that Oxford Nanopore direct RNA sequencing can be used to identify inosine-containing sites in native transcriptomes with high accuracy. We trained convolutional neural network models to distinguish inosine from adenosine and guanosine, and to estimate the modification rate at each editing site. Furthermore, we demonstrated their utility on the transcriptomes of human, mouse and Xenopus. Our approach expands the toolkit for studying adenosine-to-inosine editing and can be further extended to investigate other RNA modifications.

Access through your institution

Buy or subscribe

This is a preview of subscription content, access via your institution

Access options

Access through your institution

Buy this article

Purchase on Springer Link
Instant access to full article PDF

Buy now

Prices may be subject to local taxes which are calculated during checkout

**Fig. 1: Dinopore exploits deviations in ionic current signal and base-calling errors to predict inosines in RNA transcripts directly.**

**Fig. 2: Development and benchmarking of Dinopore for inosine detection.**

**Fig. 3: Evaluation of Dinopore on previously unseen organisms and cell types.**

**Fig. 4: Multi-class predictions by Dinopore.**

**Fig. 5: Further evaluation of Dinopore.**

**Fig. 6: Estimation of editing levels with a regression model.**

Landscape of adenosine-to-inosine RNA recoding across human tissues

Article Open access 04 March 2022

Investigating RNA editing in deep transcriptome datasets with REDItools and REDIportal

Article 29 January 2020

RNA modifications detection by comparative Nanopore direct RNA sequencing

Article Open access 10 December 2021

Data availability

Raw nanopore sequencing data have been deposited in the NCBI Sequence Read Archive under accession number SRP363295.

Genome references are publicly available and can be downloaded from the following links:

GRCh37, mm10 and xenlae2.

Code availability

The computational code used in all the analysis is hosted on GitHub (https://github.com/darelab2014/Dinopore). A pre-built computing environment as well as the source code and source data are also available in a Code Ocean capsule (https://doi.org/10.24433/CO.2180901.v1).

References

Nishikura, K. Functions and regulation of RNA editing by ADAR deaminases. Annu. Rev. Biochem. 79, 321–349 (2010).
Article CAS PubMed PubMed Central Google Scholar
Burns, C. M. et al. Regulation of serotonin-2C receptor G-protein coupling by RNA editing. Nature 387, 303–308 (1997).
Article CAS PubMed Google Scholar
Hoopengardner, B., Bhalla, T., Staber, C. & Reenan, R. Nervous system targets of RNA editing identified by comparative genomics. Science 301, 832–836 (2003).
Article CAS PubMed Google Scholar
Sommer, B., Kohler, M., Sprengel, R. & Seeburg, P. H. RNA editing in brain controls a determinant of ion flow in glutamate-gated channels. Cell 67, 11–19 (1991).
Article CAS PubMed Google Scholar
Hsiao, Y. E. et al. RNA editing in nascent RNA affects pre-mRNA splicing. Genome Res. 28, 812–823 (2018).
Article CAS PubMed PubMed Central Google Scholar
Zhang, Z. & Carmichael, G. G. The fate of dsRNA in the nucleus: a p54(nrb)-containing complex mediates the nuclear retention of promiscuously A-to-I edited RNAs. Cell 106, 465–475 (2001).
Article CAS PubMed Google Scholar
Stellos, K. et al. Adenosine-to-inosine RNA editing controls cathepsin S expression in atherosclerosis by enabling HuR-mediated post-transcriptional regulation. Nat. Med. 22, 1140–1150 (2016).
Article CAS PubMed Google Scholar
Bahn, J. H. et al. Genomic analysis of ADAR1 binding and its involvement in multiple RNA processing pathways. Nat. Commun. 6, 6355 (2015).
Article CAS PubMed Google Scholar
Yang, W. et al. Modulation of microRNA processing and expression through RNA editing by ADAR deaminases. Nat. Struct. Mol. Biol. 13, 13–21 (2006).
Article CAS PubMed Google Scholar
Kawahara, Y. et al. Redirection of silencing targets by adenosine-to-inosine editing of miRNAs. Science 315, 1137–1140 (2007).
Article CAS PubMed PubMed Central Google Scholar
Ivanov, A. et al. Analysis of intron sequences reveals hallmarks of circular RNA biogenesis in animals. Cell Rep. 10, 170–177 (2015).
Article CAS PubMed Google Scholar
Rybak-Wolf, A. et al. Circular RNAs in the mammalian brain are highly abundant, conserved, and dynamically expressed. Mol. Cell 58, 870–885 (2015).
Article CAS PubMed Google Scholar
Wang, Q., Khillan, J., Gadue, P. & Nishikura, K. Requirement of the RNA editing deaminase ADAR1 gene for embryonic erythropoiesis. Science 290, 1765–1768 (2000).
Article CAS PubMed Google Scholar
Wang, Q. et al. Stress-induced apoptosis associated with null mutation of ADAR1 RNA editing deaminase gene. J. Biol. Chem. 279, 4952–4961 (2004).
Article CAS PubMed Google Scholar
Hartner, J. C. et al. Liver disintegration in the mouse embryo caused by deficiency in the RNA-editing enzyme ADAR1. J. Biol. Chem. 279, 4894–4902 (2004).
Article CAS PubMed Google Scholar
Liddicoat, B. J. et al. RNA editing by ADAR1 prevents MDA5 sensing of endogenous dsRNA as nonself. Science 349, 1115–1120 (2015).
Article CAS PubMed PubMed Central Google Scholar
Mannion, N. M. et al. The RNA-editing enzyme ADAR1 controls innate immune responses to RNA. Cell Rep. 9, 1482–1494 (2014).
Article CAS PubMed PubMed Central Google Scholar
Pestal, K. et al. Isoforms of RNA-editing enzyme ADAR1 independently control nucleic acid sensor MDA5-driven autoimmunity and multi-organ development. Immunity 43, 933–944 (2015).
Article CAS PubMed PubMed Central Google Scholar
Gacem, N. et al. ADAR1 mediated regulation of neural crest derived melanocytes and Schwann cell development. Nat. Commun. 11, 198 (2020).
Article CAS PubMed PubMed Central Google Scholar
Rice, G. I. et al. Mutations in ADAR1 cause Aicardi-Goutieres syndrome associated with a type I interferon signature. Nat. Genet. 44, 1243–1248 (2012).
Article CAS PubMed PubMed Central Google Scholar
Roth, S. H. et al. Increased RNA editing may provide a source for autoantigens in systemic lupus erythematosus. Cell Rep. 23, 50–57 (2018).
Article CAS PubMed PubMed Central Google Scholar
Shallev, L. et al. Decreased A-to-I RNA editing as a source of keratinocytes’ dsRNA in psoriasis. RNA 24, 828–840 (2018).
Article CAS PubMed PubMed Central Google Scholar
Tran, S. S. et al. Widespread RNA editing dysregulation in brains from autistic individuals. Nat. Neurosci. 22, 25–36 (2019).
Article CAS PubMed Google Scholar
Khermesh, K. et al. Reduced levels of protein recoding by A-to-I RNA editing in Alzheimer’s disease. RNA 22, 290–302 (2016).
Article CAS PubMed PubMed Central Google Scholar
Breen, M. S. et al. Global landscape and genetic regulation of RNA editing in cortical samples from individuals with schizophrenia. Nat. Neurosci. 22, 1402–1412 (2019).
Article CAS PubMed PubMed Central Google Scholar
Han, L. et al. The genomic landscape and clinical relevance of A-to-I RNA editing in human cancers. Cancer Cell 28, 515–528 (2015).
Article CAS PubMed PubMed Central Google Scholar
Ishizuka, J. J. et al. Loss of ADAR1 in tumours overcomes resistance to immune checkpoint blockade. Nature 565, 43–48 (2019).
Article CAS PubMed Google Scholar
Liu, H. et al. Tumor-derived IFN triggers chronic pathway agonism and sensitivity to ADAR loss. Nat. Med. 25, 95–102 (2019).
Article CAS PubMed Google Scholar
Gannon, H. S. et al. Identification of ADAR1 adenosine deaminase dependency in a subset of cancer cells. Nat. Commun. 9, 5450 (2018).
Article CAS PubMed PubMed Central Google Scholar
Pinto, Y. & Levanon, E. Y. Computational approaches for detection and quantification of A-to-I RNA editing. Methods 156, 25–31 (2019).
Article CAS PubMed Google Scholar
Mansi, L. et al. REDIportal: millions of novel A-to-I RNA editing events from thousands of RNAseq experiments. Nucleic Acids Res. 49, D1012–D1019 (2021).
Article CAS PubMed Google Scholar
Ramaswami, G. & Li, J. B. RADAR: a rigorously annotated database of A-to-I RNA editing. Nucleic Acids Res. 42, D109–D113 (2014).
Article CAS PubMed Google Scholar
Garalde, D. R. et al. Highly parallel direct RNA sequencing on an array of nanopores. Nat. Methods 15, 201–206 (2018).
Article CAS PubMed Google Scholar
Liu, H. et al. Accurate detection of m⁶A RNA modifications in native RNA sequences. Nat. Commun. 10, 4079 (2019).
Article PubMed PubMed Central CAS Google Scholar
Parker, M. T. et al. Nanopore direct RNA-sequencing maps the complexity of Arabidopsis mRNA processing and m⁶A modification. Elife https://doi.org/10.7554/eLife.49658 (2020).
Price, A. M. et al. Direct RNA sequencing reveals m⁶A modifications on adenovirus RNA are necessary for efficient splicing. Nat. Commun. 11, 6016 (2020).
Article CAS PubMed PubMed Central Google Scholar
Jenjaroenpun, P. et al. Decoding the epitranscriptional landscape from native RNA sequences. Nucleic Acids Res. 49, e7 (2021).
Article CAS PubMed Google Scholar
Lorenz, D. A., Sathe, S., Einstein, J. M. & Yeo, G. W. Direct RNA sequencing enables m⁶A detection in endogenous transcript isoforms at base-specific resolution. RNA 26, 19–28 (2020).
Article CAS PubMed PubMed Central Google Scholar
Begik, O. et al. Quantitative profiling of pseudouridylation dynamics in native RNAs with nanopore sequencing. Nat. Biotechnol. https://doi.org/10.1038/s41587-021-00915-6 (2021).
Article PubMed Google Scholar
Pratanwanich, P. N. et al. Identification of differential RNA modifications from nanopore direct RNA sequencing with xPore. Nat. Biotechnol. https://doi.org/10.1038/s41587-021-00949-w (2021).
Article PubMed Google Scholar
Leger, A. et al. RNA modifications detection by comparative Nanopore direct RNA sequencing. Nat. Commun. 12, 7198 (2021).
Article CAS PubMed PubMed Central Google Scholar
Yoshida, M. & Ukita, T. Modification of nucleosides and nucleotides. VII. Selective cyanoethylation of inosine and pseudouridine in yeast transfer ribonucleic acid. Biochim. Biophys. Acta 157, 455–465 (1968).
Article CAS PubMed Google Scholar
Sakurai, M., Yano, T., Kawabata, H., Ueda, H. & Suzuki, T. Inosine cyanoethylation identifies A-to-I RNA editing sites in the human transcriptome. Nat. Chem. Biol. 6, 733–740 (2010).
Article CAS PubMed Google Scholar
Loman, N. J., Quick, J. & Simpson, J. T. A complete bacterial genome assembled de novo using only nanopore sequencing data. Nat. Methods 12, 733–735 (2015).
Article CAS PubMed Google Scholar
Ding, H., Bailey, A. D., Jain, M., Olsen, H. & Paten, B. Gaussian mixture model-based unsupervised nucleotide modification number detection using nanopore-sequencing readouts. Bioinformatics 36, 4928–4934 (2020).
Article CAS PubMed PubMed Central Google Scholar
Picardi, E. et al. Profiling RNA editing in human tissues: towards the inosinome Atlas. Sci. Rep. 5, 14941 (2015).
Article CAS PubMed PubMed Central Google Scholar
Tan, M. H. et al. Dynamic landscape and regulation of RNA editing in mammals. Nature 550, 249–254 (2017).
Article PubMed PubMed Central Google Scholar
Wick, R. R., Judd, L. M. & Holt, K. E. Deepbinner: demultiplexing barcoded Oxford Nanopore reads with deep convolutional neural networks. PLoS Comput. Biol. 14, e1006583 (2018).
Article PubMed PubMed Central CAS Google Scholar
Bazak, L. et al. A-to-I RNA editing occurs at over a hundred million genomic sites, located in a majority of human genes. Genome Res. 24, 365–376 (2014).
Article CAS PubMed PubMed Central Google Scholar
He, K., Zhang, X., Ren, S. & Sun, J. Deep residual learning for image recognition. in 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 770–778 (2016).
Nguyen, A. T., Xu, J., Luu, D. K., Zhao, Q. & Yang, Z. Advancing system performance with redundancy: from biological to artificial designs. Neural Comput. 31, 555–573 (2019).
Article PubMed Google Scholar
Porath, H. T., Knisbacher, B. A., Eisenberg, E. & Levanon, E. Y. Massive A-to-I RNA editing is common across the Metazoa and correlates with dsRNA abundance. Genome Biol. 18, 185 (2017).
Article PubMed PubMed Central CAS Google Scholar
Lo Giudice, C., Tangaro, M. A., Pesole, G. & Picardi, E. Investigating RNA editing in deep transcriptome datasets with REDItools and REDIportal. Nat. Protoc. 15, 1098–1131 (2020).
Article CAS PubMed Google Scholar
Chalk, A. M., Taylor, S., Heraud-Farlow, J. E. & Walkley, C. R. The majority of A-to-I RNA editing is not required for mammalian homeostasis. Genome Biol. 20, 268 (2019).
Article CAS PubMed PubMed Central Google Scholar
Sherry, S. T., Ward, M. & Sirotkin, K. dbSNP database for single-nucleotide polymorphisms and other classes of minor genetic variation. Genome Res. 9, 677–679 (1999).
Article CAS PubMed Google Scholar
Sherry, S. T. et al. dbSNP: the NCBI database of genetic variation. Nucleic Acids Res. 29, 308–311 (2001).
Article CAS PubMed PubMed Central Google Scholar
Ghandi, M. et al. Next-generation characterization of the cancer cell line encyclopedia. Nature 569, 503–508 (2019).
Article CAS PubMed PubMed Central Google Scholar
Lo Giudice, C. et al. Quantifying RNA editing in deep transcriptome datasets. Front. Genet. 11, 194 (2020).
Article PubMed PubMed Central CAS Google Scholar
Polson, A. G. & Bass, B. L. Preferential selection of adenosines for modification by double-stranded RNA adenosine deaminase. EMBO J. 13, 5701–5711 (1994).
Article CAS PubMed PubMed Central Google Scholar
Eggington, J. M., Greene, T. & Bass, B. L. Predicting sites of ADAR editing in double-stranded RNA. Nat. Commun. 2, 319 (2011).
Article PubMed CAS Google Scholar
Lehmann, K. A. & Bass, B. L. Double-stranded RNA adenosine deaminases ADAR1 and ADAR2 have overlapping specificities. Biochemistry 39, 12875–12884 (2000).
Article CAS PubMed Google Scholar
Buchumenski, I. et al. Systematic identification of A-to-I RNA editing in zebrafish development and adult organs. Nucleic Acids Res. 49, 4325–4337 (2021).
Article CAS PubMed PubMed Central Google Scholar
Athanasiadis, A., Rich, A. & Maas, S. Widespread A-to-I RNA editing of Alu-containing mRNAs in the human transcriptome. PLoS Biol. 2, e391 (2004).
Article PubMed PubMed Central Google Scholar
Xiong, F. et al. RNA m⁶A modification orchestrates a LINE-1–host interaction that facilitates retrotransposition and contributes to long gene vulnerability. Cell Res. 31, 861–885 (2021).
Article CAS PubMed PubMed Central Google Scholar
Liu, J. et al. The RNA m⁶A reader YTHDC1 silences retrotransposons and guards ES cell identity. Nature 591, 322–326 (2021).
Article CAS PubMed Google Scholar
Chen, C. et al. Nuclear m⁶A reader YTHDC1 regulates the scaffold function of LINE1 RNA in mouse ESCs and early embryos. Protein Cell 12, 455–474 (2021).
Article CAS PubMed PubMed Central Google Scholar
Xu, W. et al. METTL3 regulates heterochromatin in mouse embryonic stem cells. Nature 591, 317–321 (2021).
Article CAS PubMed Google Scholar
Jain, M., Jantsch, M. F. & Licht, K. The Editor’s I on disease development. Trends Genet. 35, 903–913 (2019).
Article CAS PubMed Google Scholar
Garrett, S. & Rosenthal, J. J. RNA editing underlies temperature adaptation in K⁺ channels from polar octopuses. Science 335, 848–851 (2012).
Article CAS PubMed PubMed Central Google Scholar
Alon, S. et al. The majority of transcripts in the squid nervous system are extensively recoded by A-to-I RNA editing. Elife https://doi.org/10.7554/eLife.05198 (2015).
Liscovitch-Brauer, N. et al. Trade-off between transcriptome plasticity and genome evolution in cephalopods. Cell 169, 191–202 (2017).
Article CAS PubMed PubMed Central Google Scholar
Cox, D. B. T. et al. RNA editing with CRISPR–Cas13. Science 358, 1019–1027 (2017).
Article CAS PubMed PubMed Central Google Scholar
Merkle, T. et al. Precise RNA editing by recruiting endogenous ADARs with antisense oligonucleotides. Nat. Biotechnol. 37, 133–138 (2019).
Article CAS PubMed Google Scholar
Qu, L. et al. Programmable RNA editing by recruiting endogenous ADAR using engineered RNAs. Nat. Biotechnol. 37, 1059–1069 (2019).
Article CAS PubMed Google Scholar
Boccaletto, P. et al. MODOMICS: a database of RNA modification pathways. 2017 update. Nucleic Acids Res. 46, D303–D307 (2018).
Article CAS PubMed Google Scholar
Dobin, A. et al. STAR: ultrafast universal RNA-seq aligner. Bioinformatics 29, 15–21 (2013).
Article CAS PubMed Google Scholar
McKenna, A. et al. The Genome Analysis Toolkit: a MapReduce framework for analyzing next-generation DNA sequencing data. Genome Res. 20, 1297–1303 (2010).
Article CAS PubMed PubMed Central Google Scholar
Marić, J., Sović, I., Križanović, K., Nagarajan, N. & Šikić, M. Graphmap2—splice-aware RNA-seq mapper for long reads https://doi.org/10.1101/720458 (2019).

Download references

Acknowledgements

We thank members of the DaRE laboratory for helpful discussions. M.H.T. is supported by a National Research Foundation Singapore grant (NRF2017-NRF-ISF002–2673), an Open Fund - Individual Research Grant from the National Medical Research Council (NMRC/OFIRG/0017/2016), an EMBO Global Investigatorship, an ASPIRE League seed grant from Nanyang Technological University, core funds from the Genome Institute of Singapore, and funds for Final Year Project (FYP) and the International Genetically Engineering Machine (iGEM) competition from the School of Chemical and Biomedical Engineering. J.W.J.H. is supported by a Ph.D. research scholarship from the School of Chemical and Biomedical Engineering. Y.S.H. is supported by core funds from the Bioprocessing Technology Institute. We also acknowledge the funding support for this project from Nanyang Technological University under the URECA Undergraduate Research Programme.

Author information

These authors contributed equally: Jia Wei Joel Heng, Pornchai Kaewsapsak.

Authors and Affiliations

School of Chemical and Biomedical Engineering, Nanyang Technological University, Singapore, Singapore
Tram Anh Nguyen, Jia Wei Joel Heng, Hao Liu, Angelysia Cardilla, Albert Praditya, Zirong Yi, Yuanming Wang & Meng How Tan
Genome Institute of Singapore, Agency for Science Technology and Research, Singapore, Singapore
Tram Anh Nguyen, Jia Wei Joel Heng, Pornchai Kaewsapsak, Eng Piew Louis Kok, Dominik Stanojević, Hao Liu, Angelysia Cardilla, Albert Praditya, Zirong Yi, Mingwan Lin, Jong Ghut Ashley Aw, Yuanming Wang, Qixing Zhong, Bruno Reversade, Mile Šikić, Yue Wan & Meng How Tan
Department of Biochemistry, Faculty of Medicine, Chulalongkorn University, Bangkok, Thailand
Pornchai Kaewsapsak
University of Zagreb, Faculty of Electrical Engineering and Computing, Zagreb, Croatia
Dominik Stanojević & Mile Šikić
National Junior College, Singapore, Singapore
Mingwan Lin
School of Biological Sciences, Nanyang Technological University, Singapore, Singapore
Jong Ghut Ashley Aw & Yue Wan
Bioprocessing Technology Institute, Agency for Science Technology and Research, Singapore, Singapore
Yin Ying Ho, Kai Lay Esther Peh & Ying Swan Ho
St. Vincent’s Institute of Medical Research and Department of Medicine, University of Melbourne, Fitzroy, Victoria, Australia
Jacki Heraud-Farlow & Carl Walkley
Institute of Molecular and Cell Biology, Agency for Science Technology and Research, Singapore, Singapore
Shifeng Xue & Bruno Reversade
Department of Biological Sciences, National University of Singapore, Singapore, Singapore
Shifeng Xue
Yong Loo Lin School of Medicine, National University of Singapore, Singapore, Singapore
Bruno Reversade & Yue Wan
Department of Medical Genetics, School of Medicine (KUSoM), Koç University, Istanbul, Turkey
Bruno Reversade
HP-NTU Digital Manufacturing Corporate Lab, Nanyang Technological University, Singapore, Singapore
Meng How Tan

Authors

Tram Anh Nguyen
View author publications
You can also search for this author in PubMed Google Scholar
Jia Wei Joel Heng
View author publications
You can also search for this author in PubMed Google Scholar
Pornchai Kaewsapsak
View author publications
You can also search for this author in PubMed Google Scholar
Eng Piew Louis Kok
View author publications
You can also search for this author in PubMed Google Scholar
Dominik Stanojević
View author publications
You can also search for this author in PubMed Google Scholar
Hao Liu
View author publications
You can also search for this author in PubMed Google Scholar
Angelysia Cardilla
View author publications
You can also search for this author in PubMed Google Scholar
Albert Praditya
View author publications
You can also search for this author in PubMed Google Scholar
Zirong Yi
View author publications
You can also search for this author in PubMed Google Scholar
Mingwan Lin
View author publications
You can also search for this author in PubMed Google Scholar
Jong Ghut Ashley Aw
View author publications
You can also search for this author in PubMed Google Scholar
Yin Ying Ho
View author publications
You can also search for this author in PubMed Google Scholar
Kai Lay Esther Peh
View author publications
You can also search for this author in PubMed Google Scholar
Yuanming Wang
View author publications
You can also search for this author in PubMed Google Scholar
Qixing Zhong
View author publications
You can also search for this author in PubMed Google Scholar
Jacki Heraud-Farlow
View author publications
You can also search for this author in PubMed Google Scholar
Shifeng Xue
View author publications
You can also search for this author in PubMed Google Scholar
Bruno Reversade
View author publications
You can also search for this author in PubMed Google Scholar
Carl Walkley
View author publications
You can also search for this author in PubMed Google Scholar
Ying Swan Ho
View author publications
You can also search for this author in PubMed Google Scholar
Mile Šikić
View author publications
You can also search for this author in PubMed Google Scholar
Yue Wan
View author publications
You can also search for this author in PubMed Google Scholar
Meng How Tan
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

M.H.T. conceived the project and designed the study. T.A.N. led the computational analysis, with active participation from J.W.J.H., P.K. and M.H.T. E.P.L.K., D.S., J.G.A.A., M.S. and Y.W. contributed to the analysis. J.W.J.H. and P.K. performed the sequencing experiments, with help from H.L., A.C., A.P., Z.Y. and M.L. Y.Y.H., K.L.E.P. and Y.S.H. performed the mass spectrometry experiments. Y.M.W., Q.Z., J.H.-F., S.X., B.R. and C.W. provided samples. T.A.N. and M.H.T. organized and wrote the manuscript, with help from J.W.J.H. and P.K. All authors read and approved the paper.

Corresponding author

Correspondence to Meng How Tan.

Ethics declarations

Competing interests

The authors declare no competing interests.

Peer review

Peer review information

Nature Methods thanks the anonymous reviewers for their contribution to the peer review of this work. Primary Handling Editor: Lei Tang, in collaboration with the Nature Methods team. Peer reviewer reports are available.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Extended data

Extended Data Fig. 1 Percentages of nanopore direct RNA sequencing reads that could be aligned to the reference synthetic sequences.

a, The library preparation protocol from Oxford Nanopore Technologies (ONT) contains an optional reverse transcription (RT) step to generate a second cDNA strand, which is not sequenced but improves the throughput. We found that while the extra RT step did not affect the mapping rate of sequencing reads containing only canonical nucleotides, it enhanced the mappability of inosine-containing reads, although statistical significance was not reached. P-values were calculated using two-tailed Student’s t-test (N = 3 [no RT] or 4 [with RT]). b, Reaction of inosines with acrylonitrile (ACN) results in the introduction of a chemical adduct, which blocks the progression of a reverse transcriptase. The altered base, N¹-cyanoethylinosine, is bulkier and is predicted to perturb the ionic current more dramatically than inosine, potentially rendering detection by direct RNA sequencing easier. However, we found that ACN treatment greatly reduced the throughput as numerous strands appeared to be ejected from the pores and the obtained reads were also significantly harder to align to the reference sequences than untreated inosine-containing reads. P-values were calculated using two-tailed Student’s t-test (N = 6). All box plots: Box, first to last quartiles; whiskers, 1.5× interquartile range; center line, median; points, outliers.

Extended Data Fig. 2 Inosines in the H9 transcriptome.

a, Histogram showing the distribution of editing levels in H9 human embryonic stem cells (hESCs), as calculated from Illumina RNA-seq data. Although thousands of A-to-I editing events could be detected, most of them occurred at low frequencies. b, Signal-level features of adenosine (A), inosine (I), and guanosine (G) in nanopore direct RNA sequencing data generated from H9 cells. c, Frequency of base-calling errors in nanopore data generated from H9 cells. The mismatch frequency was high at SNP positions as the reads were mapped against the reference genome. d, Base qualities of adenosine (A), inosine (I), and guanosine (G) in nanopore data generated from H9 cells. (In b-d, N = 2410 [A], 5613 [I] or 1297 [G].). All box plots: Box, first to last quartiles; whiskers, 1.5× interquartile range; center line, median; points, outliers.

Extended Data Fig. 3 Reproducibility of features in H9 nanopore data.

a, Scatterplots showing the reproducibility of event parameters (mean, standard deviation, and length) across replicates. The Pearson correlation coefficients (R) were all above 0.5. b, Scatterplots showing the reproducibility of base-calling errors (insertion, deletion, and mismatch) across replicates. There was more variability in the base-calling errors compared to the event parameters. While the Pearson correlation coefficients for deletion and mismatch were moderate (between 0.4–0.5), they were appreciably lower for insertion (less than 0.3). c, Scatterplots showing the reproducibility of base quality across replicates. Like the event parameters, the Pearson correlation coefficients for base quality were also above 0.5.

Extended Data Fig. 4 Evaluation of different CNN architectures.

a, A plain architecture with no shortcut connections. b, Comparing the performance of the plain architecture with a ResNet-based architecture shown in Fig. 2b using the same set of training and test data generated from wild type and ADAR1-null human H9 cells.

Extended Data Fig. 5 De novo discovery of RNA editing sites in Xenopus embryos.

a-c, Stranded RNA-seq libraries were constructed out of (a) Stage 1, (b) Stage 9, and (c) Stage 28 Xenopus laevis embryos and sequenced on the Illumina platform. There were three biological replicates for each developmental stage. The software, REDItools, was then used to identify RNA editing sites sample-by-sample. In every sample, A-to-G variants represented the dominant mismatch type as expected. The specificity of detection was also higher in repetitive regions than non-repetitive regions, as indicated by the higher percentages of A-to-G mismatches in all samples. d, Locations of A-to-I RNA editing sites in the Xenopus transcriptome. We examined the genomic locations of editing sites identified from Illumina RNA-seq data using GTF annotation files from NCBI. Consistent with previous studies in other vertebrates, only a small fraction of the Xenopus editing sites was found in protein-coding regions. Majority of the sites also appear to be intergenic, possibly because the frog transcriptome is not fully annotated.

Extended Data Fig. 6 Reproducibility of features in Xenopus nanopore data.

a, Scatterplots showing the reproducibility of event parameters (mean, standard deviation, and length) across replicates. The Pearson correlation coefficients (R) were all above 0.5. b, Scatterplots showing the reproducibility of base-calling errors (insertion, deletion, and mismatch) across replicates. There was more variability in the base-calling errors compared to the event parameters. While the Pearson correlation coefficients for deletion and mismatch were moderate (between 0.4–0.5), they were appreciably lower for insertion (less than 0.3). c, Scatterplots showing the reproducibility of base quality across replicates. Like the event parameters, the Pearson correlation coefficients for base quality were also above 0.5.

Extended Data Fig. 7 Classification of SNPs using a two-class model.

We tested how Dinopore, when trained only on two classes (A and I), would handle A/G SNPs. If it had labelled the SNPs primarily as unmodified, then the two-class model would be sufficient for inosine detection. However, when we evaluated the model on known A/G SNPs in human (H9 and HCT116), mouse, and Xenopus, we found that it predicted most of the SNPs to be inosines instead, possibly because the genetic variants gave a high mismatch frequency. Hence, the result suggested that a three-class model would be required to discriminate between the reference A, I (which was base-called by Guppy as a mixture of A and G), and A/G SNPs.

Extended Data Fig. 8 Detection sensitivity of Dinopore.

a, We stratified the test sites based on their editing levels and examined how accurately our method could identify the sites in each bin. Here, we required a minimum coverage of 20 nanopore reads. Unsurprisingly, the detection sensitivity was poorer for sites with low editing levels (0–10%) in all the biological systems studied. b, Motif sequence logos of A-to-I editing sites. We examined the upstream and downstream nucleotides surrounding each editing site in the test data from various biological systems. In human and mouse, the motif resembled the known ADAR sequence preference, whereby a guanosine is depleted 5’ of and enriched 3’ of the target adenosine. However, we did not observe as strong an enrichment for guanosine 3’ of the editing sites in Xenopus. c, Motifs obtained from the set of sites that were missed by Dinopore. There were very few false negatives in H9, so the leftmost motif is probably not meaningful. Interestingly, for Xenopus, our CNN model appeared to be more likely to miss bona fide editing sites with a downstream uracil and more particularly sites in a UAU sequence context.

Extended Data Fig. 9 Performance of Dinopore in repetitive and non-repetitive regions.

ROC and PR curves for (a) H9, (b) Xenopus, (c) HCT116, and (d) mouse test data. For each biological system, the various CNN models were evaluated on all the test sites (red curves), on only the sites in non-repetitive genomic regions (green curves), or on only the sites in repeats (blue curves). The training data used to develop the models were completely separate from the test datasets and were derived from H9 and Xenopus only. Strikingly, in HCT116 and the mouse, which the models had not previously encountered, the test sites in repeat regions always yielded appreciably lower AUC values.

Extended Data Fig. 10 Quantification of editing levels.

a, We wondered if A-to-I editing levels could be quantified on the ONT platform by cDNA-PCR sequencing. In this method, the libraries are made by reverse transcription, strand-switching and second-strand synthesis, and PCR amplification before attachment of sequencing adapters. We generated these libraries from H9 hESC RNA and sequenced them on the MinION device. Subsequently, we quantified the editing frequencies of known sites and compared the values obtained from nanopore sequencing with those obtained from Illumina sequencing. Overall, we observed a good correlation (R > 0.8) in editing levels between the two methods. Hence, editing may be quantified on the ONT platform by cDNA-PCR sequencing. b, Architecture of regression model to predict editing levels. We utilized CNN for regression analysis of our nanopore direct RNA sequencing data to estimate the modification rate of each inosine-containing site. As before, the input was a two-dimensional matrix with each row corresponding to a different 5-mer. The features included event parameters, base-calling errors, and base quality.

Supplementary information

Supplementary Information

Supplementary Fig. 1 and Supplementary Tables 1–6

Reporting Summary

Peer Review File

Supplementary Data

Mass spectrometry analysis of inosine incorporation

Rights and permissions

Reprints and permissions

About this article

Cite this article

Nguyen, T.A., Heng, J.W.J., Kaewsapsak, P. et al. Direct identification of A-to-I editing sites with nanopore native RNA sequencing. Nat Methods 19, 833–844 (2022). https://doi.org/10.1038/s41592-022-01513-3

Download citation

Received: 02 August 2021
Accepted: 02 May 2022
Published: 13 June 2022
Issue Date: July 2022
DOI: https://doi.org/10.1038/s41592-022-01513-3

This article is cited by

RNA editing enzymes: structure, biological functions and applications
- Dejiu Zhang
- Lei Zhu
- Peifeng Li
Cell & Bioscience (2024)
Direct RNA sequencing coupled with adaptive sampling enriches RNAs of interest in the transcriptome
- Jiaxu Wang
- Lin Yang
- Yue Wan
Nature Communications (2024)
DeepEdit: single-molecule detection and phasing of A-to-I RNA editing events using nanopore direct RNA sequencing
- Longxian Chen
- Liang Ou
- Pei Hao
Genome Biology (2023)
L-GIREMI uncovers RNA editing sites in long-read RNA-seq
- Zhiheng Liu
- Giovanni Quinones-Valdez
- Xinshu Xiao
Genome Biology (2023)
RNA modifications in cardiovascular health and disease
- Aikaterini Gatsiou
- Konstantinos Stellos
Nature Reviews Cardiology (2023)