Abstract
N6-methyladenosine (m6A) is the most abundant modified base in eukaryotic mRNA and has been linked to diverse effects on mRNA fate. Current mapping approaches localize m6A residues to transcript regions 100–200 nt long but cannot identify precise m6A positions on a transcriptome-wide level. Here we developed m6A individual-nucleotide-resolution cross-linking and immunoprecipitation (miCLIP) and used it to demonstrate that antibodies to m6A can induce specific mutational signatures at m6A residues after ultraviolet light–induced antibody-RNA cross-linking and reverse transcription. We found that these antibodies similarly induced mutational signatures at N6,2′-O-dimethyladenosine (m6Am), a modification found at the first nucleotide of certain mRNAs. Using these signatures, we mapped m6A and m6Am at single-nucleotide resolution in human and mouse mRNA and identified small nucleolar RNAs (snoRNAs) as a new class of m6A-containing non-coding RNAs (ncRNAs).
This is a preview of subscription content, access via your institution
Access options
Subscribe to this journal
Receive 12 print issues and online access
$259.00 per year
only $21.58 per issue
Buy this article
- Purchase on SpringerLink
- Instant access to full article PDF
Prices may be subject to local taxes which are calculated during checkout
Similar content being viewed by others
Accession codes
References
Meyer, K.D. et al. Comprehensive analysis of mRNA methylation reveals enrichment in 3′ UTRs and near stop codons. Cell 149, 1635–1646 (2012).
Dominissini, D. et al. Topology of the human and mouse m6A RNA methylomes revealed by m6A-seq. Nature 485, 201–206 (2012).
Perry, R.P., Kelley, D.E., Friderici, K. & Rottman, F. The methylated constituents of L cell messenger RNA: evidence for an unusual cluster at the 5′ terminus. Cell 4, 387–394 (1975).
Desrosiers, R., Friderici, K. & Rottman, F. Identification of methylated nucleosides in messenger RNA from Novikoff hepatoma cells. Proc. Natl. Acad. Sci. USA 71, 3971–3975 (1974).
Schwartz, S. et al. Perturbation of m6A writers reveals two distinct classes of mRNA methylation at internal and 5′ sites. Cell Rep. 8, 284–296 (2014).
Squires, J.E. et al. Widespread occurrence of 5-methylcytosine in human coding and non-coding RNA. Nucleic Acids Res. 40, 5023–5033 (2012).
Ryvkin, P. et al. HAMR: high-throughput annotation of modified ribonucleotides. RNA 19, 1684–1692 (2013).
Sugimoto, Y. et al. Analysis of CLIP and iCLIP methods for nucleotide-resolution studies of protein-RNA interactions. Genome Biol. 13, R67 (2012).
König, J. et al. iCLIP reveals the function of hnRNP particles in splicing at individual nucleotide resolution. Nat. Struct. Mol. Biol. 17, 909–915 (2010).
Hafner, M. et al. Transcriptome-wide identification of RNA-binding protein and microRNA target sites by PAR-CLIP. Cell 141, 129–141 (2010).
Ule, J. et al. CLIP Identifies Nova-regulated RNA networks in the brain. Science 302, 1212–1215 (2003).
Schibler, U. & Perry, R.P. The 5′-termini of heterogeneous nuclear RNA: a comparison among molecules of different sizes and ages. Nucleic Acids Res. 4, 4133–4149 (1977).
Kramer, K. et al. Photo-cross-linking and high-resolution mass spectrometry for assignment of RNA-binding sites in RNA-binding proteins. Nat. Methods 11, 1064–1070 (2014).
Zhang, C. & Darnell, R.B. Mapping in vivo protein-RNA interactions at single-nucleotide resolution from HITS-CLIP data. Nat. Biotechnol. 29, 607–614 (2011).
Piekna-Przybylska, D., Decatur, W.A. & Fournier, M.J. The 3D rRNA modification maps database: with interactive tools for ribosome analysis. Nucleic Acids Res. 36, D178–D183 (2008).
Schibler, U., Kelley, D.E. & Perry, R.P. Comparison of methylated sequences in messenger RNA and heterogeneous nuclear RNA from mouse L cells. J. Mol. Biol. 115, 695–714 (1977).
Moore, M.J. et al. Mapping Argonaute and conventional RNA-binding protein interactions with RNA at single-nucleotide resolution using HITS-CLIP and CIMS analysis. Nat. Protoc. 9, 263–293 (2014).
Weyn-Vanhentenryck, S.M. et al. HITS-CLIP and integrative modeling define the Rbfox splicing-regulatory network linked to brain development and autism. Cell Rep. 6, 1139–1152 (2014).
Liu, N. et al. Probing N6-methyladenosine RNA modification status at single nucleotide resolution in mRNA and long noncoding RNA. RNA 19, 1848–1856 (2013).
Carninci, P. et al. Genome-wide analysis of mammalian promoter architecture and evolution. Nat. Genet. 38, 626–635 (2006).
Ni, T. et al. A paired-end sequencing strategy to map the complex landscape of transcription initiation. Nat. Methods 7, 521–527 (2010).
Plessy, C. et al. Linking promoters to functional transcripts in small samples with nanoCAGE and CAGEscan. Nat. Methods 7, 528–534 (2010).
Frith, M.C. et al. A code for transcription initiation in mammalian genomes. Genome Res. 18, 1–12 (2008).
Moss, B., Gershowitz, A., Weber, L.A. & Baglioni, C. Histone mRNAs contain blocked and methylated 5′ terminal sequences but lack methylated nucleosides at internal positions. Cell 10, 113–120 (1977).
Schwartz, S. et al. Transcriptome-wide mapping reveals widespread dynamic-regulated pseudouridylation of ncRNA and mRNA. Cell 159, 148–162 (2014).
Chen, K. et al. High-resolution N6-methyladenosine (m6A) map using photo-crosslinking-assisted m6A sequencing. Angew. Chem. Int. Ed. Engl. 54, 1587–1590 (2015).
Graham, J.M. Isolation of nuclei and nuclear membranes from animal tissues. Curr. Protoc. Cell Biol. Chapter 3, Unit 3.10 (2001).
Dodt, M., Roehr, J.T., Ahmed, R. & Dieterich, C. FLEXBAR—flexible barcode and adapter processing for next-generation sequencing platforms. Biology (Basel) 1, 895–905 (2012).
Webb, S., Hector, R.D., Kudla, G. & Granneman, S. PAR-CLIP data indicate that Nrd1-Nab3-dependent transcription termination regulates expression of hundreds of protein coding genes in yeast. Genome Biol. 15, R8 (2014).
Bailey, T.L. et al. MEME SUITE: tools for motif discovery and searching. Nucleic Acids Res. 37, W202–W208 (2009).
Heinz, S. et al. Simple combinations of lineage-determining transcription factors prime cis-regulatory elements required for macrophage and B cell identities. Mol. Cell 38, 576–589 (2010).
Acknowledgements
We thank K. Meyer and D. Patil for useful comments and suggestions. This work was supported by NIH grants NIDA DA037150 (to S.R.J.), NS076465 (to C.E.M.), T32 HD060600 (to A.V.G.) and T32 CA062948 (to A.O.O.-G.); a German Research Foundation (DFG) fellowship (to B.L.); the Irma T. Hirschl and Monique Weill-Caulier Charitable Trusts; the STARR Consortium (I7-A765 to C.M.); the Vallee Foundation (C.M.); and the WorldQuant Foundation (C.E.M.).
Author information
Authors and Affiliations
Contributions
B.L., A.V.G., A.O.O.-G. and S.R.J. conceived and designed the experiments and analyzed the data; C.M. and C.E.M. analyzed mutational profiles of initial miCLIP libraries; B.L., A.V.G., A.O.O.-G. and S.R.J. wrote the manuscript.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing financial interests.
Integrated supplementary information
Supplementary Figure 1 Characterization of mutations induced by UV cross-linking of anti-m6A used in miCLIP.
To characterize the spectrum of single-nucleotide substitutions, deletions, and insertions in miCLIP cDNA libraries prepared with different antibodies, we analyzed the frequency of these types of mutations in the overall libraries and at RAC consensus motifs. (a) The in vitro studies presented in Figure 1 suggested that the Abcam antibody induces single-nucleotide substitutions at the +1 position of m6A at a higher frequency than the SySy antibody. To test whether this is also true in cellular RNA and to determine whether a specific type of substitution predominates, we determined the frequency of different types of substitutions in Abcam and SySy miCLIP libraries. As can be seen, in general the substitution profile was similar for both antibodies. Therefore, the SySy antibody can serve as a background model for determining the specific substitutions induced by the Abcam antibody. (b) To test whether the Abcam antibody induces a specific type of substitution, we examined different substitution types in the Abcam library and compared their frequencies to those in the background model (the SySy library). In the Abcam library, C→T transitions were clearly enriched. (c) Our experiments in b showed that the Abcam antibody predominantly induces C→T transitions. Our next question was whether these transitions were induced at a specific location within RNAs, particularly at potential m6A residues in the core m6A motif (RAC). To determine the local mutational profile at these sites, we analyzed the frequency of each substitution type, as well as deletions and insertions in a 41-nt window surrounding the adenosine (position 0) of these sites. The frequency of each mutation type is plotted in a different color. As can be seen in the analysis of GAC sites (left panel), the most prevalent mutation was a C→T transition (dark purple line) at the +1 position relative to the potential m6A. The same result was observed in analysis of AAC sites (right panel). We also observed a slight enrichment of C◊A transitions (red line) at the +1 position relative to the A, but their frequency was too low to aid in mapping of m6A. Neither GAC nor AAC sites had deletions and/or insertions accumulating at a specific position relative to the m6A. This analysis shows that C→T transitions induced by the Abcam antibody are highly enriched at the invariant C of the minimal m6A consensus motif RAC. In contrast, no other type of mutation showed a similar enrichment at these sites. Hence, C→T transitions were used as the signature mutation for the Abcam antibody. (d) The Abcam antibody induced C→T transitions at a higher frequency than the SySy antibody; however, we reasoned that the SySy antibody might also induce such transitions. Therefore, we determined the local mutational profile of the SySy library at potential m6A sites. This analysis was performed as in c for the Abcam antibody. Indeed, in the SySy library, C→T transitions were enriched at the +1 position relative to the A at GAC and AAC sites (dark purple line). However, their enrichment was not pronounced like in the Abcam library (compare to c). Additionally, we observed other substitutions that were variably positioned relative to the potential m6A in this library. Therefore, the SySy antibody did not induce a reliable substitution as its mutational signature. Finally, no significant accumulation of insertions or deletions was found at a specific position relative to the consensus motif. This analysis shows that substitutions, insertions and deletions could not be used to unambiguously identify m6A residues with the SySy antibody. Thus, we used truncations induced by this antibody (Fig. 1b) to map m6A residues.
Supplementary Figure 2 miCLIP peaks provide higher resolution than conventional MeRIP-Seq peaks and PA-m6A-Seq clusters.
Although miCLIP does not use peaks to identify m6A residues and instead uses C→T transitions and cDNA truncations, looking at peaks in the coverage tracks is helpful for manual validation. Also, it enables a better comparison with conventional MeRIP-Seq data sets. This figure shows the coverage of the 3’ UTR of ACTB with unique miCLIP reads from the CITS and CIMS miCLIP libraries (light blue and orange, respectively). Called m6A residues in both data sets are shown as red circles beneath the tracks. The top two tracks show MeRIP-Seq coverage from the initial m6A mapping studies (red and green). Coverage as determined by Schwartz et al.5 was generated with a version of MeRIP-Seq that uses a variety of technical improvements and greater sequencing depth, leading to smoother peaks. Regions determined to be methylated by the PAR-CLIP–based PA-m6A-Seq26 method are shown as brown horizontal bars. The gene model of ACTB is depicted below the coverage tracks. Gray bars represent DRACH consensus sites. Comparison between MeRIP-Seq and miCLIP peaks revealed that they occur at the same positions in the transcript. However, whereas MeRIP-Seq peaks are ~100–200 nt wide, miCLIP peaks are ~40 nt wide.
Supplementary Figure 3 General features of miCLIP libraries and comparison with existing data sets.
Here we wanted to systematically evaluate the performance of miCLIP with other published methods that identify m6A throughout the transcriptome1,5,26. To do this, we (1) analyzed the genomic distribution of m6A residues called by miCLIP, (2) determined whether deeper sequencing of miCLIP libraries might result in a larger number of identified m6A sites, and (3) compared the performance of miCLIP to that of previously described methods to map m6A in the transcriptome. (a) To determine whether the distribution of m6A residues called by CIMS and CITS miCLIP follows the previously identified genomic distribution of m6A, we compared the genomic coordinates of miCLIP-called m6A residues to those of conventional MeRIP-Seq peaks described by Meyer et al.1. This comparison revealed that m6A residues called by CIMS and CITS miCLIP follow a distribution similar to that of MeRIP-Seq–identified regions, including typical enrichment in coding exons and the 3’ UTR. In the CITS miCLIP library, we additionally observed an enrichment of sites called in intronic and intergenic regions, presumably because this library was created from total RNA rather than poly(A)+ RNA. Additionally, many sites in this library were called in the promoter region of transcripts, consistent with the ability of CITS miCLIP to identify m6Am residues at transcription start sites (Fig. 4). (b) It has been shown previously that CLIP libraries are often not sequenced to saturation and that the number of called sites underrepresents the actual number of cross-linking events present in the sample. Thus, when random subsets of the library are analyzed, a linear increase in the number of called sites is observed. To test whether this phenomenon is observed in miCLIP, we created random subsamples of the mRNA CIMS miCLIP library (20%, 40%, 60% and 80%). The number of called m6A residues relative to the number of calls in the total library (9,536) increased linearly as the subsample size increased, suggesting that this library was not sequenced to saturation. This demonstrates that additional m6A residues can be identified with deeper sequencing of miCLIP libraries. (c) To estimate the rate of false negative and false positive calls made by miCLIP, we used a previously published set of eight biochemically validated m6A residues (SCARLET m6A). Importantly, this study also identified 15 adenosines in a DRACH consensus that underlie MeRIP-Seq peaks but are not methylated (termed “SCARLET non-m6A” here). Although the SCARLET dataset is too small to allow for exact calculations of specificity and sensitivity, we used this set of sites to obtain approximate performance metrics of miCLIP. This dataset additionally allows objective comparison of miCLIP with other m6A mapping approaches. We thus determined the number of SCARLET-positive and SCARLET-negative sites that were called in CIMS and CITS miCLIP, as well as by the bioinformatic prediction method described previously. To compare miCLIP with techniques that identify regions of methylation rather than m6A residues, we additionally evaluated whether SCARLET sites are present in methylated regions called by MeRIP-Seq1 and PA-m6A-Seq26. CIMS and CITS miCLIP showed the highest number of correctly identified known m6A residues. In addition, CIMS and CITS miCLIP called zero and one of the false positive sites, respectively.
Supplementary Figure 4 Characterization of m6A clusters throughout the transcriptome.
Although m6As have been predicted to occur in clusters, these have never been previously characterized on a transcriptome-wide scale. Identifying individual m6A residues within a cluster is challenging with existing methods because of the inherent low resolution of these approaches. As CIMS miCLIP provides a map of m6A at individual-nucleotide resolution, we used this data set to identify m6A clusters. (a) To determine the size of m6A clusters, we analyzed 100-nt sliding windows (25-nt step size) for accumulated m6A sites (n ≥ 2) called by CIMS miCLIP. Directly adjacent and / or overlapping windows with multiple called sites were then joined and designated as m6A clusters. These clusters were then analyzed for their number of CIMS miCLIP–called m6A residues. This analysis revealed 958 m6A clusters, with average cluster sizes ranging from 138 nt (two m6A sites) to 488 nt (more than six m6A sites). (b) Methylation of m6A clusters is underestimated by bioinformatic prediction. One of the biggest challenges in bioinformatic prediction is to identify m6A sites in large “mega-peaks” that comprise multiple overlapping m6A peaks. To test this idea, we examined the number of m6A sites predicted bioinformatically in different m6A clusters. As can be seen, bioinformatic approaches significantly underestimated the number of m6A sites. Clusters with six or more m6A sites are shown. (c) Example of the exceptional spatial resolution of CIMS miCLIP. CIMS miCLIP identified clustered m6A residues in the coding sequence of ZNF503 mRNA. Orange and dark blue tracks represent unique CIMS miCLIP read coverage and the number of C→T transitions at each position, respectively. The coverage from a MeRIP-Seq experiment that was used to bioinformatically predict m6A residues is shown in purple, and the predicted m6A is marked with an arrow. Regions of methylation as determined by PA-m6A-Seq are indicated with brown horizontal bars.
Supplementary Figure 5 The sequence landscape of m6A.
Although it has been demonstrated that m6A occurs in a degenerate, pentameric sequence motif that follows a DRACH consensus, the frequency at which individual pentamers are methylated has never previously been measured. This in turn has limited bioinformatic prediction of m6A residues from existing MeRIP-Seq datasets. Our single-nucleotide-resolution map of m6A enabled us to determine the sequence landscape in which m6A occurs in the transcriptome. (a) Because bioinformatic prediction of m6A is limited to searching for one of seven DRACH motifs, m6A residues may be missed if they fall outside one of these consensus sites. As a result, a nonmethylated adenosine far from the midpoint of the m6A peak may be incorrectly called as an m6A. Here we show an example of an mRNA where the bioinformatic prediction of an m6A peak called a GGACA site (indicated with an arrow) more than 150 nt away from the midpoint of the m6A peak. Similarly, PA-m6A-Seq showed read coverage over a region ~100 nt wide and did not assist in identifying the specific m6A residues in this region. In contrast, residues called by CIMS miCLIP (red circles) underlie the midpoint of the m6A peak, likely reflecting the true m6A residues giving rise to this peak. Each of these m6A residues is in a less common DRACH motif (TGACA, AAACT, and TAACT), precluding their calling via bioinformatic prediction. These data underscore the ability of miCLIP to identify m6A residues in all possible DRACH motifs. (b) To generate a census of m6A pentamers identified by CIMS miCLIP, we analyzed the sequence environment of the 9,536 called m6A residues. We determined all possible pentamers in which m6A residues were called and found that 77.29% followed the DRACH motif (pie chart). Additionally, the number of m6A residues occurring in different subtypes of the DRACH sequence was determined (left table; motifs previously used for the bioinformatic prediction of m6A are shown in bold). This analysis confirmed that methylation is not equally distributed among all possible DRACH pentamers, with 40% of m6A residues occurring in just 3 (GGACT, GGACA, and AGACT) of 18 possible pentamers. However, all possible DRACH motifs are clearly methylated. Interestingly, sites called in non-DRACH motifs often occur in pentamers that differ from the DRACH sequence by only one nucleotide (right table; non-matching nucleotides are indicated in blue). Together, these data suggest that a considerable portion of m6As occur outside of the most frequently used DRACH motifs, and that rare m6As may occur in DRACH-like motifs. (c) To generate a census of m6A pentamers identified by CITS miCLIP, we analyzed the sequence environment of the 6,543 called m6A residues. 80.13% occurred in pentamers following the DRACH motif (pie chart). An additional 8.12% occurred in the BCA consensus motif that is characteristic for m6Am residues that were detected at the TSS by CITS miCLIP (Fig. 4). As in CIMS miCLIP, further analysis revealed that m6A is not distributed equally among all possible DRACH pentamers: 51% of CITS miCLIP–called m6A residues were found in just 3 (GGACT, GAACT, and GGACA) out of 18 possible pentamers (left table). In contrast, the distribution of m6Am residues among subtypes of the BCA motif revealed similar frequencies of methylation in all three possible sequence variants (with a slight enrichment for CCA; middle table). Finally, CITS miCLIP–called sites that localized to non-DRACH, non-BCA motifs were found in slight variations of the DRACH motif (right table; non-matching nucleotides are indicated in blue). Together, these data confirm that methylation occurs in all 18 possible DRACH motifs, and that rare m6As may occur in DRACH-like motifs.
Supplementary Figure 6 CITS miCLIP identifies novel m6Am residues.
CITS miCLIP identified novel m6Am residues in the 5’ UTRs of ATP5B and FTH1. m6Am residues have been previously characterized in histone mRNAs; however, it has remained largely unknown which transcripts have m6Am at their 5’ termini. Here we present examples of two transcripts with novel m6Am residues identified by CITS miCLIP. For a full list of sites identified in 5’ UTRs of transcripts, see Supplementary Table 2.
Supplementary information
Supplementary Text and Figures
Supplementary Figures 1–6 (PDF 1149 kb)
Supplementary Table 1
CIMS-based miCLIP (CIMS miCLIP) identified 9,536 putative m6A residues in the transcriptome. (XLSX 1177 kb)
Supplementary Table 2
CITS-based miCLIP (CITS miCLIP) identified 6,543 putative m6A sites. (XLSX 895 kb)
Supplementary Software
The custom script used in the manuscript and its instructions. (ZIP 1 kb)
Rights and permissions
About this article
Cite this article
Linder, B., Grozhik, A., Olarerin-George, A. et al. Single-nucleotide-resolution mapping of m6A and m6Am throughout the transcriptome. Nat Methods 12, 767–772 (2015). https://doi.org/10.1038/nmeth.3453
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1038/nmeth.3453
This article is cited by
-
The role of the methyltransferase METTL3 in prostate cancer: a potential therapeutic target
BMC Cancer (2024)
-
m6A-TCPred: a web server to predict tissue-conserved human m6A sites using machine learning approach
BMC Bioinformatics (2024)
-
Characterization of sexual maturity-associated N6-methyladenosine in boar testes
BMC Genomics (2024)
-
Advances in brain epitranscriptomics research and translational opportunities
Molecular Psychiatry (2024)
-
New horizons for the role of RNA N6-methyladenosine modification in hepatocellular carcinoma
Acta Pharmacologica Sinica (2024)