The presence of N1 methyl groups on adenine bases was thought to be widespread in messenger RNAs. It now seems that these modifications are much less prevalent, and occur on mRNAs that structurally mimic transfer RNA. See Letter p.251
The fate of messenger RNAs is determined by the sequence of their four main molecular building blocks: adenosine, guanosine, cytidine and uridine. However, these components can be chemically modified in ways that impart additional information to mRNAs. The identity and location of such modifications in the transcriptome — the complete set of mRNA molecules found in a cell — comprise the epitranscriptomic code. In 2016, two studies1,2 expanded this code by reporting that there are more than 7,000 N1-methyladenosine (m1A) modifications spread across the diverse mRNA transcripts in the cell. On page 251, Safra et al.3 overturn the concept that m1A is an abundant epitranscriptomic regulatory mark, and reveal principles that guide the formation of these RNA modifications in mRNA.
The m1A modification differs from the nucleoside adenosine by the presence of a simple methyl group. This group adds a positive charge to the base in adenosine, and prevents it from forming standard pairing interactions with other bases in nucleic acids — unlike the common N6-methyladenosine (m6A) and pseudouridine modifications, which form standard base pairs. The m1A modification was initially mapped in the transcriptome using an antibody that binds to it1,2, so that RNA fragments that contain m1A could be isolated and sequenced using 'next-generation' sequencing methods. Thousands of mRNA regions were attributed to m1A in this way, but the method did not detect the exact sites of the modified molecules.
It was expected that m1A modifications would disrupt the translation process, in which the cell's ribosome machinery uses mRNA as a template for protein synthesis. It was therefore remarkable that the m1A modifications were mapped to coding regions and start codons (the short sequences that are translated first in mRNAs). Ribosomes typically survey and then degrade mRNAs containing modified nucleotides that cannot form base pairs with transfer RNAs4. Nevertheless, one of the initial m1A-mapping studies1 proposed that m1A promotes translation, counter to its expected inhibitory effect on ribosomes.
Enter Safra et al., who studied human cell lines to understand how m1A influences mRNA biology. The authors used a special reverse transcriptase enzyme that synthesizes a complementary DNA strand from RNA, but which incorporates mutations precisely at m1A locations. By using the same antibody as the previous studies, they could therefore assess whether the RNA fragments that bind the antibody indeed contained m1A, and if so, where. The authors identified only seven m1A nucleotides at internal sites of cytosolic mRNAs (that is, at sites in the main body of the mRNAs, where coding regions are found), rather than 7,000. Additionally, the authors found another five m1A-containing mRNAs encoded by the DNA of mitochondria — organelles that act as cellular powerhouses.
Among the small number of mRNAs identified by Safra and colleagues as being susceptible to m1A modification, very few copies actually contained the modification — indicating that the modified forms of these transcripts are infrequent in cells. However, the mRNA that encodes the enzyme NADH dehydrogenase-5 (ND5) in mitochondria was frequently modified, such that most transcripts contained m1A. Safra et al. studied the prevalence of the m1A form of the ND5 mRNA in developing human embryos, and observed that, in egg cells and up to the four-cell stage, nearly all ND5 mRNAs contain m1A, but few do so at the late-blastocyst stage (which occurs at five days of development).
The authors also showed that m1A impairs mRNA translation, revealing a potential function for the modification. Moreover, they found that ND5 contains a single nucleotide polymorphism — a sequence variation involving a change of one nucleotide, found in some individuals — that is linked to a disease known as Leber's hereditary optic neuropathy. Notably, this mutation prevents the formation of m1A in the ND5 mRNA. Defects in m1A formation might therefore be linked to disease.
Most m1A in the cell is formed in tRNA, where it has a role in tRNA folding5. Safra and colleagues found that most m1A residues in mRNA are formed in sequences that look remarkably similar to the T-loop of tRNAs — the tRNA arm that is modified with m1A (ref. 6). The authors found that the enzyme complex that makes m1A in tRNA makes the same modification in these mRNAs.
Mimicry of tRNA structures seems to be a general mechanism by which nucleotide modifications can be introduced in mRNA (Fig. 1). For example, pseudouridine-forming enzymes that modify tRNA also modify similar structures in mRNAs7. Notably, mRNAs can contain mimics of other modified non-coding RNAs. An mRNA was recently identified8 containing a sequence that mimics the U6 small nuclear RNA (snRNA), a non-coding RNA involved in the splicing process by which RNA transcripts are processed to form mature mRNAs. This mimic is modified to form m6A by the same enzyme that modifies U6. Because the enzymatic machinery that modifies non-coding RNA can be co-opted by mRNAs that contain structural mimics, searching for other tRNA- or snRNA-like structures in mRNA might reveal previously unknown modification sites.
What explains the discrepancy between the number of initially mapped m1A sites and Safra and co-workers' results? The m1A antibody might bind to unmodified regions of mRNA, as has been observed for other antibodies that bind modified nucleotides9. Distinguishing RNA fragments that bind to antibodies through a modification from fragments that bind nonspecifically is challenging. In the earlier work1, this was done by using a reverse transcriptase that is blocked by m1A — the enzyme can make complementary DNA only on either side of m1A, leading to the absence of complementary DNA opposite the m1A in RNA. However, such absences are a common, nonspecific feature of next-generation sequencing. By contrast, Safra et al. screened RNA fragments for m1A-induced misincorporations, a highly sensitive and specific method.
The new study demonstrates that next-generation sequencing data can, in some cases, erroneously give the impression of widespread internal modifications in mRNA. Although m6A and pseudouridine are well-documented epitranscriptomic modifications, and their mapping can be performed reliably10,11, the validity of newer modification maps is unclear, because the modifications were mapped from sequencing data but not biochemically validated. A recent study12 of the mammalian transcriptome mapped approximately 3,500 nucleotides that contain methyl groups on the ribose part of the molecules. However, subsequent reanalysis showed that the key sequence motifs discovered in this study matched those of 'primer' sequences used to generate complementary DNA, a common sequencing artefact13.
Rigorous criteria are therefore needed to validate the results of modified-nucleotide mapping studies. Foremost among these is direct biochemical validation of modifications in target mRNAs. Additional criteria should include confirmation that a modification site is seen in separate mapping studies using independent, modification-specific antibodies. If the modified nucleotide causes mutations during reverse transcription, then these mutations should also be used to verify mapping data. Lastly, experiments in which the modification-synthesizing enzyme is deliberately depleted in cells can further demonstrate the specificity of a mapping experiment, as Safra et al. have shown.