Main

The mRNA cap structure is a critical component of eukaryotic mRNAs but varies between transcripts based on its methylation state. Each cap structure comprises a N7-methylguanosine (m7G) cap at the 5′ end of mRNA, adjacent to the first transcribed nucleotide3,4,5. During mRNA biogenesis, the first nucleotide of all mRNAs becomes 2′-O-methylated (Nm) by cap methyltransferase 1 (CMTR1) to form a Cap1-modified mRNA terminus (m7G-ppp-Nm)6,7. Upon mRNA export into the cytosol, a subset of Cap1 mRNAs is subjected to 2′-O-methylation of the ribose on the second nucleotide by cap methyltransferase 2 (CMTR2) to form Cap2-modified mRNA 5′ ends (m7G-ppp-Nm-Nm)7,8,9 (Fig. 1a). However, the basis for this selectivity in mRNA cap methylation and the function of Cap2 are unclear.

Fig. 1: Cap2 is a variable mRNA modification found in diverse sequence contexts.
figure 1

a, Structure of Cap1-modified and Cap2-modified 5′ mRNA termini. b, RNase T2 (scissors) cleaves RNA only after non-2′-O-methylated nucleotides. RNase T2 selectively releases Cap1 and Cap2 from mRNA in the form of two-nucleotide-long and three-nucleotide-long cap tags. c, CapTag-seq. mRNAs are decapped and a biotinylated 5′ end adapter composed of 2′-O-methylated nucleotides (Nm-RA5) is ligated to the first mRNA nucleotide. The ligated mRNAs are digested with RNase T2, resulting in the formation of the 5′ adapter-linked cap tags. The released 5′ adapter-linked cap tags are captured on streptavidin beads and ligated to the 3′ end DNA adapter containing 20 randomized nucleotides (N20-DA3). After RT–PCR and next-generation sequencing, read length identifies Cap1 (22 base pair (bp)) and Cap2 (23 bp) cDNA origins. Reads that are 21 bp long reflect 5′ adapter ligations to uncapped, sheared mRNA. Reads that are 20 bp long represent adapter dimers. d, CapTag-seq specifically detects Cap1 and Cap2 mRNA species in bulk mRNA. CapTag-seq was conducted on mRNA extracted from CMTR2 WT and KO HEK293T cells with (+) or without (–) the m7GDP removal step. A heatmap of the length-based tag frequency is shown for each sample. Average of n = 2 biological replicates. e, Cap2 levels in bulk mRNA from different organisms and mammalian cell types. Average of n = 2 biological replicates. f, Cap2 levels in bulk mRNA from different mouse tissues. Average of n = 2 biological replicates. g, A heatmap showing Cap2 levels on each of the 16 possible m7G-proximal dinucleotides in CMTR2 WT and KO HEK293T cell transcriptomes. Average of n = 2 biological replicates.

Source data

Cap2 appears to have important regulatory roles in higher organisms, as deletion of the Cmtr2 gene causes preweaning lethality in mice10 and affects growth and proliferation of KRAS-driven lung cancer cells11. As Cap2 methylation is an abundant non-constitutive mRNA modification2, these data suggest that Cap2 exerts important cellular functions via controlling a subset of mRNAs.

The identity of Cap2-modified mRNAs and the rules that govern the specificity of Cap2 methylation have remained unknown since the original discovery of Cap2 nearly 50 years ago. This is largely due to the lack of methods for transcriptome-wide mapping of Cap2 methylation.

Using a set of new methods to quantify and map Cap2, here we reveal unique features and function of this epitranscriptomic modification. Our quantitative transcriptome-wide map of Cap2 reveals an unexpected topology of the Cap2 methylome in mammalian transcriptomes, with essentially all mRNAs susceptible to Cap2 methylation. Cap2 methylation specificity is largely guided by mRNA age. We found that this age-directed mechanism for Cap2 methylation provides a novel strategy to distinguish viral RNA from self RNA and to control the activation of the innate immune response.

Cap2 levels vary between cell types

As Cap2 is a non-constitutive mRNA modification1,2, it has a potential for dynamic regulation in different cellular contexts and conditions. However, current methods for measuring Cap2 dynamics are cumbersome, as they rely on metabolic mRNA radiolabelling and chromatographic separation of Cap1 and Cap2 from the nuclease digests of mRNA1. These methods utilize RNase T2 to selectively release the cap structures from mRNA. RNase T2 liberates the cap structures from mRNA as it cleaves all phosphodiester bonds in RNA except those after Nm (ref. 12) (Fig. 1b). Thus, RNase T2 digestion of Cap1 mRNA releases m7G-ppp-Nm-N, whereas the digestion of Cap2 mRNA releases m7G-ppp-Nm-Nm-N (Fig. 1b). We refer to these m7G-linked two-nucleotide-long or three-nucleotide-long fragments as ‘cap tags’, as they directly indicate the Cap1 or Cap2 status of an mRNA, respectively.

To quantify the levels and dynamics of Cap2 in any cellular sample, we developed CapTag-seq. In CapTag-seq, mRNAs are first subjected to enzymatic m7GDP removal (decapping), and the resulting 5′-monophosphorylated mRNA ends are ligated to a 5′ adapter composed of Nm, thus rendering it resistant to RNase T2 (Fig. 1c). The resulting 5′ adapter-linked mRNAs are fully digested with RNase T2, leaving the cap tags attached to the 5′ adapter. A conversion of the 5′ adapter-linked cap tags into cDNA libraries allows quantification of the cap tags by next-generation sequencing (Fig. 1c; see Methods). In addition, the identity of nucleotides in the cap tags can reveal sequence contexts associated with Cap2 methylation.

We first assessed whether CapTag-seq specifically detects Cap1 and Cap2 mRNAs. CapTag-seq libraries prepared from HEK293T cells comprised almost exclusively three-nucleotide-long, two-nucleotide-long and one-nucleotide-long cap tags (Fig. 1d). The lack of longer cap tags demonstrates the specificity and efficiency of the RNase T2 cleavage step. In control mRNA samples that were not decapped, only the one-nucleotide tags remained (Fig. 1d), indicating that these tags probably reflect contamination from sheared mRNA with uncapped 5′ ends that can be subjected to 5′ adaptor ligation. They are unlikely to derive from Cap0 mRNAs (that is, no 2′-O-methylation at the first and second nucleotides), as Cap0 mRNAs are not detected in orthogonal biochemical assays1,2.

To further confirm that the three-nucleotide-long cap tags derive from Cap2 mRNAs, we performed CapTag-seq on CMTR2 knockout (KO) HEK293T cells (Extended Data Fig. 1a). Here we found a near-complete loss of the three-nucleotide-long cap tags (Fig. 1d), thus confirming their Cap2 origin and simultaneously validating Cap2 depletion in CMTR2 KO HEK293T cells. CapTag-seq was quantitative, as mixtures of mRNA from CMTR2-overexpressing wild-type and CMTR2 KO HEK293T cells at predefined ratios produced the expected Cap2 stoichiometries (Extended Data Fig. 1b).

We found that relative Cap2 abundance varied considerably between mammalian cell lines, with as low as approximately 25% in mouse embryonic stem (mES) cells and 40% in HEK293T cells, and as high as 54% in A549 cells and 56% in MCF-7 cells (Fig. 1e and Extended Data Fig. 1c). Of note, Cap2 was nearly absent in the Caenorhabditis elegans transcriptome (0.9%) and present at low levels in fruitfly and zebrafish (approximately 12%) mRNA (Fig. 1e). Cap2 was also highly variable in mouse tissues, ranging from approximately 8% in the mouse brain to 30% in the mouse spleen (Fig. 1f and Extended Data Fig. 1d). These results show that Cap2 levels vary widely between organisms, tissues and cell types, suggesting regulation of Cap2 methylation in different cellular contexts.

Cap2 resides within diverse sequences

We next used CapTag-seq to determine whether Cap2 is associated with specific m7G-proximal nucleotides. To test whether certain sequences are selectively methylated by CMTR2, we examined Cap2 levels on all 16 possible m7G-proximal dinucleotides. In all cell types, we found that all 16 dinucleotides can be subjected to Cap2 methylation (Fig. 1g and Extended Data Fig. 1e). However, the extent of Cap2 methylation differed between individual dinucleotides (Fig. 1g and Extended Data Fig. 1e). Unexpectedly, we found that the sequences with the highest Cap2 methylation differed slightly among mouse tissues (Extended Data Fig. 1f). This was surprising as the sequence preferences of CMTR2 would be expected to be identical regardless of the tissue. Overall, these data suggest that the pattern of Cap2 in the transcriptome may not be driven simply based on the sequence preferences of CMTR2, but, as will be shown below, is instead dictated in combination with other mechanisms.

Together, these data indicate that Cap2 is an abundant and promiscuous mRNA modification. The tissue-specific enrichment of Cap2 in different sequence contexts suggests that some mRNAs may have higher levels of Cap2 than other mRNAs, raising the possibility that Cap2 may have specific regulatory roles or functions.

CLAM-Cap-seq creates cDNA–mRNA chimeras

To understand pathways and processes that may be regulated by Cap2 methylation, we wanted to identify Cap2-modified mRNAs. Nm has been mapped at internal sites within ribosomal RNA based on the ability of Nm to stochastically stall certain reverse transcriptases and induce termination of cDNA synthesis13. However, Nm-induced terminations are often inconsistent and influenced by sequence contexts surrounding modified nucleotides14. We therefore sought to develop a novel method for detection and quantification of Cap2 in individual mRNAs.

Although RNase T2 selectively releases Cap1 and Cap2 cap tags from mRNA, it destroys the sequence information of the mRNA from which the cap tag originated (Fig. 1b). To overcome this, we developed CLAM-Cap-seq (CircLigase-assisted mapping of caps by sequencing), a method which entails generating cDNA that is physically attached to the cap tag of its template mRNA. By creating a chimera between the cDNA and the cap tag, each cDNA sequence contains a record of the cap status of the original mRNA.

In CLAM-Cap-seq, decapped mRNA is first reverse transcribed to generate a cDNA–mRNA hybrid. Next, the cDNA 3′ end is ligated to the first 5′ mRNA nucleotide to create a cDNA–mRNA chimera (Fig. 2a). Subsequently, RNase T2 removes the entire mRNA except the cap tag, which remains covalently attached to the cDNA. Finally, a DNA adapter is ligated to the cap tag to enable the conversion of cDNA–cap tags into a sequencing library (Fig. 2a; see Methods). The resulting sequencing reads begin with ‘palindromes’ that reflect the cap tag of the mRNA followed by the first cDNA nucleotides, which are the reverse complement of the cap tag (Fig. 2a). Overall, CLAM-Cap-seq physically couples a remnant of the mRNA in the form of the cap tag to the cDNA, thus revealing the Cap1 or Cap2 status for each mRNA in the transcriptome.

Fig. 2: CLAM-Cap-seq identifies Cap2 methylation enrichment on specific mRNAs.
figure 2

a, In CLAM-Cap-seq, mRNAs are decapped and reverse transcribed using a 5′-biotinylated primer. Non-templated nucleotides added to the 3′ cDNA ends by reverse transcriptase are indicated. cDNA–mRNA hybrids are captured on streptavidin beads. mRNA and cDNA ends are ligated by CircLigase to form a cDNA–mRNA chimera. The mRNA is degraded by KOH and RNase T2, leaving two-nucleotide-long and three-nucleotide-long cap tags attached to the cDNA. A DNA adapter is ligated to the cap tags, enabling the conversion of the cDNA–cap tag chimera into sequencing libraries. Sequencing reads begin with a palindrome, comprising the reverse complement of the cap tag followed by the transcript sequence. b,c, U1 snRNA CLAM-Cap-seq PCR sequences obtained from CMTR2 WT (b) and KO (c) HEK293T cells. The red asterisk indicates non-templated nucleotides, and the black asterisk denotes a mutation in the U1 snRNA. d, CLAM-Cap-seq-predicted Cap2 stoichiometry plotted against the expected Cap2 stoichiometry of a luciferase mRNA standard starting with AmU(m)A. Pearson’s r is shown. n = 2 technical replicates. e, Cumulative distribution plot of Cap2 stoichiometry in TSN isoforms from mES cells (n = 12,773), CMTR2 WT (n = 12,776) and KO (n = 16,102) HEK293T cells, and MCF-7 cells (n = 10,630). Each box shows the first quartile, median and third quartile, and whiskers represent 1.5× interquartile ranges. Two-sided Wilcoxon signed-rank test, ****P < 0.00001. f, Cap2 stoichiometry is correlated between human cell lines. Cap2 stoichiometry for a shared set of TSN isoforms in HEK293T and MCF-7 cells was plotted. Pearson’s (r) and Spearman’s (\(\rho \)) are shown. Average of n = 4 (HEK293T) and n = 2 (MCF-7) biological replicates. n = 7,187 TSN isoforms. g, Gene enrichment analysis performed for highly Cap2-modified mRNAs from HEK293T cells. The bar graph shows the enrichment ratio of the KEGG pathways associated with highly Cap2-modified mRNAs relative to all mRNAs with measurable Cap2 stoichiometry. FDR < 5%.

Source data

Although the ligation of cDNA to mRNA has not been described, we considered the possibility that this could be achieved with CircLigase, which has previously been used to ligate annealed DNA strands15. We tested CircLigase-assisted formation of chimeric cDNA–cap tags on U1 small nuclear RNA (snRNA), which is nearly 100% Cap2-modified in various cell types16. First, we confirmed successful generation of cDNA–cap tags by performing PCR across the cDNA–cap tag ligation junction (Extended Data Fig. 2a). Sequencing of individual PCR products revealed the expected Cap2 palindromes at the beginning of the reads derived from U1 snRNA extracted from wild-type cells (Fig. 2b). By contrast, only Cap1 palindromes were observed in CMTR2 KO cells, demonstrating the specificity of this method (Fig. 2c). Overall, these data show that CLAM-Cap-seq selectively generates cDNA–cap tag chimeras that report the RNA identity as well as its Cap1 or Cap2 status.

In the centre of the palindromes, we routinely observed 0, 1 or 2 nucleotides (Fig. 2b,c), probably added to the cDNA ends by the terminal nucleotidyl-transferase activity of the reverse transcriptase17. Despite the presence of non-templated nucleotides, the complementary sequences within the palindrome are easily identified and demarcate the exact cap tag region (Fig. 2b,c). Thus, the cap tag can be extracted from the palindrome to determine whether the original mRNA was Cap1 or Cap2.

We next tested the ability of CLAM-Cap-seq to accurately predict Cap2 stoichiometry in mRNA. To test this, we mixed Cap1-modified and Cap2-modified luciferase mRNAs in specific molar ratios and spiked them into cellular poly(A)+ RNA. We found that CLAM-Cap-seq accurately predicted Cap2 stoichiometry present in the mixture of synthetic luciferase mRNA standards (Fig. 2d and Extended Data Fig. 2b). Collectively, these data demonstrate that CLAM-Cap-seq can accurately measure Cap2 stoichiometry in mRNA.

CLAM-Cap-seq reveals the Cap2 methylome

We next identified Cap2-modified mRNAs in mES, HEK293T and MCF-7 cells. Gene transcription can initiate at different locations within a gene promoter, giving rise to transcript isoforms that differ in the start nucleotides (transcript-start nucleotide (TSN) isoforms)18,19. To identify Cap2 stoichiometry for each TSN isoform, rather than genes, we first identified TSN isoforms in the transcriptome by transcription start site-sequencing (TSS-seq)20,21 (Extended Data Fig. 2c–e).

Next, we performed CLAM-Cap-seq on mRNA extracted from mES, HEK293T and MCF-7 cells and calculated Cap2 stoichiometry for each TSN isoform identified by TSS-seq. CLAM-Cap-seq produced sufficient read coverage for prediction of Cap2 stoichiometry in 10,630–12,776 TSN isoforms, spread across 3,745–4,267 genes in the examined cell lines. CLAM-Cap-seq datasets showed low variability of Cap2 measurements across replicates (Extended Data Fig. 2f,g).

The specificity of CLAM-Cap-seq is supported by the lack of Cap2 reads in CLAM-Cap-seq datasets from CMTR2 KO HEK293T cells (Fig. 2e). Furthermore, the Cap2 stoichiometry was low (1–11%) in nuclear poly(A)+ RNAs compared with cytosolic mRNAs (2–84%), consistent with the formation of Cap2 in the cytosol8 (Extended Data Fig. 2h). The overall levels of Cap2 methylation and the Cap2 dinucleotide preferences identified by CLAM-Cap-seq were highly similar to those identified by CapTag-seq (Extended Data Fig. 2i,j).

To biochemically validate the levels of Cap2 measured using CLAM-Cap-seq, we developed an approach to measure Cap2 levels on specific cellular mRNAs. In this method, named CapOligo-PAGE, a self-splinting DNA oligo is selectively ligated to the radiolabelled 5′ end of an mRNA of interest. Following RNase T2-mediated removal of the mRNA, the self-splinting oligo remains attached to the radiolabelled cap tags. A subsequent PAGE electrophoresis resolves DNA oligo-attached cap tags to establish the Cap1 or Cap2 status of the mRNA of interest (Extended Data Fig. 3a–c; see Methods). Using CapOligo-PAGE, we confirmed that Cap2 was present in OAT and RPS12 mRNAs, with markedly higher Cap2 stoichiometry in RPS12 mRNA (Extended Data Fig. 3d,e), as predicted by CLAM-Cap-seq (Extended Data Fig. 2h). Thus, the Cap2 stoichiometry measurements obtained with CLAM-Cap-seq are consistent with our orthogonal assay for determination of Cap1 and Cap2 status in individual mRNAs.

We next assessed the overall stoichiometry of Cap2 throughout transcriptomes. In all three cell lines, we detected transcripts with Cap2 stoichiometries that ranged from 0% to approximately 100% (Fig. 2e). Of note, the overall level of Cap2 varied substantially between cell types, with MCF-7 cell transcripts exhibiting threefold higher Cap2 methylation than mRNAs of the mES cell transcriptome (Fig. 2e). To understand the basis for the differences in Cap2 levels in different cell types, we compared Cap2 stoichiometry at the shared set of TSN isoforms between HEK293T and MCF-7 cells (Fig. 2f). We found that TSN isoforms with low Cap2 stoichiometry in HEK293T cells also exhibited low levels of Cap2 in MCF-7 cells. Similarly, high Cap2 stoichiometry TSN isoforms from HEK293T cells were also highly Cap2-modified in MCF-7 cells (Fig. 2f). The major difference was that the Cap2 stoichiometry of each TSN isoform was proportionally higher in MCF-7 cells (Fig. 2f). Thus, rather than distinct Cap2 epitranscriptomes in each cell type, we found that Cap2 levels at all TSN isoforms are largely correlated in these two human cell lines.

We next used gene enrichment analysis to determine whether Cap2-modified mRNAs are linked to specific cellular pathways. These analyses showed that transcripts with high Cap2 stoichiometry were significantly enriched in gene sets associated with general metabolic pathways and other housekeeping functions in all three cell lines (Fig. 2g and Extended Data Fig. 4). Thus, Cap2 enrichment on specific mRNA cohorts appears to be a conserved phenomenon in mammalian cells.

Overall, these data suggest that Cap2 could influence the function of specific groups of genes and thus regulate cell functions.

Cap2 is enriched on long-lived mRNAs

As Cap2 is added in the cytosol, we asked whether Cap2 is associated with cytoplasmic mRNA processing events, such as translation or mRNA stability. Current approaches for measuring mRNA translation and stability generate sequencing reads that cannot be linked to specific TSN isoforms. To measure translation and stability of every Cap2-modified TSN isoform, we first developed methods for TSN-specific translation and stability analyses.

To specifically assess translation of TSN isoforms, we combined polysome profiling with TSS-seq (polysome–TSS-seq). The abundance of each TSN isoform along the sucrose gradient can be used to calculate the average number of actively translating ribosomes bound to each TSN isoform (mean ribosome load (MRL))22 (Fig. 3a).

Fig. 3: Cap2 methylation is highly enriched on long-lived mRNAs.
figure 3

a, Polysome–TSS-seq. Translated mRNAs were isolated from different polysome fractions (F1–F5) and subjected to TSS-seq. The average number of ribosomes bound to each TSN isoform was calculated, termed mean ribosome load (MRL). b, Cumulative distribution plot of the ribosome density (MRL per kb) for TSN isoforms binned into equally sized quartiles of increasing Cap2 stoichiometry (Q1–Q4) in CMTR2 WT HEK293T cells. Qn = 1,782. c, ActD-TSS-seq. mRNA was extracted from cells treated with actinomycin D (actD) for 0, 2, 8 and 16 h. Purified mRNA was subjected to TSS-seq for half-life measurements of each TSN isoform. d, Cumulative distribution plot of the half-lives for TSN isoforms binned into equally sized quartiles of increasing Cap2 stoichiometry in CMTR2 WT HEK293T cells. Qn = 2,115. e, Analysis of mRNA translation, as in b, using CMTR2 KO HEK293T cells. Qn = 1,938. f, Analysis of mRNA half-lives, as in d, using CMTR2 KO HEK293T cells. Qn = 2,150. In b,df, average of n = 2 biological replicates. Two-sided Wilcoxon signed-rank test, *P < 0.01, **P < 0.001, ***P < 0.0001 and ****P < 0.00001. Each box shows the first quartile, median and third quartile, and whiskers represent 1.5\(\times \) interquartile ranges.

Source data

We quantified TSN isoforms in five polysome fractions of HEK293T cells using TSS-seq. As a control, we showed that mRNAs known to be highly and lowly translated were found in the expected positions within the sucrose gradient (Extended Data Fig. 5a). Next, we calculated the MRL for each TSN isoform in two replicates (Extended Data Fig. 5b,c).

To determine whether Cap2 is associated with translation efficiency, we stratified all TSN isoforms into equally sized quartiles based on their Cap2 stoichiometry. When we examined the distribution of the ribosome density (MRL per kilobase of the open reading frame (MRL per kb)) for each Cap2 stoichiometry quartile, we observed slightly higher ribosome density in highly Cap2-modified transcripts relative to the mRNAs with the lowest levels of Cap2 (median Δ ribosome density (Q4 − Q1) = ~1 MRL per kb; Fig. 3b and Extended Data Fig. 5d–f). Overall, these data suggest that Cap2 shows a slight enrichment on mRNAs with increased translation efficiency.

We next asked whether Cap2 is associated with mRNA stability. To calculate the half-lives of each TSN isoform, we quantified TSN isoform abundance by TSS-seq at different time points after transcriptional shut off with actinomycin D (actD-TSS-seq) (Fig. 3c and Extended Data Fig. 5g). TSN isoform half-lives were quantified in two replicates (Extended Data Fig. 5h,i).

When we compared the distribution of half-lives between mRNAs of different Cap2 stoichiometry, we observed a large difference in the half-lives and abundance of mRNAs with the highest and lowest Cap2 levels (median Δ t1/2 (Q4 − Q1) = 2.5 h; Fig. 3d and Extended Data Fig. 5j–m).

A similarly strong positive relationship between Cap2 methylation and mRNA stability was also observed in mES and MCF-7 cells (Extended Data Fig. 5n–t), suggesting a conserved relationship between Cap2 and mRNA stability across different cell types and species.

Overall, these data show that Cap2 methylation is enriched on mRNAs with high translation and stability, with a much more prominent association between Cap2 methylation and mRNA stability.

Cap2 does not confer high mRNA stability

We next wanted to determine whether Cap2 directly influences mRNA translation. We therefore measured mRNA translation in CMTR2 KO HEK293T cells by polysome–TSS-seq (Extended Data Fig. 6a). We noticed that CMTR2 KO HEK293T cells exhibited slower growth and overall reduced translation based on the puromycin incorporation and polysome profiling (Extended Data Fig. 6b–d). Despite the global reduction in translation, we asked whether CMTR2 depletion causes a selective decrease in the translation efficiency of mRNAs with the highest Cap2 stoichiometry. Consistent with the overall impairment of growth and translation (Extended Data Fig. 6b–d), all transcripts exhibited a reduction in translation after CMTR2 depletion independent of their Cap2 status (median Δ ribosome density = −0.8 MRL per kb; Extended Data Fig. 6e). However, the slight difference in translation efficiency between the high and low Cap2-modified mRNAs observed in CMTR2 WT cells remained largely unchanged in the CMTR2 KO cells (Fig. 3e). These data suggest that CMTR2 depletion does not regulate the translation capacities of transcripts based on their Cap2 stoichiometry.

We then wanted to determine whether Cap2 confers long half-lives to mRNA. To test this, we measured changes in mRNA half-lives (Δ t1/2) in CMTR2 KO HEK293T cells with actD-TSS-seq (Extended Data Fig. 6f,g). Here we observed that highly Cap2-modified transcripts exhibited a subtle decrease in half-lives in CMTR2 KO cells (Extended Data Fig. 6h). However, the large difference in mRNA stability observed between the high and low Cap2-modified mRNAs in CMTR2 WT cells remained in the CMTR2 KO cells (Fig. 3f and Extended Data Fig. 6i). These data suggest that the mild stabilizing effect of Cap2 on mRNA does not explain the unusual longevity of Cap2-marked mRNAs.

Cap2 levels increase with mRNA age

As Cap2 methylation does not confer long half-life to an mRNA, we considered the possibility that long half-life might cause high Cap2 methylation. Long-lived mRNAs persist longer in cells and therefore may have more time to acquire high Cap2 methylation during their cytoplasmic lifetime.

To test this, we used BruChase to capture mRNAs of increasing age23. In this approach, cells were pulsed with 5-bromouridine (5-BrU) for 3 h to label newly synthesized RNA. Then, the cells were chased in uridine-rich media to allow for ageing of BrU-labelled transcripts. BrU-containing RNA was immunopurified from mRNA with an anti-5-BrU antibody at 0 and 8 h (Fig. 4a).

Fig. 4: Cap2 is an epitranscriptomic mark of mRNA age.
figure 4

a, Transcripts were metabolically labelled using a 3 h 5-BrU pulse and chased in uridine-containing media for 0 h and 8 h to allow isolation of young and old RNA, respectively. BrU-containing mRNAs of different age were immunopurified using an anti-5-BrU antibody (Ab) and Cap2 levels were measured using CapTag-seq. IP, immunoprecipitation. b, Cap2 levels in young and old mRNA. Average of n = 2 biological replicates. c, Cap2 levels on each of the 16 possible m7G-proximal dinucleotides in young and old transcriptomes. Average of n = 2 biological replicates. d, CLAM-Cap–qPCR, a transcript-specific Cap2 methylation measurement method. Following generation of cDNA–cap tags attached to the 3′ end DNA adapter, Cap2 levels were measured by qPCR using a gene-specific reverse (Rv-mRNA X) and Cap2 tag-specific forward (Fw-Cap2) primer. Total (Cap1 and Cap2) cDNA–cap tags were amplified in parallel by qPCR with a primer pair comprising Rv-mRNA X and a primer that anneals to the DNA adapter (Fw-total). eh, Fold change in Cap2 methylation as a function of mRNA age for YBX1 (e), TUBA1B (f) and RPS9 (g) mRNAs and for the nuclear non-coding RNA XIST (h). The level of Cap2 methylation at 0 h of the uridine chase was set to 1 (t0 = 1, dashed line). n = 3 biological replicates. Data shown as mean ± s.d. Coloured data points denote measurements from the same experiment.

Source data

We next performed CapTag-seq on young (0 h) and old (8 h) mRNA. We found that young mRNA was 11% Cap2-modified, whereas old mRNA exhibited approximately 42% Cap2 methylation (Fig. 4b), suggesting that Cap2 methylation increases throughout the transcriptome as transcripts age in the cytosol. All 16 m7G-proximal dinucleotides exhibited an increase in Cap2 methylation over time (Fig. 4c), indicating the generality of age-dependent increases in Cap2 levels across all mRNA sequence contexts.

To examine the levels of Cap2 on individual transcripts as they age, we developed CLAM-Cap–quantitative PCR (CLAM-Cap–qPCR), an amplification-based method for measurement of Cap2 levels on low-abundance mRNAs. CLAM-Cap–qPCR involves preparation of cDNA–cap tag chimeras with a DNA adapter ligated to the 3′ end of the cap tag according to the CLAM-Cap-seq protocol. Next, the levels of Cap2 are measured in an mRNA of interest by qPCR using a transcript-specific primer and a primer that hybridizes to the DNA adapter and the first nucleotide of the three-nucleotide-long cap tag that is unique to the Cap2 form of the mRNA (Fig. 4d). To determine the total abundance of the mRNA, a parallel qPCR is performed using a primer that hybridizes only to the 3′ adapter, but not to any portion of the cap tag (Fig. 4d). We confirmed the accuracy of this method using luciferase mRNA standards with known Cap2 stoichiometries (Extended Data Fig. 6j). Overall, this approach allowed us to calculate the Cap2 levels in an mRNA of interest.

We performed CLAM-Cap–qPCR on three long-lived mRNAs: YBX1, TUBA1B and RPS9. In each case, we observed that the Cap2 levels continuously increased as each mRNA aged (Fig. 4e–g). As a control, we examined the nuclear RNA XIST24, which is not expected to encounter CMTR2. Indeed, we found largely unchanged Cap2 levels on XIST for the entire duration of the uridine chase (Fig. 4h). Overall, these data suggest that Cap2 methylation represents a dynamic mRNA modification that continuously increases throughout the mRNA lifetime.

CMTR2 depletion induces antiviral genes

We next wanted to understand the functional implication of Cap2 methylation. To test this, we analysed gene expression changes in CMTR2 KO HEK293T cells using RNA-seq. We found that downregulated genes were enriched in translation and RNA processing pathways, whereas markedly upregulated genes were related to the innate immune response and inflammatory pathways (Fig. 5a and Extended Data Fig. 7a). Among upregulated genes, we noticed numerous interferon-stimulated genes (ISGs), a gene group that is transcriptionally induced after cell infection with viruses, as well as other pathogens25. We confirmed the induction of ISGs at the protein and RNA level (Fig. 5b and Extended Data Fig. 7b–g). Similar effects were seen in CMTR2 KO A549 cells (Fig. 5c and Extended Data Fig. 7h–j).

Fig. 5: Cap2 methylation suppresses RIG-I activation and virus-induced innate immune response.
figure 5

a, RNA-seq-based measurements of gene expression changes upon CMTR2 depletion. n = 4 biological replicates. n = 9,507 genes. P < 0.05. CPM, counts per million. b, Expression of antiviral proteins in CMTR2 WT and KO HEK293T cells. GAPDH was the loading control. c, Expression of innate immune response genes in CMTR2 KO A549 cells requires RIG-I. Transcript levels were quantified by RT–qPCR. n = 3 biological replicates. d, HEK293T lysates expressing RIG-I or the RIG-I K858A/K861A cap-binding mutant were incubated with immobilized Cap1 and Cap2 double-stranded RNAs (dsRNAs). The pulled-down RIG-I protein was detected by western blot. The bars show quantification from n = 5 pulldowns. Two-tailed unpaired Student’s t-test with Welch’s correction, ****P < 0.00001. e, CMTR2 WT and KO HEK293T cells were transfected with GFP, RIG-I WT or RIG-I mutant (mut) from d. IP10 expression was measured 48 h post-transfection by RT–qPCR. n = 4 biological replicates. f, CMTR2 WT HEK293T cells were transfected with control, CMTR2 (CMTR2 WT) and CMTR2 W85A (CMTR2 mut) constructs for 24 h. Each sample was further transfected with GFP or RIG-I WT. IP10 expression was measured by RT–qPCR 48 h after the second transfection. n = 3 biological replicates. g, HEK293T CMTR2 KO cells expressing control and CMTR2 WT cells expressing control, CMTR2 WT or CMTR2 mut for 24 h were either mock-infected or VSV-infected. Cap2 methylation on viral N mRNA was assessed with CLAM-Cap–qPCR 24 h post-infection. The stacked bar plot shows the fraction of Cap1-modified and Cap2-modified VSV N mRNA. The Cap1 fraction was calculated as 1 − Cap2 fraction. n = 4 biological replicates. h, CMTR2 WT HEK293T cells were treated as in g. IP10 expression was assessed 24 h post-infection by RT–qPCR. n = 5 biological replicates. In ch, data shown as mean ± s.d. In c,f,h, one-way ANOVA with Bonferroni multiple comparison test was used; **P < 0.01, ***P < 0.001 and ****P < 0.0001.

Source data

The magnitude of the ISG induction was comparable to that of the induction seen after wild-type HEK293T cells were treated with interferon (750 U ml−1) for 6 h (Extended Data Fig. 7k). The induction of ISGs was sufficient to sensitize cells to diverse inflammatory signals (Extended Data Fig. 7l–p).

Overall, these data suggest that the lack of Cap2, and thus the increase in Cap1 mRNA levels, leads to the activation of pathways that are normally triggered in response to viral RNA. As CMTR2 KO cells are not exposed to an exogenous virus, the induction of the innate immune response may be mediated by the change in cap methylation of endogenous RNAs.

Cap2 suppresses RIG-I activation

The effects of Cap2 loss may be mediated by a specific RNA-binding protein whose RNA-binding activity is affected by Cap2 methylation. RIG-I is a well-established sensor of foreign RNA that becomes activated upon binding to the triphosphate bridge of m7G-capped RNAs26,27. RIG-I is markedly activated by Cap0 RNA due to the low nanomolar affinity (2 nM) of RIG-I for Cap0 RNA27,28.

Methyl modifications of RNA impair the binding of capped RNA to RIG-I. A 2′-O-methylation at the first nucleotide in Cap1 RNA leads to markedly reduced binding affinity (425 nM)27. 2′-O-methylation at the second nucleotide also reduces RIG-I activation26. In this study, the RNA only contained a single methyl modification at the second position, and not a dual methylation as seen in Cap2. Therefore, it remains unclear whether dual methylation in Cap2 would further reduce the binding affinity of Cap1 RNA to RIG-I. Overall, these studies raise the possibility that some of the cellular effects of CMTR2 depletion may be mediated by RIG-I.

To address this, we first asked whether RIG-I activity contributes to the induction of ISGs in CMTR2 KO cells. We generated CMTR2, RIG-I double KO A549 cells and monitored the expression of several ISGs by qPCR with reverse transcription (RT–qPCR) (Fig. 5c and Extended Data Fig. 7h,j). We found that RIG-I depletion markedly reduced the expression of ISGs in CMTR2 KO cells, demonstrating RIG-I involvement in the induction of the innate immune response in these cells.

Next, we directly measured the effect of Cap2 on the RNA-binding capacity of RIG-I. We incubated equal amounts of streptavidin-immobilized Cap1 and Cap2 double-stranded RNAs with cell lysates expressing FLAG-tagged RIG-I. Consistent with previous studies27, RIG-I readily bound to Cap0 RNA (Extended Data Fig. 8a). RIG-I also showed lower, but clear binding to Cap1 RNA. However, RIG-I exhibited markedly lower interaction with Cap2 RNA (Fig. 5d and Extended Data Fig. 8a,b). Neither Cap1 nor Cap2 RNA bound to RIG-I K858A/K861A, a RIG-I mutant that cannot interact with the triphosphate bridge of the cap structure29 (Fig. 5d). These results confirm the cap-dependent nature of the observed interactions. Overall, these data demonstrate that the dual methylation in Cap2 RNAs further reduces the ability of Cap1 RNAs to bind RIG-I.

Although Cap1 is not thought to be an activating ligand for RIG-I, we wanted to know whether high levels of Cap1 could activate RIG-I in cells. To measure RIG-I activation, we overexpressed RIG-I in HEK293T cells, which led to the induction of IP10 (Fig. 5e), a frequently used marker for RIG-I activation28,30. However, overexpression of RIG-I in CMTR2 KO cells, whose transcriptome is solely composed of Cap1 RNA, resulted in markedly increased levels of IP10 (Fig. 5e). These data suggest that Cap1 RNA can activate RIG-I if it is present at high levels.

Of note, the observed activation of RIG-I depends on the RNA cap structure as overexpression of RIG-I K858A/K861A failed to induce IP10 (Fig. 5e). Expression of RIG-I constructs remained similar in all tested samples (Extended Data Fig. 9a).

To determine whether the low level of activation of RIG-I in wild-type HEK293T cells was due to Cap1 mRNAs (approximately 60% of mRNAs; see Fig. 1e), we overexpressed CMTR2 to convert Cap1 to Cap2 mRNAs before RIG-I overexpression. Here we found that RIG-I expression led to markedly reduced IP10 levels relative to control RIG-I-expressing cells (Fig. 5f). Conversely, RIG-I activation was not reduced in cells expressing CMTR2 W85A, a mutant that cannot methylate the cap structure7 (Fig. 5f). RIG-I expression remained similar across all tested samples (Extended Data Fig. 9b). Overall, these data suggest that the function of the dual methylation in Cap2 is to further reduce the binding of Cap1 to RIG-I, thus decreasing the immunostimulatory effects seen at high concentrations of Cap1.

We confirmed previous studies showing that transfected Cap1 RNA is unable to activate RIG-I probably due to the low level of transfected RNA and its weak binding affinity (Extended Data Fig. 9c). However, the transcriptome-wide increase in Cap1 levels due to CMTR2 depletion may be sufficient to achieve Cap1 concentrations needed to activate RIG-I.

Cap2 in viral RNA impairs host response

Although our data demonstrate that Cap2 suppresses the ability of endogenous Cap1 RNA to induce the innate immune response, the purpose of a slow, time-dependent methylation of Cap1 to Cap2 is unclear. We considered the possibility that slow Cap2 methylation allows the host cell to detect and respond to rapidly replicating viral Cap1 RNAs before the viral RNA acquires high levels of Cap2 over time.

To test this model, we used vesicular stomatitis virus (VSV), an RNA virus that triggers the expression of antiviral genes, in part through RIG-I activation31. Although VSV RNA acquires Cap1 by virally encoded enzymes, it utilizes host CMTR2 to achieve low Cap2 levels32. Therefore, we used CMTR2 overexpression to enhance Cap2 methylation efficiency in VSV RNA. In this way, we could determine whether the normally low Cap2 methylation efficiency enables efficient induction of the innate immune response by VSV.

To test this, we first infected control HEK293T cells with propagation-incompetent VSV33 (Extended Data Fig. 9d). Consistent with previous studies34,35, VSV infection resulted in the rapid accumulation of viral RNA to an amount comparable to the entire mRNA transcriptome of mock-infected cells (Extended Data Fig. 9e,f). Viral transcripts were predominantly Cap1-modified (Fig. 5g and Extended Data Fig. 9g,h). This increase in Cap1 levels was associated with the expected induction of IP10 mRNA (Fig. 5h). Thus, the VSV-induced increase in Cap1 may contribute to the activation of the innate immune response.

We next attempted to increase Cap2 methylation efficiency of viral RNA by overexpressing CMTR2. Overexpression of CMTR2 markedly increased Cap2 stoichiometry on viral RNA, resulting in the concomitant reduction in Cap1 levels (Fig. 5g and Extended Data Fig. 9g,h). In CMTR2-overexpressing cells, VSV infection failed to induce IP10 mRNA to the levels seen in control infected cells (Fig. 5h). This was due to cap-mediated methylation, as CMTR2 W85A failed to suppress IP10 induction (Fig. 5g,h and Extended Data Fig. 9g,h). Overall, these data suggest that cells maintain low efficiency of Cap2 methylation to prevent viral RNA from acquiring Cap2 and evading host defence mechanisms.

As a control, we asked whether CMTR2 overexpression nonspecifically suppresses the innate immune response. To test this, we activated the innate immune response using noncapped RIG-I ligands, such as triphosphorylated double-stranded RNA and poly(I:C)36. Although HEK293T cells express RIG-I protein at low levels (Fig. 5b), they have previously been shown to respond to double-stranded RNA ligands in a RIG-I-dependent manner37. Consistent with this, we found that both stimuli readily induced IP10 mRNA expression (Extended Data Fig. 9i). However, CMTR2 overexpression did not suppress their effects on IP10 mRNA levels (Extended Data Fig. 9i). These data are consistent with a model in which increased expression of CMTR2 selectively suppresses antiviral responses induced by Cap1 RNAs.

Discussion

Cap2 was discovered nearly 50 years ago as one of the five major methyl modifications that decorate mRNA along with m7G, N6,2′-O-dimethyladenosine (m6Am), Cap1 and N6-methyladenosine (m6A). Despite the high prevalence of Cap2 in the transcriptome, Cap2 is the last major unmapped nucleotide modification. Using CLAM-Cap-seq, we generated a transcriptome-wide map of Cap2, which revealed strong Cap2 enrichment on long-lived mRNAs, occurring as a result of mRNA age-guided Cap2 deposition in the transcriptome. Rather than controlling mRNA processing events such as translation or stability, a major function of Cap2 is to further suppress the ability of endogenous RNAs to activate the innate immune response. Mechanistically, the dual methylation in Cap2 acts to prevent Cap1 from binding to RIG-I, thus suppressing the autoimmune potential of endogenous Cap1 RNA. We also show that slow, time-dependent accumulation of Cap2 in mRNAs represents a cellular adaptation used to cloak host RNAs from activating RIG-I. Simultaneously, slow Cap2 methylation reduces the likelihood that rapidly replicating viral RNA acquires Cap2 and thus evades recognition by host cell defences.

The cell requires mechanisms to distinguish self from non-self RNA. Methylation of mRNA caps is one of the major mechanisms to mark self mRNAs. Cap0 RNAs that lack ribose methylation are high-affinity ligands and potent activators of RIG-I27. Small amounts of Cap0 can therefore activate the innate immune response. The presence of a single methyl group in Cap1 is sufficient to reduce the ability of RNA to activate RIG-I. Although the binding affinity of Cap0 to RIG-I is 2 nM, Cap1 still binds at an affinity of 425 nM (ref. 27). We found that large amounts of Cap1 RNA, achieved by depletion of CMTR2 or by viral infection, can provide sufficient levels of Cap1 to activate RIG-I. Thus, even though Cap1 only activates RIG-I weakly, the large amount of Cap1 in the cell, coupled with the induction of RIG-I expression in response to viral infection, can make Cap1 RNA an important agonist of RIG-I signalling.

Previous studies have shown that methylation at either the first or second position of RNA is sufficient to reduce activation of RIG-I26. We showed that the dual methylation in Cap2 functions to reduce the ability of Cap1 RNA to bind to and activate RIG-I. Although our study identifies RIG-I as an ‘anti-reader’ of Cap2, RIG-I KO did not completely reduce the induction of the ISGs after CMTR2 depletion. Other proteins, such as IFITs38,39 or other foreign RNA sensors such as MDA5 (ref. 40), may also be sensitive to Cap2 methylation and therefore regulate aspects of mRNA biology.

Of note, some viruses have acquired CMTR2 homologues, such as Mimivirus and African swine fever virus9. The CMTR homologue in vaccinia virus has been proposed to methylate the first, second and possibly other nucleotides41. Thus, viral CMTR homologues may have broader methylation functions than previously recognized, including Cap2 methylation, which may help to them to evade host responses.

The levels of Cap1 and Cap2 vary in different cells and tissues, which correlate in part with CMTR2 levels (Extended Data Fig. 1c,d). Although higher Cap1 levels may be deleterious for cells, we found that these cells typically exhibit lower RIG-I expression, which may reduce the autoinflammatory potential of high levels of Cap1 RNA (Extended Data Fig. 10a–c). Cells may adjust the levels of CMTR2 to influence their cellular responses to either host or viral RNAs, and aberrant levels of Cap1 may contribute to diseases that are linked to excessive activation of the innate immune response (Extended Data Fig. 10d).

In addition to suppressing the immunostimulatory effects of cellular RNAs, Cap2 may affect mRNAs in other ways. Our global analyses showed small but clear effects of Cap2 on mRNA translation or stability. Cap2 may thus shape gene expression, particularly for long-lived transcripts, to further enhance their stability in cells and to increase their protein expression. In addition, some RNAs may have a stronger dependence on Cap2 for preventing their interaction with RIG-I, and thus may have a more important role in suppressing the innate response when Cap2-modified. Other RNAs may contribute to the biology of Cap2, including snRNAs16. Finally, Cap2 has been linked to neuronal functions in Drosophila41, suggesting different roles of Cap2 in lower organisms.

We developed a suite of tools to profile and measure Cap2 in transcriptomes and on specific mRNAs of interest. Measurements of Cap1 and Cap2 cap tags using CapTag-seq reveals the overall Cap2 prevalence in bulk mRNA, whereas CLAM-Cap-seq involves creating cDNA–cap tag chimeras that directly link the Cap2 status of an mRNA to the cDNA that is generated from it. Along with newly developed biochemical and PCR-based methods, the Cap1 or Cap2 state of any mRNA of interest can be readily measured.

Our data show that Cap2 is fundamentally different from other epitranscriptomic mRNA modifications. First, Cap2 methylation is a dynamic process, as it continuously accumulates throughout the lifetime of an mRNA in the cytoplasm. This contrasts other mRNA modifications such as m6A, which are largely ‘written’ in the nucleus and thus exhibit little or no potential for dynamics once the mRNA has left the nucleus. Although CMTR2 exhibits slight preferences for methylation of some m7G-proximal sequences, it does not require strictly defined sequence motifs. Furthermore, the primary purpose of Cap2 is not to regulate mRNA fates, as is seen with m6A42, but is instead to lower the overall burden of Cap1 RNAs on the intracellular defence mechanisms designed to fight against invading pathogens.

Methods

Cell culture

HEK293T, A549 and MCF-7 cells were purchased from the American Type Culture Collection. mES cells were obtained as a kind gift from J. Hanna’s laboratory. HEK293T and MCF-7 cells were maintained in DMEM (11995065, Gibco) supplemented with 10% FBS and 100 U penicillin–streptomycin (15140148, Gibco). A549 cells were grown in Ham’s F-12K medium (21127022, Gibco) supplemented with 10% FBS and 100 U penicillin–streptomycin. mES cells were cultured in Knockout DMEM (10829018, Gibco) supplemented with 15% heat-inactivated FBS, 100 U penicillin–streptomycin, 1× GlutaMAX (35050061, Gibco), 55 µM β-mercaptoethanol, 1× MEM non-essential amino acid solution (11140076, Gibco), 103 U ml−1 LIF (ESG1107, ESGRO), 3 µM CHIR99021 (72052, STEMCELL Technologies) and 1 µM PD0325901 (72182, STEMCELL Technologies). All cell types were grown in sterile cell culture incubators at 37 °C and 5% CO2. Cell lines were not authenticated. All cell types tested negative for mycoplasma contamination. Mycoplasma contamination was routinely tested with Hoechst staining.

Generation of KO cell lines

To generate CMTR2 KO HEK293T cells, 6 × 105 HEK293T cells were seeded in a single well of a six-well cell culture plate. The next day, cells were transfected with 1 µg FTSJD1 double nickase plasmid (SC-412604-NIC, Santa Cruz Biotechnology) using 2 µl LipoD239 transfection reagent (SL100668, SignaGen Laboratories). After 36 h, GFP-positive cells were sorted by flow cytometry and seeded into a single well of a 12-well cell culture plate. After 24 h, GFP-positive cells were treated with 5 µg ml−1 puromycin for 36 h. The remaining viable cells were washed twice with PBS and grown until 30–50% confluency was reached. Single-cell clones were isolated and screened for CMTR2 depletion using western blot. To generate CMTR2 KO and CMTR2 RIG-I double KO A549 cells, 2.5 × 105 A549 cells were seeded in a single well of a six-well plate. The following day, cells were transfected with either 2.5 µg FTSJD1 double nickase plasmid or a mixture of 1.5 µg FTSJD1 and 1.5 µg RIG-I (SC-400812-NIC, Santa Cruz Biotechnology) double nickase plasmids using Lipofectamine 3000 transfection reagent (L3000001, Invitrogen) according to the manufacturer’s instructions. The procedure for isolation of single-cell KO clones was performed as described for HEK293T cells.

Animal maintenance and procedures

All animals used in this study were maintained in compliance with Weill Cornell Medicine Institutional Animal Care and Use Committee (IACUC) protocols. Eight-week-old CL57B/L wild-type female mice were purchased from Charles River Laboratory and housed in standard cages with unrestricted supplies of water and food with 14 h light–10 h dark cycle at 18–23 °C and 40–60% humidity. Sixteen-week-old female mice were dissected for isolation of the brain, liver, kidney, lung, heart and spleen. Upon isolation, organs were stored in TRIzol reagent at −80 °C until further use. The zebrafish wild-type AB line was used in this study. Embryos were maintained in E3 medium at 28 °C and staged as previously described43. Forty-eight-hour-old zebrafish embryos were collected in TRIzol reagent and stored at −80 °C until further use.

Total RNA isolation

Total RNA was isolated from cells using TRIzol reagent according to the manufacturer’s instructions unless otherwise stated. For isolation of total RNA from mouse tissues, TRIzol-submerged tissues were homogenized with high-impact zirconium 1.5-mm beads (D1032-15, Benchmark Scientific) twice at 50 Hz for 3 min using TissueLyser II (Qiagen). For isolation of total RNA from zebrafish samples, embryos were dissolved in TRIzol reagent and passed three times through a 21-gauge needle.

mRNA extraction

Total RNA (20 µg) was diluted in 75 μl nuclease-free water. Oligo(dT)25 magnetic beads (50 μl; bed volume; S1419S, NEB) were resuspended in 75 μl 2× mRNA binding buffer (40 mM Tris-HCl pH 7.5, 1 M LiCl, 2 mM EDTA and 0.1% Triton X-100) and mixed with total RNA. Samples were heated at 65 °C for 5 min, placed on ice for 3 min and incubated at room temperature for 10 min with constant rotation. Beads were washed twice with 150 μl mRNA wash buffer (20 mM Tris-HCl pH 7.5, 150 mM LiCl, 1 mM EDTA and 0.01% Triton X-100) and resuspended in 75 μl mRNA elution buffer (10 mM Tris-HCl pH 7.5). Samples were heated at 75 °C for 2.5 min and placed on ice for 2 min. To ensure pure mRNA isolation, a second round of poly(A)+ RNA purification was conducted. 75 μl 2× mRNA binding buffer was mixed with the beads from the previous step, followed by incubation for 10 min at room temperature with constant rotation. Beads were then washed once with 150 μl mRNA wash buffer and resuspended in 25 μl water. To allow for the final mRNA elution from the beads, samples were heated at 75 °C for 3 min and the supernatant was collected. The extracted mRNA was further purified using a Zymo Research Clean and Concentrator-5 (RCC-5) column (R1013, Zymo Research). In the experiments in which a higher or lower amount of total RNA was used, mRNA was isolated by upscaling or downscaling of the mRNA isolation reagents.

CapTag-seq

mRNA (2 µg) was treated with 25 U Quick CIP (M0525L, NEB) in a 30-μl reaction for 30 min at 37 °C. The reaction was cleaned using a Zymo RCC-5 column and mRNA was eluted with 20 μl water. m7GDP was removed from mRNA 5′ termini (decapping) using 5 U Cap-Clip acid pyrophosphatase (C-CC15011H, CELLSCRIPT) in a 20-μl reaction for 1 h at 37 °C. The mRNA decapping reaction was cleaned with Zymo RCC-5 column and mRNA was eluted in 11.5 μl water. A biotinylated 5′ adapter composed of 2′-O-methylated nucleotides (biotin-Nm-RA5) was ligated to the 5′-monophosphorylated mRNA ends using 60 U T4 RNA ligase 1 (M0437M, NEB) in the following 30 μl reaction mixture: 3 μl 10× RNA ligase buffer, 10 μl decapped mRNA, 3 μl 10 mM ATP, 1 μl 30 μM biotin-Nm-RA5, 10.5 μl 50% PEG-8000, 0.5 μl 40 U μl−1 RNaseOUT and 2 μl 30 U μl−1 T4 RNA ligase 1 for 4 h at 25 °C. To successfully remove non-ligated biotin-Nm-RA5, the ligation reaction was cleaned three times with Zymo RCC-5 column following the manufacturer’s protocol for isolation of RNA longer than 200 nucleotides. Ligated mRNA was eluted in the third cleanup step in 40 μl water. Next, ligated mRNA was digested to completion with 5 U RNase T2 (GE-NUC00400-02, MoBiTec) in 30 mM Na-acetate pH 4.5 overnight at 37 °C. The resulting adapter-linked cap tags were captured on M-280 streptavidin Dynabeads (20 μl bed volume, Invitrogen) in 300 μl binding buffer (10 mM Tris-HCl pH 7.5, 300 mM NaCl and 0.05% Triton X-100) for 30 min at room temperature. Beads were washed twice with 500 μl high-salt buffer (10 mM Tris-HCl pH 7.5, 2 M NaCl and 0.1% Triton X-100), twice with 500 μl binding buffer, and twice with 500 μl low-salt buffer (10 mM Tris-HCl pH 7.5, 50 mM NaCl and 0.025% Triton X-100). The removal of the 2′,3′-cyclic phosphate from the 3′ end of cap tags was conducted using 10 U T4 PNK (M0201L, NEB) in a 20 μl dephosphorylation buffer (100 mM Na-acetate pH 6.0, 10 mM MgCl2 and 5 mM DTT) for 20 min at 23 °C. Beads were washed twice with 500 μl binding buffer, and twice with 500 μl low-salt buffer. Next, the preadenylated DNA adapter with a 20-nucleotide randomized region (N20-DA3) was ligated to the dephosphorylated 3′ ends of cap tags with 200 U T4 RNA ligase 2, truncated KQ (M0373L, NEB) in a 20 μl ligation mix (2 μl 10× RNA ligase buffer, 1 μl 20 μM DA3, 7 μl water, 1 μl 200 U μl−1 T4 RNA ligase 2, truncated KQ, 8.5 μl 50% PEG-8000 and 0.5 μl 40 U μl−1 RNaseOUT) for 4 h at 27 °C. The non-ligated N20-DA3 was removed with 25 U yeast 5′-deadenylase (M0331S, NEB) and 15 U RecJf (M0264S, NEB) for 45 min at 30 °C. Beads were washed twice with 500 μl high-salt buffer, twice with 500 μl binding buffer and twice with 500 μl low-salt buffer. For cDNA preparation, beads were first resuspended in 12 μl reverse transcription annealing mix (1 μl 10 μM reverse transcription primer (RTP), 4 μl 250 mM Tris-HCl pH 8.3, 5 μl 300 mM KCl and 2 μl water), heated at 90 °C for 2 min, and slowly cooled down to 25 °C with a rate of 0.1 °C s−1. Following the RTP annealing, 7.25 μl RT mixture (4 μl water, 1 μl 0.1 M DTT, 1 μl 10 mM dNTPs (each), 1 μl 60 mM MgCl2 and 0.25 μl 40 U ul−1 RNaseOUT) and 0.75 μl SuperScript III (18080051, Invitrogen) were added to the annealing mix. The reaction was incubated for 30 min at 55 °C, and heat inactivated for 10 min at 75 °C. PCR amplification of cDNA was performed with 2× Phusion High Fidelity PCR master mix with HF buffer (M0531L, NEB). The resulting PCR products were purified twice with 1.8× PCR volume AMPure XP beads (A63881, Beckman Coulter). Amplified cDNA libraries were sequenced on Illumina instruments in a single-end or paired-end modes.

CapTag-seq data analysis

Low-quality sequencing reads were filtered out and the 3′ end adapter (DA3) was trimmed off using Flexbar v2.5 (ref. 44). Duplicated reads were removed using the pyFastqDuplicateRemover.py script within the pyCRAC package45. Next, the 20-bp-long randomized region was removed from the 3′ end of sequencing reads using the UNIX cut command. The remaining portion of the reads represents RNase T2-released tags. Tag length distribution, quantification of the two-nucleotide and three-nucleotide-long cap tags from Cap1 and Cap2 mRNAs, respectively, and Cap1 and Cap2 m7G-proximal sequences of cap tags were obtained using basic UNIX commands.

TSS-seq

mRNA (200 ng) was treated with 25 U Quick CIP in a 30-μl reaction for 30 min at 37 °C. The reaction was cleaned using a Zymo RCC-5 column and mRNA was eluted in 17 μl water. m7GDP was removed from the mRNA 5′-termini using 5 U Cap-Clip acid pyrophosphatase in a 20-μl reaction for 1 h at 37 °C. The decapped mRNA was isolated by mixing the decapping reaction with 44 μl RNAClean XP beads (A63987, Beckman Coulter). After 15 min of incubation at room temperature, the beads were washed twice with 80% ethanol and mRNA was eluted from the beads in 10 μl water for 5 min. Next, a biotinylated 5′ RNA adapter with the eight-nucleotide-long unique molecular identifiers (biotin-RA5-UMIs) was ligated to the 5′-monophosphorylated mRNA ends with 60 U RNA ligase 1 in the 30-μl reaction as described in CapTag-seq. Ligated mRNA was purified by mixing the ligation reaction with 44 μl 1.5 M NaCl and 31 μl RNAClean XP beads. After 10 min of incubation at room temperature, beads were washed twice with 80% ethanol and RNA was eluted from the beads in 20 μl water for 5 min. Eluted RNA was further mixed with 28 μl RNAClean XP beads and incubated for 10 min at room temperature. Beads were washed twice with 80% ethanol and RNA was eluted as before in 20 μl water. Next, the ligated mRNA was subjected to a partial alkaline-based fragmentation by mixing 20 μl mRNA with 5 μl 240 mM NaHCO3 and 5 μl 360 mM Na2CO3. The fragmentation mixture was incubated for 11 min at 60 °C and immediately placed on ice. Fragmented RNA was extracted using a Zymo RCC-5 column, eluted in 20 μl water and stored at −80 °C until further use. The following day, ligated mRNA 5′ ends were captured on M-280 streptavidin Dynabeads (20 μl bed volume) in 100 μl binding buffer for 30 min at room temperature. Next, the beads were washed twice with 500 μl high-salt buffer, twice with 500 μl binding buffer and twice with 500 μl low-salt buffer. Dephosphorylation of the 3′ ends of RNA fragments, ligation of the preadenylated DNA adapter (DA3) and reverse transcription steps were conducted as described in the CapTag-seq protocol. For the reverse transcription reaction, samples were incubated for 45 min at 55 °C, followed by heat inactivation for 10 min at 75 °C. cDNA was PCR amplified using 2× Phusion High Fidelity PCR master mix with HF buffer and the resulting PCR products were purified twice with 0.9× PCR volume AMPure XP beads. Amplified cDNA libraries were sequenced on the Illumina instrument in a paired-end mode.

TSS-seq data analysis

Low-quality sequencing reads were filtered out and the 3′ end adapter (DA3) was trimmed off using Flexbar v2.5 (ref. 44). Duplicated reads were removed using the pyFastqDuplicateRemover.py script within the pyCRAC package v1.3.2 (ref. 45). Only the read R1 was considered for further analysis. Eight-nucleotide-long randomized region (UMIs) was removed from the 5′ end of sequencing reads using seqtk. UMI-free reads were then shortened from the 3′ end to the universal length of 25 bp using seqtk. The first nucleotide of the processed, 25-bp-long reads represents the TSN of an mRNA. The processed reads were first aligned to the Drosophila melanogaster genome (dm6) using Bowtie v1.2.3 (ref. 46). The remaining, unmapped reads were then aligned to the human (h38) or mouse (mm10) genome using Bowtie v1.2.3. The 5′ end read coverage (representing a TSN coverage) per each genomic position was obtained using BEDTools v2.28.0 (ref. 47). The Ensembl gene annotation file was obtained from Ensembl48 and annotated gene starts were extended by 250 bp. The aligned reads were annotated with BEDTools using the modified Ensembl gene annotation file. For each sample, 5′ end read coverage per each genomic position was normalized to the total number of mapped reads (mapped reads per million (RPM)). Genomic positions with the normalized 5′ end coverage < 2 RPM were discarded. A genomic position with the maximum coverage within an annotated gene was identified as the major TSN isoform of a gene. All other genomic positions within that gene with normalized 5′ end coverage more than 10% of the maximum (major) TSN were considered as alternative TSN isoforms for that gene.

ActD-TSS-seq

CMTR2 WT or CMTR2 KO HEK293T cells (1.8 × 106) were seeded in a 6-mm cell culture dish. The day after seeding, cells were treated with 5 μg ml−1 actD to block synthesis of new transcripts. Total RNA was extracted from cells after 0, 2, 8 and 16 h of actD treatment. Total RNA of each sample was spiked-in with 40 ng D. melanogaster poly(A)+ RNA (636222, Takara). Cellular poly(A)+ RNA was extracted using oligo(dT)25 magnetic beads. mRNA (200 ng) from each time point of actD treatment was subjected to the TSS-seq protocol as described above.

ActD-TSS-seq data analysis

Sequencing reads were processed as described above with minor modifications.

The annotated TSN isoforms with expression more than 4 RPM at 0 h of actD treatment were considered for calculations of mRNA half-lives. The TSN isoform expression at each time point of actD treatment was first normalized to the total number of reads mapping to the D. melanogaster genome in each sample. The normalized TSN expression was then transformed with a sample-specific scale factor determined from the general mRNA decay rate in HEK293T cells (see Extended Data Figs. 5g and 6f). The scaling factors were as follows: for 0 h actD = 1, for 2 h actD = 0.77, for 8 h actD = 0.56 and for 16 h actD = 0.41. The fully normalized TSN isoform expression values at each time point after actD treatment were then used for the calculation of the mRNA decay rates with a one-phase decay model using the drm function in the drc R package49. TSN isoform half-lives were calculated as: t1/2 = ln2/k, where k represents the TSN isoform-specific decay constant derived from the drm function.

Polysome–TSS-seq

CMTR2 WT or CMTR2 KO HEK293T cells (8 × 106) were seeded onto a 150-mm cell culture dish. The following day, cells were treated with 100 µg ml−1 cycloheximide (CHX; C7698, Sigma) for 5 min at 37 °C. Cell dishes were placed in ice, washed once with 10 ml ice-cold PBS supplemented with 100 µg ml−1 CHX (PBS + CHX), and scraped in 8 ml PBS + CHX. Cells were pelleted at 300g for 3 min at 4 °C, supernatant aspirated, and cell pellet was resuspended in 0.5 ml ice-cold polysome extraction buffer (20 mM Tris-HCl pH 7.5, 100 mM KCl, 1% Triton X-100, 5 mM MgCl2, 2 mM DTT, 100 µg ml−1 CHX and 1× Halt protease and phosphatase inhibitor cocktail (78440, Thermo Scientific)). Cells were left on ice for 5 min to lyse. The cell lysis process was facilitated by passing cells through a 21-gauge needle. Cell lysate was centrifuged for 5 min at 16,000g at 4 °C, the supernatant was transferred to a new 1.5-ml tube and snap frozen in liquid nitrogen. Frozen cell lysates were stored at −80 °C until further use. Frozen cell lysates were thawed on ice. Lysate (500 µl) was layered on top of the 10–50% linear sucrose gradient prepared with polysome extraction buffer. The gradient was centrifuged for 2 h at 36,000g in a SW-41 Ti swinging bucket rotor at 4 °C. Polysome fractionation was performed using an automated fraction collector (BioComp) with a continuous monitoring of the 254-nm absorbance. Fractions corresponding to a one, two–three, four–five, six–seven and more than eight ribosomes were collected manually. Drosophila poly(A)+ RNA (3 µl of 2.5 ng µl−1) was spiked-in to each fraction to account for differences in RNA extraction between fractions. Next, an equal volume of TRIzol LS was added to each isolated polysome fraction along with 15 mM EDTA (final concentration). Tubes were briefly vortexed and left at room temperature for 30 min. Total RNA was isolated as instructed by the TRIzol LS manufacturer’s protocol and precipitated using isopropanol. mRNA was extracted from the total RNA of each polysome fraction with oligo(dT)25 magnetic beads. Poly(A)+ RNA (200 ng) from each fraction was subjected to the TSS-seq protocol as described above.

Polysome–TSS-seq data analysis

Sequencing reads were processed as described in the actD-TSS-seq data analysis with minor modifications. The genomic positions with 5′ end read coverage of more than 1 RPM in each polysome fraction were considered. Genomic positions that passed this expression threshold were then annotated to the TSN isoforms identified in the input sample (0 h actD-TSS-seq libraries) using BEDTools. The expression (RPM) of the annotated TSN isoforms was further normalized to the total number of reads mapping to the D. melanogaster genome in each polysome fraction. Following the normalization, an average number of ribosomes bound to each TSN isoform was calculated as follows: total abundance of a TSN isoform (Ntotal) was calculated as the sum of the TSN isoform expression levels (N) in each of the five isolated polysome fractions (Fn): Ntotal = NF1 + NF2 + NF3 + NF4 + NF5, where F denotes polysome fraction number. Total number of ribosomes (Rtotal) associated with each TSN isoform was calculated as Rtotal = f1 × NF1 + f2 × NF2 + f3 × NF3 + f4 × NF4 + f5 × NF5, where f denotes the number of ribosomes present in each polysome fraction (f1 = 1, f2 = 2.5, f3 = 4.5, f4 = 6.5 and f5 = 12). The average number of ribosomes bound to each TSN isoform (MRL) was then calculated as MRL = Rtotal/Ntotal. Translation efficiency of each TSN isoform was defined as the average ribosome density on each TSN isoform. Ribosome density (MRL per kb) was calculated as MRL per kilobase of the mRNA open reading frame.

CLAM-Cap-seq

mRNA (1 µg) was partially fragmented in a 60-μl reaction containing 40 mM NaHCO3 and 60 mM Na2CO3 for 8.5 min at 60 °C. Seven independent fragmentation reactions were combined (7 μg poly(A)+ RNA in total) and collectively purified using a Zymo RCC-5 column. Fragmented mRNA was eluted from the column in 31 μl water. Next, the internal mRNA fragments were 5′ end phosphorylated with 10 U T4 PNK in two separate 30-μl reactions (15 μl mRNA, each) for 30 min at 37 °C. Two phosphorylation reactions were combined and cleaned as described before. mRNA was eluted from the column in 21 μl water. Next, 5′-phosphorylated internal mRNA fragments were removed with 1 U Terminator 5′ phosphate-dependent exonuclease (TER51020, Lucigen) in two separate 15-μl reactions (10 μl mRNA, each) for 1 h at 30 °C. The remaining m7G-protected 5′ mRNA fragments were purified by mixing two combined Terminator reactions (30 μl total) with 45 μl RNAClean XP beads. The mixtures were left at room temperature for 15 min, followed by two bead washes with 80% ethanol. RNA was eluted from the beads in 16 μl water for 5 min. Enriched 5′ mRNA fragments were subjected to m7GDP removal with 7.5 U Cap-Clip acid pyrophosphatase in a 20-μl reaction for 1 h at 37 °C. The decapping reaction was mixed with 36 μl RNAClean XP beads and 84 μl 100% ethanol, and left at room temperature for 15 min. The beads were washed twice with 80% ethanol, and decapped mRNA fragments were eluted from the beads in 11 μl water for 5 min. Next, the decapped mRNA fragments were incubated in the annealing mix (10 μl decapped mRNA fragments, 1 μl 10 μM biotin-N6-DA3-RTP, 2 μl 500 mM Tris-HCl pH 8.3, and 2 μl 750 mM KCl) at 90 °C for 1.5 min and cooled down quickly to 4 °C. Following N6-DA3-RTP annealing, 4.25 μl RT mix (2 μl 0.1 M DTT, 1 μl 10 mM dNTPs (each), 1 μl 60 mM MgCl2 and 0.25 μl 40 U μl−1 RNaseOUT) was added to the annealing mix along with 0.75 μl 200 U μl−1 Maxima reverse transcriptase RNase H minus (EP0752, Thermo Scientific). The reverse transcription reaction was incubated at 25 °C for 7.5 min, followed by a 10-min incubation at 52 °C and a 30-min incubation at 57 °C. Reactions were cooled down to 4 °C, mixed with M-280 streptavidin Dynabeads (10-μl bed volume) in 100 μl binding buffer and incubated at room temperature for 30 min. The beads were washed twice with 500 μl binding buffer and twice with 500 μl low-salt buffer. Next, the beads were resuspended in 18.5 μl 1× CircLigase mix (2 μl 10× CircLigase buffer, 4 μl 5 M betaine, 1 μl 50 mM MnCl2, 1 μl 1 mM ATP and 10.5 μl water), and 1 μl 100 U μl−1 CircLigase II was added. The CircLigase reaction was incubated for 8 h at 60 °C with occasional pipetting of the beads. The reaction was then resuspended in 100 μl binding buffer containing fresh M-280 streptavidin Dynabeads (5-μl bed volume) and incubated for 30 min at room temperature. Beads were then washed once with 500 μl high-salt buffer, twice with 500 μl binding buffer and once with 500 μl low-salt buffer. Next, the beads were resuspended in 50 μl 0.2 M KOH and incubated overnight at 37 °C. The following day, HCl was added to the solution to obtain the pH of 7.5, and new M-280 streptavidin Dynabeads (10-μl bed volume) were added. Sample was incubated for 30 min at room temperature, followed by bead washes with 500 μl high-salt buffer, twice with 500 μl binding buffer and once in 500 μl low-salt buffer. Beads were then incubated for 2.5 h at 37 °C in a 50-μl reaction containing 30 mM Na-acetate pH 4.5 and 5 U RNase T2. Beads were then washed once with 500 μl high-salt buffer, twice with 500 μl binding buffer and twice with 500 μl low-salt buffer. The removal of the 2′,3′-cyclic phosphate from the 3′ ends of cap tags and the ligation of the preadenylated DNA adapter with UMIs (DA5-UMIs) was performed as described in the CapTag-seq protocol. After ligation, the beads were washed twice with 500 μl high-salt buffer, twice with 500 μl binding buffer and twice with 500 μl low-salt buffer. Beads were then resuspended in 20 μl cDNA buffer (10 mM Tris-HCl pH 7.5) and used for subsequent PCR amplification of cDNA–cap tags with 2× Phusion High Fidelity PCR master mix with HF buffer. The resulting PCR products were purified twice with 0.9× PCR volume AMPure XP beads. Purified cDNA libraries were sequenced on Illumina instrument in a paired-end mode.

CLAM-Cap-seq data analysis

Low-quality sequencing reads were filtered out and the 3′ end DNA adapter (DA3) was trimmed off using Flexbar v2.5 (ref. 44). Duplicated reads were removed using the pyFastqDuplicateRemover.py script within the pyCRAC package v1.3.2 (ref. 45). Following the removal of PCR duplicates, only the read R1 was considered for further analysis. The six-nucleotide-long UMI was first removed from the 5′ end of sequencing reads using seqtk. Next, we determined Cap1 and Cap2 origins of the sequencing reads. To achieve that, 5′ ends of processed reads were screened for the presence of Cap1 and Cap2 palindromes containing zero and one non-templated nucleotide using grep UNIX command. The palindrome search was focused on all 64 possible trinucleotide sequences that an mRNA may begin with (for example, AGA, CTT, and so on), except for the homopolymeric trinucleotide stretches (for example, AAA, GGG, CCC and UUU), as these can generate palindromes with ambiguous Cap1 and Cap2 assignment. Of note, as U-starting mRNAs are very rare in mammalian transcriptomes (see Extended Data Fig. 2d), they were also excluded from the Cap1 and Cap2 palindrome search. Next, the sequencing reads containing Cap1 and Cap2 palindromes were separated, followed by the removal of cap tags and a non-templated nucleotide with seqtk to obtain mappable reads whose starts represent the first mRNA nucleotide. Cap tag-free Cap1-derived and Cap2-derived reads were then trimmed at their 3′ end to obtain the universal read length of 25 bp. After processing, Cap1 and Cap2 reads were separately mapped to the human (hg38) or mouse (mm10) genome using Bowtie v1.2.3. 5′ end read coverage per each genomic position was assessed using BEDTools. The aligned reads were then annotated to the TSN isoforms identified via TSS-seq using BEDTools. Cap2 stoichiometry for each TSN isoform was calculated as the Cap2 read fraction of the total (Cap1 + Cap2) reads. Only TSN isoforms whose total (Cap1 + Cap2) CLAM-Cap-seq read coverage was higher or equal to 30 were considered for Cap2 stoichiometry calculations. TSN isoforms showing low variability in Cap2 stoichiometry between replicates (standard deviation of less than 5%) were used for all downstream analysis.

Gene enrichment analysis

For gene enrichment analysis (WebGestalt (http://www.webgestalt.org/)), Cap2 stoichiometry of a gene was calculated as a weighted Cap2 stoichiometry average of all TSN isoforms of that gene. In these calculations, the contribution of each TSN isoform to the average Cap2 stoichiometry for a gene was based on the relative expression level of each Cap2-modified TSN isoform. Genes with the highest and lowest Cap2 stoichiometry (top 20%) were identified. The overrepresentation analysis50 for KEGG pathways was performed using all genes with measured Cap2 stoichiometry as a reference gene set. In the RNA-seq dataset, genes were ranked based on the level of the changes in their expression between CMTR2 KO and CMTR2 WT cells (log2 fold change) and subjected to the gene set enrichment analysis50 for identification of gene sets enriched in specific Reactome pathways.

CapOligo-PAGE

Small or poly(A)+ RNA (3 µg) was subjected to m7GDP removal with 5 U Cap-Clip acid pyrophosphatase in a 20-µl reaction for 1 h at 37 °C. The decapped RNA was purified using a Zymo RCC-5 column and eluted in 20 µl water. RNA 5′ ends were dephosphorylated using 25 U Quick CIP in a 30-µl reaction at 37 °C. RNA was purified as described above and eluted in 10 µl water. RNA (7 µl) from the previous step was radiolabelled in a 10-µl reaction using 10 U T4 PNK and 1 µl 10 mCi ml−1 [32P]-ATP for 30 min at 37 °C. The reaction was heat inactivated at 85 °C for 2 min. Next, 1 µl 10 µM 5′-biotinylated self-splinting DNA oligonucleotide was added to the reaction mix. The sample was heated at 85 °C for 4 min and left at room temperature for 20 min to allow DNA oligo annealing to the 5′ end of the target RNA. Ligation mix (3.7 µl; 0.56× PNK buffer, 0.15 mM ATP, 22.8% DMSO and 1.9 U µl−1 T4 DNA ligase (EL0011, Thermo Scientific)) was added to the reaction, mixed and incubated for 4 h at 37 °C. M-280 streptavidin Dynabeads (7-µl bed volume) in 100 µl binding buffer were added to the ligation mix and incubated for 30 min at room temperature with constant shaking. Next, the beads were washed twice with 500 µl high-salt buffer, twice with 500 µl binding buffer and twice with 500 µl low-salt-buffer. After the last wash, beads were resuspended in 45 µl 30 mM Na-acetate pH 4.5, and 4 µg Monarch RNase A (T3018L, NEB) and 5 U RNase T2 were added. Samples were incubated overnight at 37 °C to completely degrade mRNA, leaving the cap tags attached to the self-splinting DNA oligonucleotide. The following day, the beads were washed twice with 500 µl binding buffer and twice with low-salt buffer. The beads were then resuspended in 50 µl 10 mM HCl and incubated for 30 min at 37 °C to open up the 2′,3′-cyclic phosphate at the 3′ end of RNase-derived cap tags. Binding buffer (150 µl) was added, and the pH of the reaction was adjusted with KOH to 7.5. M-280 streptavidin Dynabeads (5-µl bed volume) were added to the mixture and incubated for 30 min at room temperature. Next, the beads were washed once with 500 µl binding buffer and once with low-salt buffer. Beads were then resuspended in 10 µl 1.1× CutSmart buffer and digested with 20 U BamHI HF (3136L, NEB) for 1.25 h at 37 °C. Beads were gently washed once with 150 µl binding buffer and once with 150 µl low-salt buffer. Beads were then resuspended in 10 mM Tris-HCl pH 7.5, heated at 85 °C for 7 min and supernatants were quickly collected into a fresh tube. The eluted samples containing DNA oligo-linked cap tags were mixed with 2× Novex TBE-urea sample buffer (LC6876, Invitrogen) and loaded onto a 20-cm-long 15% TBE-urea sequencing gel. Samples were run on the gel until the bromephenol blue dye reached the end of the gel. The resolved DNA oligo-linked cap tags were transferred onto a nylon membrane, UV crosslinked and exposed to the phosphor screen. Autoradiographs were developed using the FLA7000IP Typhoon Phosphorimager.

BruChase

HEK293T cells (5  × 106) were seeded in a 10-cm cell culture dish. The following day, cells were incubated in growing media containing 200 μg ml−1 5-BrU (850187, Sigma) for 3 h to allow for metabolic labelling of newly synthesized transcripts. Then, BrU-containing media were withdrawn, and cells were chased in growing media containing 2 mM uridine (U3750, Sigma) for 0, 4, 8 and 16 h. Total RNA was extracted from cells at each time point of the uridine chase using TRIzol reagent and isopropanol precipitation. Next, 100 μg total RNA for each time point was incubated with 10 μg anti-bromodeoxyuridine antibody (MI-11-3, MBL Life Science) in 500 μl IP buffer (5 mM Tris-HCl pH 7.5, 0.5× PBS, 0.05% Triton X-100, 1 mM EDTA and 40 U ml−1 RNaseOUT) for 2 h at 4 °C with constant rotation. Pierce Protein A/G magnetic beads (30-μl bed volume; 88802, Thermo Scientific) in 100 μl IP buffer were added to the RNA–antibody mix and incubated for an additional 1 h at 4 °C with constant rotation. Beads were then washed four times with 700 μl IP buffer. The immunopurified RNA was eluted from the beads via proteinase K treatment. Beads were mixed in 300 μl proteinase K buffer (100 mM Tris-HCl pH 7.5, 50 mM NaCl, 0.5% SDS and 10 mM EDTA) and 10 μl 20 mg ml−1 proteinase K solution (RNA grade; 25530049, Invitrogen), and incubated for 45 min at 50 °C. Next, the supernatant was collected and mixed with 300 μl phenol:chloroform:IAA, 25:24:1, pH 6.6 (AM9730, Invitrogen) in a Phase Lock Gel Heavy tube (2302830, QuantaBio). The mixture was incubated for 5 min at 30 °C and centrifuged at 13,000g for 5 min. Following centrifugation, the aqueous phase was collected, and BrU-containing RNA was precipitated with 0.3 M Na-acetate pH 5.5, 2.5× sample volume 100% ethanol and 2 μl 15 mg ml−1 Glycoblue (AM9515, Invitrogen). Two independent immunopurifications were combined and subjected to the CLAM-Cap-seq protocol. DNA adapter-linked cDNA–cap tags were resuspended in 40 μl water and subjected to the RT–qPCR analysis.

BruChase-CapTag-seq

HEK293T cells (9 × 106) were seeded in a 15-cm cell culture dish. The following day, cells were incubated in growing media containing 200 μg ml−1 5-BrU (850187, Sigma) for 3 h to allow for metabolic labelling of newly synthesized transcripts. Then, BrU-containing media were withdrawn, and cells were chased in growing media containing 2 mM uridine (U3750, Sigma) for 0 h and 8 h. Total RNA was extracted from cells at both time points of the uridine chase using TRIzol reagent and isopropanol precipitation. mRNA was extracted from total RNA using oligo(dT) capture as described above. Isolated mRNA from three 15-cm cell culture dishes constituted a single biological replicate. BrU-labelled mRNA was immunopurified as described above. BrU-labelled mRNA (200 ng) from each time point of the uridine chase was subjected to the CapTag-seq procedure.

RNA-seq

Total RNA was extracted from 2 × 106 CMTR2 WT or CMTR2 KO HEK293T cells using TRIzol reagent. The RNA quality was assessed by Bioanalyzer analysis. Total RNA was spiked in with ERCC RNA spike-in mix 1 according to the manufacturer’s instructions (4456740, Invitrogen). Total RNA (1 µg) was used for RNA-seq library preparation with the NEBNext Ultra II RNA Library Prep kit for Illumina (E7770S, NEB). Ribosomal RNA was removed using the NEBNext rRNA Depletion kit (E6310L, NEB). The libraries were sequenced on the Illumina instrument in a paired-end mode. Four independent biological replicates were sequenced for each condition.

RNA-seq data analysis

Sequencing reads of low quality were discarded and reads shorter than 18 bp were removed. Ribosomal rRNA reads were removed using STAR aligner51. The remaining reads were then mapped to the human (hg38) protein-coding transcriptome using STAR aligner. Trimmed mean of M values (TMM) normalization, empirical Bayes estimate of the negative binominal dispersion, and measurement of the changes in gene expression (log2 fold change) were performed for all samples and replicates at the same time using edgeR52.

Western blotting and antibodies

Cell pellets were collected by centrifugation at 300g for 5 min. After a single wash in ice-cold PBS, cell pellets were resuspended in cell lysis buffer (50 mM Tris-HCl pH 7.5, 150 mM NaCl, 1% NP-40, 0.1% SDS and 1× Halt protease and phosphatase inhibitor cocktail) and lysed on ice for 10 min. Cell lysis was facilitated by sonication with the following parameters: four times 5 s on, 10 s off at 10% amplitude (Branson). Cell lysates were cleared by a 5 min of centrifugation at 20,000g at 4 °C. Protein concentration in isolated cell lysates was determined using the Pierce BCA Protein Assay kit (23225, Thermo scientific). Protein (20–25 µg) was loaded per lane onto a NuPAGE 4–12% Bis-Tris gel (NP0322BOX, Invitrogen). Resolved proteins were transferred onto a PVDF membrane and probed with an antibody recognizing a protein of interest. The following antibodies were used in this study: rabbit anti-FTSJD1 polyclonal antibody (1:500 dilution; PA5-61696, Invitrogen), rabbit anti-RIG-I (D14G6) monoclonal antibody (1:1,000 dilution; 3743T, CST), rabbit RIG-I/DDX58 (EPR18629) monoclonal antibody (1:2,000 dilution; ab180675, Abcam), rabbit anti-MDA5 (D74E4) monoclonal antibody (1:1,000 dilution; 5321T, CST), rabbit anti-IFIT1 (D2X9Z) monoclonal antibody (1:1,000 dilution; 14769S, CST), rabbit anti-IFITM3 (D8E8G) XP monoclonal antibody (1:1,000 dilution; 59212T, CST), mouse anti-NLRP1 monoclonal antibody (1:1,000 dilution; 679802, BioLegend), rabbit anti-MAVS polyclonal antibody (1:1,000 dilution; 3993T, CST), rabbit anti-OAS3 polyclonal antibody (1:1,000 dilution; ab154270, Abcam), mouse anti-puromycin (12D10) monoclonal antibody (1:5,000 dilution; MABE343, Millipore Sigma), mouse anti-GAPDH monoclonal antibody (1:10,000; GT239, GeneTex), mouse anti-β-actin monoclonal antibody (1:4,000; AM4302, Thermo), horseradish peroxidase-conjugated donkey anti-rabbit IgG (1:5,000; NA934, Cytiva) and horseradish peroxidase-conjugated sheep anti-mouse IgG (1:5,000; NA931, Cytiva). For anti-RIG-I (D14G6), anti-NLRP1 and anti-MDA5 (D74E4) antibodies, membranes were incubated with 3% BSA in TBS supplemented with 0.1% Tween-20 (TBST). All other antibodies were used with 3% milk in TSBT.

RT–qPCR

To remove potential DNA contamination, total RNA was first treated with 1 U DNase I (EN0521, Thermo Scientific) for 20 min at 37 °C and purified using a Zymo RCC-5 column. Total RNA (1–3 μg) was reverse transcribed to cDNA with the SuperScript III First Strand synthesis system (18080051, Invitrogen) using either random hexamers (N8080127, Invitrogen) or oligo(dT)20 (18418020, Invitrogen) as reverse transcription primers. The same amount of total RNA was used for directly compared conditions. qPCR was performed using the iQ SYBR Green Supermix (1708880, Bio-Rad) with 150 nM primers in a 10-μl reaction. The amplifications were conducted with the following protocol in all experiments: 95 °C for 10 min, 40 cycles of 95 °C for 15 s, 58 °C for 15 s and 68 °C for 20 s. The specificity of primer pairs was tested with melting curves at the end of the 40th amplification cycle. GAPDH-normalized gene expression was calculated and presented. Normalized gene expression values were set to 1 for control conditions with a propagation of variability across all samples and replicates.

Puromycin incorporation assay

CMTR2 WT or CMTR2 KO HEK293T cells (1.8 × 106) were seeded in a 6-mm cell culture dish. The following day, cells were incubated in growing media containing 0.8 µg ml−1 puromycin for 0, 30, 60 and 90 min. Cells were washed twice with ice-cold PBS and collected by centrifugation at 300g for 5 min. Cell lysate preparation and western blotting were performed as described above. The PVDF membrane with transferred proteins was probed with anti-puromycin antibody for the detection of nascent proteins. Following anti-puromycin western blot, the membrane was washed and stained with Amido Black Staining solution (A8181, Sigma-Aldrich) to ensure equal protein loading in all lanes.

MTT cell proliferation assay

Changes in cell growth upon CMTR2 depletion were tested using MTT (3-(4,5-dimethylthiazol-2-yl)−2,5-diphenyltetrazolium bromide) cell proliferation assay. CMTR2 WT or CMTR2 KO HEK293T cells (2 × 104) were seeded in a single well of a 12-well plate. The MTT assay was performed on cells at 1, 2 and 3 days following the initial seeding. For each time point, cells were washed once with pre-warmed PBS, followed by incubation in the solution containing 1:1 mixture of phenol red-free DMEM and MTT reagent (5 mg ml−1 MTT (ab 146345, Abcam) for 3 h at 37 °C. MTT solvent (1.5× volume) (4 mM HCl and 0.1% NP-40 in isopropanol) was added to the cells, and formazan crystals were dissolved by pipetting. Samples were incubated for 15 min at room temperature and 570-nm absorbance read to estimate the number of viable cells. A well without seeded cells was used for background subtraction.

VSV infections

HEK293T cells (3.5 × 105) were seeded in a poly-d-lysine (A3890401, Gibco) coated well of a six-well plate. The following day, cells were transfected with 800 ng pcDNA4.0/TO-NeonGreen, pcDNA4/TO-NeonGreen-CMTR2 or pcDNA4/TO-NeonGreen-CMTR2 W85A plasmids using 1.2 μl LipoD293T transfection reagent. Twenty-four hours after transfection, cells were infected with 1.25 × 108 propagation-incompetent VSV particles. After 24 h of the VSV infection, cells were washed once with PBS and collected in 1 ml TRIzol reagent for total RNA extraction. Isolated total RNA was treated with DNase I, purified and subjected to CLAM-Cap–qPCR and RT–qPCR analysis.

Poly(I:C) and 3p-hpRNA cell treatments

HEK293T cells (3.5 × 105) were seeded in a well of a six-well plate. The following day, cells were transfected with 800 ng pcDNA4.0/TO-NeonGreen, pcDNA4/TO-NeonGreen-CMTR2 or pcDNA4/TO-NeonGreen-CMTR2 W85A plasmids using 1.2 μl LipoD293T transfection reagent. Twenty-four hours after transfection, cells were transfected with either 500 ng LMW poly(I:C) (InvivoGen) or 750 ng 3p-hpRNA (InvivoGen) using LyoVec transfection reagent (InvivoGen) according to the manufacturer’s instructions. After 24 h (for poly(I:C) and 8 h (for 3p-hpRNA), cells were collected in 1 ml TRIzol reagent for total RNA extraction. Isolated total RNA was treated with DNase I, purified and subjected to RT–qPCR analysis.

Plasmids

The CMTR2 open reading frame was obtained by RT–PCR on mRNA extracted from HEK293T cells. The CMTR2 open reading frame was cloned into the pcDNA4/TO-NeonGreen plasmid between the KpnI and XbaI restriction sites to obtain the N-terminal NeonGreen-tagged CMTR2 expression construct. The pcDNA4/TO-NeonGreen-CMTR2 W85A mutant construct was generated by PCR-based site-directed mutagenesis of the pcDNA4/TO-NeonGreen-CMTR2 plasmid. The pcDNA3.1(+)-FLAG-RIG-I plasmid was purchased from OriGene (OHu25414). The pcDNA3.1(+)-FLAG-RIG-I K858A/K861A, H830A and C829A mutant constructs were generated by PCR-based site-directed mutagenesis of the original pcDNA3.1(+)-FLAG-RIG-I plasmid. All generated plasmids are available from the lead contact on request.

Preparation of the biotinylated dsRNA oligos for RIG-I pulldown

Single-stranded Cap0-modified, Cap1-modified and Cap2-modified RNA oligonucleotides were provided by TriLink Biotechnologies. To remove any residual non-capped RNA, the oligos were treated with Terminator exonuclease for 1 h at 30 °C and purified using RNAClean XP beads. A complementary 5′-biotinylated RNA oligo was annealed to Cap0-modified, Cap1-modified and Cap2-modified RNA oligos in a 1.1:1 ratio in 50 μl annealing buffer (10 mM Tris-HCl pH 7.5, 100 mM NaCl and 1 mM EDTA). Biotinylated Cap0-terminated, Cap1-terminated or Cap2-terminated dsRNA (700 ng) was incubated with M-280 streptavidin Dynabeads (40 μl bed volume) in 300 μl binding buffer supplemented 1 U μl−1 RNaseOUT at room temperature for 30 min with constant rotation. The RNA-bound beads were washed three times with 500 μl binding buffer, resuspended in 150 μl cell lysis buffer (20 mM Tris-HCl pH 7.5, 100 mM KCl, 5 mM MgCl2, 0.5% NP-40, 40 U ml−1 RNaseOUT and 1× Halt protease and phosphatase inhibitor cocktail), and kept on ice.

RIG-I pulldown assay

HEK293T cells (4.8 × 106) were seeded on a 10-cm Petri dish. The following day, cells were transfected with 4.8 μg pcDNA3.1(+)-FLAG-RIG-I, pcDNA3.1(+)-FLAG-RIG-I K858A/K858A, pcDNA3.1(+)-FLAG-RIG-I H830A or pcDNA3.1(+)-FLAG-RIG-I C829A plasmids using 10 μl LipoD293 transfection reagent. Forty-eight hours after transfection, cells were washed once with 5 ml ice-cold PBS and scraped in 8 ml ice-cold PBS. Cell pellets were collected by 5 min of centrifugation at 300g at 4 °C, supernatant was removed by aspiration and cells were resuspended in 500 μl cell lysis buffer (20 mM Tris-HCl pH 7.5, 100 mM KCl, 5 mM MgCl2, 0.5% NP-40, 40 U ml−1 RNaseOUT and 1× Halt protease and phosphatase inhibitor cocktail). Cells were left to lyse on ice for 10 min. To facilitate cell lysis, cells were passed three times through a 21-gauge needle and incubated on ice for an additional 10 min. The lysates were cleared by centrifugation at 16,000g for 10 min and the supernatants were saved. Protein concentration was measured by the Pierce BCA Protein Assay kit according to the manufacturer’s instructions. Protein lysate (0.75 mg) at the concentration of 1 mg ml−1 was incubated with streptavidin-immobilized Cap0, Cap1 and Cap2 dsRNA for 45 min at room temperature with agitation. Following the incubation, the beads were washed four times with 750 μl cell lysis buffer and separated in two equal parts. RNA-bound proteins were eluted in 25 μl 1.5× NuPAGE LDS sample buffer (NP0007, Invitrogen) from one bead fraction. The eluted proteins were loaded onto the NuPAGE 4–12% Bis-Tris gel and run at 180 V for 60 min. The resolved proteins were transferred onto the PVDF membrane and western blot was performed using anti-RIG-I antibody. To show equal amount of RNA bait across different samples, the remaining part of the beads was subjected to the proteinase K treatment for RNA isolation. Isolated RNA was loaded onto the 20% TBE non-denaturing PAGE and run at 150 V for 2.5 h. RNA was visualized after the gel was stained with 1× SYBR Gold nucleic acid gel stain (S11494, Invitrogen).

CMTR2 and RIG-I co-overexpression in CMTR2 WT HEK293T cells

HEK293T cells (3.5 × 105) were seeded in a single poly-d-lysine (A3890401, Gibco) coated well of a six-well plate. The following day, cells were transfected with 800 ng pcDNA4.0/TO-NeonGreen, pcDNA4/TO-NeonGreen-CMTR2 or pcDNA4/TO-NeonGreen-CMTR2 W85A plasmids using 1.2 μl LipoD293T transfection reagent. After 24 h, cells were transfected with either 1 μg pcDNA3.1(+)-FLAG-GFP or pcDNA3.1(+)-FLAG-RIG-I. After 48 h, cells were washed once with PBS, scraped and collected by 5 min centrifugation at 300g at 4 °C. Cell pellets were divided into equal parts for protein and total RNA extraction.

RIG-I overexpression in CMTR2 WT and CMTR2 KO HEK293T cells

CMTR2 WT or CMTR2 KO HEK293T cells (3.5 × 105) were seeded in a single well of a six-well plate. The following day, cells were transfected with 1 μg pcDNA3.1(+)-FLAG-GFP, pcDNA3.1(+)-FLAG-RIG-I or pcDNA3.1(+)-FLAG-RIG-I K858A/K861A constructs. After 48 h, cells were washed once with PBS, scraped and collected by 5 min of centrifugation at 300g at 4 °C. Cell pellets were divided into equal parts for protein and total RNA extraction.

Oligonucleotides

The sequences of the oligonucleotides used in this study are provided in Supplementary Table 3.

Quantification and statistical analysis

Quantitative and statistical methods are described above and in the figure legends. R v4.0.1, GraphPad Prism v9.0.1 and ImageJ 1.53a were used for all statistical analysis and data visualization. Figures were prepared using Graphic v3.1 and Adobe Illustrator v27.0.1. All statistical tests and P values are provided in Supplementary Table 2. Experimental results shown as representative blots were successfully replicated two or more times to ensure the reproducibility of the reported findings.

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.