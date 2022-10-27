A new BS condition quantitatively converts Ψ to Ψ-BS adduct

In a recent effort to map m5C in RNA, Khoddami et al. made a surprising observation that BS treatment could lead to modest base deletions during RT at Ψ sites in RNAs (RBS-seq) (Fig. 1a)15,16. The formation of a Ψ-BS adduct was shown to be the key intermediate that leads to deletion readout upon reverse transcription15. In total, 15 and 20 Ψ sites (deletion rate >5%) were detected in human 18S and 28S rRNA, respectively, using the RBS-seq protocol, but the signals on human mRNA were weak, with only 78 sites detected with a deletion rate of greater than 5%16. The conventional BS reaction condition in RBS-seq inevitably converted all the cytosines into uracils and thus reduced read complexity, resulting in a notable proportion of reads that could not be aligned to mRNA exons. Nevertheless, the discovery of Ψ-BS-adduct-induced deletion during RT provided a completely new principle for potential Ψ detection.

Fig. 1: BID-seq quantitatively detects Ψ sites as deletion signatures. a, Chemical structure of the Ψ-BS adduct after bisulfite treatment. b, BID-seq BS selectively reacts with Ψ and completely converts it into the Ψ-BS adduct under optimized conditions, without affecting normal C or U bases in RNA. c, The deletion ratio at the 100% modified Ψ site within the AGΨGA motif (synthesized RNA oligo) after BID-seq treatment versus that in the input. d, The average C to U mutation ratio at normal cytidine bases in synthesized RNA oligo after BID-seq treatment versus that in input. For c–d, n = 2 biologically independent samples. e, Heatmap plot for deletion ratios on 256 motifs (NNΨNN) after BS treatment in BID-seq, which contain one 100% modified Ψ within each motif. Source data Full size image

Following these intriguing observations15,16, we tested two commercial bisulfite kits (Zymo and Epigentek) used for conventional BS treatment on synthetic 5-mer RNA oligonucleotides AGXGA (X = C or Ψ). In both cases, we observed quantitative C-to-U conversion, but no formation of Ψ-BS adducts (Supplementary Fig. 1a). We then examined the published RBS-seq condition to measure the conversion efficiency of Ψ to Ψ-BS adduct16. Although matrix-assisted laser desorption/ionization-time of flight (MALDI-TOF) MS showed quantitative C-to-U conversion, the efficiency of Ψ-BS adduct formation varied and was less than 30% among four replicates (Fig. 1a and Supplementary Fig. 1b), explaining the low sensitivity in detection of Ψ using the previous protocol.

It is known that the protonation of N3 in cytosine at acidic pH (around 5.1) is critical to BS-mediated deamination, whereas a neutral pH is more suitable for the BS reaction with uracil20. We reasoned that neutral pH would inhibit C-to-U conversion but promote Ψ reaction with BS to yield higher levels of Ψ-BS (Fig. 1a). Indeed, BS treatment of the model RNA probes at neutral pH followed by MALDI-TOF MS revealed quantitative conversion of Ψ to Ψ-BS adduct without any detectable C-to-U conversion (Fig. 1b).

To optimize Ψ detection, we treated a 30-mer Ψ-containing RNA probe (with a AGΨGA motif) with BS at neutral pH (2.4 M Na 2 SO 3 and 0.36 M NaHSO 3 ) and screened commercial reverse transcriptases. We found that SuperScript IV generated a high deletion rate (~70%) at the fully modified Ψ site after the new bisulfite reaction followed by RT, amplification and sequencing, whereas the deletion ratio was almost undetectable (<1%) in the untreated ‘input’ (Fig. 1c). Note that deletion rates of unmodified bases (A, C, G, U) and the C-to-U conversion at C bases were undetectable in both treated and untreated samples (Fig. 1d), indicating very low background and no reduction in read complexity caused by potential cytosine deamination. To examine the deletion rate dependency on the sequence context, we built libraries with 30-mer RNA oligonucleotides containing NNΨNN (N = A or C or G or U) as spike-in and performed BID-seq. We found that 232 out of 256 motifs gave deletion rates over 50% at the Ψ site, with 252 out of 256 motifs displaying deletion rates above 25% (Fig. 1e). After BID-seq, the unmodified probes containing 0% Ψ (NNUNN) displayed deletion ratios of less than 5% for most sequence motifs; high background (around 10–25% deletion ratio) was observed in only a few motifs containing ACΨ-, CUΨ-, GCΨ-, GUΨ- or -ΨUC, -ΨUG (Supplementary Fig. 1c). When calling Ψ candidate sites in biological samples, we set the deletion rate at greater than 1.5-fold over the background at each candidate site to eliminate potential false positives arising from the background in our analysis pipeline.

Together, we show that BID-seq quantitatively converts Ψ to the Ψ-BS adduct without detectable C-to-U conversion, and that SuperScript IV generates high deletion rates at the BS-modified Ψ sites in most sequence contexts during RT, confirming that BID-seq is highly sensitive and specific for Ψ detection. With spike-in probes containing varied Ψ levels to calibrate sequence-context-dependent deletion rate21, we can further calculate the stoichiometry at the Ψ-modified sites.

Validation of BID-seq

To validate BID-seq in biological samples, we developed a BID-seq protocol to map Ψ in various RNA species from biological samples (Fig. 2a). We first applied BID-seq to validate Ψ detection in rRNA from HeLa cells. To identify notable Ψ deletion signatures, we set the Ψ detection criteria as follows: (1) deletion rate above 5% (with deletion count above five in BID-seq libraries); (2) deletion rate below 1% in ‘Input’ libraries; (3) total reads coverage depth above 20 in both BID-seq and ‘Input’ libraries; (4) deletion rate above 1.5-fold over background in any given sequence motif (defined as the deletion rates detected from RNA probes containing 0% Ψ, as in Supplementary Fig. 1c). In addition, we excluded sites that tend to be false positives, specifically uracil sites at the neighboring nucleotide 3′ or 5′ to the known Ψ sites.

Fig. 2: BID-seq detects known Ψ sites in human ribosomal RNA with modification stoichiometry. a, Flowchart of library construction pipeline for BID-seq, revealing Ψ modification fraction by deletion ratio signature. b, Two-dimensional (2D) plot for deletion ratios of known Ψ sites in HeLa 18S ribosomal RNA, in BID-seq treated library versus input. c, 2D plot for deletion ratios of known Ψ sites in HeLa 28S ribosomal RNA, in BID-seq treated library versus input. d, 2D plot for deletion ratios of known Ψ sites in HeLa 5.8S ribosomal RNA, in BID-seq treated library versus input. e, An example IGV plot of the highly modified Ψ site at position 1,081 of HeLa 18S ribosomal RNA, within a CAΨAA motif. f–h, Deletion and Ψ fraction detected by BID-seq in HeLa 18S rRNA (f), 28S rRNA (g), and 5.8S rRNA (h), respectively. After BS treatment in BID-seq, the deletion rates and Ψ fractions are marked in blue and pink, respectively. Source data Full size image

Applying all these criteria for Ψ detection, we identified 42, 53 and 2 known Ψ sites in HeLa 18S, 28S and 5.8 rRNAs22, respectively, without any false positives; these known Ψ sites all exhibited notable deletion rates ranging from 5% to 95% in BID-seq (Fig. 2b–d). A representative highly modified Ψ1,081 site in HeLa 18S rRNA is visualized in an original Integrative Genomics Viewer (IGV) plot (Fig. 2e). Notably, the deletion rates at these Ψ sites in untreated ‘input’ were less than 1%, except for a couple of known modifications such as m1acp3Ψ 1,248 at 18S rRNA23, m3U 4,500 at 28S rRNA and an interesting uncharacterized U 2,176 site at 28S rRNA (Fig. 2b,c).

To quantify the modification fraction at each Ψ site by deletion rate, we mixed oligo probes containing NNΨNN and NNUNN (with different stoichiometry of Ψ) as controls to plot calibration curves for these sequence contexts (Supplementary Fig. 2a and Table 1). The high mutation rates on 232 motifs, low background for most of these motif contexts and the approximately hyperbola calibration curves in BID-seq enabled sensitive detection of Ψ as well as estimation of Ψ stoichiometry. Based on the calibration curves, the fractions of these Ψ sites in HeLa 18S, 28S and 5.8S rRNAs were calculated to be around 20–100%, generally consistent with those measured by mass spectrometry22 (Fig. 2f–h and Supplementary Tables 2–4). Among 43 and 61 known Ψ sites uncovered by mass spectrometry in HeLa 18S and 28S rRNAs22, respectively, 9 sites were not detected by BID-seq for three reasons: (1) low modification fraction: 18S rRNA Ψ1,136 and 28S rRNA Ψ4,463 (Supplementary Fig. 2b–e); (2) no reads coverage at the Ψ site because of dramatic RT stop caused by multiple highly modified Ψ sites within a narrow region: 28S rRNA Ψ3,741/Ψ3,743/Ψ3,747/Ψ3,749 and 28S rRNA Ψ4,266/Ψ4,269 (Supplementary Fig. 2d–f); (3) m3U adjacent to a Ψ site that seems to interfere with the BS reaction on the Ψ base or the subsequent RT: 28S rRNA Ψ4,501 (Supplementary Fig. 2d–f). These represent potential limitations of BID-seq in mapping Ψ sites.

Compared with BID-seq, RBS-seq detected 15 and 20 Ψ sites in 18S and 28S rRNA, respectively, because of low deletion rates, with deletion rates close to zero for other known Ψ sites (Supplementary Fig. 2g,h). We also applied BID-seq to small RNAs (<200 nt) from HeLa cells, and validated highly modified Ψ sites in both H/ACA box and C/D box snoRNAs (Supplementary Fig. 2i,j), including snoRNA Ψ sites previously revealed by Ψ-seq5.

BID-seq maps Ψ in mRNA from human cell lines

We optimized BID-seq to be compatible with low RNA input21,24, and then applied it to 10–20 ng polyA-tailed RNA from HeLa, HEK293T and A549 cells. In addition to the aforementioned criteria for Ψ detection, we added one more Ψ modification fraction cutoff and focused on mRNA sites >10% Ψ stoichiometry, as the candidate sites. We identified 575, 543 and 922 Ψ sites in mRNA from HeLa, HEK293T and A549 cells, respectively (Fig. 3a), which all showed clear internal deletion signatures (Supplementary Fig. 3a and Tables 5–7). Meanwhile, we set up an additional cutoff criterion that requires a deletion count of more than ten to assign hundreds of ‘confident’ Ψ sites in human mRNA (Supplementary Fig. 3b and Tables 5–7). Most of these mRNA Ψ sites display the modification fraction at 10–40% (Supplementary Fig. 3c), but we also identified 152, 169 and 110 highly modified mRNA Ψ sites (>50% Ψ fraction) in the three human cell lines (Fig. 3a), with a continuous distribution of Ψ fraction from 50% all the way to close to 100% (Fig. 3b). The mRNA Ψ sites distribute mostly in coding sequence (CDS) and 3′-UTR (Fig. 3c), similar to the distribution pattern observed previously using CeU-seq6. In the metagene profile, an example of the mRNA Ψ candidate sites in A549 cells shows accumulation in the CDS region (Fig. 3d). The common gene ontology (GO) clusters of HeLa and A549 cells enrich the functions such as microtubule/cytoskeleton, ribosome, membrane, actin binding, ATP binding, translation, mRNA processing, etc. (Supplementary Fig. 3d). Note that Ψ can be either shared or cell-line specific. We uncovered 386 mRNA Ψ sites (>10% Ψ fraction) shared among 2–3 human cell lines (Supplementary Fig. 3e). For highly modified Ψ (>50% Ψ fraction), we identified 127 cell-line-specific sites (Supplementary Fig. 3f) and 78 sites as highly modified Ψ in at least one human cell line and detectable (>10% Ψ fraction) in all three cell lines (Fig. 3e).

Fig. 3: BID-seq detects Ψ sites at base resolution in human mRNA and characterizes the ‘writer’ protein for individual Ψ sites. a, BID-seq reveals 575, 543 and 922 Ψ sites (modification fraction above 10%) in HeLa, HEK293T and A549 cells, respectively. b, Modification level distribution of Ψ sites in mRNA from HeLa, HEK293T and A549 cells, with the definition of highly modified Ψ sites as those above 50% Ψ-fraction. c, Pie chart showing the distribution of mRNA Ψ sites in HeLa, HEK293T and A549 cells, with stoichiometry ≥10% in three mRNA segments. d, Metagene plot of 922 Ψ sites (modification fraction >10%) in A549 mRNA. e, Heatmap plot of Ψ-fraction for 78 overlapped Ψ sites with above 50% Ψ-fraction in at least one human cell line and above 10% Ψ-fraction in three cell lines, in a matrix of the corresponding gene name versus each cell line. f, Distribution of motifs for 575 Ψ sites in HeLa mRNA, with ‘x axis’ as the motif frequency and ‘y axis’ showing the average Ψ modification fraction of each motif. g, Example IGV plot to show raw reads coverage at the highly modified Ψ site in HeLa ERH mRNA. The deletion signatures reflect the modification level change in shTRUB1 versus shControl, but not depletion of other PUS enzymes. h, Among 133 Ψ sites (above 10% Ψ-fraction) in shControl HeLa mRNA, scatter plot of BID-seq data shows the reduced Ψ-fraction at 70 Ψ sites in TRUB1-depleted cells. i, Pie chart of TRUB1 hypo-regulated, hyper-regulated and TRUB1-independent Ψ sites. j, Pie chart of PUS7 hypo-regulated, hyper-regulated and PUS7-independent Ψ sites. k, Heatmap plot of Ψ-fraction for 104 Ψ sites that show reduced modification level under the depletion of a specific PUS enzyme or multiple PUS enzymes, in a matrix of the corresponding gene name versus the knockdown of each PUS enzyme. Source data Full size image

We next analyzed the motif frequency and modification fraction of all mRNA Ψ sites in all three cell lines. In HeLa cells, the most frequent motifs are GUΨCN (N = A or C or G or U), USΨAG (S = C or G), poly-U (UUUUU or more), NGΨGG (N = A or C or G or U) and GSΨGA (S = C or G) (Fig. 3f and Supplementary Fig. 3g). HEK293T and A549 cells also display the similar patterns in motif frequency (Supplementary Fig. 3h). Previously, GUΨC and UVΨAG (V = A or C or G) were reported as the potential TRUB1 motif and Pus7 motif, respectively13, which are consistent with our findings here. Note that we plotted the deletion ratio at each Ψ site versus the RPKM value of the corresponding mRNA (Supplementary Fig. 3i), which gives an estimated RPKM of 1.5 as the expression limit for mRNA Ψ detection under the current sequencing depth of ~80 M reads per library.

Pseudouridine writers for Ψ deposition in HeLa mRNA

Thirteen putative PUS enzymes have been annotated in the human genome7,8,9, with dyskerin pseudouridine synthase 1 (DKC1) known to rely on H/ACA snoRNAs to guide pseudouridine deposition25,26,27. Most other PUS enzymes are thought to be stand-alone enzymes that function without snoRNAs9,28. To identify PUS enzymes that catalyze Ψ deposition at individual sites in mRNA, we performed shRNA knockdown of eight known PUS enzymes in HeLa cells followed by BID-seq. We noticed substantially reduced Ψ modification in shControl versus wild-type HeLa cells, most probably due to either cellular stress or immune stimulation caused by lentivirus transfection. We were still able to detect 133 mRNA Ψ sites with Ψ fractions above 10% in shControl HeLa cells and used these 133 sites to study Ψ deposition by writers under the same lentivirus infection conditions (Supplementary Table 8). We compared the deletion rate at each site among shControl and each PUS knockdown. For example, the highly modified Ψ site in ERH mRNA 3′-UTR displayed a Ψ fraction reduction from 96% to 8% upon TRUB1 knockdown, whereas knockdown of other PUS enzymes did not affect this site (Fig. 3g), revealing that this Ψ site is installed by TRUB1 (ref. 13). TRUB1 regulated 70 sites out of 133 (Fig. 3h,i), including 15 highly modified sites (>50% fraction) in transcripts such as ERH, ZNF664, DKC1, M6PR, AGPAT5, SCP2, CDC6, INTS1, FKBP4, AMFR, etc. (Supplementary Fig. 4a), out of which ERH, ZNF664, DKC1, M6PR, AGPAT5, SCP2, INTS1, FKBP4, AMFR and HEXA were also identified by Ψ-seq5. We then analyzed the motif frequency and modification fraction of the 70 TRUB1-regulated mRNA Ψ sites. The most frequent motifs are GUΨCN (N = A or C or G or U) and poly-U (UUUUU or more Us) (Supplementary Fig. 4b), consistent with the same main motif contexts revealed by BID-seq (Fig. 3f and Supplementary Fig. 3g).

PUS7 (refs. 29,30,31,32), PUS1 (refs. 4,32), PUS3, PUS7L, PUSL1, TRUB2 and DKC1 (refs. 25,26,27) also deposited 40, 28, 30, 24, 28, 28 and 33 Ψ sites in HeLa transcripts, respectively (Fig. 3j and Supplementary Fig. 4c). Overall, we found that 104 (out of 133) Ψ sites (in shControl HeLa cells) responded to knockdown of these eight PUS enzymes, with some sites regulated by one specific PUS enzyme and others affected by multiple PUS enzymes (Fig. 3k). The remaining 29 (out of 133) HeLa mRNA Ψ sites might be regulated by other PUS enzymes as PUS10 (ref. 33), RPUSD1, RPUSD2, RPUSD3 and RPUSD4 (ref. 32). Note that more effective knockdown or knockout with deeper sequencing may help confidently assign ‘writer’ proteins for all 133 mRNA Ψ sites in shControl cells.

BID-seq detects abundant Ψ sites in mRNA from mouse tissues

To further investigate mRNA pseudouridylation in real tissues, we performed BID-seq with polyA-tailed RNA isolated from 12 mouse tissues. We detected many more Ψ candidate sites in mouse tissue mRNA than in HeLa mRNA, consistent with the trend shown in our LC-MS/MS measurements (Supplementary Fig. 5a) and a previous analysis of mouse brain and lung tissues6. Specifically, we identified 1,043, 2,001, 1,835, 2,782, 508, 6,617, 1,862, 1,454, 2,610, 3,212, 2,384 and 1,811 Ψ sites (>10% fraction) in mRNA from mouse B cell, bone marrow, CD4 T cell, CD8 T cell, cerebral cortex, cerebellum, heart, kidney, liver, small intestine, testis and thymus, respectively (Fig. 4a and Supplementary Tables 9–20). We observed a number of highly modified sites (>50% Ψ fraction) in all 12 tissues, particularly ranging from 50% to 80% fraction (Fig. 4b). Similar to human cell lines, mRNA Ψ in mouse tissues also accumulate in CDS and 3′ UTR (Fig. 4c). In metagene profiles, the mRNA Ψ sites in mouse liver, kidney, thymus and CD8 T cells, shown as examples, distribute in the CDS and 3′-UTR, with accumulation around the stop codon (Supplementary Fig. 5b).

Fig. 4: Mouse tissue mRNAs are heavily modified with Ψ. a, BID-seq reveals a large number of Ψ sites (modification fraction >10%) in 12 mouse tissues, with the Ψ site number in three human cell lines shown for comparison. b, Modification level distribution of mRNA Ψ sites in 12 mouse tissues, in which a number of Ψ sites are highly modified (modification fractions above 50%). The modification level distribution of Ψ sites in three human cell lines are shown as comparison. c, Pie chart showing the distribution of mRNA Ψ sites in CD4 T and CD8 T cells, with stoichiometry ≥10% in three mRNA segments. d, The number of Ψ-modified genes (with Ψ sites >10% fraction) that contain one or two Ψ versus above three Ψ sites per mRNA, in 12 mouse tissues. e, 2D plot of Ψ-modified genes (Ψ-fraction above 10% for each Ψ site) in mouse cerebellum, respectively, with ‘x axis’ as the mRNA abundance normalized to Rps8 (abundant nontarget gene, without any Ψ on mRNA) and ‘y axis’ showing the Ψ-strength of each gene, defined as the sum of Ψ fraction at all the Ψ sites within one mRNA. The cutoff of Ψ-strength 1.0 was marked by a red line. f, Among tissue-specific genes in each tissue type, the gene number distribution of non-Ψ-modified genes versus Ψ-modified genes. g, Top 25 enriched GO clusters from nontissue-specific Ψ-modified genes, in mouse liver and cerebellum, respectively. One-sided Fisher’s exact test. Adjusted P values using the linear step-up method. Source data Full size image

In total we identified 4,008 highly modified mRNA Ψ sites (>50%) from all 12 tissues (Fig. 4a,b). We next asked whether some of these Ψ sites could be tissue specific and potentially distinguish tissue type. In all, 2,595 out of 4,008 Ψ sites were indeed tissue-specific and can serve as tissue-specific mRNA markers (Supplementary Fig. 5c,d). Particularly, we observed many tissue-specific Ψ sites in cerebellum, CD8 T cell, small intestine and testis mRNA. Whereas Ψ sites may serve as cell-type specific markers, highly modified Ψ sites are also shared among multiple tissues, suggesting common functions. Out of 4,008 sites, 462 display a Ψ fraction of over 50% in at least one tissue and are detectable (above 10% Ψ fraction) in at least four tissues (Supplementary Fig. 5e). It will be interesting to explore the functional roles of these shared mRNA Ψ sites in tissues in the future.

Another interesting observation is the presence of multiple Ψ sites (≥3 Ψ) per mRNA in a portion of pseudouridylated mRNAs in mouse tissues (Fig. 4d), especially in cerebellum, liver, CD4 T cells and CD8 T cells. For instance, around 25% of Ψ-modified genes in cerebellum carry at least three Ψ per mRNA. We used ‘Ψ-strength’ (defined as the sum of Ψ fraction in all Ψ sites in one gene) to measure and describe the overall level of Ψ modification in one gene. We then plotted Ψ-strength versus normalized gene expression level (normalized to the abundant housekeeping gene Rps8, which lacks detectable Ψ sites) to group all Ψ-modified genes, with Ψ-strength of 1.0 as the cutoff (Fig. 4e). We then investigated gene expression levels and found, compared with the genes of lower Ψ-strength (<1.0), those with high Ψ-strength (>1.0) displayed a notably higher expression level in tissues such as cerebellum, CD4 T cells, CD8 T cells, thymus and testis (Supplementary Fig. 6a), suggesting that Ψ deposition on mouse tissue mRNA might contribute to mRNA stability.

To further study the features of Ψ-modified genes, we grouped tissue-specific genes in each tissue type (defined as genes that show a much higher expression in one tissue versus all other tissues), and analyzed how many of them are Ψ-modified in each tissue. Notably, 16%, 24%, 22%, 16%, 38% and 22% of tissue-specific transcripts are Ψ-modified in bone marrow, cerebellum, heart, kidney, liver and small intestine (Fig. 4f). Collectively, our data suggest that pseudouridylation occurs in many tissue-specific mRNAs in mouse and may affect tissue-specific biological functions.

We next investigated the potential functions of nontissue-specific genes in each tissue type. GO analysis of these genes in each tissue type showed common functional clusters on endoplasmic reticulum, ribosome, ATP binding, nucleotide/RNA binding, etc., which display similarity to those in human cell lines (Fig. 4g and Supplementary Figs. 3d and 6b). Overall, mouse tissues clearly show abundant Ψ modifications in nucleus-encoded mRNA; some of these are shared across tissues, suggesting common functions.

In addition, we investigated Ψ modification on mitochondrion-encoded mRNAs and detected five Ψ sites in ND1, CO1 and ND4, with Ψ stoichiometry at around 20–60%, from cultured human cell lines (Supplementary Fig. 6c). PUS1 seems to serve as the ‘writer’ protein for at least one Ψ site on ND4 mRNA in HeLa cells (Supplementary Fig. 6d). However, Ψ is more abundant on mitochondrial mRNAs from diverse mouse tissues; we detected 66 mt-mRNA Ψ sites in multiple mt-mRNAs, with around 20–65% Ψ fraction (Supplementary Fig. 6e). In some tissues, several mt-mRNAs, such as Nd1, Nd2, Nd4, Nd5, Co1 and Atp6, contain multiple Ψ modifications. Functional consequences of these mt-mRNA Ψ modifications require future investigations.

Ψ increases mRNA stability

In mouse tissues, mRNAs with high Ψ strength tend to be more abundant (Supplementary Fig. 6a). Pseudouridylation of synthetic mRNA has been reported to increase its stability34; however, the extent and potential functions of pseudouridylation in native mRNA are poorly understood. As we show here that TRUB1 is a main enzyme that deposits Ψ on mRNA in HeLa cells, we investigated its potential role on transcript stability. Yeast Pus4 (paralog of human TRUB1) overexpression is known to increase cell size and proliferation35. We also found consistently that TRUB1 depletion could inhibit cell growth, arrest cells in G1 phase, and cause reduced cell size (Supplementary Fig. 7a–d). We further validated the discovered Ψ sites in mRNA and also the TRUB1 function as a ‘writer’ protein using the CMC-treatment-based4,5 method for the four highly modified mRNA Ψ sites known to be installed by TRUB1, such as ERH, SCP2, AMFR and CDC6 (Supplementary Fig. 7e,f); the CMC-based RT with quantitative PCR (RT-qPCR) assay worked well in single-site Ψ determination and displayed notably reduced readthrough ratio at Ψ sites on these four mRNAs, after CMC-treatment and normalization to control regions. We also verified an array of HeLa mRNA Ψ sites in different motif contexts using this orthogonal assay (Supplementary Fig. 7f,g). Furthermore, we employed the published ‘CMC-RT and ligation-assisted PCR analysis of Ψ modification’ (CLAP)36, for visualization and quantification of mRNA Ψ site by gel electrophoresis. We selected three Ψ sites with surrounding sequences suitable for the CLAP protocol and validated our BID-seq methods in both Ψ site detection and Ψ stoichiometry estimation (Supplementary Fig. 7h, i).

We then performed TRUB1 knockdown and studied its effects on transcript half-life by RNA-seq. We noticed that TRUB1-targets, which carry TRUB1-modified Ψ in mRNA in shControl cells, displayed a shorter half-life upon TRUB1 knockdown, whereas the half-life of nontargets (without detectable Ψ) remained unchanged (Fig. 5a). We investigated the four representative genes containing TRUB1-regulated highly modified mRNA Ψ sites, ERH, SCP2, AMFR and CDC6 (Supplementary Fig. 4a). Three of the four targets displayed notable reduced mRNA level after 72-h siTRUB1 knockdown compared with the control (Fig. 5b). By using RT-qPCR, we validated that TRUB1 depletion reduced the stability of all four representative TRUB1-targets but not a nontarget mRNA (Supplementary Fig. 7j), confirming that Ψ installed by TRUB1 stabilizes the target mRNA. To further validate the transcript stabilization role of TRUB1-regulated Ψ, we engineered a fused dCas13d-TRUB1 system37 and confirmed that site-specific Ψ deposition could notably prolong mRNA lifetime (Fig. 5c). Taken together, our data reveal a main functional role of pseudouridylation in stabilizing target mRNA.

Fig. 5: Ψ affects mRNA stability. a, Cumulative distribution showing the decreased mRNA half-life for TRUB1-targets in TRUB1-depleted HeLa cells versus the control, compared with nontargets. n = 7,881 nontargets, and n = 65 TRUB1-targets. Box, first and third quartiles; line in the middle of the box, median; short line, maximum and minimum; ***P = 0.0008; unpaired, two-tailed t-test. b, Relative mRNA levels of four representative transcripts carrying TRUB1-regulated highly modified Ψ, in siTRUB1 versus siControl. P = 0.0006, 0.0005, 0.0612 and <0.0001, respectively; unpaired, two-tailed t-test. c, Stable expression of dCas13d-TRUB1, by gRNA transfection, restored Ψ and increased half-life of the target mRNA in TRUB1-depeted HeLa cells. For ERH: P = 0.0104, 0.0002 and 0.0002, respectively; unpaired, two-tailed t-test. For SCP2: P = 0.0511, 0.0006 and 0.0006, respectively; unpaired, two-tailed t-test. For AMFR: P = 0.0025, <0.0001 and 0.0002, respectively; unpaired, two-tailed t-test. For CDC6: P = 0.0015, 0.0002 and 0.0002, respectively; unpaired, two-tailed t-test. For b–c, n = 3, biologically independent samples; data are presented as mean values ± s.d.; NS, P ≥ 0.05; *P < 0.05; **P < 0.01; ***P < 0.001 and ****P < 0.0001. Source data Full size image

Pseudouridylation at mRNA stop codons

Using an in vitro translation assay, Karijolich et al. discovered a unique function that targeted pseudouridylation could convert nonsense codons into sense codons and promote readthrough (Supplementary Fig. 8a)17. More recently, it was demonstrated that Ψ can facilitate noncanonical base pairing in the ribosome decoding center to promote nonsense suppression18,19. Despite these important observations, whether Ψ naturally exists in stop codons of mRNA and promotes stop codon readthrough in vivo remains unclear. In HeLa, HEK293T and A549 cells, BID-seq revealed several pseudouridylation sites in stop codons (as ‘ΨGA’, ‘ΨAA’ and ‘ΨAG’) in NDUFS2, CTSC, PLP2, MDK, SMOX, CUL3 and C7orf50 mRNAs, with Ψ fraction ranging from 10% to 40% (Fig. 6a). The modification fraction of the ΨGA stop codon in NDUFS2 mRNA decreased dramatically upon PUS1 knockdown (Fig. 6b). Correspondingly, we observed decreased stop codon readthrough for NDUFS2 with PUS1 knockdown, whereas dCas13d-PUS1 coupled with guide RNA (gRNA) for NDUFS2 substantially increased stop codon readthrough from around 3% up to ~14% (Fig. 6c and Supplementary Fig. 8b,c).

Fig. 6: The presence of Ψ promotes stop codon readthrough in vivo. a, Heatmap plot of Ψ fraction for seven Ψ sites within mRNA stop codon in three human cell lines, in a matrix of the corresponding gene name versus each cell line. b, Ψ modification fraction of Ψ within stop codon of the NDUFS2 mRNA, in wild-type HeLa cells, shControl HeLa cells and PUS1-depleted HeLa cells, respectively. For wild-type HeLa cells, n = 3, biologically independent samples; data are presented as mean values ± s.d. For shControl and PUS1-depleted HeLa cells, n = 2, biologically independent samples. c, Stop codon readthrough for the NDUFS2 mRNA in HeLa cells investigated by immunoblotting assay. shControl or shPUS1 HeLa cells stably expressing dCas13d-PUS1 were transfected with control or NDUFS2 gRNA for 48 h. The percentage numbers of readthrough ratio are shown in blue. The readthrough protein bands are labeled by red arrow. d, Heatmap plot of Ψ-fraction for 106 Ψ sites within mRNA stop codons in mouse tissues, in a matrix of the corresponding gene name versus tissue type. Source data Full size image

We also identified 106 Ψ-modified stop codons from 12 mouse tissues, with Ψ fraction ranging from 10% to 65% (Fig. 6d). In all cases, a nearby second stop codon without Ψ was found at downstream locations. Ψ-modified stop codons in Atp5a1, Dbi, Rpl4 and Tomm70a are conserved in 11 or 12 tissues whereas others are tissue specific (Fig. 6d). Taken together, our data reveal the existence of Ψ in stop codons in native mRNAs, suggesting their role in promoting stop codon readthrough in vivo as an alternative translation regulation mechanism.

Among Ψ-modified stop codons from mouse tissues (Fig. 6d), we examined corresponding proteins that may include potential readthrough peptide with over 10% increased protein molecular weight (Supplementary Table 23) to allow for confident detection of the shifted protein band. We selected ten proteins with available commercial antibodies, and tested these proteins in seven different mouse tissues. Among these ten targets containing Ψ-modified stop codons, we observed notable band shifts for potential readthrough in Selenof, Ppp1r2, Nt5c3, Szrd1 and Cd52 (Supplementary Fig. 8d). These band shifts could be observed in four different mice repeats (Supplementary Fig. 8e) with some individual variations. The highest estimated readthrough is around 35% for Selenof in kidney with a stop codon modified by Ψ at around 42% stoichiometry (Supplementary Fig. 8d,e). Note that for Selenof and Ube2e3, there were no observable band shifts in some tissues, though BID-seq indicates the presence of Ψ-modified stop codon (Supplementary Fig. 8d). Interestingly, although BID-seq data reveals an approximately 12% Ψ-modified stop codon of Cd52 mRNA from bone marrow but not any other tissues, we saw a strong band shift for Cd52 readthrough peptide in bone marrow (Supplementary Fig. 8d,e), likely driven by a low-Ψ-modified stop codon. These observations suggest that the Ψ-mediated stop codon readthrough may depend on sequence context and is regulated by unknown tissue-specific mechanisms. A lot more future research is required to understand and potentially take advantage of this intriguing translation regulation mechanism.