Coupling of structure-specific in vivo chemical modification to next-generation sequencing is transforming RNA secondary structure studies in living cells. The dominant strategy for detecting in vivo chemical modifications uses reverse transcriptase truncation products, which introduce biases and necessitate population-average assessments of RNA structure. Here we present dimethyl sulfate (DMS) mutational profiling with sequencing (DMS-MaPseq), which encodes DMS modifications as mismatches using a thermostable group II intron reverse transcriptase. DMS-MaPseq yields a high signal-to-noise ratio, can report multiple structural features per molecule, and allows both genome-wide studies and focused in vivo investigations of even low-abundance RNAs. We apply DMS-MaPseq for the first analysis of RNA structure within an animal tissue and to identify a functional structure involved in noncanonical translation initiation. Additionally, we use DMS-MaPseq to compare the in vivo structure of pre-mRNAs with their mature isoforms. These applications illustrate DMS-MaPseq's capacity to dramatically expand in vivo analysis of RNA structure.
Subscribe to Journal
Get full journal access for 1 year
only $4.92 per issue
All prices are NET prices.
VAT will be added later in the checkout.
Tax calculation will be finalised during checkout.
Rent or Buy article
Get time limited or full article access on ReadCube.
All prices are NET prices.
Gene Expression Omnibus
Mortimer, S.A., Kidwell, M.A. & Doudna, J.A. Insights into RNA structure and function from genome-wide studies. Nat. Rev. Genet. 15, 469–479 (2014).
Deigan, K.E., Li, T.W., Mathews, D.H. & Weeks, K.M. Accurate SHAPE-directed RNA structure determination. Proc. Natl. Acad. Sci. USA 106, 97–102 (2009).
Ouyang, Z., Snyder, M.P. & Chang, H.Y. SeqFold: genome-scale reconstruction of RNA secondary structure integrating high-throughput sequencing data. Genome Res. 23, 377–387 (2013).
Rouskin, S., Zubradt, M., Washietl, S., Kellis, M. & Weissman, J.S. Genome-wide probing of RNA structure reveals active unfolding of mRNA structures in vivo. Nature 505, 701–705 (2014).
Wells, S.E., Hughes, J.M., Igel, A.H. & Ares, M. Jr. Use of dimethyl sulfate to probe RNA structure in vivo. Methods Enzymol. 318, 479–493 (2000).
Mortimer, S.A. & Weeks, K.M. A fast-acting reagent for accurate analysis of RNA secondary and tertiary structure by SHAPE chemistry. J. Am. Chem. Soc. 129, 4144–4145 (2007).
Smola, M.J., Rice, G.M., Busan, S., Siegfried, N.A. & Weeks, K.M. Selective 2′-hydroxyl acylation analyzed by primer extension and mutational profiling (SHAPE-MaP) for direct, versatile and accurate RNA structure analysis. Nat. Protoc. 10, 1643–1669 (2015).
Ding, Y. et al. In vivo genome-wide profiling of RNA secondary structure reveals novel regulatory features. Nature 505, 696–700 (2014).
Lucks, J.B. et al. Multiplexed RNA structure characterization with selective 2′-hydroxyl acylation analyzed by primer extension sequencing (SHAPE-Seq). Proc. Natl. Acad. Sci. USA 108, 11063–11068 (2011).
Spitale, R.C. et al. Structural imprints in vivo decode RNA regulatory mechanisms. Nature 519, 486–490 (2015).
Poulsen, L.D., Kielpinski, L.J., Salama, S.R., Krogh, A. & Vinther, J. SHAPE Selection (SHAPES) enrich for RNA structure signal in SHAPE sequencing-based probing data. RNA 21, 1042–1052 (2015).
Kwok, C.K., Tang, Y., Assmann, S.M. & Bevilacqua, P.C. The RNA structurome: transcriptome-wide structure probing with next-generation sequencing. Trends Biochem. Sci. 40, 221–232 (2015).
Strobel, E.J., Watters, K.E., Loughrey, D. & Lucks, J.B. RNA systems biology: uniting functional discoveries and structural tools to understand global roles of RNAs. Curr. Opin. Biotechnol. 39, 182–191 (2016).
Homan, P.J. et al. Single-molecule correlated chemical probing of RNA. Proc. Natl. Acad. Sci. USA 111, 13858–13863 (2014).
Siegfried, N.A., Busan, S., Rice, G.M., Nelson, J.A.E. & Weeks, K.M. RNA motif discovery by SHAPE and mutational profiling (SHAPE-MaP). Nat. Methods 11, 959–965 (2014).
Smola, M.J., Calabrese, J.M. & Weeks, K.M. Detection of RNA–protein interactions in living cells with SHAPE. Biochemistry 54, 6867–6875 (2015).
Inoue, T. & Cech, T.R. Secondary structure of the circular form of the Tetrahymena rRNA intervening sequence: a technique for RNA structure analysis using chemical probes and reverse transcriptase. Proc. Natl. Acad. Sci. USA 82, 648–652 (1985).
Mohr, S. et al. Thermostable group II intron reverse transcriptase fusion proteins and their use in cDNA synthesis and next-generation RNA sequencing. RNA 19, 958–970 (2013).
Katibah, G.E. et al. Broad and adaptable RNA structure recognition by the human interferon-induced tetratricopeptide repeat protein IFIT5. Proc. Natl. Acad. Sci. USA 111, 12025–12030 (2014).
Beckman, R.A., Mildvan, A.S. & Loeb, L.A. On the fidelity of DNA replication: manganese mutagenesis in vitro. Biochemistry 24, 5810–5817 (1985).
Badis, G., Saveanu, C., Fromont-Racine, M. & Jacquier, A. Targeted mRNA degradation by deadenylation-independent decapping. Mol. Cell 15, 5–15 (2004).
Aviran, S. & Pachter, L. Rational experiment design for sequencing-based RNA structure mapping. RNA 20, 1864–1877 (2014).
Hooks, K.B. & Griffiths-Jones, S. Conserved RNA structures in the non-canonical Hac1/Xbp1 intron. RNA Biol. 8, 552–556 (2011).
Ben-Shem, A. et al. The structure of the eukaryotic ribosome at 3.0 Å resolution. Science 334, 1524–1529 (2011).
Aragón, T. et al. Messenger RNA targeting to endoplasmic reticulum stress signaling sites. Nature 457, 736–740 (2009).
Latrèche, L., Jean-Jean, O., Driscoll, D.M. & Chavatte, L. Novel structural determinants in human SECIS elements modulate the translational recoding of UGA as selenocysteine. Nucleic Acids Res. 37, 5868–5880 (2009).
Chartrand, P., Meng, X.H., Singer, R.H. & Long, R.M. Structural elements required for the localization of ASH1 mRNA and of a green fluorescent protein reporter particle in vivo. Curr. Biol. 9, 333–338 (1999).
Jambor, H. et al. Systematic imaging reveals features and changing localization of mRNAs in Drosophila development. eLife 4, e05003 (2015).
MacDonald, P.M. bicoid mRNA localization signal: phylogenetic conservation of function and RNA secondary structure. Development 110, 161–171 (1990).
Bullock, S.L., Ringel, I., Ish-Horowicz, D. & Lukavsky, P.J. A′-form RNA helices are required for cytoplasmic mRNA transport in Drosophila. Nat. Struct. Mol. Biol. 17, 703–709 (2010).
Jambor, H., Brunel, C. & Ephrussi, A. Dimerization of oskar 3′ UTRs promotes hitchhiking for RNA localization in the Drosophila oocyte. RNA 17, 2049–2057 (2011).
Van De Bor, V., Hartswood, E., Jones, C., Finnegan, D. & Davis, I. gurken and the I factor retrotransposon RNAs share common localization signals and machinery. Dev. Cell 9, 51–62 (2005).
Fields, A.P. et al. A regression-based analysis of ribosome-profiling data reveals a conserved complexity to mammalian translation. Mol. Cell 60, 816–827 (2015).
Lorenz, R. et al. ViennaRNA Package 2.0. Algorithms Mol. Biol. 6, 26 (2011).
Wan, Y. et al. Landscape and variation of RNA secondary structure across the human transcriptome. Nature 505, 706–709 (2014).
Meyer, M., Plass, M., Pérez-Valle, J., Eyras, E. & Vilardell, J. Deciphering 3′ss selection in the yeast genome reveals an RNA thermosensor that mediates alternative splicing. Mol. Cell 43, 1033–1039 (2011).
Babendure, J.R., Babendure, J.L., Ding, J.-H. & Tsien, R.Y. Control of mammalian translation by mRNA structure near caps. RNA 12, 851–861 (2006).
Kudla, G., Murray, A.W., Tollervey, D. & Plotkin, J.B. Coding-sequence determinants of gene expression in Escherichia coli. Science 324, 255–258 (2009).
Carlile, T.M. et al. Pseudouridine profiling reveals regulated mRNA pseudouridylation in yeast and human cells. Nature 515, 143–146 (2014).
Schwartz, S. et al. Transcriptome-wide mapping reveals widespread dynamic-regulated pseudouridylation of ncRNA and mRNA. Cell 159, 148–162 (2014).
Meyer, K.D. et al. Comprehensive analysis of mRNA methylation reveals enrichment in 3′ UTRs and near stop codons. Cell 149, 1635–1646 (2012).
Dominissini, D. et al. Topology of the human and mouse m6A RNA methylomes revealed by m6A-seq. Nature 485, 201–206 (2012).
Zubradt, M. et al. Genome-wide DMS-MaPseq for in vivo RNA structure determination. Protocol Exchange http://dx.doi.org/10.1038/protex.2016.068 (2016).
Zubradt, M. et al. Target-specific DMS-MaPseq for in vivo RNA structure determination. Protocol Exchange http://dx.doi.org/10.1038/protex.2016.069 (2016).
Ingolia, N.T., Ghaemmaghami, S., Newman, J.R.S. & Weissman, J.S. Genome-wide analysis in vivo of translation with nucleotide resolution using ribosome profiling. Science 324, 218–223 (2009).
Adiconis, X. et al. Comparative analysis of RNA sequencing methods for degraded or low-input samples. Nat. Methods 10, 623–629 (2013).
Darty, K., Denise, A. & Ponty, Y. VARNA: interactive drawing and editing of the RNA secondary structure. Bioinformatics 25, 1974–1975 (2009).
We thank A. Fields from UCSF for FXR2 reporter plasmids; T. Norman, A. Fields, and J. Quinn for insightful discussions and comments on the manuscript; A. Jaeger (Whitehead Institute for Biomedical Research, Cambridge, Massachusetts, USA) for providing HEK 293T cells; and the Orr-Weaver lab at the Whitehead Institute for providing flies. We also thank Y. Chen, D. Bogdanoff, E. Chow, and J. Lund at the UCSF Center for Advanced Technology for sequencing assistance; J. Love and S. Levine in the Whitehead Core and MIT BioMicro Center for library preparation; and C. Reiger, M. DeVera, J. Kanter, and G. McCauley for administrative support. This research was supported by the CRSB (Center for RNA Systems Biology; grant P50 GM102706 to J.S.W.), the Howard Hughes Medical Institute (J.S.W.), the National Science Foundation grant 1144247 (M. Z.), and the Genentech Foundation (M.Z.). Research on TGIRTs and their modes of use was supported by NIH R01 grants GM37949 and GM37951 (A.M.L.).
Thermostable group II intron reverse transcriptase (TGIRT) enzymes and methods for their use are the subject of patents and patent applications that have been licensed by the University of Texas and East Tennessee State University to InGex, LLC. A.M.L. and the University of Texas are minority equity holders in InGex, LLC and receive royalty payments from the sale of TGIRT enzymes and the licensing of intellectual property. The other authors declare no competing financial interests.
Integrated supplementary information
a, Correlation of Gini index for 202 yeast mRNA regions with 15x coverage at 2.5% or 5% v/v DMS concentrations reveals high reproducibility. b, Correlation of DMS-MaPseq signal for each of the 1800 nucleotides in the yeast 18S rRNA under varying DMS concentrations reveals high fidelity of RNA structure results at 5% v/v DMS. Pearson’s r values are shown.
Supplementary Figure 2 Mutations produced by reverse transcription on in vivo DMS-treated and untreated templates.
a, Total mismatch percentage on each nucleotide from in vivo DMS-MaPseq with TGIRT on yeast mRNA. b, Nucleotide composition of mismatches in DMS-seq from Rouskin et al. for in vivo DMS-treated yeast mRNA, revealing a preference to generate mismatches on cytosines. c, Nucleotide composition of mismatches as detected by existing RNA structure probing approaches for untreated yeast mRNA, revealing no strong mismatch biases independent of DMS modification. d, Mutation frequency from DMS-treated and untreated yeast mRNA templates, derived from the same RNA source for TGIRT and SSII-Mn2+ data. Mutation frequency was calculated as the number of mismatches or indels detected via sequencing divided by the total number of bases sequenced.
Supplementary Figure 3 The variable background signal in DMS-MaPseq data can produce additional noise when used for correction.
a, b, RPS28B 3′ UTR positive control region shown with average ratiometric DMS-MaPseq signal from two untreated and in vivo DMS-treated biological replicates prepared with TGIRT (a) or SSII/Mn2+ (c). Error bars represent one standard deviation. c, d, Background correction of TGIRT (c) or SSII/Mn2+ (d) RPS28B structure signal as (Sin vivo – Suntreated) / Sdenatured produces false positives (solid lines) and false negatives (dashed lines) at certain positions. DMS reactivity calculated as the ratiometric DMS signal per position normalized to the highest number of reads in displayed region, which is set to 1.0. e, f, Histogram of R2 values for DMS-MaPseq data from yeast mRNA regions for untreated (e) and in vivo treated (f) genome-wide biological replicates prepared with TGIRT or SSII/Mn2+ reveals high variability in background signal.
Supplementary Figure 4 XBP1 mRNA positive control structure with nucleotides colored by DMS reactivity from in vivo genome-wide DMS-MaPseq in HEK 293Ts.
Note that the sequencing coverage for this region (~10x) is lower than suggested for high data reproducibility (~20x). DMS reactivity calculated as the ratiometric DMS signal per position normalized to the highest number of reads in displayed region, which is set to 1.0.
a, Raw ratiometric data from yeast 18S rRNA prepared with the target-specific tagmentation or genome-wide DMS-MaPseq approach. Data is the same as used for the ROC curve in Figure 4d. b, c, XBP1 and MSRB1 mRNA positive control structures with nucleotides colored by DMS reactivity from target-specific DMS-MaPseq in HEK 293Ts. DMS reactivity calculated as the ratiometric DMS signal per position normalized to the highest number of reads in displayed region, which is set to 1.0. d, e, Raw ratiometric data for XBP1 and MSRB1 positive control regions in (b) and (c), respectively. XBP1 region spans nt 520-592 from the NCBI NM_005080.3 transcript, and the MSRB1 region spans nt 966-1036 from the NCBI NM_016332.2 transcript.
Supplementary Figure 6 Target-specific DMS-MaPseq data show no signal drop-off and low background signal.
a, Raw ratiometric data from HEK 293Ts for an in vivo DMS-treated XBP1 mRNA region reveals no loss of signal across amplicon until the sequence bound by the PCR primer. (Position 1 corresponds nt 340 in the NCBI NM_005080.3 transcript annotation.) b, Ratiometric data plotted for RPS28B yeast mRNA positive control region for in vivo DMS-treated or untreated RNA reveals low background relative to in vivo signal. In vivo data is the same used for the structure model overlay in Figure 4f. Position 1 corresponds to chr XII, position 673546.
Supplementary Figure 7 Targeted amplification of low-abundance RNA targets using a unique molecular index.
a, Schematic for targeted RNA structure probing via gene-specific RT-PCR using a unique molecular index (UMI) on the RT primer. Using a gene-specific RT primer with a 5′ overhang comprised of an N10 random index and defined PCR primer binding site, each cDNA is labeled with a UMI. After gene-specific PCR amplification and limited-cycle second PCR to add sequencing adaptors and indexes, the PCR amplicon is sequenced on a MiSeq for a read length specified by the size of the region of interest. b, c, Yeast ASH1 (b) and SFT2 (c) mRNA positive control structures from target-specific UMI approach with nucleotides colored by DMS reactivity in vivo. Data are presented after collapsing to unique reads based on UMI and internal DMS-induced mismatches and are presented without collapsing, with nearly identical results. DMS reactivity calculated as the ratiometric DMS signal per position normalized to the highest number of reads in displayed region, which is set to 1.0. Uncolored nucleotides had no data collected.
a, Total mismatch percentage observed on each nucleotide from in vivo DMS-treated and untreated D. melanogaster RNA. b, Correlation of DMS-MaPseq signal for each nucleotide in the oskar PCR amplicon under varied DMS treatment times reveals high fidelity of RNA structure results under both conditions. c, Raw ratiometric data for the oskar positive control structure in Figure 5a reveals low background signal in an untreated versus in vivo treated sample at 50% v/v DMS for 10min. Nucleotide positions are noted relative to transcription start site of FlyBase transcript annotation FBtr0081956. d, gurken mRNA positive control structure with nucleotides colored by in vivo DMS reactivity, after 25% v/v DMS treatment for 5 min. DMS reactivity calculated as the ratiometric DMS signal per position normalized to the highest number of reads in displayed region, which is set to 1.0. e, Raw ratiometric data for the gurken positive control region shown in (d) reveals low background signal in an untreated versus in vivo treated sample.
Supplementary Figure 9 In vitro and in vivo DMS-MaPseq reactivity of FXR2 stem1 and stem2 regions within the human FXR2 5 UTR and first exon.
a, b, Stem 1 (a) and stem 2 (b) structure predictions from 0.04 constraint reactivity threshold, with nucleotides colored by DMS reactivity with nucleotides colored by in vitro DMS reactivity. c, d, Stem 1 (c) and stem 2 (d) structure predictions with nucleotides colored by in vivo DMS reactivity. DMS reactivity calculated as the ratiometric DMS signal normalized to the highest reactive base in the 82-263nt region. e, Raw ratiometric signal from the FXR2 5′ UTR and first exon in vivo, displaying the mean between two technical replicates and error bars representing one standard deviation. Position 1 corresponds to chrXVII: 7614897.
Supplementary Figure 10 Fluorescent reporter constructs with RNA structure mutations confirm function of a highly stable structure in FXR2 translation.
a, b, Alternative stem 1 structure prediction from 0.06 constraint reactivity threshold from the human FXR2 5′ UTR and first exon, with nucleotides colored by DMS reactivity (a) or with the mutated sequence from (d) overlaid in green or blue text (b). DMS reactivity calculated as the ratiometric DMS signal normalized to the highest reactive base in the displayed region. c, Stem 2 structure model shown with mutated sequence used in (d) overlaid in green or blue text. d, Top, FXR2 reporter construct design. The 5′ UTR and first exon of human FXR2 ∆ATG is fused to a T2A and in-frame eGFP lacking its initial AUG, such that mutations to the coding region of FXR2 will not affect stability of the eGFP protein. To internally control for transfection and transcription efficiency, mCherry driven by an internal ribosome entry site was included downstream. Bottom, fluorescence measurements following transient transfection of FXR2 reporter constructs into HEK 293T cells. The eGFP/mCherry ratio was calculated for transfection replications of each construct and scaled relative to the wildtype construct, which was set to 1.0. Error bars represent one standard deviation. This analysis reveals a drop in eGFP levels upon mutating the predicted FXR2 structure and a full recovery of eGFP levels after compensatory mutation. Basal levels of protein expression in ∆GTG mutant likely reflect translation initiation at other NUG codons.
Supplementary Figure 11 RNA structure does not vary between the pre-mRNA and spliced mRNA isoforms of yeast ribosomal protein genes.
a, Targeted DMS-MaPseq data specific for the yeast RPL31B pre-mRNA and spliced mRNA isoforms reveal minimal structure difference in the common exon1 sequence. Ratiometric DMS-MaPseq data is plotted with isoform-specific RT primer locations noted with arrows. b, Exon1 DMS-MaPseq structure signal correlation (Pearson’s r value) across pre-mRNA and spliced mRNA isoforms and between isoform-specific replicates.
Supplementary Figure 12 Investigation of RNA structure data in ribosomal protein exon 1 regions reveals nucleotide resolution consistency in structure signal between pre-mRNA and spliced isoforms.
a, Ratiometric DMS-MaPseq data for the RPL14A exon 1 from its pre-mRNA versus spliced isoform reveals highly correlated signal for each A/C nucleotide. b, Top, Average ratiometric DMS-MaPseq signal from two RPL14A pre-mRNA technical replicates. Bottom, Average signal across RPL14A pre-mRNA and spliced isoforms reveals tight error distribution. c, Ratiometric DMS-MaPseq data for the RPL31B exon 1 from its pre-mRNA versus spliced isoform reveals highly correlated signal for each A/C nucleotide. d, Top, Average ratiometric DMS-MaPseq signal from two RPL31B pre-mRNA technical replicates. Bottom, Average signal across RPL31B pre-mRNA and spliced isoforms reveals tight error distribution. Error bars represent one standard deviation.
About this article
Cite this article
Zubradt, M., Gupta, P., Persad, S. et al. DMS-MaPseq for genome-wide or targeted RNA structure probing in vivo. Nat Methods 14, 75–82 (2017). https://doi.org/10.1038/nmeth.4057
Journal of Molecular Biology (2021)
Cell Chemical Biology (2021)
Briefings in Bioinformatics (2021)
A novel SHAPE reagent enables the analysis of RNA structure in living cells with unprecedented accuracy
Nucleic Acids Research (2021)