Genome-wide mapping of RNA structure using nuclease digestion and high-throughput sequencing

Journal name:
Nature Protocols
Volume:
8,
Pages:
849–869
Year published:
DOI:
doi:10.1038/nprot.2013.045
Published online

Abstract

RNA structure is important for RNA function and regulation, and there is growing interest in determining the RNA structure of many transcripts. Here we provide a detailed protocol for the parallel analysis of RNA structure (PARS) for probing RNA secondary structures genome-wide. In this method, enzymatic footprinting is coupled to high-throughput sequencing to provide secondary structure data for thousands of RNAs simultaneously. The entire experimental protocol takes ∼5 d to complete, and sequencing and data analysis take an additional 6–8 d. PARS was developed using the yeast genome as proof of principle, but its approach should be applicable to probing RNA structures from different transcriptomes and structural dynamics under diverse solution conditions.

At a glance

Figures

  1. Detailed experimental schematic of PARS.
    Figure 1: Detailed experimental schematic of PARS.

    Total RNA is isolated from cells, enriched for poly(A)+ transcripts and renatured in vitro. The folded RNA is then cut by RNase V1 and S1 nuclease separately, resulting in 5′P overhangs. For each of the two pools of cleaved RNA, the RNA is fragmented and made into a cDNA library. However, only the sites cleaved by RNase V1 or S1 nuclease contain 5′Ps that are ligation-competent (the ends of RNA fragments where these overhangs are located are highlighted by gray rectangles). The RNA is then size-selected, followed by 5′ adapter ligation. Fragmentation products with 3′P groups are converted to 3′OH groups by Antarctic phosphatase, enabling these products to be ligated to 3′ adapters. This step is followed by reverse transcription, size selection and PCR to produce a cDNA library that is suitable for high-throughput sequencing. This figure is partially reproduced from Kertesz et al.10.

  2. Detailed analysis pipeline of PARS.
    Figure 2: Detailed analysis pipeline of PARS.

    Deep sequencing reads from SOLiD or Illumina sequencing can be mapped to the transcriptome using the software Bowtie. (Optional) For Illumina reads, the quality of the sequenced reads can be determined using the program FastQC. Bases with poor-quality scores at the ends of the reads (shown in blue bars) can be trimmed before aligning the bases to the transcriptome to enable more accurate mapping. For every base along a transcript, the number of double- and single-stranded reads that start with the base is read. Structural information is then inferred through the base 1 nt upstream of the mapped base because RNase V1 cleaves RNA cuts after a paired base, whereas S1 nuclease cleaves it after an unpaired base. The PARS score, log ratio of V1/S1, provides information on whether a base is paired or unpaired. The larger the PARS score, the more likely a base is to be paired (and part of a nucleotide that 'terminates' a double-stranded stretch of RNA). Information on double- and single-stranded regions of the RNA can also be integrated into SeqFold, which is a structure prediction program that uses experimental PARS data to guide computationally predicted structure ensembles to provide more accurate RNA secondary structure models. This figure is partially reproduced from Kertesz et al.10.

  3. PARS correctly recapitulates results of RNA footprinting for the P9-9.2 domain of the Tetrahymena ribozyme.
    Figure 3: PARS correctly recapitulates results of RNA footprinting for the P9-9.2 domain of the Tetrahymena ribozyme10.

    (a–c) RNase V1 cleaves the folded p9-9.2 domain of the tetrahymena ribozyme at two distinct sites, which are accurately captured by PARS10. (a) The double-stranded signal of PARS obtained using the double-stranded cutter RNase V1 (red bars) is shown as the number of sequence reads mapped along each nucleotide of the P9-9.2 domain. Also shown is the signal obtained on the P9-9.2 domain using traditional footprinting (black line) and a semiautomated quantification of the RNase V1 lane shown in panel (c). Red arrows indicate cleavages that are seen in c. (b) Single-stranded signal of PARS obtained using the single-stranded cutter RNase S1 (green bars), compared with the signal obtained using traditional footprinting (black line). Green arrows indicate cleavages that are seen in c. (c) The gel resulting from RNase V1 (lane 9) and RNase S1 (lanes 6, 7 and 8 at pH 7 and lanes 4 and 5 at pH 4.5) digestions. Alkaline hydrolysis (lanes 1 and 2), RNase T1 ladder (lane 3) and no RNase treatment (lane 10) are also shown. Arrows mark nucleotides that were identified by PARS as double (red arrows) or single stranded (green arrows). (d) The known secondary structure of the p9-9.2 domain. Arrows mark nucleotides that were identified by both PARS and enzymatic probing as double (red arrows) or single stranded (green arrows). This figure is reproduced from Kertesz et al.10.

  4. Probing melting dynamics of the RNA structure across different temperatures.
    Figure 4: Probing melting dynamics of the RNA structure across different temperatures.

    (a) Raw RNase V1 sequencing reads for the first 150 nt of the SCR1 RNA at 23 °C, 30 °C, 37 °C, 55 °C and 75 °C. Bases that melt at the respective temperatures are shown as colored bars at the bottom of the graph, and the color indicates the highest temperature at which the structure was found to be stable. (b) RNA secondary structure model of the full-length SCR1 mRNA35. A tertiary interaction is indicated in gray dotted lines. The melting transitions obtained from PARTE are indicated as colored dots. This figure is reproduced from Wan et al.16.

References

  1. Sharp, P.A. The centrality of RNA. Cell 136, 577580 (2009).
  2. Wan, Y., Kertesz, M., Spitale, R.C., Segal, E. & Chang, H.Y. Understanding the transcriptome through RNA structure. Nat. Rev. Genet. 12, 641655 (2011).
  3. Breaker, R.R. Prospects for riboswitch discovery and analysis. Mol. Cell 43, 867879 (2011).
  4. Guo, F., Gooding, A.R. & Cech, T.R. Structure of the Tetrahymena ribozyme: base triple sandwich and metal ion at the active site. Mol. Cell 16, 351362 (2004).
  5. Deigan, K.E., Li, T.W., Mathews, D.H. & Weeks, K.M. Accurate SHAPE-directed RNA structure determination. Proc. Natl. Acad. Sci. USA 106, 97102 (2009).
  6. Gornicki, P. et al. Use of lead(II) to probe the structure of large RNA's. Conformation of the 3′ terminal domain of E. coli 16S rRNA and its involvement in building the tRNA binding sites. J. Biomol. Struct. Dyn. 6, 971984 (1989).
  7. Auron, P.E., Weber, L.D. & Rich, A. Comparison of transfer ribonucleic acid structures using cobra venom and S1 endonucleases. Biochemistry 21, 47004706 (1982).
  8. Watts, J.M. et al. Architecture and secondary structure of an entire HIV-1 RNA genome. Nature 460, 711716 (2009).
  9. Wilkinson, K.A. et al. High-throughput SHAPE analysis reveals structures in HIV-1 genomic RNA strongly conserved across distinct biological states. PLoS Biol. 6, e96 (2008).
  10. Kertesz, M. et al. Genome-wide measurement of RNA secondary structure in yeast. Nature 467, 103107 (2010).
  11. Zheng, Q. et al. Genome-wide double-stranded RNA sequencing reveals the functional significance of base-paired RNAs in Arabidopsis. PLoS Genet. 6, e1001141 (2010).
  12. Underwood, J.G. et al. FragSeq: transcriptome-wide RNA structure probing using high-throughput sequencing. Nat. Methods 7, 9951001 (2010).
  13. Lucks, J.B. et al. Multiplexed RNA structure characterization with selective 2′-hydroxyl acylation analyzed by primer extension sequencing (SHAPE-Seq). Proc. Natl. Acad. Sci. USA 108, 1106311068 (2011).
  14. Li, F. et al. Global analysis of RNA secondary structure in two metazoans. Cell Rep. 1, 6982 (2012).
  15. Ouyang, Z., Snyder, M.P. & Chang, H.Y. SeqFold: Genome-scale reconstruction of RNA secondary structure integrating high-throughput sequencing data. Genome Res. (2012).
  16. Wan, Y. et al. Genome-wide measurement of RNA folding energies. Mol. Cell 48, 169181 (2012).
  17. Merino, E.J., Wilkinson, K.A., Coughlan, J.L. & Weeks, K.M. RNA structure analysis at single nucleotide resolution by selective 2′-hydroxyl acylation and primer extension (SHAPE). J. Am. Chem. Soc. 127, 42234231 (2005).
  18. Low, J.T. & Weeks, K.M. SHAPE-directed RNA secondary structure prediction. Methods 52, 150158 (2010).
  19. Wurst, R.M., Vournakis, J.N. & Maxam, A.M. Structure mapping of 5′-32P-labeled RNA with S1 nuclease. Biochemistry 17, 44934499 (1978).
  20. Lowman, H.B. & Draper, D.E. On the recognition of helical RNA by cobra venom V1 nuclease. J. Biol. Chem. 261, 53965403 (1986).
  21. Ehresmann, C. et al. Probing the structure of RNAs in solution. Nucleic Acids Res. 15, 91099128 (1987).
  22. Lee, A., Hansen, K.D., Bullard, J., Dudoit, S. & Sherlock, G. Novel low abundance and transient RNAs in yeast revealed by tiling microarrays and ultra high-throughput sequencing are not conserved across closely related yeast species. PLoS Genet. 4, e1000299 (2008).
  23. Neil, H. et al. Widespread bidirectional promoters are the major source of cryptic transcripts in yeast. Nature 457, 10381042 (2009).
  24. Xu, Z. et al. Bidirectional promoters generate pervasive transcription in yeast. Nature 457, 10331037 (2009).
  25. ENCODE Project Consortium et al. Identification and analysis of functional elements in 1% of the human genome by the ENCODE pilot project. Nature 447, 799816 (2007).
  26. Langmead, B., Trapnell, C., Pop, M. & Salzberg, S.L. Ultrafast and memory-efficient alignment of short DNA sequences to the human genome. Genome Biol. 10, R25 (2009).
  27. Trapnell, C., Pachter, L. & Salzberg, S.L. TopHat: discovering splice junctions with RNA-seq. Bioinformatics 25, 11051111 (2009).
  28. Li, H. & Durbin, R. Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics 25, 17541760 (2009).
  29. Rumble, S.M. et al. SHRiMP: accurate mapping of short color-space reads. PLoS Comput. Biol. 5, e1000386 (2009).
  30. Mortazavi, A., Williams, B.A., McCue, K., Schaeffer, L. & Wold, B. Mapping and quantifying mammalian transcriptomes by RNA-seq. Nat. Methods 5, 621628 (2008).
  31. Ding, Y., Chan, C.Y. & Lawrence, C.E. RNA secondary structure prediction by centroids in a Boltzmann weighted ensemble. RNA 11, 11571166 (2005).
  32. Saldanha, A.J. Java Treeview—extensible visualization of microarray data. Bioinformatics 20, 32463248 (2004).
  33. Darty, K., Denise, A. & Ponty, Y. VARNA: Interactive drawing and editing of the RNA secondary structure. Bioinformatics 25, 19741975 (2009).
  34. Quail, M.A. et al. A large genome center's improvements to the Illumina sequencing system. Nat. Methods 5, 10051010 (2008).
  35. Zwieb, C., van Nues, R.W., Rosenblad, M.A., Brown, J.D. & Samuelsson, T.A nomenclature for all signal recognition particle RNAs. RNA 11, 713 (2005).

Download references

Author information

Affiliations

  1. Howard Hughes Medical Institute, Stanford University School of Medicine, Stanford, California, USA.

    • Yue Wan,
    • Kun Qu &
    • Howard Y Chang
  2. Program in Epithelial Biology, Stanford University School of Medicine, Stanford, California, USA.

    • Yue Wan,
    • Kun Qu &
    • Howard Y Chang
  3. The Jackson Laboratory for Genomic Medicine, Farmington, Connecticut, USA.

    • Zhengqing Ouyang
  4. Department of Biomedical Engineering, University of Connecticut, Storrs, Connecticut, USA.

    • Zhengqing Ouyang

Contributions

Y.W. and H.Y.C. developed the protocol and designed the experiments; Y.W. performed the experiments; K.Q. analyzed the data; Z.O. developed the SeqFold pipeline; Y.W. and H.Y.C. wrote the paper with contributions from all authors.

Competing financial interests

Y.W. and H.Y.C. are named as inventors on a patent application filed on the PARS method by Weizmann Institute and Stanford University.

Corresponding author

Correspondence to:

Author details

Additional data