FragSeq: transcriptome-wide RNA structure probing using high-throughput sequencing

Published online:


Classical approaches to determine structures of noncoding RNA (ncRNA) probed only one RNA at a time with enzymes and chemicals, using gel electrophoresis to identify reactive positions. To accelerate RNA structure inference, we developed fragmentation sequencing (FragSeq), a high-throughput RNA structure probing method that uses high-throughput RNA sequencing of fragments generated by digestion with nuclease P1, which specifically cleaves single-stranded nucleic acids. In experiments probing the entire mouse nuclear transcriptome, we accurately and simultaneously mapped single-stranded RNA regions in multiple ncRNAs with known structure. We probed in two cell types to verify reproducibility. We also identified and experimentally validated structured regions in ncRNAs with, to our knowledge, no previously reported probing data.

  • Subscribe to Nature Methods for full access:



Additional access options:

Already a subscriber?  Log in  now or  Register  for online access.


Gene Expression Omnibus


  1. 1.

    , & (eds). The RNA World 3rd edn. (Cold Spring Harbor Laboratory Press, Cold Spring Harbor, New York, USA, 2005).

  2. 2.

    Affymetrix/Cold Spring Harbor Laboratory ENCODE Transcriptome Project. Post-transcriptional processing generates a diversity of 5′-modified long and short RNAs. Nature 457, 1028–1032 (2009).

  3. 3.

    et al. Chromatin signature reveals over a thousand highly conserved large non-coding RNAs in mammals. Nature 458, 223–227 (2009).

  4. 4.

    microRNAs: tiny regulators with great potential. Cell 107, 823–826 (2001).

  5. 5.

    et al. RNA maps reveal new RNA classes and a possible function for pervasive transcription. Science 316, 1484–1488 (2007).

  6. 6.

    Enzymatic approaches to probing of RNA secondary and tertiary structure. Methods Enzymol. 2, 192–212 (1989).

  7. 7.

    & SHAPE-directed RNA secondary structure prediction. Methods 52, 150–158 (2010).

  8. 8.

    , & Computational methods in noncoding RNA research. J. Math. Biol. 56, 15–49 (2008).

  9. 9.

    et al. Genome-wide mapping of DNase hypersensitive sites using massively parallel signature sequencing (MPSS). Genome Res. 16, 123–131 (2006).

  10. 10.

    , , , & Conversion of embryonic stem cells into neuroectodermal precursors in adherent monoculture. Nat. Biotechnol. 21, 183–186 (2003).

  11. 11.

    & Single-strand-specific nucleases. FEMS Microbiol. Rev. 26, 457–491 (2003).

  12. 12.

    & 3′-phosphatase activity in T4 polynucleotide kinase. Biochemistry 16, 5120–5126 (1977).

  13. 13.

    , , , & Recognition of single-stranded DNA by nuclease P1: high resolution crystal structures of complexes with substrate analogs. Proteins 32, 414–424 (1998).

  14. 14.

    & P1 nuclease cleavage is dependent on length of the mismatches in DNA. DNA Repair (Amst.) 7, 1384–1391 (2008).

  15. 15.

    & Structural analyses of the human U3 ribonucleoprotein particle reveal a conserved sequence available for base pairing with pre-rRNA. Mol. Cell. Biol. 7, 2899–2913 (1987).

  16. 16.

    , , , & Direct probing of RNA structure and RNA-protein interactions in purified HeLa cell's and yeast spliceosomal U4/U6.U5 tri-snRNP particles. J. Mol. Biol. 317, 631–649 (2002).

  17. 17.

    et al. Role of pre-rRNA base pairing and 80S complex formation in subnucleolar localization of the U3 snoRNP. Mol. Cell. Biol. 24, 8600–8610 (2004).

  18. 18.

    , , & The U3 small nucleolar ribonucleoprotein functions in the first step of preribosomal RNA processing. Cell 60, 897–908 (1990).

  19. 19.

    & Disruption of U8 nucleolar snRNA inhibits 5.8S and 28S rRNA processing in the Xenopus oocyte. Cell 73, 1233–1245 (1993).

  20. 20.

    , & Requirement for intron-encoded U22 small nucleolar RNA in 18S ribosomal RNA maturation. Science 266, 1558–1561 (1994).

  21. 21.

    , & RNA-Seq: a revolutionary tool for transcriptomics. Nat. Rev. Genet. 10, 57–63 (2009).

  22. 22.

    et al. Genome-wide measurement of RNA secondary structure in yeast. Nature 467, 103–107 (2010).

  23. 23.

    & RNAstructure: software for RNA secondary structure prediction and analysis. BMC Bioinformatics 11, 129 (2010).

  24. 24.

    & Gene regulation by riboswitches. Nat. Rev. Mol. Cell Biol. 5, 451–463 (2004).

  25. 25.

    , & Nuclease protection of RNAs containing site-specific labels: a rapid method for mapping RNA-protein interactions. RNA 6, 1905–1909 (2000).

  26. 26.

    , , , & Site-specific ribose methylation of preribosomal RNA: a novel function for small nucleolar RNAs. Cell 85, 1077–1088 (1996).

  27. 27.

    , , , & Efficient method to generate single-copy transgenic mice by site-specific integration in embryonic stem cells. Genesis 44, 23–28 (2006).

  28. 28.

    Gene trapping methods for the identification and functional analysis of cell surface proteins in mice. Methods Enzymol. 328, 592–615 (2000).

  29. 29.

    , , , & Trinucleotide repeat system for sequence specificity analysis of RNA structure probing reagents. Anal. Biochem. 402, 40–46 (2010).

  30. 30.

    , , , & Mapping and quantifying mammalian transcriptomes by RNA-Seq. Nat. Methods 5, 621–628 (2008).

Download references


A.V.U. was supported in part by US National Institutes of Health (NIH) bioinformatics training grant 1 T32 GM070386-01 and by a US National Science Foundation Graduate Research fellowship. S.K. was supported in part by NIH National Human Genome Research Institute grant U41 HG004568-01. C.S.O. was supported by California Institute for Regenerative Medicine training grant T3-00006. This study was funded in part by NIH R01HG004002 to D.H.M. and NIH 1R03DA026061-01 to S.R.S. We thank D. Bernick, S. Kuersten and O. Uhlenbeck for helpful discussions; Y. Ponty for adding the feature to display enzymatic/chemical modifications to VARNA, the program used to visualize our probing data; E. Farias-Hesson and N. Pourmand of the University of California Santa Cruz Genome Sequencing Center for preparing samples; workers at ABI for carrying out the sequencing; and M. Storm and F. Ng of ABI for facilitating that sequencing run.

Author information

Author notes

    • Jason G Underwood
    •  & Andrew V Uzilov

    These authors contributed equally to this work.


  1. Howard Hughes Medical Institute, University of California Santa Cruz, Santa Cruz, California, USA.

    • Jason G Underwood
    • , Sofie R Salama
    •  & David Haussler
  2. Center for Biomolecular Science and Engineering, Baskin School of Engineering, University of California Santa Cruz, Santa Cruz, California, USA.

    • Jason G Underwood
    • , Sofie R Salama
    •  & David Haussler
  3. Department of Biomolecular Engineering, Baskin School of Engineering, University of California Santa Cruz, Santa Cruz, California, USA.

    • Andrew V Uzilov
    • , Sol Katzman
    • , Courtney S Onodera
    • , Todd M Lowe
    • , Sofie R Salama
    •  & David Haussler
  4. Department of Physics and Astronomy, University of Rochester, Rochester, New York, USA.

    • Jacob E Mainzer
  5. Department of Biochemistry and Biophysics, University of Rochester Medical Center, Rochester, New York, USA.

    • David H Mathews
  6. Present affiliations: Pacific Biosciences, Inc., Menlo Park, California, USA (J.G.U.) and Center for Biomolecular Science and Engineering, Baskin School of Engineering, University of California Santa Cruz, Santa Cruz, California, USA (S.K.).

    • Jason G Underwood
    •  & Sol Katzman


  1. Search for Jason G Underwood in:

  2. Search for Andrew V Uzilov in:

  3. Search for Sol Katzman in:

  4. Search for Courtney S Onodera in:

  5. Search for Jacob E Mainzer in:

  6. Search for David H Mathews in:

  7. Search for Todd M Lowe in:

  8. Search for Sofie R Salama in:

  9. Search for David Haussler in:


J.G.U. designed and carried out the experiments. A.V.U. designed and carried out the bioinformatics analysis, except for preparing the read mappings, which S.K. did, with C.S.O. contributing data. J.E.M. programmed additional features in the RNAstructure software. J.G.U., A.V.U. and S.R.S. wrote the manuscript. S.R.S., D.H.M., T.M.L. and D.H. directed the research.

Competing interests

The authors declare no competing financial interests.

Corresponding author

Correspondence to Sofie R Salama.

Supplementary information

PDF files

  1. 1.

    Supplementary Text and Figures

    Supplementary Figures 1–12, Supplementary Table 1, Supplementary Notes 1–3, Supplementary Discussion

Zip files

  1. 1.

    Supplementary Data 1

    Stockholm-format (machine-readable) multiple alignment of U15b C/D box snoRNA homologs, containing structure models that were evaluated. See file for detailed comments.

  2. 2.

    Supplementary Data 2

    Stockholm-format (machine-readable) multiple alignment of U22 C/D box snoRNA homologs, containing structure models that were evaluated. See file for detailed comments.

  3. 3.

    Supplementary Data 3

    Stockholm-format (machine-readable) multiple alignment of U97 C/D box snoRNA homologs, containing structure models that were evaluated. See file for detailed comments.

  4. 4.

    Supplementary Data 4

    FASTA-format file of sequences used for filtering out sequencing reads prior to mapping to genome (see Methods).

  5. 5.

    Supplementary Data 5

    Six-column BED-format file containing genomic coordinates (mm9 genome assembly) of all RNAs examined in this study. This can be uploaded to the UCSC Genome Browser as a custom track.

  6. 6.

    Supplementary Software

    FragSeq algorithm implementation, configuration files and Readme. All FragSeq algorithm software, scripts and configuration files needed to reproduce the analysis in this paper are provided. The Readme file contains complete instructions on how to rerun our analysis. However, read mappings are not provided owing to their large size and have to be downloaded from the GEO (accession number is listed in the paper; see the Readme file). The script is also provided (Supplementary Note 3).