Identification of important, functional small RNA (sRNA) species is currently hampered by the lack of reliable and sensitive methods to isolate and characterize them. We have developed a method, termed target-enrichment of sRNAs (TEsR), that enables targeted sequencing of rare sRNAs and diverse precursor and mature forms of sRNAs not detectable by current standard sRNA sequencing methods. It is based on the amplification of full-length sRNA molecules, production of biotinylated RNA probes, hybridization to one or multiple targeted RNAs, removal of nontargeted sRNAs and sequencing. By this approach, target sRNAs can be enriched by a factor of 500–30,000 while maintaining strand specificity. TEsR enriches for sRNAs irrespective of length or different molecular features, such as the presence or absence of a 5′ cap or of secondary structures or abundance levels. Moreover, TEsR allows the detection of the complete sequence (including sequence variants, and 5′ and 3′ ends) of precursors, as well as intermediate and mature forms, in a quantitative manner. A well-trained molecular biologist can complete the TEsR procedure, from RNA extraction to sequencing library preparation, within 4–6 d.
This is a preview of subscription content, access via your institution
Open Access articles citing this article.
Cell Discovery Open Access 19 April 2021
Inhibition of DNA damage response at telomeres improves the detrimental phenotypes of Hutchinson–Gilford Progeria Syndrome
Nature Communications Open Access 18 November 2019
Interplay and Targetome of the Two Conserved Cyanobacterial sRNAs Yfr1 and Yfr2 in Prochlorococcus MED4
Scientific Reports Open Access 04 October 2019
Access Nature and 54 other Nature Portfolio journals
Get Nature+, our best-value online-access subscription
$29.99 / 30 days
cancel any time
Subscribe to this journal
Receive 12 print issues and online access
$259.00 per year
only $21.58 per issue
Rent or buy this article
Prices vary by article type
Prices may be subject to local taxes which are calculated during checkout
Chugh, P. & Dittmer, D.P. Potential pitfalls in microRNA profiling. Wiley Interdiscip. Rev. RNA 3, 601–616 (2012).
Moldovan, L. et al. Methodological challenges in utilizing miRNAs as circulating biomarkers. J. Cell. Mol. Med. 18, 371–390 (2014).
Raabe, C.A., Tang, T.H., Brosius, J. & Rozhdestvensky, T.S. Biases in small RNA deep sequencing data. Nucleic Acids Res. 42, 1414–1426 (2014).
Chevillet, J.R. et al. Quantitative and stoichiometric analysis of the microRNA content of exosomes. Proc. Natl. Acad. Sci. USA 111, 14888–14893 (2014).
Neilsen, C.T., Goodall, G.J. & Bracken, C.P. IsomiRs – the overlooked repertoire in the dynamic microRNAome. Trends Genet. 28, 544–549 (2012).
Vickers, K.C., Roteta, L.A., Hucheson-Dilks, H., Han, L. & Guo, Y. Mining diverse small RNA species in the deep transcriptome. Trends Biochem. Sci. 40, 4–7 (2015).
Borges, F. & Martienssen, R.A. The expanding world of small RNAs in plants. Nat. Rev. Mol. Cell Biol. 16, 727–741 (2015).
Castel, S.E. & Martienssen, R.A. RNA interference in the nucleus: roles for small RNAs in transcription, epigenetics and beyond. Nat. Rev. Genet. 14, 100–112 (2013).
Sharma, U. et al. Biogenesis and function of tRNA fragments during sperm maturation and fertilization in mammals. Science 351, 391–396 (2016).
Ghosheh, Y. et al. Characterization of piRNAs across postnatal development in mouse brain. Sci. Rep. 6, 25039 (2016).
de Rie, D. et al. An integrated expression atlas of miRNAs and their promoters in human and mouse. Nat. Biotechnol. 35, 872–878 (2017).
Haussecker, D. et al. Capped small RNAs and MOV10 in human hepatitis delta virus replication. Nat. Struct. Mol. Biol. 15, 714–721 (2008).
Xie, M. et al. Mammalian 5′-capped microRNA precursors that generate a single microRNA. Cell 155, 1568–1580 (2013).
Hua, X., Chen, L., Wang, J., Li, J. & Wingender, E. Identifying cell-specific microRNA transcriptional start sites. Bioinformatics 32, 2403–2410 (2016).
Starega-Roslan, J., Galka-Marciniak, P. & Krzyzosiak, W.J. Nucleotide sequence of miRNA precursor contributes to cleavage site selection by Dicer. Nucleic Acids Res. 43, 10939–10951 (2015).
Fromm, B. et al. A uniform system for the annotation of vertebrate microRNA genes and the evolution of the human microRNAome. Annu. Rev. Genet. 49, 213–242 (2015).
Peng, F. et al. Identification of differentially expressed miRNAs in individual breast cancer patient and application in personalized medicine. Oncogenesis 5, e194 (2016).
Tkach, M. & Théry, C. Communication by extracellular vesicles: where we are and where we need to go. Cell 164, 1226–1232 (2016).
Lam, J.K.W., Chow, M.Y.T., Zhang, Y. & Leung, S.W.S. siRNA versus miRNA as therapeutics for gene silencing. Mol. Ther. Nucleic Acids 4, e252 (2015).
Burnett, J.C. & Rossi, J.J. RNA-based therapeutics-current progress and future prospects. Chem. Biol. 19, 60–71 (2012).
Bussotti, G. et al. Improved definition of the mouse transcriptome via targeted RNA sequencing. Genome Res. 26, 705–716 (2016).
Huang, X. et al. Characterization of human plasma-derived exosomal RNAs by deep sequencing. BMC Genomics 14, 319 (2013).
Clark, M.B. et al. Quantitative gene profiling of long noncoding RNAs with targeted RNA sequencing. Nat. Methods 12, 339–342 (2015).
Ramos, A.D. et al. Integration of genome-wide approaches identifies lncRNAs of adult neural stem cells and their progeny in vivo. Cell Stem Cell 12, 616–628 (2013).
Tourancheau, A. et al. Unravelling the transcriptomic landscape of the major phase II UDP-glucuronosyltransferase drug metabolizing pathway using targeted RNA sequencing. Pharmacogenomics J. 16, 60–70 (2016).
Mercer, T.R. et al. Genome-wide discovery of human splicing branchpoints. Genome Res. 25, 290–303 (2015).
Mercer, T.R. et al. Targeted sequencing for gene discovery and quantification using RNA CaptureSeq. Nat. Protoc. 9, 989–1009 (2014).
Válóczi, A. et al. Sensitive and specific detection of microRNAs by northern blot analysis using LNA-modified oligonucleotide probes. Nucleic Acids Res. 32, e175 (2004).
Rossiello, F. et al. Inhibition of DNA damage response activation at dysfunctional telomeres by modulation of telomeric DNA damage response RNAs. Nat. Commun. 8, 13890 (2017).
Wernersson, R., Juncker, A.S. & Nielsen, H.B. Probe selection for DNA microarrays using OligoWiz. Nat. Protoc. 2, 2677–2691 (2007).
Vallone, P.M. & Butler, J.M. AutoDimer: a screening tool for primer-dimer and hairpin structures. BioTechniques 37, 226–231 (2004).
Kozomara, A. & Griffiths-Jones, S. miRBase: integrating microRNA annotation and deep-sequencing data. Nucleic Acids Res. 39, D152–D157 (2011).
We thank all P.C. and F.d.d.F. laboratory members, especially G. Pascarella, K. Hashimoto and A. Bonetti, for insightful discussions. F.d.d.F.'s laboratory is supported by Associazione Italiana per la Ricerca sul Cancro (application 12971), AriSLA (project 'DDRNA and ALS'), Associazione Italiana per la Ricerca sul Cancro (AIRC; application 12971), Worldwide Cancer Research (Association for International Cancer Research (AICR), Research Infrastructure Fund (RIF) 14-1331), Research EPIGEN, Fondazione Cariplo (grants 2014-1215 and 2014-0812), Progetti di Ricerca di Interesse Nazionale (PRIN) 2010–2011, Fondazione Telethon (GGP12059), the Human Frontier Science Program (contract RGP 0014/2012) and a European Research Council advanced grant (322726). P.C.'s lab is supported by a Research Grant from the Japanese Ministry of Education, Culture, Sports, Science and Technology (MEXT) to the RIKEN Center for Life Science Technologies. J.A. is supported by Marie Curie Initial Training Networks (FP7 PEOPLE 2012 ITN (CodeAge project no: 316354)). F.R. is supported by Fondazione Italiana per la Ricerca sul Cancro (FIRC, application number 12476).
The authors declare no competing financial interests.
Integrated supplementary information
(a) Overview of the approach. The sRNA sample is ligated to a 3’ DNA linker, which then acts as a priming site for cDNA synthesis primer. An additional 5’ RNA adapter is ligated to the 3’ linker ligated RNA sample, prior to cDNA synthesis. After cDNA synthesis, two options are available for specific qPCR of the target sRNA. Each option uses a primer specific to the sRNA sequence and another primer designed to bind to either the 3’ adapter or the 5’ adapter, allowing flexibility in primer design. (b) Example of the application of this qPCR method for the miR29b1 as a part of quality checking during the TEsR procedure. A universal reverse primer and a forward primer specific to miR29b1 were used. Presence of qRT-PCR products was confirmed by agarose gel, and quantitative enrichment of the library was confirmed by qPCR of mir29b1 in enriched and unenriched libraries.
Specific probes (RNA baits) were designed for three small RNA targets, including miR29b1, Snord68, and Snord70. Each sRNA had one probe, complementary to the full length of the annotated mature sRNA gene. The exact locations on the mouse mm10 reference genome are shown.
The secondary structures of Snord68, Snord69, and miR29b1 were predicted by the Vienna RNA Fold webserver (V2.3.1, Institute for Theoretical Chemistry – University of Vienna; available at http://rna.tbi.univie.ac.at/cgi-bin/RNAWebSuite/RNAfold.cgi) based on their primary sequences and estimated minimum free energy structure. Color bar shows base-pair probability of each nucleotide. The miR29b1is shown as the precursor form, with the presence of both 3’ and 5’ mature miRNAs and the stem-loop.
Supplementary Figure 4 Strategy for designing and applying multiple probes in a single capture reaction.
(a) An example of designing tiling probes to be pooled together into one capture tube to detect multiple RNA products from a 1.2 Kb region. At the start, 36 tiling probes were selected and scored by applying the OligoWiz1 probe design software. Each probe was scored based on potential cross-hybridization (self-dimer2), melting temperature, potential folding (stem-loop), relative position to the target, and sequence complexity. Top probes selected by OligoWiz were filtered by AutoDimer to remove probes with high potential for cross-reactivity. The final set of probes can be pooled together to capture multiple RNA targets. b) Results for capturing of probes described in (a). Both blank regions (i.e. with probes but no RNA captured) and ultra-deep (over 2.2 million mapped reads) region were detected, suggesting that real transcription of novel RNA can be reliably detected by the TEsR approach. c) Capture of GAPDH from the total RNA sample of the human T19 cell line. The RNA was prepared by the mirVana microRNA Isolation Kit (Ambion) according to the manufacturer’s instructions. Before TEsR capture, 2.5 μg purified total RNA sample was treated to remove ribosomal RNAs by using Ribo-Zero Gold rRNA removal method (Epicentre). Both probes shown in panel 5 were applied in one capturing reaction. Probe 1 was designed to span an exon-exon junction. All of the capture mapped reads shown in panel 4 lie within the exons of the sense GAPDH and the antisense lncRNA. The numbers of reads mapped to the sense (green) and the antisense (purple) RNAs are shown separately in panel 6, with 48512 and 7943 tags per million respectively. Panel 1 shows chromosome 12 ideogram with the GAPDH locus marked by a red bar. Panel 2 and 3 show respectively reference genes and transcripts from Entrez gene, hg19 annotation and Gencode v19 annotation, with antisense long noncoding RNA (lncRNA NST00000602946.1) in purple, and sense messenger RNA for GAPDH in green.
(a) Validation of RNA products post reverse transcription. Baits (Bait) were run in a 1% Agarose gel. RNase-treated baits (PCR temp) were used as a positive control, proving that the signal obtained was mainly coming from RNA molecules. (b) Testing the production of adenylated 3’ adapter. ‘Linker’ lane is the control with template only, without adenylation; and the ‘App Linker’ indicates the adenylated adapter. The quality of adenylated adapter product can be judged based on the percent of adenylated vs not adenylated. (c) The top and low panels show the bionanalyzer profiles of precaptured libraries before and after the PCR step respectively. Green boxes indicate the lower Bioanalyser marker (1) and other oligos (2). In particular, (2) -in the upper panel- shows the oligos used for ligation to RNA as well as those used for reverse transcription, however the contamination of undesired oligos (like adapter dimers) after PCR amplification of cDNA is markedly reduced as seen in (2) in the lower panel. Blue boxes show library size distributions and Purple boxes show the upper Bioanalyser marker. The lower bioanalyser profile represents a typical profile described at Step 76. The peak around 100bp corresponds to the precaptured pool of RNAs of microRNA-like length (consider that both 3’ and 5’ linkers together account for approximately 70bp). The other (lower abundant) peak at around 150 bp corresponds to other sRNAs present in the starting RNA samples.
Supplementary Figure 6 Protocol steps: high-sensitivity Bioanalyzer profiles of post-captured final libraries.
(a) High Sensitivity Bioanalyser profiles of post-captured libraries of 4 different biological replicates (lanes 1-4). On the left, 1 -in green- represents the lower Bioanalyser marker and 2 -in purple- shows the upper Bioanalyser marker. On the right, lane number 5 shows the negative control PCR, where no template was used. L represents the dsDNA ladder used. (b) Single nucleotide resolution scheme of post-captured libraries. N represents a given nucleotide according to the sequence of the transcriptome enriched for. X represents a given nucleotide according to the complete list of index sequences provided by ScriptSeq™ Index PCR Primers.
(a) Mapping of 7 captured libraries, including 2 biological replicates for two different treatments of mouse embryonic fibroblast cell lines (MEF): MEFmmS1 and MEFmmS2, MEFpmS1 and MEFpmS2, and one pair of technical replicates for mouse embryonic stem cell lines MESrep1 and MESrep2, and a negative control library without biotinylated probes (MESnobait). Mapping data is shown for each library, in which red color shows number of mapped reads, while blue colors shows unmaped reads, and yellow color shows reads mapped to ribosomal RNA genes, and purple shows low quality reads (i.e. reads with ambiguous N bases). The mapping data show that the capture protocol could remove most of ribosomal RNAs and most sequencing reads are mappable to the reference genome. (b) The base composition of each sequencing round (total 75 rounds for the first read, as shown by the horizontal axis from 1 to 75). Sequencing was performed using a MiSeq 150-cycle kit. Colour represents four A, C, G, and T nucleotides (in sequential order from bottom to the top) and an ambiguous class for N nucleotides (black colour). The vertical axis shows counts for each nucleotide type from each of the 75 sequencing rounds. In every sequencing round, all four nucleotide classes were detected.
Supplementary Figure 8 Detection of functional sRNA by TEsR—the rare telomeric DNA damage response RNA (tDDRNA).
(a) TEsR generated deep sequencing information for the rare telomeric DNA damage response RNA (tDDRNA) at thousands of reads. To avoid spurious reads contamination, only reads that had exact sequence match to the expected telomeric repeats (TAAGGG) were counted, and telomeric reads with mismatches or soft clipped portions were removed with ad hoc R scripts. The majority of reads were from 18-22nt in size. (b) In contrast, tDDRNAs were hardly detected (4 reads over a total of 207 million reads) by deep HiSeq sequencing of small RNA samples or total RNA samples.
1. Wernersson, R., Juncker, A. S. & Nielsen, H. B. Probe selection for DNA microarrays using OligoWiz. Nat. Protocols 2, 2677–2691, (2007).
2. Vallone, P. M. & Butler, J. M. AutoDimer: a screening tool for primer-dimer and hairpin structures. BioTechniques 37, 226–231, (2004).
About this article
Cite this article
Nguyen, Q., Aguado, J., Iannelli, F. et al. Target-enrichment sequencing for detailed characterization of small RNAs. Nat Protoc 13, 768–786 (2018). https://doi.org/10.1038/nprot.2018.001