Skip to main content

Thank you for visiting You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

Hidden specificity in an apparently nonspecific RNA-binding protein


Nucleic-acid-binding proteins are generally viewed as either specific or nonspecific, depending on characteristics of their binding sites in DNA or RNA1,2. Most studies have focused on specific proteins, which identify cognate sites by binding with highest affinities to regions with defined signatures in sequence, structure or both1,2,3,4. Proteins that bind to sites devoid of defined sequence or structure signatures are considered nonspecific1,2,5. Substrate binding by these proteins is poorly understood, and it is not known to what extent seemingly nonspecific proteins discriminate between different binding sites, aside from those sequestered by nucleic acid structures6. Here we systematically examine substrate binding by the apparently nonspecific RNA-binding protein C5, and find clear discrimination between different binding site variants. C5 is the protein subunit of the transfer RNA processing ribonucleoprotein enzyme RNase P from Escherichia coli. The protein binds 5′ leaders of precursor tRNAs at a site without sequence or structure signatures. We measure functional binding of C5 to all possible sequence variants in its substrate binding site, using a high-throughput sequencing kinetics approach (HITS-KIN) that simultaneously follows processing of thousands of RNA species. C5 binds different substrate variants with affinities varying by orders of magnitude. The distribution of functional affinities of C5 for all substrate variants resembles affinity distributions of highly specific nucleic acid binding proteins. Unlike these specific proteins, C5 does not bind its physiological RNA targets with the highest affinity, but with affinities near the median of the distribution, a region that is not associated with a sequence signature. We delineate defined rules governing substrate recognition by C5, which reveal specificity that is hidden in cellular substrates for RNase P. Our findings suggest that apparently nonspecific and specific RNA-binding modes may not differ fundamentally, but represent distinct parts of common affinity distributions.

This is a preview of subscription content, access via your institution

Relevant articles

Open Access articles citing this article.

Access options

Buy article

Get time limited or full article access on ReadCube.


All prices are NET prices.

Figure 1: Processing of precursor tRNA with randomized leader sequences.
Figure 2: Discrimination of C5 between different precursor tRNAMet leader sequences.
Figure 3: Rules for sequence discrimination by C5.


  1. Gupta, A. & Gribskov, M. The role of RNA sequence and structure in RNA–protein interactions. J. Mol. Biol. 409, 574–587 (2011)

    Article  CAS  Google Scholar 

  2. von Hippel, P. H. & Berg, O. G. On the specificity of DNA–protein interactions. Proc. Natl Acad. Sci. USA 83, 1608–1612 (1986)

    Article  ADS  CAS  Google Scholar 

  3. Ray, D. et al. Rapid and systematic analysis of the RNA recognition specificities of RNA-binding proteins. Nature Biotechnol. 27, 667–670 (2009)

    Article  CAS  Google Scholar 

  4. Campbell, Z. T. et al. Cooperativity in RNA–protein interactions: global analysis of RNA binding specificity. Cell Rep. 1, 570–581 (2012)

    Article  CAS  Google Scholar 

  5. Singh, R. & Valcárcel, J. Building specificity with nonspecific RNA-binding proteins. Nature Struct. Mol. Biol. 12, 645–653 (2005)

    Article  CAS  Google Scholar 

  6. Zhuang, F., Fuchs, R. T., Sun, Z., Zheng, Y. & Robb, G. B. Structural bias in T4 RNA ligase-mediated 3′-adapter ligation. Nucleic Acids Res. 40, e54 (2012)

    Article  CAS  Google Scholar 

  7. Kurz, J. C. & Fierke, C. A. Ribonuclease P: a ribonucleoprotein enzyme. Curr. Opin. Chem. Biol. 4, 553–558 (2000)

    Article  CAS  Google Scholar 

  8. Smith, J. K., Hsieh, J. & Fierke, C. A. Importance of RNA–protein interactions in bacterial ribonuclease P structure and catalysis. Biopolymers 87, 329–338 (2007)

    Article  CAS  Google Scholar 

  9. Reiter, N. J. et al. Structure of a bacterial ribonuclease P holoenzyme in complex with tRNA. Nature 468, 784–789 (2010)

    Article  ADS  CAS  Google Scholar 

  10. Rueda, D., Hsieh, J., Day-Storms, J. J., Fierke, C. A. & Walter, N. G. The 5′ leader of precursor tRNAAsp bound to the Bacillus subtilis RNase P holoenzyme has an extended conformation. Biochemistry 44, 16130–16139 (2005)

    Article  CAS  Google Scholar 

  11. Herschlag, D. The role of induced fit and conformational changes of enzymes in specificity and catalysis. Bioorg. Chem. 16, 62–96 (1988)

    Article  CAS  Google Scholar 

  12. Fersht, A. R. Enzyme Structure and Mechanism (Freeman, 1985)

    Google Scholar 

  13. Cornish-Bowden, A. Enzyme specificity: its meaning in the general case. J. Theor. Biol. 108, 451–457 (1984)

    Article  CAS  Google Scholar 

  14. Cleland, W. W. in Isotope Effects in Chemistry and Biology (eds Kohen, A. & Limbach, H. H. ) 915–930 (CRC Press, 2006)

    Google Scholar 

  15. Schellenberger, V., Siegel, R. A. & Rutter, W. J. Analysis of enzyme specificity by multiple substrate kinetics. Biochemistry 32, 4344–4348 (1993)

    Article  CAS  Google Scholar 

  16. Lorenz, C. et al. Genomic SELEX for Hfq-binding RNAs identifies genomic aptamers predominantly in antisense transcripts. Nucleic Acids Res. 38, 3794–3808 (2010)

    Article  CAS  Google Scholar 

  17. Pitt, J. N. & Ferré-D'Amaré, A. R. Rapid construction of empirical RNA fitness landscapes. Science 330, 376–379 (2010)

    Article  ADS  CAS  Google Scholar 

  18. Badis, G. et al. Diversity and complexity in DNA recognition by transcription factors. Science 324, 1720–1723 (2009)

    Article  ADS  CAS  Google Scholar 

  19. Rowe, W. et al. Analysis of a complete DNA–protein affinity landscape. J. R. Soc. Interface 7, 397–408 (2010)

    Article  MathSciNet  CAS  Google Scholar 

  20. Nutiu, R. et al. Direct measurement of DNA affinity landscapes on a high-throughput sequencing instrument. Nature Biotechnol. 29, 659–664 (2011)

    Article  CAS  Google Scholar 

  21. Stormo, G. D. & Zhao, Y. Determining the specificity of protein–DNA interactions. Nature Rev. Genet. 11, 751–760 (2010)

    Article  CAS  Google Scholar 

  22. SantaLucia J. Jr & Turner, D. H. Measuring the thermodynamics of RNA secondary structure formation. Biopolymers 44, 309–319 (1997)

    Article  Google Scholar 

  23. Forsdyke, D. R. Calculation of folding energies of single-stranded nucleic acid sequences: conceptual issues. J. Theor. Biol. 248, 745–753 (2007)

    Article  MathSciNet  CAS  Google Scholar 

  24. Maerkl, S. J. & Quake, S. R. A systems approach to measuring the binding energy landscapes of transcription factors. Science 315, 233–237 (2007)

    Article  ADS  CAS  Google Scholar 

  25. Zhao, Y. & Stormo, G. Quantitative analysis demonstrates most transcription factors require only simple models of specificity. Nature Biotechnol. 29, 480–483 (2011)

    Article  CAS  Google Scholar 

  26. Sun, L., Campbell, F. E., Yandek, L. E. & Harris, M. E. Binding of C5 protein to P RNA enhances the rate constant for catalysis for P RNA processing of pre-tRNAs lacking a consensus (+ 1)/C(+ 72) pair. J. Mol. Biol. 395, 1019–1037 (2010)

    Article  CAS  Google Scholar 

  27. Leontis, N. B., Lescoute, A. & Westhof, E. The building blocks and motifs of RNA architecture. Curr. Opin. Struct. Biol. 16, 279–287 (2006)

    Article  CAS  Google Scholar 

  28. Snoussi, K. & Leroy, J. L. Imino proton exchange and base-pair kinetics in RNA duplexes. Biochemistry 40, 8898–8904 (2001)

    Article  CAS  Google Scholar 

  29. LaRiviere, F. J., Wolfson, A. D. & Uhlenbeck, O. C. Uniform binding of aminoacyl-tRNAs to elongation factor Tu by thermodynamic compensation. Science 294, 165–168 (2001)

    Article  ADS  CAS  Google Scholar 

  30. Stormo, G. D., Schneider, T. D. & Gold, L. Quantitative analysis of the relationship between nucleotide sequence and functional activity. Nucleic Acids Res. 14, 6661–6679 (1986)

    Article  CAS  Google Scholar 

  31. Guo, X. et al. RNA-dependent folding and stabilization of C5 protein during assembly of the E. coli RNase P holoenzyme. J. Mol. Biol. 360, 190–203 (2006)

    Article  CAS  Google Scholar 

  32. Christian, E. L., McPheeters, D. S. & Harris, M. E. Identification of individual nucleotides in the bacterial ribonuclease P ribozyme adjacent to the pre-tRNA cleavage site by short-range photo-cross-linking. Biochemistry 37, 17618–17628 (1998)

    Article  CAS  Google Scholar 

  33. Cha, S. Kinetics of enzyme reactions with competing alternative substrates. Mol. Pharmacol. 4, 621–629 (1968)

    CAS  PubMed  Google Scholar 

  34. Chan, P. P. & Lowe, T. M. GtRNAdb: a database of transfer RNA genes detected in genomic sequence. Nucleic Acids Res. 37, D93–D97 (2009)

    Article  CAS  Google Scholar 

  35. Northrop, D. B. Fitting enzyme-kinetic data to V/K. Anal. Biochem. 132, 457–461 (1983)

    Article  CAS  Google Scholar 

  36. Northrop, D. B. Rethinking fundamentals of enzyme action. Adv. Enzymol. 73, 25–55 (1999)

    CAS  PubMed  Google Scholar 

  37. Theil, H. Economic Forecasts and Policy (North Holland Publishing, 1961)

    Google Scholar 

  38. Bendel, R. B. & Afifi, A. A. Comparison of stopping rules in forward “stepwise” regression. J. Am. Stat. Assoc. 72, 46–53 (1977)

    MATH  Google Scholar 

Download references


We particularly thank T. Nilsen for comments on the manuscript. We are grateful to G. Stormo for discussion, M. Adams for support with the Illumina sequencing, and H.-C. Lin for technical assistance. This work was supported by the US National Institutes of Health (NIH) (GM067700, GM099720 and CSTA UL1RR024989 to E.J.; GM056740 and GM096000 to M.E.H.; T32 GM008056 to C.N.N.). U.-P.G. received a DFG fellowship.

Author information

Authors and Affiliations



U.-P.G., M.E.H. and E.J. designed the study. U.-P.G., L.E.Y., C.N.N. and F.E.C. performed the experiments. V.E.A. contributed to the development of the data analysis framework. D.A. developed and performed the modelling for binding models. U.-P.G., D.A., M.E.H. and E.J. analysed the data. U.P.G., M.E.H. and E.J. wrote the paper.

Corresponding authors

Correspondence to Michael E. Harris or Eckhard Jankowsky.

Ethics declarations

Competing interests

The authors declare no competing financial interests.

Extended data figures and tables

Extended Data Figure 1 C5 binding site in the 87 ptRNA leaders in E. coli.

ac, Alignment and sequence logos for the C5 binding site in all 87 ptRNA leaders encoded by E. coli. Binding of C5 to the consecutive ptRNA positions −3 to −8 is well established, based on a crystal structure9 and biochemical evidence10; that is, looping of bases seen for certain RNA- and DNA-binding proteins, does not occur with C5. Consistent with this idea, we did not detect any sequence motif with the MEME software, when including positions −1 to −10. a, Sequence alignment. Sequences were aligned with CLUSTAL. Coloured squares indicate the bases (C, blue; A, green; U, red; G, black). Anticodon, the anticodon recognized by the tRNA; tRNA#, the tRNA identification number; tRNA type, the amino acid. b, Sequence logo depicting the probability of any base at a given position, based on the alignment in a. The logo was generated with Weblogo. c, Sequence logo for the information content of the alignment in a. The logo was generated with Weblogo.

Extended Data Figure 2 Preparation of DNA libraries for Illumina sequencing.

a, BAR, the indexing barcode; NN, the degenerated barcode. For primer sequences see Methods. RT, reverse transcription. b, DNA libraries (PCR products, a) for samples at the time points indicated. Controls: lane 5, no RNA; lane 6, no reverse transcriptase. c, Read structure. Nucleotides 1 and 2 are degenerated barcode; nucleotides 3-5 are sample barcode (index tag); nucleotides 6–29 are additional leader sequence, nucleotides 30–35 are randomized leader sequence; nucleotides 38 onwards are tRNA.

Extended Data Figure 3 Multiple turnover reaction scheme.

E, enzyme; ES1...i, individual enzyme substrate complexes; K1...i, individual functional binding constants; S1....i, individual substrate variants; V1...i, individual reaction rate constants.

Extended Data Figure 4 Effect of the 21 nucleotide extension on ptRNA processing by RNase P.

a, Relative processing rate constants were measured for three sequence variants from different parts of the affinity distribution by PAGE. Reactions for each sequence variant were conducted in the presence of the randomized population (unlabelled) with equal amounts of substrate with (S/21) and without the 21-nucleotide extension (S/nL). The asterisk marks the position of the radiolabel at the 5′ end of the substrate. Reactions were conducted under the conditions described in the Methods. b, PAGE for the reaction of the reference sequence variant. The time point at 5 min is marked for reference. c, The effects of the 21-nucleotide extension on relative processing rate constants of the three indicated sequence variants. The position of each sequence variant in the affinity distribution of all sequence variants (Fig. 2d) is given for reference by the vertical line above the plot. The number indicates the factor (S/nL)/(S/21) by which the 21-nucleotide extension decreases the relative rate constant of the given sequence variant, given as average from three independent experiments. The horizontal line approximates the degree of the relative change. The 21-nucleotide extension decreases the observed for sequence variant (CTCCTG) by a factor of 2.3. For the genomically encoded leader sequence AAAAAG, the 21-nucleotide extension decreases krel for by a factor of 0.95; that is, the substrate with the extension reacts slightly faster than the substrate without extension. The fast reacting substrate (TTATAT) is also only minimally affected by the extension (0.92). Together, the data show only minor effects of the 21-nucleotide extension on the position of a given sequence variant in the affinity distribution.

Extended Data Figure 5 Processing of ptRNAMet(-3-8)by RNase P without C5.

Distribution of krel values for processing of ptRNAMet(-3-8) by RNase P without C5 (black line). Data were obtained analogously to those with C5. For comparison, the distribution of krel values with C5 is shown (red line).

Extended Data Figure 6 Sequence logos are only associated with the high-affinity tail of the distribution.

a, Plot of sequence variants ranked from weakest to tightest binder to the specific transcription factor Arid3a (Fig. 2d), based on data published previously18. To facilitate direct comparison to the six-nucleotide binding site of C5, only approximately half of all sequences are shown in the plot, and only six positions (positions two to seven, as indicated) of the eight-nucleotide binding site are shown. The position in the binding site is marked on the right. The brackets mark 0.1% of sequence variants (33 sequences) that bind tightest, fall into the medium, and bind weakest. Sequence logos show the information content in these sequences. The logos were generated with Weblogo. Sequence signatures of the tightest binding variants are highly enriched in physiological substrates of Arid3a18. b, Plot of sequence variants ranked from weakest to tightest binder to another specific transcription factor, Hnf4a, based on data published previously18. Approximately half of all sequences are shown in the plot, and six positions (positions two to seven, as indicated) of the eight-nucleotide binding site. Sequence signatures of the tightest binding variants are highly enriched in physiological substrates of Hnf4A18. c, Plot of sequence variants ranked from slowest to fastest reacting for C5 (Fig. 2e). The brackets mark 1% of sequence variants that react fastest, fall into the medium and react slowest. Sequence logos were generated as in a.

Extended Data Figure 7 Sequence determinants for substrate recognition by C5.

a, Model considering identity, but not position of a given base in the C5 binding site. Ranking of the four bases according to their potential to promote (positive linear coefficient) or decrease (negative linear coefficient) functional C5 binding. For calculation of linear coefficients, see the Methods. b, Position weight matrix (PWM) model considering both base identity and position in the binding site, but assuming independent contributions of each position. The plot shows the ranking of the bases according to their potential to promote (positive linear coefficient) or decrease (negative linear coefficient) functional C5 binding, relative to the reference sequence (AAAAAG, Fig. 1c). Bases are coloured as in a. For the calculation of linear coefficients, see the Methods.

Extended Data Figure 8 Neural network analysis.

Correlation between observed krel and values calculated with the best model obtained by neural network analysis (Methods).

Extended Data Table 1 Sequencing data.

Supplementary information

Supplementary Table 1

This file contains the Read Number for each sequence variant at each timepoint. N/A indicates reads below quality threshold for a variant. (XLSX 310 kb)

PowerPoint slides

Rights and permissions

Reprints and Permissions

About this article

Cite this article

Guenther, UP., Yandek, L., Niland, C. et al. Hidden specificity in an apparently nonspecific RNA-binding protein. Nature 502, 385–388 (2013).

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI:

This article is cited by


By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.


Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing