Hidden specificity in an apparently nonspecific RNA-binding protein

Guenther, Ulf-Peter; Yandek, Lindsay E.; Niland, Courtney N.; Campbell, Frank E.; Anderson, David; Anderson, Vernon E.; Harris, Michael E.; Jankowsky, Eckhard

doi:10.1038/nature12543

Letter
Published: 22 September 2013

Hidden specificity in an apparently nonspecific RNA-binding protein

Ulf-Peter Guenther^1,2,
Lindsay E. Yandek²,
Courtney N. Niland²,
Frank E. Campbell¹,
David Anderson³,
Vernon E. Anderson²,
Michael E. Harris² &
…
Eckhard Jankowsky^1,2

Nature volume 502, pages 385–388 (2013)Cite this article

20k Accesses
67 Citations
83 Altmetric
Metrics details

Subjects

Abstract

Nucleic-acid-binding proteins are generally viewed as either specific or nonspecific, depending on characteristics of their binding sites in DNA or RNA^1,2. Most studies have focused on specific proteins, which identify cognate sites by binding with highest affinities to regions with defined signatures in sequence, structure or both^1,2,3,4. Proteins that bind to sites devoid of defined sequence or structure signatures are considered nonspecific^1,2,5. Substrate binding by these proteins is poorly understood, and it is not known to what extent seemingly nonspecific proteins discriminate between different binding sites, aside from those sequestered by nucleic acid structures⁶. Here we systematically examine substrate binding by the apparently nonspecific RNA-binding protein C5, and find clear discrimination between different binding site variants. C5 is the protein subunit of the transfer RNA processing ribonucleoprotein enzyme RNase P from Escherichia coli. The protein binds 5′ leaders of precursor tRNAs at a site without sequence or structure signatures. We measure functional binding of C5 to all possible sequence variants in its substrate binding site, using a high-throughput sequencing kinetics approach (HITS-KIN) that simultaneously follows processing of thousands of RNA species. C5 binds different substrate variants with affinities varying by orders of magnitude. The distribution of functional affinities of C5 for all substrate variants resembles affinity distributions of highly specific nucleic acid binding proteins. Unlike these specific proteins, C5 does not bind its physiological RNA targets with the highest affinity, but with affinities near the median of the distribution, a region that is not associated with a sequence signature. We delineate defined rules governing substrate recognition by C5, which reveal specificity that is hidden in cellular substrates for RNase P. Our findings suggest that apparently nonspecific and specific RNA-binding modes may not differ fundamentally, but represent distinct parts of common affinity distributions.

Access through your institution

Buy or subscribe

This is a preview of subscription content, access via your institution

Access options

Access through your institution

Buy this article

Purchase on Springer Link
Instant access to full article PDF

Buy now

Prices may be subject to local taxes which are calculated during checkout

**Figure 1: Processing of precursor tRNA with randomized leader sequences.**

**Figure 2: Discrimination of C5 between different precursor tRNA^Met leader sequences.**

**Figure 3: Rules for sequence discrimination by C5.**

High-throughput biochemistry in RNA sequence space: predicting structure and function

Article 12 January 2023

Comprehensive sequence-to-function mapping of cofactor-dependent RNA catalysis in the glmS ribozyme

Article Open access 03 April 2020

RNA-binding proteins that lack canonical RNA-binding domains are rarely sequence-specific

Article Open access 31 March 2023

References

Gupta, A. & Gribskov, M. The role of RNA sequence and structure in RNA–protein interactions. J. Mol. Biol. 409, 574–587 (2011)
Article CAS Google Scholar
von Hippel, P. H. & Berg, O. G. On the specificity of DNA–protein interactions. Proc. Natl Acad. Sci. USA 83, 1608–1612 (1986)
Article ADS CAS Google Scholar
Ray, D. et al. Rapid and systematic analysis of the RNA recognition specificities of RNA-binding proteins. Nature Biotechnol. 27, 667–670 (2009)
Article CAS Google Scholar
Campbell, Z. T. et al. Cooperativity in RNA–protein interactions: global analysis of RNA binding specificity. Cell Rep. 1, 570–581 (2012)
Article CAS Google Scholar
Singh, R. & Valcárcel, J. Building specificity with nonspecific RNA-binding proteins. Nature Struct. Mol. Biol. 12, 645–653 (2005)
Article CAS Google Scholar
Zhuang, F., Fuchs, R. T., Sun, Z., Zheng, Y. & Robb, G. B. Structural bias in T4 RNA ligase-mediated 3′-adapter ligation. Nucleic Acids Res. 40, e54 (2012)
Article CAS Google Scholar
Kurz, J. C. & Fierke, C. A. Ribonuclease P: a ribonucleoprotein enzyme. Curr. Opin. Chem. Biol. 4, 553–558 (2000)
Article CAS Google Scholar
Smith, J. K., Hsieh, J. & Fierke, C. A. Importance of RNA–protein interactions in bacterial ribonuclease P structure and catalysis. Biopolymers 87, 329–338 (2007)
Article CAS Google Scholar
Reiter, N. J. et al. Structure of a bacterial ribonuclease P holoenzyme in complex with tRNA. Nature 468, 784–789 (2010)
Article ADS CAS Google Scholar
Rueda, D., Hsieh, J., Day-Storms, J. J., Fierke, C. A. & Walter, N. G. The 5′ leader of precursor tRNA^Asp bound to the Bacillus subtilis RNase P holoenzyme has an extended conformation. Biochemistry 44, 16130–16139 (2005)
Article CAS Google Scholar
Herschlag, D. The role of induced fit and conformational changes of enzymes in specificity and catalysis. Bioorg. Chem. 16, 62–96 (1988)
Article CAS Google Scholar
Fersht, A. R. Enzyme Structure and Mechanism (Freeman, 1985)
Google Scholar
Cornish-Bowden, A. Enzyme specificity: its meaning in the general case. J. Theor. Biol. 108, 451–457 (1984)
Article CAS Google Scholar
Cleland, W. W. in Isotope Effects in Chemistry and Biology (eds Kohen, A. & Limbach, H. H. ) 915–930 (CRC Press, 2006)
Google Scholar
Schellenberger, V., Siegel, R. A. & Rutter, W. J. Analysis of enzyme specificity by multiple substrate kinetics. Biochemistry 32, 4344–4348 (1993)
Article CAS Google Scholar
Lorenz, C. et al. Genomic SELEX for Hfq-binding RNAs identifies genomic aptamers predominantly in antisense transcripts. Nucleic Acids Res. 38, 3794–3808 (2010)
Article CAS Google Scholar
Pitt, J. N. & Ferré-D'Amaré, A. R. Rapid construction of empirical RNA fitness landscapes. Science 330, 376–379 (2010)
Article ADS CAS Google Scholar
Badis, G. et al. Diversity and complexity in DNA recognition by transcription factors. Science 324, 1720–1723 (2009)
Article ADS CAS Google Scholar
Rowe, W. et al. Analysis of a complete DNA–protein affinity landscape. J. R. Soc. Interface 7, 397–408 (2010)
Article MathSciNet CAS Google Scholar
Nutiu, R. et al. Direct measurement of DNA affinity landscapes on a high-throughput sequencing instrument. Nature Biotechnol. 29, 659–664 (2011)
Article CAS Google Scholar
Stormo, G. D. & Zhao, Y. Determining the specificity of protein–DNA interactions. Nature Rev. Genet. 11, 751–760 (2010)
Article CAS Google Scholar
SantaLucia J. Jr & Turner, D. H. Measuring the thermodynamics of RNA secondary structure formation. Biopolymers 44, 309–319 (1997)
Article Google Scholar
Forsdyke, D. R. Calculation of folding energies of single-stranded nucleic acid sequences: conceptual issues. J. Theor. Biol. 248, 745–753 (2007)
Article MathSciNet CAS Google Scholar
Maerkl, S. J. & Quake, S. R. A systems approach to measuring the binding energy landscapes of transcription factors. Science 315, 233–237 (2007)
Article ADS CAS Google Scholar
Zhao, Y. & Stormo, G. Quantitative analysis demonstrates most transcription factors require only simple models of specificity. Nature Biotechnol. 29, 480–483 (2011)
Article CAS Google Scholar
Sun, L., Campbell, F. E., Yandek, L. E. & Harris, M. E. Binding of C5 protein to P RNA enhances the rate constant for catalysis for P RNA processing of pre-tRNAs lacking a consensus (+ 1)/C(+ 72) pair. J. Mol. Biol. 395, 1019–1037 (2010)
Article CAS Google Scholar
Leontis, N. B., Lescoute, A. & Westhof, E. The building blocks and motifs of RNA architecture. Curr. Opin. Struct. Biol. 16, 279–287 (2006)
Article CAS Google Scholar
Snoussi, K. & Leroy, J. L. Imino proton exchange and base-pair kinetics in RNA duplexes. Biochemistry 40, 8898–8904 (2001)
Article CAS Google Scholar
LaRiviere, F. J., Wolfson, A. D. & Uhlenbeck, O. C. Uniform binding of aminoacyl-tRNAs to elongation factor Tu by thermodynamic compensation. Science 294, 165–168 (2001)
Article ADS CAS Google Scholar
Stormo, G. D., Schneider, T. D. & Gold, L. Quantitative analysis of the relationship between nucleotide sequence and functional activity. Nucleic Acids Res. 14, 6661–6679 (1986)
Article CAS Google Scholar
Guo, X. et al. RNA-dependent folding and stabilization of C5 protein during assembly of the E. coli RNase P holoenzyme. J. Mol. Biol. 360, 190–203 (2006)
Article CAS Google Scholar
Christian, E. L., McPheeters, D. S. & Harris, M. E. Identification of individual nucleotides in the bacterial ribonuclease P ribozyme adjacent to the pre-tRNA cleavage site by short-range photo-cross-linking. Biochemistry 37, 17618–17628 (1998)
Article CAS Google Scholar
Cha, S. Kinetics of enzyme reactions with competing alternative substrates. Mol. Pharmacol. 4, 621–629 (1968)
CAS PubMed Google Scholar
Chan, P. P. & Lowe, T. M. GtRNAdb: a database of transfer RNA genes detected in genomic sequence. Nucleic Acids Res. 37, D93–D97 (2009)
Article CAS Google Scholar
Northrop, D. B. Fitting enzyme-kinetic data to V/K. Anal. Biochem. 132, 457–461 (1983)
Article CAS Google Scholar
Northrop, D. B. Rethinking fundamentals of enzyme action. Adv. Enzymol. 73, 25–55 (1999)
CAS PubMed Google Scholar
Theil, H. Economic Forecasts and Policy (North Holland Publishing, 1961)
Google Scholar
Bendel, R. B. & Afifi, A. A. Comparison of stopping rules in forward “stepwise” regression. J. Am. Stat. Assoc. 72, 46–53 (1977)
MATH Google Scholar

Download references

Acknowledgements

We particularly thank T. Nilsen for comments on the manuscript. We are grateful to G. Stormo for discussion, M. Adams for support with the Illumina sequencing, and H.-C. Lin for technical assistance. This work was supported by the US National Institutes of Health (NIH) (GM067700, GM099720 and CSTA UL1RR024989 to E.J.; GM056740 and GM096000 to M.E.H.; T32 GM008056 to C.N.N.). U.-P.G. received a DFG fellowship.

Author information

Authors and Affiliations

Center for RNA Molecular Biology, Case Western Reserve University, Cleveland, 44106, Ohio, USA
Ulf-Peter Guenther, Frank E. Campbell & Eckhard Jankowsky
Department of Biochemistry, School of Medicine, Case Western Reserve University, Cleveland, 44106, Ohio, USA
Ulf-Peter Guenther, Lindsay E. Yandek, Courtney N. Niland, Vernon E. Anderson, Michael E. Harris & Eckhard Jankowsky
Department of Management, Zicklin School of Business, Baruch College, The City University of New York, 10010, New York, USA
David Anderson

Authors

Ulf-Peter Guenther
View author publications
You can also search for this author in PubMed Google Scholar
Lindsay E. Yandek
View author publications
You can also search for this author in PubMed Google Scholar
Courtney N. Niland
View author publications
You can also search for this author in PubMed Google Scholar
Frank E. Campbell
View author publications
You can also search for this author in PubMed Google Scholar
David Anderson
View author publications
You can also search for this author in PubMed Google Scholar
Vernon E. Anderson
View author publications
You can also search for this author in PubMed Google Scholar
Michael E. Harris
View author publications
You can also search for this author in PubMed Google Scholar
Eckhard Jankowsky
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

U.-P.G., M.E.H. and E.J. designed the study. U.-P.G., L.E.Y., C.N.N. and F.E.C. performed the experiments. V.E.A. contributed to the development of the data analysis framework. D.A. developed and performed the modelling for binding models. U.-P.G., D.A., M.E.H. and E.J. analysed the data. U.P.G., M.E.H. and E.J. wrote the paper.

Corresponding authors

Correspondence to Michael E. Harris or Eckhard Jankowsky.

Ethics declarations

Competing interests

The authors declare no competing financial interests.

Extended data figures and tables

Extended Data Figure 1 C5 binding site in the 87 ptRNA leaders in E. coli.

a–c, Alignment and sequence logos for the C5 binding site in all 87 ptRNA leaders encoded by E. coli. Binding of C5 to the consecutive ptRNA positions −3 to −8 is well established, based on a crystal structure⁹ and biochemical evidence¹⁰; that is, looping of bases seen for certain RNA- and DNA-binding proteins, does not occur with C5. Consistent with this idea, we did not detect any sequence motif with the MEME software, when including positions −1 to −10. a, Sequence alignment. Sequences were aligned with CLUSTAL. Coloured squares indicate the bases (C, blue; A, green; U, red; G, black). Anticodon, the anticodon recognized by the tRNA; tRNA#, the tRNA identification number; tRNA type, the amino acid. b, Sequence logo depicting the probability of any base at a given position, based on the alignment in a. The logo was generated with Weblogo. c, Sequence logo for the information content of the alignment in a. The logo was generated with Weblogo.

Extended Data Figure 2 Preparation of DNA libraries for Illumina sequencing.

a, BAR, the indexing barcode; NN, the degenerated barcode. For primer sequences see Methods. RT, reverse transcription. b, DNA libraries (PCR products, a) for samples at the time points indicated. Controls: lane 5, no RNA; lane 6, no reverse transcriptase. c, Read structure. Nucleotides 1 and 2 are degenerated barcode; nucleotides 3-5 are sample barcode (index tag); nucleotides 6–29 are additional leader sequence, nucleotides 30–35 are randomized leader sequence; nucleotides 38 onwards are tRNA.

Extended Data Figure 3 Multiple turnover reaction scheme.

E, enzyme; ES_1...i, individual enzyme substrate complexes; K_1...i, individual functional binding constants; S_1....i, individual substrate variants; V_1...i, individual reaction rate constants.

Extended Data Figure 4 Effect of the 21 nucleotide extension on ptRNA processing by RNase P.

a, Relative processing rate constants were measured for three sequence variants from different parts of the affinity distribution by PAGE. Reactions for each sequence variant were conducted in the presence of the randomized population (unlabelled) with equal amounts of substrate with (S/21) and without the 21-nucleotide extension (S/nL). The asterisk marks the position of the radiolabel at the 5′ end of the substrate. Reactions were conducted under the conditions described in the Methods. b, PAGE for the reaction of the reference sequence variant. The time point at 5 min is marked for reference. c, The effects of the 21-nucleotide extension on relative processing rate constants of the three indicated sequence variants. The position of each sequence variant in the affinity distribution of all sequence variants (Fig. 2d) is given for reference by the vertical line above the plot. The number indicates the factor (S/nL)/(S/21) by which the 21-nucleotide extension decreases the relative rate constant of the given sequence variant, given as average from three independent experiments. The horizontal line approximates the degree of the relative change. The 21-nucleotide extension decreases the observed for sequence variant (CTCCTG) by a factor of 2.3. For the genomically encoded leader sequence AAAAAG, the 21-nucleotide extension decreases k^rel for by a factor of 0.95; that is, the substrate with the extension reacts slightly faster than the substrate without extension. The fast reacting substrate (TTATAT) is also only minimally affected by the extension (0.92). Together, the data show only minor effects of the 21-nucleotide extension on the position of a given sequence variant in the affinity distribution.

Extended Data Figure 5 Processing of ptRNA^Met(-3-8)by RNase P without C5.

Distribution of k^rel values for processing of ptRNA^Met(-3-8) by RNase P without C5 (black line). Data were obtained analogously to those with C5. For comparison, the distribution of k^rel values with C5 is shown (red line).

Extended Data Figure 6 Sequence logos are only associated with the high-affinity tail of the distribution.

a, Plot of sequence variants ranked from weakest to tightest binder to the specific transcription factor Arid3a (Fig. 2d), based on data published previously¹⁸. To facilitate direct comparison to the six-nucleotide binding site of C5, only approximately half of all sequences are shown in the plot, and only six positions (positions two to seven, as indicated) of the eight-nucleotide binding site are shown. The position in the binding site is marked on the right. The brackets mark 0.1% of sequence variants (33 sequences) that bind tightest, fall into the medium, and bind weakest. Sequence logos show the information content in these sequences. The logos were generated with Weblogo. Sequence signatures of the tightest binding variants are highly enriched in physiological substrates of Arid3a¹⁸. b, Plot of sequence variants ranked from weakest to tightest binder to another specific transcription factor, Hnf4a, based on data published previously¹⁸. Approximately half of all sequences are shown in the plot, and six positions (positions two to seven, as indicated) of the eight-nucleotide binding site. Sequence signatures of the tightest binding variants are highly enriched in physiological substrates of Hnf4A¹⁸. c, Plot of sequence variants ranked from slowest to fastest reacting for C5 (Fig. 2e). The brackets mark 1% of sequence variants that react fastest, fall into the medium and react slowest. Sequence logos were generated as in a.

Extended Data Figure 7 Sequence determinants for substrate recognition by C5.

a, Model considering identity, but not position of a given base in the C5 binding site. Ranking of the four bases according to their potential to promote (positive linear coefficient) or decrease (negative linear coefficient) functional C5 binding. For calculation of linear coefficients, see the Methods. b, Position weight matrix (PWM) model considering both base identity and position in the binding site, but assuming independent contributions of each position. The plot shows the ranking of the bases according to their potential to promote (positive linear coefficient) or decrease (negative linear coefficient) functional C5 binding, relative to the reference sequence (AAAAAG, Fig. 1c). Bases are coloured as in a. For the calculation of linear coefficients, see the Methods.

Extended Data Figure 8 Neural network analysis.

Correlation between observed k^rel and values calculated with the best model obtained by neural network analysis (Methods).

Extended Data Table 1 Sequencing data.

Full size table

Supplementary information

Supplementary Table 1

This file contains the Read Number for each sequence variant at each timepoint. N/A indicates reads below quality threshold for a variant. (XLSX 310 kb)

PowerPoint slides

PowerPoint slide for Fig. 1

PowerPoint slide for Fig. 2

PowerPoint slide for Fig. 3

Rights and permissions

Reprints and permissions

About this article

Cite this article

Guenther, UP., Yandek, L., Niland, C. et al. Hidden specificity in an apparently nonspecific RNA-binding protein. Nature 502, 385–388 (2013). https://doi.org/10.1038/nature12543

Download citation

Received: 15 May 2013
Accepted: 14 August 2013
Published: 22 September 2013
Issue Date: 17 October 2013
DOI: https://doi.org/10.1038/nature12543

This article is cited by

Two distinct binding modes provide the RNA-binding protein RbFox with extraordinary sequence specificity
- Xuan Ye
- Wen Yang
- Fan Yang
Nature Communications (2023)
Massively parallel kinetic profiling of natural and engineered CRISPR nucleases
- Stephen K. Jones
- John A. Hawkins
- Ilya J. Finkelstein
Nature Biotechnology (2021)
Combinatorial recognition of clustered RNA elements by the multidomain RNA-binding protein IMP3
- Tim Schneider
- Lee-Hsueh Hung
- Albrecht Bindereif
Nature Communications (2019)
Global pairwise RNA interaction landscapes reveal core features of protein recognition
- Qin Zhou
- Nikesh Kunder
- Zachary T. Campbell
Nature Communications (2018)
NMR resonance assignments of RNase P protein from Thermotoga maritima
- Danyun Zeng
- Benjamin P. Brown
- Nicholas J. Reiter
Biomolecular NMR Assignments (2018)

Comments

By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.