Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Article
  • Published:

The kinetic landscape of an RNA-binding protein in cells

Abstract

Gene expression in higher eukaryotic cells orchestrates interactions between thousands of RNA-binding proteins (RBPs) and tens of thousands of RNAs1. The kinetics by which RBPs bind to and dissociate from their RNA sites are critical for the coordination of cellular RNA–protein interactions2. However, these kinetic parameters have not been experimentally measured in cells. Here we show that time-resolved RNA–protein cross-linking with a pulsed femtosecond ultraviolet laser, followed by immunoprecipitation and high-throughput sequencing, allows the determination of binding and dissociation kinetics of the RBP DAZL for thousands of individual RNA-binding sites in cells. This kinetic cross-linking and immunoprecipitation (KIN-CLIP) approach reveals that DAZL resides at individual binding sites for time periods of only seconds or shorter, whereas the binding sites remain DAZL-free for markedly longer. The data also indicate that DAZL binds to many RNAs in clusters of multiple proximal sites. The effect of DAZL on mRNA levels and ribosome association correlates with the cumulative probability of DAZL binding in these clusters. Integrating kinetic data with mRNA features quantitatively connects DAZL–RNA binding to DAZL function. Our results show how kinetic parameters for RNA–protein interactions can be measured in cells, and how these data link RBP–RNA binding to the cellular function of RBPs.

This is a preview of subscription content, access via your institution

Access options

Rent or buy this article

Prices vary by article type

from$1.95

to$39.95

Prices may be subject to local taxes which are calculated during checkout

Fig. 1: Time-resolved RNA-protein cross-linking with a fs laser.
Fig. 2: Kinetics of DAZL-RNA binding and dissociation in cells.
Fig. 3: Clustering of DAZL-binding sites in 3′-UTRs.
Fig. 4: Link between DAZL–RNA binding and the effects of DAZL on mRNA function.

Similar content being viewed by others

Data availability

Sequencing data are available at the NCBI Gene Expression Omnibus with accession number GSE150214Source data are provided with this paper.

Code availability

Customized R and Python scripts are available at https://github.com/deebratforlife/KIN-CLIP.

References

  1. Gerstberger, S., Hafner, M. & Tuschl, T. A census of human RNA-binding proteins. Nat. Rev. Genet. 15, 829–845 (2014).

    Article  CAS  PubMed  Google Scholar 

  2. Licatalosi, D. D., Ye, X. & Jankowsky, E. Approaches for measuring the dynamics of RNA-protein interactions. Wiley Interdiscip. Rev. RNA 11, e1565 (2020).

    Article  CAS  PubMed  Google Scholar 

  3. Corley, M., Burns, M. C. & Yeo, G. W. How RNA-binding proteins interact with RNA: molecules and mechanisms. Mol. Cell 78, 9–29 (2020).

    Article  CAS  PubMed  Google Scholar 

  4. Ule, J., Hwang, H. W. & Darnell, R. B. The future of cross-linking and immunoprecipitation (CLIP). Cold Spring Harb. Perspect. Biol. 10, a032243 (2018).

    Article  PubMed  PubMed Central  Google Scholar 

  5. Van Nostrand, E. L. et al. Principles of RNA processing from analysis of enhanced CLIP maps for 150 RNA binding proteins. Genome Biol. 21, 90 (2020).

    Article  PubMed  PubMed Central  Google Scholar 

  6. Gleitsman, K. R., Sengupta, R. N. & Herschlag, D. Slow molecular recognition by RNA. RNA 23, 1745–1753 (2017).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  7. Jarmoskaite, I. et al. A quantitative and predictive model for RNA binding by human Pumilio proteins. Mol. Cell 74, 966–981 (2019).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  8. Sutandy, F. X. R. et al. In vitro iCLIP-based modeling uncovers how the splicing factor U2AF2 relies on regulation by cofactors. Genome Res. 28, 699–713 (2018).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  9. Hockensmith, J. W., Kubasek, W. L., Vorachek, W. R. & von Hippel, P. H. Laser cross-linking of nucleic acids to proteins. Methodology and first applications to the phage T4 DNA replication system. J. Biol. Chem. 261, 3512–3518 (1986).

    Article  CAS  PubMed  Google Scholar 

  10. Pashev, I. G., Dimitrov, S. I. & Angelov, D. Crosslinking proteins to nucleic acids by ultraviolet laser irradiation. Trends Biochem. Sci. 16, 323–326 (1991).

    Article  CAS  PubMed  Google Scholar 

  11. Russmann, C. et al. Crosslinking of progesterone receptor to DNA using tuneable nanosecond, picosecond and femtosecond UV laser pulses. Nucleic Acids Res. 25, 2478–2484 (1997).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  12. Steube, A., Schenk, T., Tretyakov, A. & Saluz, H. P. High-intensity UV laser ChIP–seq for the study of protein–DNA interactions in living cells. Nat. Commun. 8, 1303 (2017).

    Article  ADS  PubMed  PubMed Central  Google Scholar 

  13. Budowsky, E. I., Axentyeva, M. S., Abdurashidova, G. G., Simukova, N. A. & Rubin, L. B. Induction of polynucleotide-protein cross-linkages by ultraviolet irradiation. Peculiarities of the high-intensity laser pulse irradiation. Eur. J. Biochem. 159, 95–101 (1986).

    Article  CAS  PubMed  Google Scholar 

  14. Auweter, S. D. et al. Molecular basis of RNA recognition by the human alternative splicing factor Fox-1. EMBO J. 25, 163–173 (2006).

    Article  CAS  PubMed  Google Scholar 

  15. Chen, Y. et al. Targeted inhibition of oncogenic miR-21 maturation with designed RNA-binding proteins. Nat. Chem. Biol. 12, 717–723 (2016).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  16. Jenkins, H. T., Malkova, B. & Edwards, T. A. Kinked β-strands mediate high-affinity recognition of mRNA targets by the germ-cell regulator DAZL. Proc. Natl Acad. Sci. USA 108, 18266–18271 (2011).

    Article  ADS  CAS  PubMed  Google Scholar 

  17. Zagore, L. L. et al. DAZL regulates germ cell survival through a network of polyA-proximal mRNA interactions. Cell Rep. 25, 1225–1240 (2018).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  18. Hofmann, M. C., Narisawa, S., Hess, R. A. & Millán, J. L. Immortalization of germ cells and somatic testicular cells using the SV40 large T antigen. Exp. Cell Res. 201, 417–435 (1992).

    Article  CAS  PubMed  Google Scholar 

  19. Fu, X. F. et al. DAZ family proteins, key players for germ cell development. Int. J. Biol. Sci. 11, 1226–1235 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  20. Lin, Y. & Page, D. C. Dazl deficiency leads to embryonic arrest of germ cell development in XY C57BL/6 mice. Dev. Biol. 288, 309–316 (2005).

    Article  CAS  PubMed  Google Scholar 

  21. Ruggiu, M. et al. The mouse Dazla gene encodes a cytoplasmic protein essential for gametogenesis. Nature 389, 73–77 (1997).

    Article  ADS  CAS  PubMed  Google Scholar 

  22. Saunders, P. T. et al. Absence of mDazl produces a final block on germ cell development at meiosis. Reproduction 126, 589–597 (2003).

    Article  CAS  PubMed  Google Scholar 

  23. Yang, C. R. et al. The RNA-binding protein DAZL functions as repressor and activator of mRNA translation during oocyte maturation. Nat. Commun. 11, 1399 (2020).

    Article  ADS  CAS  PubMed  PubMed Central  Google Scholar 

  24. Haberman, N. et al. Insights into the design and interpretation of iCLIP experiments. Genome Biol. 18, 7 (2017).

    Article  PubMed  PubMed Central  Google Scholar 

  25. Huppertz, I. et al. iCLIP: protein–RNA interactions at nucleotide resolution. Methods 65, 274–287 (2014).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  26. Reynolds, N. et al. Dazl binds in vivo to specific transcripts and can regulate the pre-meiotic translation of Mvh in germ cells. Hum. Mol. Genet. 14, 3899–3909 (2005).

    Article  CAS  PubMed  Google Scholar 

  27. Itri, F. et al. Femtosecond UV-laser pulses to unveil protein-protein interactions in living cells. Cell. Mol. Life Sci. 73, 637–648 (2016).

    Article  CAS  PubMed  Google Scholar 

  28. Brister, M. M. & Crespo-Hernández, C. E. Direct observation of triplet-state population dynamics in the RNA uracil derivative 1-cyclohexyluracil. J. Phys. Chem. Lett. 6, 4404–4409 (2015).

    Article  CAS  PubMed  Google Scholar 

  29. Brister, M. M. & Crespo-Hernández, C. E. Excited-state dynamics in the RNA nucleotide uridine 5′-monophosphate investigated using femtosecond broadband transient absorption spectroscopy. J. Phys. Chem. Lett. 10, 2156–2161 (2019).

    Article  CAS  PubMed  Google Scholar 

  30. Paschotta, R. Encyclopedia of Laser Physics and Technology (Wiley-VCH, 2008).

  31. Strober, W. Trypan blue exclusion test of cell viability. Curr. Protoc. Immunol. https://doi.org/10.1002/0471142735.ima03bs21 (2001).

  32. Moore, M. J. et al. Mapping Argonaute and conventional RNA-binding protein interactions with RNA at single-nucleotide resolution using HITS-CLIP and CIMS analysis. Nat. Protocols 9, 263–293 (2014).

    Article  CAS  PubMed  Google Scholar 

  33. Langmead, B. & Salzberg, S. L. Fast gapped-read alignment with Bowtie 2. Nat. Methods 9, 357–359 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  34. Quinlan, A. R. & Hall, I. M. BEDTools: a flexible suite of utilities for comparing genomic features. Bioinformatics 26, 841–842 (2010).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  35. Robinson, J. T. et al. Integrative genomics viewer. Nat. Biotechnol. 29, 24–26 (2011).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  36. Weyn-Vanhentenryck, S. M. et al. HITS-CLIP and integrative modeling define the Rbfox splicing-regulatory network linked to brain development and autism. Cell Rep. 6, 1139–1152 (2014).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  37. Zhang, C. & Darnell, R. B. Mapping in vivo protein–RNA interactions at single-nucleotide resolution from HITS-CLIP data. Nat. Biotechnol. 29, 607–614 (2011).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  38. Schindler, D., Uschner, D., Hilgers, R.-D. & Heussen, N. randomizeR: randomization for clinical trials. R version 4.3.0 https://cran.r-project.org/web/packages/randomizeR/index.html (2019).

  39. Aken, B. L. et al. Ensembl 2017. Nucleic Acids Res. 45, D635–D642 (2017).

    Article  CAS  PubMed  Google Scholar 

  40. O’Leary, N. A. et al. Reference sequence (RefSeq) database at NCBI: current status, taxonomic expansion, and functional annotation. Nucleic Acids Res. 44, D733–D745 (2016).

    Article  PubMed  Google Scholar 

  41. Fox, J. An R Companion to Applied Regression 3rd edition (Sage, 2019).

  42. Thompson, H. W., Mera, R. & Prasad, C. The analysis of variance (ANOVA). Nutr. Neurosci. 2, 43–55 (1999).

    Article  CAS  PubMed  Google Scholar 

  43. Leschinski, C. Vignette: the MonteCarlo package. R version 4.3.0 https://cran.r-project.org/web/packages/MonteCarlo/vignettes/MonteCarlo-Vignette.html (2019).

  44. Cao, J. & Zhang, S. A Bayesian extension of the hypergeometric test for functional enrichment analysis. Biometrics 70, 84–94 (2014).

    Article  MathSciNet  PubMed  MATH  Google Scholar 

  45. Jolliffe, I. T. & Cadima, J. Principal component analysis: a review and recent developments. Phil. Trans. R. Soc. Lond. A 374, 20150202 (2016).

    ADS  MathSciNet  MATH  Google Scholar 

  46. Kerr, G., Ruskin, H. J., Crane, M. & Doolan, P. Techniques for clustering gene expression data. Comput. Biol. Med. 38, 283–293 (2008).

    Article  CAS  PubMed  Google Scholar 

  47. Krijthe, J. H. Rtsne: t-distributed stochastic neighbour embedding using a Barnes–Hut implementation. https://github.com/jkrijthe/Rtsne (2015).

  48. van der Maaten, L. & Hinton, G. Visualizing data using t-SNE. J. Mach. Learn. Res. 9, 2579–2605 (2008).

    MATH  Google Scholar 

  49. Bigs, D., De Ville, B. & Suen, E. A method of choosing multiway partitions for classification and decision trees. J. Appl. Stat. 18, 49–62 (1991).

    Article  Google Scholar 

  50. Goodman, L. A. Simple models for the analysis of association in crossclassifications having ordered categories. J. Am. Stat. Assoc. 74, 537–552 (1979).

    Article  Google Scholar 

  51. Armstrong, R. A. When to use the Bonferroni correction. Ophthalmic Physiol. Opt. 34, 502–508 (2014).

    Article  PubMed  Google Scholar 

  52. Benjamini, Y. & Hochberg, Y. Controlling the false discovery rate: a practical and powerful approach to multiple testing. J. R. Stat. Soc. B 57, 289–300 (1995).

    MathSciNet  MATH  Google Scholar 

  53. Fabregat, A. et al. Reactome pathway analysis: a high-performance in-memory approach. BMC Bioinformatics 18, 142 (2017).

    Article  PubMed  PubMed Central  Google Scholar 

  54. Jassal, B. et al. The reactome pathway knowledgebase. Nucleic Acids Res. 48, D498–D503 (2020).

    CAS  PubMed  Google Scholar 

  55. Shannon, P. et al. Cytoscape: a software environment for integrated models of biomolecular interaction networks. Genome Res. 13, 2498–2504 (2003).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  56. Magidson, J. Common pitfalls in causal analysis of categorical data. J. Mark. Res. 19, 461–471 (1982).

    Article  Google Scholar 

  57. Breiman, L., Friedman, J. H., Olshen, R. A. & Stone, C. J. Classification and Regression Trees (Chapman & Hall/CRC, 1984).

  58. Dua, D. & Gradd, C. UCI Machine Learning Repository http://archive.ics.uci.edu/ml (University of California, School of Information and Computer Science, 2019).

  59. Kass, G. V. An exploratory technique for investigating large quantities for categorical data. Appl. Stat. 29, 119–127 (1980).

    Article  ADS  Google Scholar 

Download references

Acknowledgements

We thank G. Varani for providing purified RBFOX(RRM) and RBFOXmut(RRM); A. Komar for the design of the codon-optimized DAZL construct; and W. Huang for assistance with the fluorescence polarization experiments. This work was supported by the NIH (GM118088 to E.J. and GM107331 to D.D.L.) and the NSF (CHE-1800052 to C.E.C.-H.).

Author information

Authors and Affiliations

Authors

Contributions

D.S., C.E.C.-H., D.D.L. and E.J. conceptualized the study. C.E.C.-H. and M.M.B. adapted the pulsed fs-laser set-up for time-resolved cross-linking experiments. D.S. and M.M.B. optimized and performed fs-laser cross-linking experiments. D.S. and L.L.Z. optimized and performed NGS library preparations. L.L.Z. and D.D.L. performed iCLIP, RNA-seq and ribosome-profiling experiments. X.Y. optimized DAZL(RRM) overexpression, purified recombinant DAZL(RRM) and performed fluorescence anisotropy studies. D.S. and E.J. devised the framework for KIN-CLIP data analysis and the DAZL regulatory program. D.S. performed the data analysis. All authors contributed to the writing of the manuscript.

Corresponding authors

Correspondence to Donny D. Licatalosi or Eckhard Jankowsky.

Ethics declarations

Competing interests

D.S. and E.J. are founders of Bainom Inc. The remaining authors declare no competing interests.

Additional information

Peer review information Nature thanks Rick Russell and the other, anonymous, reviewer(s) for their contribution to the peer review of this work.

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Extended data figures and tables

Extended Data Fig. 1 Time-resolved RNA-protein cross-linking with a fs laser in vitro.

a, Schematics of set-up of the fs laser. b, Degradation of RNA (38 nt) under steady-state and fs-laser illumination. Data points represent averages of three independent measurements. Error bars indicate 1 s.d. Lines show a linear trend. c, Dose absorbed over time for cross-linking with conventional UV (Stratalinker, 200 mJ cm−2, λ = 254 nm) and fs laser (2.6 mW). Error bars indicate 1 s.d. (n = 3 independent measurements). Lines show a linear trend. d, Representative denaturing PAGE for a cross-linking reaction of 50 nM RBFOX(RRM) (laser: 2.6 mW) (lanes 5–12) and control reactions with RNA only (lanes 1–3) and RBFOX(RRM) only (lane 4), with (lanes 2–4) or without (lanes 1 and 5) cross-linking. Three independent measurements provided similar results. e, Representative denaturing PAGE for a cross-linking reaction of 50 nM RBFOX(RRM) with Stratalinker (200 mJ cm−2, λ = 254 nm), lanes 4–8) and control reactions (lanes 1–3). Three independent measurements provided similar results. f, Time course of cross-linking reaction of 50 nM RBFOX(RRM) with Stratalinker (200 mJ cm−2, λ = 254 nm) versus fs laser (Fig. 1d). Data points are averages from triplicate experiments (error bars indicate 1 s.d.). g, h, RNA cross-linking time courses for DAZL(RRM) (g) and RBFOXmut(RRM) (h) with fs laser at different laser power and protein concentrations. Data points represent averages of three independent measurements (error bars indicate 1 s.d.). Lines show the fit to the data in Fig. 1e. ik, Binding isotherms for RBFOX(RRM) (i), RBFOXmut(RRM) (j) and DAZL(RRM) (k) to cognate RNAs, measured by fluorescence anisotropy. Experiments were performed multiple times; all data points are shown. Apparent equilibrium binding constants (K1/2; Fig. 1e) were calculated with the quadratic binding equation.

Source data

Extended Data Fig. 2 DAZL–RNA cross-linking with fs laser in GC-1 spg cells.

a, Western blot of doxycyline-dependent DAZL expression in GC-1 cells. Four independent experiments provided similar results. b, Schematic of the time-resolved cross-linking approach in cells. Numbers mark the respective CLIP libraries. c, Representative PAGE for bulk DAZL–RNA cross-linking. Three independent experiments provided similar results. The intensity of cross-linked RNA (marked) is used to convert NGS reads to a concentration-equivalent parameter (for bulk cross-linking intensities and associated standard errors see Supplementary Table 6). d, Distribution of CLIP sequencing reads across RNA classes and mRNA regions for fs laser (4.2× DAZL, 2.6 mW) and conventional cross-linking (Stratalinker; 4.2× DAZL). Distributions for laser cross-linking experiments were calculated for binding sites with sequencing reads for all 12 measurements. Distributions for iCLIP experiments were calculated from three independent measurements17. e, DAZL-binding sites identified by fs laser (KIN-CLIP) and conventional UV cross-linking (iCLIP) on all RNAs and 3′-UTRs. f, Metagene distribution of DAZL-binding sites identified by KIN-CLIP and iCLIP on 3′-UTRs proximal to stop codons and the PAS. The dotted lines mark the background of a random distribution of binding sites on 3′-UTRs. g, CITS analysis36,37 of 6-mer and 4-mer enrichment at 5′-termini of sequencing reads for KIN-CLIP (top) and iCLIP (bottom). The data indicate a virtually identical sequence context of cross-linking sites for KIN-CLIP and iCLIP. Sequence enrichment reflects the statistical overrepresentation of 6-mer and 4-mer sequences with respect to randomized sequences (z-score, 11-nt region, ±5 nt from the 5′-terminal nucleotide).

Source data

Extended Data Fig. 3 Determination of kinetic parameters from fs-laser time-resolved DAZL–RNA cross-linking in cells.

a, Flow chart of the approach to calculate kinetic parameters for individual DAZL–RNA binding sites in cells (for details, see Methods). Unless otherwise stated, rate constants averaged from both approaches are used in subsequent data analyses. b, Scaling of Χ2 with the number of iterative fitting cycles for analytical and numerical approaches. c, d, Distribution of Χ2 at first and last (642) fitting cycle for analytical (c) and numerical (d) approaches (COD, coefficient of determination; R2, linear correlation coefficient). ei, Correlation of parameters calculated with analytical and numerical fitting procedures for kon(4.2× DAZL) (e), kon(1× DAZL) (f), kdiss (g), kXL(2.6 mW) (h) and kXL(1 mW) (i). j, Correlation between cross-linking rate constants for low and high laser power. Rate constants are averaged from parameters obtained with numerical and analytical approaches. Cross-linking rate constants at higher laser power were larger than at lower laser power for 92% of binding sites. k, Confidence range for dissociation rate constants (for details, see Methods). l, Normalized read densities measured experimentally and calculated from the kinetic parameters for all DAZL-binding sites. m, Distribution of Χ2 for experimental values compared with values calculated with the kinetic parameters.

Source data

Extended Data Fig. 4 Kinetic parameters of DAZL-binding sites and sequence context.

ad, Sequences surrounding DAZL-binding sites, arranged according to decreasing values for kon(4.2× DAZL) (a), kdiss (b), kXL(2.6 mW) (c) and Φmax (d). Sequences are aligned at the peak nucleotide (most frequent cross-link site (±11-nt peak nucleotide); Extended Data Fig. 2f, position 0). eh, Frequency of 6-mer sequences surrounding DAZL cross-link sites (±111-nt peak nucleotide) in the top and bottom 5% of sequences arranged according to the kinetic parameters in ad. il, Relative frequency of 6-mer sequences in the top and bottom 5% of sequences (eh), arranged according to the kinetic parameters in ad. Sequences below the diagonal line correspond to enrichment of a 6-mer in the top 5% versus the bottom 5%. A6, U6 and U3GU2 are most enriched in the vicinity of the binding sites with the fastest apparent association rate constants, compared to the binding sites with the slowest apparent association rate constants. No comparable enrichment is seen for other kinetic parameters. mp, Relative frequency of 4-mers in the top and bottom 5% of sequences arranged according to the kinetic parameters in ad. qt, Distribution of association and dissociation rate constants, binding probabilities (P(4.2× DAZL)) and maximal fractional occupancy (Φmax) for binding sites (n = 8,696, binding sites with associated values for fractional occupancy) on different RNA classes. P values (one-way ANOVA, significant for P < 0.05) indicate inter-group differences. Φmax values, but not other parameters, vary significantly for different RNA classes (for box plots: vertical line, median; box limits, IQR; whiskers, 1.5 × IQR). ux, Distributions of kinetic parameters for all binding sites (n = 8,212, binding sites with associated values for fractional occupancy) in the indicated mRNA regions (P value by one-way ANOVA; for box plots: vertical line, median; box limits, IQR; whiskers, 1.5 × IQR). kon(4.2× DAZL) and P(4.2× DAZL), but not the other parameters, vary significantly for different mRNA regions.

Source data

Extended Data Fig. 5 Arrangement of 3′-UTR DAZL-binding sites in clusters.

a, Arrangement of DAZL-binding sites in 3′-UTRs. Binding sites are coloured according to kon(4.2× DAZL) and kdiss as indicated in the key. Right, number of clusters in the corresponding 3′-UTR. Colours mark the number of binding sites in a cluster, as indicated in the legend bar (right) (n = 1,313 3′-UTRs, 1,690 clusters, 6,085 binding sites). b, c, Distribution of DAZL-binding sites in 3′-UTRs closer than 500 nt to the PAS (b) or further than 500 nt from the PAS (c) as a function of the distance between neighbouring binding sites. The grey lines show the distribution if sites were randomly distributed across all 3′-UTRs (P values by one sided t-test). d, Large windows: genome browser traces of representative 3′-UTRs with five clusters (Nucks1) and 2 clusters (D’Rik, D030056L22Rik). Bars show the normalized read coverage for 4.2× DAZL, 2.6 mW laser and 680 s cross-linking time. Numbers mark the distance between clusters. Small windows: magnification of cluster 1 of Nucks1, with three binding sites, and cluster 1 of D’Rik, with two binding sites (numbers mark the distance between binding sites). e, Number of clusters in 3′-UTRs with DAZL-binding sites. Colours show the number of binding sites in a cluster as indicated in a (red, 20; light yellow, 1). f, Distances between clusters in 3′-UTRs with two to four clusters. Number 1 represents the cluster most proximal to the PAS. g, Distribution of distances between neighbouring binding sites (n = 2,888) in clusters (2–9 binding sites). Number 1 represents the 3′ binding site (for box plots: vertical line, median; box limits, IQR; whiskers, 1.5 × IQR). hj, Correlation between the number of binding sites (n = 6,546) for clusters proximal (blue, <0.5 kb) and distant (red, ≥0.5 kb) to the PAS and P(4.2× DAZL) (h), dissociation rate constants (kdiss; i), and maximum fractional occupancy (Φmax; j), for individual binding sites in a given cluster. P values (one-way ANOVA) indicate significant inter-group differences for P(4.2× DAZL) and Φmax, but not for kdiss. P(4.2× DAZL) and Φmax depend on kon(4.2× DAZL), which correlates with the number of binding sites in a cluster (Fig. 3c). For box plots: vertical line, median; box limits, IQR; whiskers, 1.5 × IQR. k, Correlation between kinetic parameters of individual binding sites in clusters with 6, 5, 4 and 3 binding sites. The Pearson correlation coefficient is indicated. Binding site number 1 indicates the 3′ binding site in a cluster.

Source data

Extended Data Fig. 6 Link between DAZL binding in 3′-UTRs and effects on mRNA level and ribosome association.

a, Correlation between cumulative binding probabilities (ΣB) and number of binding sites in a cluster (n = 1,313 3′-UTRs, 6,085 binding sites, 1,690 clusters in transcripts with associated values for ΔRNA and ΔRPF). b, Correlation between ΣB and distance of the cluster from the PAS. c, d, Correlation of ΣB terciles (Fig. 4a) and changes in ribosome association (ΔRPF; Fig. 4b) (c) or changes in transcript levels (ΔRNA; Fig. 4b) (d) for the corresponding transcripts (n = 968) between low (1× DAZL) and high (4.2× DAZL) concentration (P values by one-way ANOVA). For UTRs with multiple clusters, the cluster closest to the PAS was used (for box plots: vertical line, median; box limits, IQR; whiskers, 1.5 × IQR). e, Distribution of binding probabilities for individual DAZL-binding sites in 3′-UTRs for transcripts in THRH, THRM, TLRM, TLRL, TMRH, TMRL mRNA classes (Fig. 4b). The dotted lines mark terciles (for details, see Methods). f, Correlation between binding probabilities for individual binding sites and functional mRNA classes (Fig. 4b). Colours mark the enrichment of a given ΣB tercile compared to a random distribution (one-sided hypergeometric test; red, P < 0.0005 to 0.05; shades of yellow, P > 0.05 to 0.5 (not enriched)). No significant enrichment is observed. g, Distribution of cumulative binding probabilities for DAZL clusters in 3′-UTRs with scrambled binding sites. The dotted lines mark terciles. h, Correlation between cumulative binding probabilities of DAZL clusters with binding sites scrambled between clusters (g) and functional mRNA classes (Fig. 4b). Colours mark the enrichment of a given ΣB tercile compared to a random distribution (one-sided hypergeometric test; red, P < 0.0005 to 0.05; shades of yellow, P > 0.05 to 0.5 (not enriched)). No significant enrichment is observed. i, Correlation between additive binding probabilities of two DAZL sites in a cluster and functional mRNA classes. Colours mark the enrichment of a given ΣB tercile compared to a random distribution (one-sided hypergeometric test; red, P < 0.0005 to 0.05; shades of yellow, P > 0.05 to 0.5 (not enriched)). For clusters with more than two binding sites, permutations of two sites were tested and sites with the highest additive binding probability were selected. The model tests whether the additive binding probability of any two DAZL-binding sites in a given cluster can explain the effect of DAZL on the transcript to the same extent as considering cumulative binding probabilities for the entire cluster (Fig. 4c). The model is only able to explain the TLRL, TLRM mRNA classes, which frequently contain transcripts with clusters that have only few DAZL-binding sites. j, Correlation between conditional binding probabilities of two DAZL sites in a cluster (terciles) and functional mRNA classes. Colours mark the enrichment of a given ΣB tercile compared to a random distribution (one-sided hypergeometric test; red, P < 0.0005 to 0.05; shades of yellow, P > 0.05 to 0.5 (not enriched)). For clusters with more than two binding sites, permutations of two sites were tested and combinations of sites with the highest multiplicative binding probability were selected. The model tests whether the conditional binding probability of any two DAZL-binding sites (for example, whether DAZL needs to bind simultaneously to both sites) in a given cluster can explain the effect of DAZL on the transcript to the same extent as considering cumulative binding probabilities for the entire cluster (Fig. 4c). The model explains only mRNA classes that frequently contain transcripts with DAZL clusters that have only few binding sites. For these clusters cumulative and conditional binding probabilities scale similarly. The data suggest that simultaneous binding of DAZL to two sites in a cluster is not required for general DAZL function. k, Correlation between conditional binding probabilities of three DAZL sites in a cluster (terciles) and functional mRNA classes. Colours mark the enrichment of a given ΣB tercile compared to a random distribution (hypergeometric test, one-sided, red: P < 0.0005 to 0.05, shades of yellow: P > 0.05 to 0.5, not enriched). Analysis was performed as in j (Fig. 4c). The data suggest that simultaneous binding of DAZL to two or more sites in a cluster is not required for DAZL function.

Source data

Extended Data Fig. 7 Link between DAZL clusters in 3′-UTRs and effects on mRNA level and ribosome association.

a, Distribution of transcript levels at 4.2× DAZL b, Distribution of 3′-UTR lengths17,39,40. For UTRs with multiple lengths, coordinates for the longest 3′-UTR were used. c, Distribution of distances of DAZL clusters from the PAS. d, Distribution of differential cumulative binding probability (ΔΣB) for all DAZL clusters. The dotted lines mark terciles. Terciles were defined by obtained standard deviations from the mean for each feature described above. e, Link between the effect of DAZL on mRNA level and ribosome association and cluster features (top graphs: black line, number of DAZL clusters in 3′-UTR; blue vertical lines, ΣB, with the lower end marking ΣB at 1× DAZL and the upper end marking ΣB at 4.2× DAZL; middle graphs: ΔΣB for each cluster and number of DAZL-binding sites in each cluster; heat maps below the graphs: terciles of transcript features obtained from ac. Each panel shows one functional mRNA class (defined in Fig. 4b). Functional classes not displayed contained too few or no transcripts (TLRH, 0; THRL, 2) or showed no change in ribosome association and transcript level (TMRM). Numbers represent the groups in the DAZL code (Fig. 4d). Clusters with ΣB > 1 (n = 4) are not shown.

Source data

Extended Data Fig. 8 The DAZL regulatory program.

a, Pairwise correlation between DAZL cluster features. Colours correspond to Pearson’s correlation coefficient. Cluster features are marked as indicated on the right. b, Variance of data reflected in the eigenvalues of the seven principal component axes obtained by PCA. Each eigenvalue corresponds to a principal component axis. Each axis reflects a linear combination of the seven characteristics of a DAZL cluster, obtained from a. The eigenvalues and the corresponding principal component axis are sorted according to the initial variance they represent. The first three principal component axes explain roughly 90% variance. c, Biplots of DAZL cluster features (arrows) projected on the first two principal components (PC1, PC2; b). Dots represent transcripts. Colours correspond to terciles of the distributions of values for ΔRPF and ΔRNA as defined in Fig 4b. Each arrow represents a cluster feature (labels as in a). Proximity of arrows scales with correlation between the corresponding features. Arrows in the x direction (positive or negative) contribute to PC1; arrows in the y direction (positive or negative) contribute to PC2. Short arrows (transcript level, proximity to PAS) indicate that additional principal components (PC3–PC7) are required to explain the corresponding feature. d, t-SNE (perplexity = 10, iterations = 2,000) of cluster features (a). Identified groups are marked 1–21. Each point represents a transcript. e, Biplots of DAZL cluster features (arrows) projected on three principal components (PC1–PC3; b). Dots represent transcripts. Colours correspond to functional mRNA classes (THRH, THRM, TLRM, TLRL, TMRH, TMRL; Fig. 4b). Separation of transcripts in 21 groups is marked as 1–21. f, Link between functional mRNA classes and kinetic parameters (ΣB, ΔΣB), cluster features (number of binding sites in cluster, proximity to PAS) and UTR features (numbers of clusters on UTR, UTR length, transcript level). Left, enrichment of terciles (H, M, L; Fig. 4a, Extended Data Fig. 7a–d) for ΣB, ΔΣB, number of binding sites in cluster, cluster distance from PAS, UTR length and transcript level in group 1. Numbers and colour indicate the degree of enrichment. The row on the left marks the visualization of the DAZL code for group 1 that is used in Fig. 4d. Right, enrichment of terciles for the features indicated in the left panel for all groups (1–21). Functional mRNA classes for the respective groups are shown at the bottom. g, Genome browser traces of representative transcripts of select groups. mRNA classes are indicated. The y axis represents normalized coverage value. h, Mapping of transcripts from select groups on two biological networks. Groups are coloured as indicated. Proximity of transcripts of a given group in the network indicates closely related biological functions.

Source data

Extended Data Fig. 9 Decision tree classification linking the DAZL code to the functional effects of DAZL binding.

a, Decision tree classifier (CHAID algorithm56,57,58) of seven features (ΣB, ΔΣB, distance to PAS, 3′-UTR length, transcript level, number of clusters in a given 3′-UTR (Clust./3′-UTR), and number of sites in cluster; Extended Data Fig. 8) in terciles (Extended Data Fig. 7). Nodes () mark the given feature and corresponding partition (high, medium, low). Circles indicate the number of transcripts; donut graphs mark the functional mRNA classes, colour-coded as shown on the right. Circled numbers left of the heat map with the DAZL code (identical to that in Fig. 4d) indicate the number of transcripts in a given group. The decision tree was calculated by cross-tabulation of predictor variables (transcripts, n = 413) with target variables (functional mRNA classes THRH, THRM, TLRM, TLRL, TMRH and TMRL; Fig. 4b) followed by partitioning of predictor variables into statistically significant subgroups (Χ2 test, for independence with significance threshold: 0.05 (ref. 59, Supplementary Table 10). b, Confusion matrix corresponding to the decision tree. Validation 1 (n = 24 transcripts) and Validation 2 (n = 21 transcripts) are predictions for transcripts that were not included in the decision tree classification.

Source data

Extended Data Fig. 10 Linear regression models for linking the DAZL code to the effects of DAZL on changes in transcript levels, ribosome association and translation efficiency.

a, Distribution of changes in translational efficiency values (ΔTE) between high and low DAZL concentration for transcripts in the 21 groups of the DAZL regulatory program, defined in Fig. 4d. mRNA functional classes are defined in Fig. 4b. The grey area in the plot centre marks unchanged ΔTE (95% confidence interval). P values were calculated by one-way ANOVA of inter-group variations for each mRNA functional class (for box plots: horizontal line, median; box limits, IQR; whiskers 1.5 × IQR). b, Linear regression models tested. (yellow, dummy coding, using terciles of the variables (Extended Data Fig. 8); red, no dummy coding, use of continuous data; grey, variable was omitted. c, Adjusted R2 for each model. d, DILCs for each model. Grey boxes mark models without the respective variable. e, Significance of each DILC for each model (white, significant (P < 0.005 to 0.05); black, not significant (P > 0.05); P values by one-sided Student’s t-test on each coefficient). Only model 1 (M1) shows consistently significant DILCs. Models 24–27 include interaction terms corresponding to 7 independent variable terms and test the effect of multi-collinearity. Interaction terms for each of the models were as follows: M24: ΣB | ΔΣB and ΣB | number of binding sites in a cluster. M25: ΣB | ΔΣB. M26: ΣB | ΔΣB and ΣB: Proximity to PAS. M27: ΣB | Proximity to PAS. Interaction terms are the cross product of encompassing independent variable terms and were selected based on pairwise correlation coefficients (Extended Data Fig. 8a). f, Linear regression model linking the DAZL regulatory program to changes in translational efficiency values (ΔTE) (a). Points represent the DILC (red, DILCs for translational efficiencies that increase at high DAZL concentration; black, DILCs for translational efficiencies that decrease at high DAZL concentration). g, Correlation between experimental values for ΔTE and values predicted with the linear regression model for test dataset. h, Correlation between predicted values for ΔRPF and changes in luciferase activity between high and low DAZL concentration for reporter RNA constructs. Reporters were generated by appending the 3′-UTR of the respective transcripts to a luciferase ORF, and measurements were performed as described previously17. Error bars represent s.e.m. for each data point, corresponding to five independent experiments. Naa40 and Ptma were part of the model building dataset (training dataset). Calm2, Cxcl1, D’Rik and Spp1 were part of the test dataset.

Source data

Supplementary information

Supplementary Information

This file contains Supplementary Methods, Supplementary Tables S1– S7, Supplementary Figures S1– S7 and Supplementary Schemes S1 and S2.

Reporting Summary

Supplementary Table 8

Binding site kinetic parameters.

Supplementary Table 9

Metrics for Dazl binding site clusters.

Supplementary Table 10

Decision Tree parameters.

Supplementary Table 11

Ribo-Seq and RNA-seq data.

Supplementary Table 12

GO term gene list.

Supplementary Table 13

PCA metrics.

Source data

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Sharma, D., Zagore, L.L., Brister, M.M. et al. The kinetic landscape of an RNA-binding protein in cells. Nature 591, 152–156 (2021). https://doi.org/10.1038/s41586-021-03222-x

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1038/s41586-021-03222-x

This article is cited by

Comments

By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.

Search

Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing