Many disordered proteins conserve essential functions in the face of extensive sequence variation, making it challenging to identify the mechanisms responsible for functional selection. Here we identify the molecular mechanism of functional selection for the disordered adenovirus early gene 1A (E1A) protein. E1A competes with host factors to bind the retinoblastoma (Rb) protein, subverting cell cycle regulation. We show that two binding motifs tethered by a hypervariable disordered linker drive picomolar affinity Rb binding and host factor displacement. Compensatory changes in amino acid sequence composition and sequence length lead to conservation of optimal tethering across a large family of E1A linkers. We refer to this compensatory mechanism as conformational buffering. We also detect coevolution of the motifs and linker, which can preserve or eliminate the tethering mechanism. Conformational buffering and motif–linker coevolution explain robust functional encoding within hypervariable disordered linkers and could underlie functional selection of many disordered protein regions.
This is a preview of subscription content, access via your institution
Subscribe to Nature+
Get immediate online access to the entire Nature family of 50+ journals
Subscribe to Journal
Get full journal access for 1 year
only $8.25 per issue
All prices are NET prices.
VAT will be added later in the checkout.
Tax calculation will be finalised during checkout.
Get time limited or full article access on ReadCube.
All prices are NET prices.
SAXS raw data for Rb, E1AWT and the E1AWT–Rb complex have been deposited in SASDB (https://www.sasbdb.org) with codes SASDNK6 (Rb 1 mg/ml), SASDNL6 (Rb 2 mg/ml), SASDNM6 (Rb 4 mg/ml), SASDNN6 (E1AWT 4.2 mg/ml), SASDNP6 (E1AWT 5.6 mg/ml), SASDNQ6 (E1AWT 7.0 mg/ml), SASDNR6 (E1AWT–Rb 0.7 mg/ml), SASDNS6 (E1AWT–Rb 1.4 mg/ml), SASDNT6 (E1AWT–Rb 2.7 mg/ml), SASDNU6 (E1AWT–Rb merged data) and SASDNV6 (E1AWT, SEC–SAXS). Refined conformational ensemble models for E1AWT and E1AWT–Rb have been deposited in the Protein Ensemble Database (https://proteinensemble.org) with codes PED00175 (E1AWT) and PED00174 (E1AWT–Rb). Unfiltered conformational ensembles for the E1AWT–Rb, E1AΔL–Rb and E1AΔE–Rb complexes are available at (https://moma.laas.fr/data/) under the description ‘Conformational ensemble models of the IDP E1A bound to Rb protein.’ NMR assignments of backbone resonances for E1AWT, E1AΔE and E1AΔL are provided in Supplementary Data 2. Trajectories for all E1A linker ensembles are provided at Zenodo (https://zenodo.org/record/6332925), and trajectory analysis results are provided at https://github.com/holehouse-lab/supportingdata/tree/master/2021/Gonzalez_Foutel_2021. PDB codes used in data analysis and prediction are: 1GUX, 3POM, 2R7G and 4YOZ. Source data are provided with this paper.
The loop sampling method used to model the linker between the two binding motifs can be used via a web server (https://moma.laas.fr/applications/LoopSampler/), and binaries can be provided upon request. All code used to analyze the E1A linker trajectories are provided at https://github.com/holehouse-lab/supportingdata/tree/master/2021/Gonzalez_Foutel_2021.
Wright, P. E. & Dyson, H. J. Intrinsically unstructured proteins: re-assessing the protein structure-function paradigm. J. Mol. Biol. 293, 321–331 (1999).
van der Lee, R. et al. Classification of intrinsically disordered regions and proteins. Chem. Rev. 114, 6589–6631 (2014).
Tompa, P., Davey, N. E., Gibson, T. J. & Babu, M. M. A million peptide motifs for the molecular biologist. Mol. Cell 55, 161–169 (2014).
Brown, C. J., Johnson, A. K., Dunker, A. K. & Daughdrill, G. W. Evolution and disorder. Curr. Opin. Struct. Biol. 21, 441–446 (2011).
Das, R. K., Ruff, K. M. & Pappu, R. V. Relating sequence encoded information to form and function of intrinsically disordered proteins. Curr. Opin. Struct. Biol. 32, 102–112 (2015).
Daughdrill, G. W., Narayanaswami, P., Gilmore, S. H., Belczyk, A. & Brown, C. J. Dynamic behavior of an intrinsically unstructured linker domain is conserved in the face of negligible amino acid sequence conservation. J. Mol. Evol. 65, 277–288 (2007).
Beh, L. Y., Colwell, L. J. & Francis, N. J. A core subunit of polycomb repressive complex 1 is broadly conserved in function but not primary sequence. Proc. Natl Acad. Sci. USA 109, E1063–E1071 (2012).
Das, R. K., Huang, Y., Phillips, A. H., Kriwacki, R. W. & Pappu, R. V. Cryptic sequence features within the disordered protein p27Kip1 regulate cell cycle signaling. Proc. Natl Acad. Sci. USA 113, 5616–5621 (2016).
Martin, E. W. et al. Valence and patterning of aromatic residues determine the phase behavior of prion-like domains. Science 367, 694–699 (2020).
Zarin, T. et al. Proteome-wide signatures of function in highly diverged intrinsically disordered regions. eLife 8, e46883 (2019).
Buske, P. J., Mittal, A., Pappu, R. V. & Levin, P. A. An intrinsically disordered linker plays a critical role in bacterial cell division. Semin. Cell Dev. Biol. 37, 3–10 (2015).
Borcherds, W. et al. Optimal affinity enhancement by a conserved flexible linker controls p53 mimicry in MdmX. Biophys. J. 112, 2038–2042 (2017).
Sherry, K. P., Das, R. K., Pappu, R. V. & Barrick, D. Control of transcriptional activity by design of charge patterning in the intrinsically disordered RAM region of the Notch receptor. Proc. Natl Acad. Sci. USA 114, E9243–E9252 (2017).
Hantschel, O. et al. A myristoyl/phosphotyrosine switch regulates c-Abl. Cell 112, 845–857 (2003).
Ayrapetov, M. K. et al. Conformational basis for SH2-Tyr(P)527 binding in Src inactivation. J. Biol. Chem. 281, 23776–23784 (2006).
Dyla, M. & Kjaergaard, M. Intrinsically disordered linkers control tethered kinases via effective concentration. Proc. Natl Acad. Sci. USA 117, 21413–21419 (2020).
Cordeiro, T. N. et al. Interplay of protein disorder in retinoic acid receptor heterodimer and its corepressor regulates gene expression. Structure 27, 1270–1285 (2019).
Brodsky, S. et al. Intrinsically disordered regions direct transcription factor in vivo binding specificity. Mol. Cell 79, 459–471 (2020).
Harmon, T. S., Holehouse, A. S., Rosen, M. K. & Pappu, R. V. Intrinsically disordered linkers determine the interplay between phase separation and gelation in multivalent proteins. eLife 6, e30294 (2017).
Huang, Q., Li, M., Lai, L. & Liu, Z. Allostery of multidomain proteins with disordered linkers. Curr. Opin. Struct. Biol. 62, 175–182 (2020).
Jencks, W. P. On the attribution and additivity of binding energies. Proc. Natl Acad. Sci. USA 78, 4046–4050 (1981).
Zhou, H. X. The affinity-enhancing roles of flexible linkers in two-domain DNA-binding proteins. Biochemistry 40, 15069–15073 (2001).
Zhou, H. X. Polymer models of protein stability, folding, and interactions. Biochemistry 43, 2141–2154 (2004).
Morrison, G. & Thirumalai, D. Semiflexible chains in confined spaces. Phys. Rev. E Stat. Nonlin. Soft Matter Phys. 79, 11924 (2009).
van Dongen, E. M. W. M. et al. Variation of linker length in ratiometric fluorescent sensor proteins allows rational tuning of Zn(II) affinity in the picomolar to femtomolar range. J. Am. Chem. Soc. 129, 3494–3495 (2007).
Bertagna, A., Toptygin, D., Brand, L. & Barrick, D. The effects of conformational heterogeneity on the binding of the Notch intracellular domain to effector proteins: a case of biologically tuned disorder. Biochem. Soc. Trans. 36, 157–166 (2008).
Mao, A. H., Crick, S. L., Vitalis, A., Chicoine, C. L. & Pappu, R. V. Net charge per residue modulates conformational ensembles of intrinsically disordered proteins. Proc. Natl Acad. Sci. USA 107, 8183–8188 (2010).
Marsh, J. A. & Forman-Kay, J. D. Sequence determinants of compaction in intrinsically disordered proteins. Biophys. J. 98, 2383–2390 (2010).
Müller-Späth, S. et al. From the cover: charge interactions can dominate the dimensions of intrinsically disordered proteins. Proc. Natl Acad. Sci. USA 107, 14609–14614 (2010).
Das, R. K. & Pappu, R. V. Conformations of intrinsically disordered proteins are influenced by linear sequence distributions of oppositely charged residues. Proc. Natl Acad. Sci. USA 110, 13392–13397 (2013).
Van Rosmalen, M., Krom, M. & Merkx, M. Tuning the flexibility of glycine-serine linkers to allow rational design of multidomain proteins. Biochemistry 56, 6565–6574 (2017).
Sorensen, C. S. & Kjaergaard, M. Effective concentrations enforced by intrinsically disordered linkers are governed by polymer physics. Proc. Natl Acad. Sci. USA 116, 23124–23131 (2019).
Kjaergaard, M., Glavina, J. & Chemes, L. B. Predicting the effect of disordered linkers on effective concentrations and avidity with the “Ceff calculator” app. Methods Enzymol. 647, 145–171 (2021).
Tokuriki, N., Oldfield, C. J., Uversky, V. N., Berezovsky, I. N. & Tawfik, D. S. Do viral proteins possess unique biophysical features? Trends Biochem. Sci 34, 53–59 (2009).
Gitlin, L., Hagai, T., LaBarbera, A., Solovey, M. & Andino, R. Rapid evolution of virus sequences in intrinsically disordered protein regions. PLoS Pathog. 10, e1004529 (2014).
Hagai, T., Azia, A., Babu, M. M. & Andino, R. Use of host-like peptide motifs in viral proteins is a prevalent strategy in host-virus interactions. Cell Rep. 7, 1729–1739 (2014).
Davey, N. E., Trave, G. & Gibson, T. J. How viruses hijack cell regulation. Trends Biochem. Sci. 36, 159–169 (2011).
Chemes, L. B., de Prat-Gay, G. & Sanchez, I. E. Convergent evolution and mimicry of protein linear motifs in host-pathogen interactions. Curr. Opin. Struct. Biol. 32, 91–101 (2015).
King, C. R., Zhang, A., Tessier, T. M., Gameiro, S. F. & Mymryk, J. S. Hacking the cell: network intrusion and exploitation by adenovirus E1A. MBio 9, e00390-18 (2018).
Liu, X. & Marmorstein, R. Structure of the retinoblastoma protein bound to adenovirus E1A reveals the molecular basis for viral oncoprotein inactivation of a tumor suppressor. Genes Dev. 21, 2711–2716 (2007).
Lee, J. O., Russo, A. A. & Pavletich, N. P. Structure of the retinoblastoma tumour-suppressor pocket domain bound to a peptide from HPV E7. Nature 391, 859–865 (1998).
Dyson, N., Guida, P., McCall, C. & Harlow, E. Adenovirus E1A makes two distinct contacts with the retinoblastoma protein. J. Virol. 66, 4606–4611 (1992).
Ferreon, J. C., Martinez-Yamout, M. A., Dyson, H. J. & Wright, P. E. Structural basis for subversion of cellular control mechanisms by the adenoviral E1A oncoprotein. Proc. Natl Acad. Sci. USA 106, 13260–13265 (2009).
Ferreon, A. C., Ferreon, J. C., Wright, P. E. & Deniz, A. A. Modulation of allostery by protein intrinsic disorder. Nature 498, 390–394 (2013).
Fattaey, A. R., Harlow, E. & Helin, K. Independent regions of adenovirus E1A are required for binding to and dissociation of E2F-protein complexes. Mol. Cell. Biol. 13, 7267–7277 (1993).
Hosek, T. et al. Structural and dynamic characterization of the molecular hub early region 1A (E1A) froM Human Adenovirus. Chemistry 22, 13010–13013 (2016).
Haberz, P., Arai, M., Martinez-Yamout, M. A., Dyson, H. J. & Wright, P. E. Mapping the interactions of adenoviral E1A proteins with the p160 nuclear receptor coactivator binding domain of CBP. Protein Sci. 25, 2256–2267 (2016).
Zuiderweg, E. R. P. Mapping protein-protein interactions in solution by NMR spectroscopy. Biochemistry 41, 1–7 (2002).
Palopoli, N., Gonzalez Foutel, N. S., Gibson, T. J. & Chemes, L. B. Short linear motif core and flanking regions modulate retinoblastoma protein binding affinity and specificity. Protein Eng. Des. Sel. 31, 69–77 (2018).
Perozzo, R., Folkers, G. & Scapozza, L. Thermodynamics of protein-ligand interactions: history, presence, and future aspects. J. Recept. Signal Transduct. Res. 24, 1–52 (2004).
Theisen, F. F. et al. Quantification of conformational entropy unravels effect of disordered flanking region in coupled folding and binding. J. Am. Chem. Soc. 143, 14540–14550 (2021).
Bernado, P., Mylonas, E., Petoukhov, M. V., Blackledge, M. & Svergun, D. I. Structural characterization of flexible proteins using small-angle X-ray scattering. J. Am. Chem. Soc. 129, 5656–5664 (2007).
Estaña, A. et al. Realistic ensemble models of intrinsically disordered proteins using a structure-encoding coil database. Structure 27, 381–391 (2019).
Cortes, J., Simeon, T., Remaud-Simeon, M. & Tran, V. Geometric algorithms for the conformational analysis of long protein loops. J. Comput. Chem. 25, 956–967 (2004).
Cohan, M. C., Eddelbuettel, A. M. P., Levin, P. A. & Pappu, R. V. Dissecting the functional contributions of the intrinsically disordered C-terminal tail of Bacillus subtilis FtsZ. J. Mol. Biol. 432, 3205–3221 (2020).
Glavina, J. et al. Host diversification is concurrent with linear motif evolution in a mastadenovirus hub protein. J. Mol. Biol. 434, 167563 (2022).
Hoppe, E. et al. Multiple cross-species transmission events of human adenoviruses (HAdV) during hominine evolution. Mol. Biol. Evol. 32, 2072–2084 (2015).
Glavina, J. et al. Interplay between sequence, structure and linear motifs in the adenovirus E1A hub protein. Virology 525, 117–131 (2018).
Lau, L., Gray, E. E., Brunette, R. L. & Stetson, D. B. DNA tumor virus oncogenes antagonize the cGAS–STING DNA-sensing pathway. Science 350, 568–571 (2015).
Ferreiro, D. U., Komives, E. A. & Wolynes, P. G. Frustration in biomolecules. Q. Rev. Biophys. 47, 285–363 (2014).
Sherry, K. P., Johnson, S. E., Hatem, C. L., Majumdar, A. & Barrick, D. Effects of linker length and transient secondary structure elements in the intrinsically disordered notch RAM region on notch signaling. J. Mol. Biol. 427, 3587–3597 (2015).
Crisostomo, L., Soriano, A. M., Mendez, M., Graves, D. & Pelka, P. Temporal dynamics of adenovirus 5 gene expression in normal human cells. PLoS ONE 14, e0211192 (2019).
Ramirez, J. et al. Targeting the two oncogenic functional sites of the HPV E6 oncoprotein with a high-affinity bivalent ligand. Angew. Chem. Int. Ed. Engl. 54, 7958–7962 (2015).
Cheng, J. et al. Stabilized recombinant suppressors of RNA silencing: functional effects of linking monomers of Carnation Italian Ringspot virus p19. Biochim. Biophys. Acta 1774, 1528–1535 (2007).
Travers, T. et al. Combinatorial diversity of Syk recruitment driven by its multivalent engagement with FcεRIγ. Mol. Biol. Cell 30, 2331–2347 (2019).
Milles, S. et al. Plasticity of an ultrafast interaction between nucleoporins and nuclear transport receptors. Cell 163, 734–745 (2015).
Chemes, L. B., Noval, M. G., Sanchez, I. E. & de Prat-Gay, G. Folding of a cyclin box: linking multitarget binding to marginal stability, oligomerization, and aggregation of the retinoblastoma tumor suppressor AB pocket domain. J. Biol. Chem. 288, 18923–18938 (2013).
Uversky, V. N. What does it mean to be natively unfolded? Eur. J. Biochem. 269, 2–12 (2002).
Hofmann, H. et al. Polymer scaling laws of unfolded and intrinsically disordered proteins quantified with single-molecule spectroscopy. Proc. Natl Acad. Sci. USA 109, 16155–16160 (2012).
Kuzmic, P., Moss, M. L., Kofron, J. L. & Rich, D. H. Fluorescence displacement method for the determination of receptor-ligand binding constants. Anal. Biochem. 205, 65–69 (1992).
Muhandiram, D. R. & Kay, L. E. Gradient-enhanced triple-resonance three-dimensional NMR experiments with improved sensitivity. J. Magn. Reson. B 103, 203–216 (1994).
Wittekind, M. & Mueller, L. HNCACB, a high-sensitivity 3D NMR experiment to correlate amide-proton and nitrogen resonances with the alpha- and beta-carbon resonances in proteins. J. Magn. Reson. B 101, 201–205 (1993).
Johnson, B. A. & Blevins, R. A. NMRView: a computer program for the visualization and analysis of NMR data. J. Biomol. NMR 4, 603–614 (1994).
Tamiola, K., Acar, B. & Mulder, F. A. Sequence-specific random coil chemical shifts of intrinsically disordered proteins. J. Am. Chem. Soc. 132, 18000–18003 (2010).
Blanchet, C. E. et al. Versatile sample environments and automation for biological solution X-ray scattering experiments at the P12 beamline (PETRA III, DESY). J. Appl. Crystallogr. 48, 431–443 (2015).
Girardot, R., Viguier, G., Pérez, J. & Ounsy, M. M. FOXTROT: A JAVA-based application to reduce and analyse SAXS and WAXS piles of 2D data at synchrotron SOLEIL, Synchrotron Soleil. In Proc. 8th canSAS Meeting, Tokai, Japan (2015).
Franke, D. et al. ATSAS 2.8: a comprehensive data analysis suite for small-angle scattering from macromolecular solutions. J. Appl. Crystallogr. 50, 1212–1225 (2017).
Guinier, A. Diffraction of X-rays of very small angles-application to the study of ultramicroscopic phenomenon. Ann. Phys. 12, 161–237 (1939).
Svergun, D. I., Semenyuk, A. V. & Feigin, L. A. Small-angle-scattering-data treatment by the regularization method. Acta Crystallogr. A 44, 244–250 (1988).
Panjkovich, A. & Svergun, D. I. CHROMIXS: automatic and interactive analysis of chromatography-coupled small-angle X-ray scattering data. Bioinformatics 34, 1944–1946 (2018).
Balog, E. R., Burke, J. R., Hura, G. L. & Rubin, S. M. Crystal structure of the unliganded retinoblastoma protein pocket domain. Proteins 79, 2010–2014 (2011).
Schneidman-Duhovny, D., Hammel, M., Tainer, J. A. & Sali, A. Accurate SAXS profile computation and its assessment by contrast variation experiments. Biophys. J. 105, 962–974 (2013).
Schneidman-Duhovny, D., Hammel, M., Tainer, J. A. & Sali, A. FoXS, FoXSDock and MultiFoXS: single-state and multi-state structural modeling of proteins and their complexes based on SAXS profiles. Nucleic Acids Res. 44, W424–W429 (2016).
Weinkam, P., Pons, J. & Sali, A. Structure-based model of allostery predicts coupling between distant sites. Proc. Natl Acad. Sci. USA 109, 4875–4880 (2012).
Tria, G., Mertens, H. D., Kachala, M. & Svergun, D. I. Advanced ensemble modelling of flexible macromolecules using X-ray solution scattering. IUCrJ 2, 207–217 (2015).
Svergun, D., Barberato, C. & Koch, M. H. J. CRYSOL—a program to evaluate X-ray solution scattering of biological macromolecules from atomic coordinates. J. Appl. Crystallogr. 28, 768–773 (1995).
Garcia De La Torre, J., Huertas, M. L. & Carrasco, B. Calculation of hydrodynamic properties of globular proteins from their atomic-level structure. Biophys. J. 78, 719–730 (2000).
Ortega, A., Amoros, D. & Garcia de la Torre, J. Prediction of hydrodynamic and other solution properties of rigid proteins from atomic- and residue-level models. Biophys. J. 101, 892–898 (2011).
Vitalis, A. & Pappu, R. V. ABSINTH: a new continuum solvation model for simulations of polypeptides in aqueous solutions. J. Comput. Chem. 30, 673–699 (2009).
Vitalis, A. & Pappu, R. V. Methods for Monte Carlo simulations of biomacromolecules. Annu. Rep. Comput. Chem. 5, 49–76 (2009).
Kozlov, A. G. et al. Intrinsically disordered C-terminal tails of E. coli single-stranded DNA binding protein regulate cooperative binding to single-stranded DNA. J. Mol. Biol. 427, 763–774 (2015).
Metskas, L. A. & Rhoades, E. Conformation and dynamics of the troponin I C-terminal domain: combining single-molecule and computational approaches for a disordered protein region. J. Am. Chem. Soc. 137, 11962–11969 (2015).
McGibbon, R. T. et al. MDTraj: a modern open library for the analysis of molecular dynamics trajectories. Biophys. J. 109, 1528–1532 (2015).
Holehouse, A. S., Das, R. K., Ahad, J. N., Richardson, M. O. & Pappu, R. V. CIDER: resources to analyze sequence-ensemble relationships of intrinsically disordered proteins. Biophys. J. 112, 16–21 (2017).
Nygaard, M., Kragelund, B. B., Papaleo, E. & Lindorff-Larsen, K. An efficient method for estimating the hydrodynamic radius of disordered protein conformations. Biophys. J. 113, 550–557 (2017).
Schymkowitz, J. et al. The FoldX web server: an online force field. Nucleic Acids Res. 33, W382–W388 (2005).
London, N., Raveh, B., Cohen, E., Fathi, G. & Schueler-Furman, O. Rosetta FlexPepDock web server—high resolution modeling of peptide-protein interactions. Nucleic Acids Res. 39, W249–W253 (2011).
Good, P. Permutation, Parametric, and Bootstrap Tests of Hypotheses (Springer-Verlag, 2005).
Benjamini, Y. & Hochberg, Y. Controlling the false discovery rate: a practical and powerful approach to multiple testing. J. Royal Stat. Soc. Ser. B 57, 289–300 (1995).
Mészáros, B., Erdos, G. & Dosztányi, Z. IUPred2A: context-dependent prediction of protein disorder as a function of redox state and protein binding. Nucleic Acids Res. 46, W329–W337 (2018).
Howe, K. L. et al. Ensembl 2021. Nucleic Acids Res. 49, D884–D891 (2021).
Ashkenazy, H. et al. ConSurf 2016: an improved methodology to estimate and visualize evolutionary conservation in macromolecules. Nucleic Acids Res. 44, W344–W350 (2016).
Waterhouse, A. M., Procter, J. B., Martin, D. M. A., Clamp, M. & Barton, G. J. Jalview Version 2—a multiple sequence alignment editor and analysis workbench. Bioinformatics 25, 1189–1191 (2009).
Livingstone, C. D. & Barton, G. J. Protein sequence alignments: a strategy for the hierarchical analysis of residue conservation. Comput. Appl. Biosci. 9, 745–756 (1993).
Jumper, J. et al. Highly accurate protein structure prediction with AlphaFold. Nature 596, 583–589 (2021).
Mirdita, M. et al. ColabFold: making protein folding accessible to all. Nat. Methods 19, 679–682 (2022).
Mirdita, M., Steinegger, M. & Söding, J. MMseqs2 desktop and local web server app for fast, interactive sequence searches. Bioinformatics 35, 2856–2858 (2019).
Pettersen, E. F. et al. UCSF Chimera—a visualization system for exploratory research and analysis. J. Comput. Chem. 25, 1605–1612 (2004).
This work was supported by: Agencia Nacional de Promoción Científica y Tecnológica (ANPCyT) Grants PICT no. 2013-1895 and no. 2017-1924 (L. B. C.), no. 2012-2550 and no. 2015-1213 (I. E. S.) and no. 2016-4605 (G. P. G.); US National Institutes of Health no. GM115556 and no. CA141244 (G. W. D.) and no. 5R01NS056114 (R. V. P.); Florida Department of Health (FLDOH) no. 20B17 (G. W. D.); US National Science Foundation no. MCB-1614766 (R. V. P.); a travel award from the USF Nexus Initiative and a Creative Scholarship Grant from the USF College of Arts and Sciences (G. W. D. and L. B. C.); Labex EpiGenMed (Investissements d’avenir) program no. ANR-10-LABX-12-01 (P. B.); French National Research Agency no. ANR-10-INBS-04-01 and no. ANR-10-INBS-05 (P. B.); Spanish Ministerio de Ciencia y Universidades MICYU-FEDER no. RTI2018-097189-C2-1 (G. F.-B.); Consejo Nacional de Investigaciones Científicas y Técnicas (CONICET, Argentina) doctoral fellowship (N. S. G.-F., M. S. and N. A. G.), postdoctoral fellowship (J. G.), and permanent researcher (L. B. C., G. d. P.-G. and I. E. S.); Fulbright Visiting Scholar Program (N. S. G.-F.); Ministerio de Ciencia e Innovación, España, no. BES-2013-063991 and no. EEBB-I-16-11670 (S. B.-V.); Longer Life Foundation: A RGA/Washington University Collaboration (A. S. H.); HPC resources of the CALMIP supercomputing center no. 2016-P16032 (G. F.-B.); and Cluster of Scientific Computing (http://ccc.umh.es/) of the Miguel Hernández University (G. F.-B.). The synchrotron SAXS data were collected at beamline P12, operated by EMBL Hamburg at the PETRA III storage ring (DESY, Hamburg, Germany). We thank K. Perez at the Protein Expression and Purification Core Facility at EMBL (Heidelberg) for critical help with ITC experiments, and P. Aramendia for providing critical access to fluorescence spectrometry equipment at Centro de Investigaciones en Bionanociencias (CIBION, Argentina).
A. S. H. is a scientific consultant with Dewpoint Therapeutics Inc. and R. V. P. is a member of the scientific advisory board of Dewpoint Therapeutics Inc. This work has not been influenced by the affiliation with Dewpoint. The rest of the authors have no competing interests.
Peer review information
Nature Structural and Molecular Biology thanks the anonymous reviewers for their contribution to the peer review of this work. Primary Handling editors: Anke Sparmann and Florian Ullrich, in collaboration with the Nature Structural & Molecular Biology team. Peer reviewer reports are available.
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
a, Far UV-CD spectra of E1AWT (solid line), E1AΔΕ (dotted line), E1AΔL (dashed line). Inset: 15% SDS-PAGE gel of purified recombinant E1A proteins (purity > 90%). b, Far UV-CD spectrum of the Rb (RbAB domain). c, SEC-SLS experiments of E1AWT (solid line), E1AΔΕ (dotted line) and E1AΔL (dashed line). d, SEC-SLS experiment of Rb. For c) and d), black bars correspond to the elution volume of globular protein markers: BSA 66 kDa (1), MBP 45 kDa (2) and Lysozyme 14.3 kDa (3). Black line: SEC profile, red line: measurement of the molecular weight. e, 12.5% SDS-PAGE of MBP-E1A fusion protein variants. Gel1: Grafting of selected linkers from Human and Simian E1A proteins into the E1AWT construct containing the HAdV5 motifs. Types are: HAdV52, HAdV40, SAdV3, SAdV22, HAdV5, HAdV5ΔHyd, HAdV18, HAdV40-2x. Gel 2: Grafting of linkers from Bovine, Canine and Bat E1A proteins into the E1AWT sequence and endogenous variants carrying the cognate motifs for each species: BAdV2, BAdV2-ED, BAdV1, CAdV1, BtAdV2 and BtAdV2-ED. f, 17% SDS-PAGE of cleaved E1A protein variants: BAdV2, HAdV52, HAdV40, BtAdV2, HAdV5 and HAdV40-2x. g, Size exclusion chromatography experiment performed on a Superdex 200 column to determine Rh of cleaved E1A variants. Black bars correspond to Vo and Vo + Vi, and to the elution volume of globular protein markers: Gamma Globulin 150 kDa (1), Transferrin 80 kDa (2), BSA 66 kDa (3) MBP 45 kDa (4) and Trypsin Inhibitor 21 kDa (5). The E1A types are referenced to the names used in Fig. 4d.
Measurements were performed loading the cell with Rb solution and the syringe with the different peptides or proteins as titrants. Panels show heat exchanged as a function of time (upper panel), and the enthalpy per mole of injectant plotted as a function of [peptide/protein]/[Rb] molar ratio (lower panel, black circles) and the corresponding fit using a single site binding model (lower panel, black lines). Binding traces here represented correspond to: a, Rb (5 μM) and Human E2F2 (50 μM); b, Rb (30 μM) and E1AE2F (300 μM); c, Rb (15 μM) and E1ALxCxE (150 μM); d, Rb (15 μM) and E1ALxCxE-AC (150 μM); e, Rb (15 μM) and E1ALxCxE-ACP (150 μM); f, Rb (15 μM) and E1AWT (150 μM); g, Rb (15 μM) and E1AΔE (150 μM); h, Rb (30 μM) and E1AΔL (300 μM). Thermodynamic parameters derived from the fitting are shown in Supplementary Table 1. Exothermic binding to Rb was observed for the Human E2F2 peptide and E1A peptides and protein fragments harboring the LxCxE motif, while E1AE2F and E1AΔL harboring only the E1A E2F motif clearly showed an endothermic behavior. i) ITC curve of a peptide corresponding to the TAZ2 region in the E1A linker (63-80) that showed intensity decreases in the NMR experiments (Fig. 2) binding to Rb. The titration was performed at 30 μM Rb and 300 µM E1A linker peptide at 20 °C. A schematic representation of each interacting pair is shown above the ITC traces: Rb (grey double circle) and each peptide/protein, where binding motifs are represented as follows: Human-E2F2 (green oval), E2F motif (blue oval), LxCxE motif (red oval), LxCxE acidic stretch (orange circle), phosphorylation (letter P).
Extended Data Fig. 3 Fluorescence Spectroscopy titration experiments of E1A-Rb and E2F-Rb interactions.
Representative titration binding curves at equilibrium for each FITC-labeled peptide/protein-Rb interaction tested in this work. Normalized anisotropy signals (circles) are shown, along with the global fit to a 1:1 binding model (lines) that yielded the KD value. The residuals for the fit are shown in the lower panels. Binding traces here represented correspond to two probe (FITC-labeled peptide/protein) concentrations: a, Human E2F2: 1 nM (black) and 5 nM (red); b, E1AE2F: 100 nM (black) and 500 nM (red); c, E1ALxCxE: 100 nM (black) and 500 nM (red); d, E1ALxCxE-AC: 130 nM (black) and 700 nM (red); e, E1ALxCxEACP: 30 nM (black) and 100 nM (red); f, E1AWT: 0.5 nM (black) and 2 nM (red); g, E1AΔE: 200 nM (black) and 800 nM (red); h, E1AΔL: 200 nM (black) and 800 nM (red). The KD values obtained by global fitting to a 1:1 model (Supplementary Data Table 1) were in excellent agreement with those obtained when fitting individual binding curves using non-normalized anisotropy or fluorescence data (Supplementary Table 2). A schematic representation of each interacting pair is shown above the binding traces: Rb (grey double circle); FITC-moiety at the N-terminus of the sequence (light green circle). Binding motifs are represented as follows: Human-E2F2 (green oval), E2F motif (blue oval), LxCxE motif (red oval), acidic stretch (orange circle), phosphorylation (letter P). The linker is represented by a black line.
a, Central region of 1H-15N TROSY spectra of free 15N-labeled E1A (black) and a 1:1 molar ratio complex of 15N-labeled E1A and unlabeled Rb (red) at 525 μM, with assigned peaks of the free form indicated. The full spectrum of this complex is shown in Fig. 2a. b, Left panel: Overlay of the 1H-15N TROSY spectra of free 15N-labeled E1AΔL (black) and a 1:1 molar ratio complex of 15N -labeled E1AΔL and unlabeled Rb (red) at 315 μM. Right panel: central region of the spectra with assigned peaks of the free form indicated c, Left panel: Overlay of the 1H-15N TROSY spectra of free 15N-labeled E1AΔE (black) and a 1:1 molar ratio complex of 15N-labeled E1AΔE and unlabeled Rb (red) at 315 μM. Right panel: central region of the spectra with assigned peaks of the free form indicated. The low chemical shift dispersions in the 1H dimension for E1AΔL and E1AΔE denote their disordered nature, like that seen in E1A. There is no change in peak dispersion upon binding with Rb, indicating that linker regions of the E1AΔL and E1AΔE mutants remain largely disordered in the [E1AΔL:Rb] and [E1AΔE:Rb] complexes. d, Plot of chemical shift changes upon binding as a function of residue number for E1AWT, E1AΔL and E1AΔE. Dashed line at 0.2 ppm corresponds to the digital resolution of the experiment. The small chemical shift changes for almost all of the linker residues suggest very little if no interaction with Rb. I/I0 ratio is overlaid for comparison (colored lines). Dots on the bottom correspond to the residues of each variant whose 1H-15N intensities in the bound state is = 0, so the chemical shift changes could not be measured.
Measurements were performed by loading the cell with Rb or with a pre-assembled complex of Rb with peptide/proteins containing one of the interacting motifs and titrating with peptide/proteins containing the complementary motif loaded into the syringe. Panels show heat exchanged as a function of time, (upper panel) and the enthalpy per mole of injectant plotted as a function of [peptide or protein]/[Rb] molar ratio (Lower panel, black circles) along with the corresponding fit using a single site binding model (Lower panel, black lines). Binding traces correspond to: a, Rb (30 μM, cell) titrated with E1AE2F (300 μM, syringe) at 10 °C; b, [E1ALxCxE:Rb] (30 μM, cell) titrated with E1AE2F (300 μM, syringe) at 10 °C; c, Rb (30 μM, cell) titrated with E1AΔL(300 μM, syringe) at 10 °C; d, [E1ALxCxE:Rb] (30 μM, cell) titrated with E1AΔL (300 μM, syringe) at 10 °C; e, Rb (15 μM, cell) titrated with E1ALxCxE (150 μM, syringe) at 20 °C; f, [E1AE2F:Rb] (15 μM, cell) titrated with E1ALxCxE (150 μM, syringe) at 20 °C; g, [E1AΔL:Rb] (15 μM, cell) titrated with E1ALxCxE (150 μM, syringe) at 20 °C. Thermodynamic parameters derived from the fitting are shown in Supplementary Table 1. A schematic representation of each titration design is shown above the ITC traces: Rb: grey double circle, E2F motif: blue oval, LxCxE motif: red oval. The E1A linker is depicted as a black line. h, ITC measurements of E1AE2F and E1AΔL at different temperatures. The heat capacity change (ΔCp) was calculated from the slope of the plot of ΔH vs temperature. E1AE2F: filled blue circles; E1AΔL: open blue circles. Thermodynamic parameters are reported in Supplementary Data Table 5.
a, I. Experimental SAXS intensity profile (black empty circles) versus theoretical profiles obtained from the crystal structure of the unliganded RbAB domain (PDB ID: 3POM) (red line) or a refined model where flexible loops were added (Allos-Mod-FoXS, blue line). Residuals are shown below the fits. II. Kratky plots of Rb at 4.0 mg/ml (blue line), 2.0 mg/ml (red line) and 1.0 mg/ml (black line). III. Orthogonal views of the RbAB crystal structure (red) and optimized model (blue) (RMSD = 1.7 Å). b, I. SAXS intensity profile of E1AWT (black circles) and the best fit from the EOM method (red line). Below, residual of the fit. II. Rg distribution of the E1AWT ensemble pool (black area) and EOM-selected ensemble (red area). III-IV. Kratky plots (III) or Guinier plots (IV) of E1AWT at 7.0 mg/ml (blue empty circles), 5.6 mg/ml (red empty circles) and 4.2 mg/ml (black empty circles). V. Overlay of SEC-SAXS profile of E1AWT (blue empty circles) and the merged curve from SAXS experiments at three concentrations (pink line). c, Theoretical SAXS profiles computed for a pool of 10250 [E1AWT:Rb] structures compared to experimental SAXS profiles and EOM fitting. Four fitting conditions are shown: I. 1000 generations with ensemble size N = 20, II. 1000 generations with N = 50, III. 500 generations with N = 20 and IV. 500 generations with N = 50. Left: experimental SAXS intensity profiles (grey circles) and EOM fitting (red lines). Middle: Rg distributions of pool ensembles (black line) and EOM-selected sub-ensembles (red line). Right: EOM-selected sub-ensembles. Fitting condition II is presented in Fig. 3. d, Calculated Rh for [E1AWT:Rb] (black) [E1AΔE:Rb] (green) and [E1AΔL:Rb] (blue) pool ensembles and the EOM-selected [E1AWT:Rb] sub-ensemble (red).
a, Linker length control titration experiment. End-to-end distance (Re) of natural sequences (colored circles) compared to synthetic sequences of varying length and constant sequence composition matching the HF_HAdV40 linker (yellow squares). Natural sequences: n = 15 independent simulations were run for each sequence, points represent the mean Re value and error bars represent the standard deviation over the population obtained from the total ensemble from 15 simulations. Synthetic sequences: n = 20 random permutations were generated for each length and simulated under equivalent conditions. The mean Re value (yellow square) is a double average over both conformational space and sequence space. Lines within the yellow squares represent the standard error of the mean across all simulations of a given length, shown to confirm that all random permutations have very similar Re values. b, Net-charge per residue (NCPR) as a function of normalized end-to-end distance for the 27 linkers of Fig. 4a. Inset: NCPR as a function of linker length. Sequences used in the grafting experiment are shown as solid circles and the rest as transparent circles. R = Pearson’s correlation coefficient. c, Correlation between distinct sequence parameters and normalized end-to-end distance (upper panels) or linker length (lower panels) (Supplementary Text 1). R = Pearson’s correlation coefficient. Most R values are < 0.3 with several exceptions. d, Hydrodynamic radius (Rh) for motif-linker-motif constructs of five cleaved E1A variants (shown in Extended Data Fig. 1f,g). The length of each construct is indicated above each bar. Rh was determined from size exclusion chromatography run on Superdex 75 (n = 1, striped colored bars) or Superdex 200 (n = 1, cross-hatched colored bars). The height of each bar indicates the estimated Rh value and the error bars represent the standard deviation obtained from interpolation in the –logMW vs Kav calibration curve (see Methods). Rh was also predicted from all-atom simulations (colored bars). The height of each bar represents the mean Rh value from ten independent simulations of each construct (n = 10), while each individual marker is the mean of each independent simulation.
Competition displacement curves were performed by competing a preassembled equimolar [FITC-E2F2:Rb] complex at 10 nM concentration with increasing concentrations of each variant. One representative example is shown for each variant reported on Supplementary Table 7. The displacement reaction was followed by recording the fluorescence anisotropy of the FITC moiety, with excitation at 490 nm and emission at 520 nm. In every case except for Bov-1-ED, the E1A variants were able to displace FITC-E2F2 from binding to Rb. The anisotropy value of free FITC-E2F2 was 0.042 ± 0.002 and the anisotropy value of the [FITC-E2F2:Rb] complex was 0.14 ± 0.01. In every case, the anisotropy value obtained at the end of the titration was equal to the anisotropy value of the free FITC-E2F2 peptide, confirming the complete displacement of FITC-E2F2. The anisotropy values were normalized to calculate the fraction of Rb-bound FITC-E2F2 and fitted to estimate the KD value for the [Variant:Rb] complex.
Extended Data Fig. 9 Conservation of pocket domain structure and linear motif binding sites across mammalian pocket proteins.
a, Structural conservation of the pocket domain across mammalian pocket proteins. The human Rb pocket domain (PDB:1GUX) is shown aligned with 9 structural models of Rb pocket domains from representative mammalian species plus the human paralogs p107 (PDB:4YOZ) and p130. The models of the Rb pocket domains and p130 were obtained by using Alphafold2 implemented in ColabFold (See Methods). Secondary structure is depicted in rainbow colors. The E2F (left) and LxCxE (right) motifs are depicted as green ribbons (PDB 2R7G and 1GUX respectively). b, Structural conservation of the E2F and LxCxE clefts in pocket proteins. Structural alignment shown in panel A with the residues that mediate binding to the E2F and LxCxE motifs (marked as asterisks in Supplementary Fig. 1) depicted as blue and red sticks respectively. c, The distance between the E2F and LxCxE binding sites is highly conserved across mammalian pocket proteins. The spacing was measured between the C-terminal anchor site of the E2F cleft (blue sphere) and the N-terminal anchor site of the LxCxE cleft (red sphere). Distances are: 46.0 Å (human, macaque and chicken), 46.1 Å (chimpanzee, dog, microbat, cow, sheep, pig, horse and tree shrew), 47.3 Å (p107) and 46.5 Å (p130). These distances are slightly shorter than the distance between binding sites used in the Ceff calculations (r0 = 49 Å), which was measured between the C-terminal residue of the E2F motif and the N-terminal residue of the LxCxE motif using the structures of the motifs bound to Rb (PDB: 2R7G and 1GUX).
a,b, Lp and Ceff values for E1A linkers. Boxplots: center line represents the median, lower and upper bounds represent the first and third quartiles and upper and lower whiskers extend from the top and bottom of the box by 1.4 the interquartile range. Black dots: outliers. p-values were calculated using a two-sided permutation test (10000 permutations) and the Benjamini-Hochberg correction for multiple comparisons to control the false discovery rate. ***p-value < 0.001 (detection limit of the test). N = 110: All E1A linkers, N = 24: Simulated linkers. c, Ceff as a function of linker length for 24 linkers calculated using the WLC model (Lp = 3 Å) (green dots), or Lp values from all atom simulations (Lp Sim, orange dots). Dark green/red dots: E1AWT. d, Upper panel: E2F (blue) and LxCxE (red) motifs From E1A bound to Rb. Green sticks: core residues, blue/red sticks: variable residues. Lower panel: FoldX energy matrices with energy normalized in the range 0-2 kcal/mol. e, Fold-change in affinity (KD,E1A (Lp = 3 Å) / KD,E1A (Lp Sim)) using naïve versus simulated Lp. Red dot: E1AWT. f, Predicted KD for the E1AE2F and E1ALXCXE SLiMs and for the motif-linker-motif construct for 110 sequences (E1A WLC) and for 24 simulated sequences using Lp = 3 Å (KD WLC) or sequence-specific Lp from the simulations (E1A Sim). Boxplot elements and p-values are defined as in panel a. Cyan dots: experimental value for E1AWT. Red line: E2F2 motif affinity. g, Global Rb binding affinity (KD,E1A) as a function of linker length for 24 sequences using the LpSim values. KD,E1A = KD,E2F·KD,LxCxE·Ceff−1. The low R2 value indicates that KD,E1A is uncorrelated to linker length. Upper panel: density plot of linker length for 107 E1A linkers (three short linkers were excluded). Right panel: density plot of KD,E1A. Red dot/line: Predicted KD,E1A for HAdV5 (E1AWT). Grey cross line: experimental KD,E1A for E1AWT.
Supplementary Tables 1–7, Supplementary Figure 1 and Supplementary Text 1.
Sequence Alignment of 116 mastadenovirus E1A sequences.
Backbone chemical shifts from NMR experiments for E1AWT, E1AΔΕ and E1Aδl.
Raw/unprocessed data from c and d of Fig. 1.
Raw/unprocessed data from b(i–v) and c of Fig. 2.
Raw/unprocessed data from f of Fig. 3.
Raw/unprocessed data from d of Fig. 4 and sequences of E1A variants used in grafting experiments.
Raw/unprocessed data from a of Fig. 5.
Raw/unprocessed data from a–d and g of Extended Data Fig. 1.
Raw/unprocessed data from d of Extended Data Fig. 4.
Raw/unprocessed data from a and c of Extended Data Fig. 7.
Raw/unprocessed data from panels a–c and e–g of Extended Data Fig. 10.
About this article
Cite this article
González-Foutel, N.S., Glavina, J., Borcherds, W.M. et al. Conformational buffering underlies functional selection in intrinsically disordered protein regions. Nat Struct Mol Biol 29, 781–790 (2022). https://doi.org/10.1038/s41594-022-00811-w