Abstract
Previous studies have suggested that nature is restricted to about 1,000 protein folds to perform a great diversity of functions. Here, we use protein interaction data from different sources and three-dimensional structures to suggest that the total number of interaction types is also limited, and estimate that most interactions in nature will conform to one of about 10,000 types. We currently know fewer than 2,000, and at the present rate of structure determination, it will be more than 20 years before we know a full representative set.
This is a preview of subscription content, access via your institution
Relevant articles
Open Access articles citing this article.
-
Modeling protein quaternary structure of homo- and hetero-oligomers beyond binary interactions by homology
Scientific Reports Open Access 05 September 2017
Access options
Subscribe to this journal
Receive 12 print issues and online access
$209.00 per year
only $17.42 per issue
Rent or buy this article
Prices vary by article type
from$1.95
to$39.95
Prices may be subject to local taxes which are calculated during checkout



References
Chothia, C. One thousand families for the molecular biologist. Nature 357, 543–544 (1992).
Andreeva, A. et al. SCOP database in 2004: refinements integrate structure and sequence family data. Nucleic Acids Res. 32 Database issue, D226–D229 (2004).
Pearl, F.M. et al. The CATH database: an extended protein family resource for structural and functional genomics. Nucleic Acids Res. 31, 452–455 (2003).
Kuhlman, B. et al. Design of a novel globular protein fold with atomic-level accuracy. Science 302, 1364–1368 (2003).
Uetz, P. et al. A comprehensive analysis of protein-protein interactions in Saccharomyces cerevisiae. Nature 403, 623–627 (2000).
Ito, T. et al. A comprehensive two-hybrid analysis to explore the yeast protein interactome. Proc. Natl. Acad. Sci. USA 98, 4569–4574 (2001).
Gavin, A.C. et al. Functional organization of the yeast proteome by systematic analysis of protein complexes. Nature 415, 141–147 (2002).
Ho, Y. et al. Systematic identification of protein complexes in Saccharomyces cerevisiae by mass spectrometry. Nature 415, 180–183 (2002).
Rain, J.C. et al. The protein-protein interaction map of Helicobacter pylori. Nature 409, 211–215 (2001).
Giot, L. et al. A protein interaction map of Drosophila melanogaster. Science 302, 1727–1736 (2003).
Li, S. et al. A map of the interactome network of the metazoan C. elegans. Science 303, 540–543 (2004).
Sali, A., Glaeser, R., Earnest, T. & Baumeister, W. From words to literature in structural proteomics. Nature 422, 216–225 (2003).
Aloy, P. et al. Structure-based assembly of protein complexes in yeast. Science 303, 2026–2029 (2004).
Russell, R.B. et al. A structural perspective on protein-protein interactions and complexes. Curr. Opin. Struct. Biol. 14, 313–324 (2004).
Apic, G., Gough, J. & Teichmann, S.A. Domain combinations in archaeal, eubacterial and eukaryotic proteomes. J. Mol. Biol. 310, 311–325 (2001).
Abbott, A. Proteomics: the society of proteins. Nature 417, 894–896 (2002).
Aloy, P. & Russell, R.B. The third dimension for protein interactions and complexes. Trends Biochem. Sci. 27, 633–638 (2002).
Bader, G.D. & Hogue, C.W. Analyzing yeast protein-protein interaction data obtained from different sources. Nat. Biotechnol. 20, 991–997 (2002).
Westbrook, J., Feng, Z., Chen, L., Yang, H. & Berman, H.M. The Protein Data Bank and structural genomics. Nucleic Acids Res. 31, 489–491 (2003).
Aloy, P. & Russell, R.B. Interrogating protein interaction networks through structural biology. Proc. Natl. Acad. Sci. USA 99, 5896–5901 (2002).
Henrick, K. & Thornton, J.M. PQS: a protein quaternary structure file server. Trends Biochem. Sci. 23, 358–361 (1998).
Aloy, P., Ceulemans, H., Stark, A. & Russell, R.B. The relationship between sequence and interaction divergence in proteins. J. Mol. Biol. 332, 989–998 (2003).
Von Mering, C. et al. Comparative assessment of large-scale data sets of protein-protein interactions. Nature 417, 399–403 (2002).
Mewes, H.W. et al. MIPS: a database for genomes and protein sequences. Nucleic Acids Res. 30, 31–34 (2002).
Bateman, A. et al. The Pfam protein families database. Nucleic Acids Res. 30, 276–280 (2002).
Tatusov, R.L. et al. The COG database: an updated version includes eukaryotes. BMC Bioinform. 4, 41 (2003).
von Mering, C. et al. STRING: a database of predicted functional associations between proteins. Nucleic Acids Res. 31, 258–261 (2003).
Kraulis, P.J. MOLSCRIPT: A program to produce both detailed and schematic plots of protein structures. J. Appl. Cryst. 24, 946–950 (1991).
Acknowledgements
We thank Andrej Sali (UCSF) for fruitful discussions and Peer Bork (EMBL) for useful comments on the manuscript.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing financial interests.
Supplementary information
Rights and permissions
About this article
Cite this article
Aloy, P., Russell, R. Ten thousand interactions for the molecular biologist. Nat Biotechnol 22, 1317–1321 (2004). https://doi.org/10.1038/nbt1018
Published:
Issue Date:
DOI: https://doi.org/10.1038/nbt1018