Abstract
Current structural genomics programs aim systematically to determine the structures of all proteins coded in both human and other genomes, providing a complete picture of the number and variety of protein structures that exist. In the past, estimates have been made on the basis of the incomplete sample of structures currently known. These estimates have varied greatly (between 1,000 and 10,000; see for example refs 1 and 2), partly because of limited sample size but also owing to the difficulties of distinguishing one structure from another. This distinction is usually topological, based on the fold of the protein; however, in strict topological terms (neglecting to consider intra-chain cross-links), protein chains are open strings and hence are all identical. To avoid this trivial result, topologies are determined by considering secondary links in the form of intra-chain hydrogen bonds (secondary structure) and tertiary links formed by the packing of secondary structures. However, small additions to or loss of structure can make large changes to these perceived topologies and such subjective solutions are neither robust nor amenable to automation. Here I formalize both secondary and tertiary links to allow the rigorous and automatic definition of protein topology.
This is a preview of subscription content, access via your institution
Access options
Subscribe to this journal
Receive 51 print issues and online access
$199.00 per year
only $3.90 per issue
Rent or buy this article
Prices vary by article type
from$1.95
to$39.95
Prices may be subject to local taxes which are calculated during checkout
Similar content being viewed by others
References
Chothia, C. One thousand families for the molecular biologist. Nature 357, 543–544 (1992).
Orengo, C. A., Jones, D. T. & Thornton, J. Protein superfamilies and domain superfolds. Nature 372, 631–634 (1994).
Eidhammer, I., Jonassen, I. & Taylor, W. R. Structure comparison and structure patterns. J. Comput. Biol. 7, 658–716 (2000).
Murzin, A. G., Brenner, S. E., Hubbard, T. & Chothia, C. SCOP: a structural classification of proteins database for the investigation of sequences and structures. J. Mol. Biol. 247, 536–540 (1995).
Orengo, C. A. et al. CATH—a hierarchic classification of protein domain structures. Structure 5, 1093–1108 (1997).
Mizuguchi, K., Deane, C. M., Blundell, T. L. & Overington, J. P. HOMSTRAD: a database of protein structure alignments for homologous families. Protein Sci. 7, 2469–2471 (1998).
Holm, L. & Sander, C. Dali/FSSP classification of three-dimensional protein folds. Nucleic Acids Res. 25, 231–234 (1997).
Hadley, C. & Jones, D. T. A systematic comparison of protein structure classifications SCOP, CATH and FSSP. Structure 7, 1099–1112 (1995).
Chothia, C. & Finkelstein, A. V. The classification and origins of protein folding patterns. Ann. Rev. Biochem. 59, 1007–1039 (1990).
Finkelstein, A. V. & Ptitsyn, O. B. Why do globular proteins fit the limited set of folding patterns? Prog. Biophys. Mol. Biol. 50, 171–190 (1987).
Cohen, F. E., Sternberg, M. J. E. & Taylor, W. R. Analysis and prediction of protein β-sheet structures by a combinatorial approach. Nature 285, 378–382 (1980).
Cohen, F. E., Sternberg, M. J. E. & Taylor, W. R. Analysis and prediction of the packing of α-helices against a β-sheet in the tertiary structure of globular proteins. J. Mol. Biol. 156, 821–862 (1982).
Murzin, A. G., Lesk, A. M. & Chothia, C. Principles determining the structure of β-sheet barrels in proteins: I, A theoretical analysis. J. Mol. Biol. 236, 1396–1381 (1994).
Taylor, W. R. Defining linear segments in protein structure. J. Mol. Biol. 310, 1135–1150 (2001).
Taylor, W. R. Searching for the ideal forms of proteins. Biochem. Soc. Trans. 28, 264–269 (2000).
Murzin, A. G. & Finkelstein, A. V. General architecture of the α-helical globule. J. Mol. Biol. 204, 749–769 (1988).
Taylor, W. R., Jones, D. T. & Green, N. M. A method for α-helical integral membrane protein fold prediction. Protein Struct. Funct. Genet. 18, 281–294 (1994).
Taylor, W. R. Protein structure domain identification. Protein Eng. 12, 203–216 (1999).
Jones, D. T., Taylor, W. R. & Thornton, J. M. A new approach to protein fold recognition. Nature 385, 86–89 (1992).
Chothia, C. & Murzin, A. G. New folds for all-β proteins. Structure 1, 217–222 (1993).
Taylor, W. R. Protein structure comparison using bipartite graph matching and its application to protein structure classification. Mol. Cell. Proteom. advance online publication 4 March 2002 (DOI 10.1074/mcp.T200001-MCP200).
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Competing interests
The authors declare that they have no competing financial interests
Supplementary information
Rights and permissions
About this article
Cite this article
Taylor, W. A ‘periodic table’ for protein structures. Nature 416, 657–660 (2002). https://doi.org/10.1038/416657a
Received:
Accepted:
Issue Date:
DOI: https://doi.org/10.1038/416657a
This article is cited by
-
A backbone-centred energy function of neural networks for protein design
Nature (2022)
-
Evolution of networks of protein domain organization
Scientific Reports (2021)
-
Bottom-up de novo design of functional proteins with complex structural features
Nature Chemical Biology (2021)
-
The multilevel organismal diversity approach deciphers difficult to distinguish nudibranch species complex
Scientific Reports (2021)
-
Design of metalloproteins and novel protein folds using variational autoencoders
Scientific Reports (2018)
Comments
By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.