A protein taxonomy based on secondary structure

Przytycka, Teresa; Aurora, Rajeev; Rose, George D.

doi:10.1038/10728

Insight
Published: July 1999

A protein taxonomy based on secondary structure

Teresa Przytycka¹,
Rajeev Aurora¹^nAff2 &
George D. Rose¹

Nature Structural Biology volume 6, pages 672–682 (1999)Cite this article

516 Accesses
78 Citations
Metrics details

Abstract

Does a protein's secondary structure determine its three-dimensional fold? This question is tested directly by analyzing proteins of known structure and constructing a taxonomy based solely on secondary structure. The taxonomy is generated automatically, and it takes the form of a tree in which proteins with similar secondary structure occupy neighboring leaves. Our tree is largely in agreement with results from the structural classification of proteins (SCOP), a multidimensional classification based on homologous sequences, full three-dimensional structure, information about chemistry and evolution, and human judgment. Our findings suggest a simple mechanism of protein evolution.

Access through your institution

Buy or subscribe

This is a preview of subscription content, access via your institution

Access options

Access through your institution

Buy this article

Purchase on Springer Link
Instant access to full article PDF

Buy now

Prices may be subject to local taxes which are calculated during checkout

**Figure 1: Similarity tree for the 183 proteins in our data set, generated automatically as described in the Methods.**

**Figure 2: The tree obtained using VAST¹¹.**

**Figure 3: Primary sequence tree, constructed using the neighbor-joining (NJ) method³⁴.**

References

Minor, D.L. Jr. & Kim, P.S. Context-dependent secondary structure formation of a designed protein sequence. Nature 380, 730–734 ( 1996).
Article CAS Google Scholar
Itahaki, L.S., Otzen, D.E. & Fersht, A.R. The structure of the transition state for folding of chymotrypsin inhibitor 2 analysed by protein engineering methods: evidence for a nucleation–condensation mechanism for protein folding. J. Mol. Biol. 254, 260–288 (1995).
Article Google Scholar
Shao, X. & Matthews, C.R. Single-tryptophan mutants of monomeric tryptophan repressor: optical spectroscopy reveals nonnative structure in a model for an early folding intermediate. Biochemistry 37, 7850–7858 (1998).
Article CAS Google Scholar
Clark, P.L., Liu, Z.-P., Rizo, J. & Gierasch, L.M. Cavity formation before stable hydrogen bonding in the folding of a beta-clam protein. Nature Struct. Biol. 4, 883–886 (1997).
Article CAS Google Scholar
Yee, D.P., Chan, H.S., Havel, T.F. & Dill, K.A. Does compactness induce secondary structure in proteins? A study of poly-alanine chains computed by distance geometry. J. Mol. Biol. 241, 557–573 (1994).
Article CAS Google Scholar
Havel, T.F., Crippen, G.M. & Kuntz, I.D. Effects of distance constraints on macromolecular conformation. II. Simulation of experimental results and theoretical predictions. Biopolymers 18, 73–81 (1979).
Article CAS Google Scholar
Reymond, M.T., Merutka, G., Dyson, H.J. & Wright, P.E. Folding propensities of peptide fragments of myoglobin. Protein Sci. 6, 706–716 (1997).
Article CAS Google Scholar
Dyson, H.J. et al. Folding of peptide fragments comprising the complete sequence of proteins. Models for initiation of protein folding II. Plastocyanin. J. Mol. Biol. 226, 819–835 (1992).
Article CAS Google Scholar
Srinivasan, R. & Rose, G.D. LINUS—a simple algorithm to predict the fold of a protein. Proteins Struct. Funct. Genet. 22, 81–99 (1995).
Article CAS Google Scholar
Murzin, A.G., Brenner, S.E., Hubbard, T. & Chothia, C. SCOP: a structural classification of proteins database for the investigation of sequences and structures. J. Mol. Biol. 247, 536–540 (1995).
CAS PubMed Google Scholar
Madej, T., Gibrat, J-F. & Bryant, S.H. Threading a database of protein cores. Proteins Struct. Funct. Genet. 23, 356– 369 (1995).
Article CAS Google Scholar
Mitchell, E.M., Artymiuk, P.J., Rice, D.W. & Willett, P. Use of techniques derived from graph theory to compare secondary structure motifs in proteins. J. Mol. Biol. 212, 151 –166 (1990).
Article CAS Google Scholar
Di Francesco, V., Garnier, J. & Munson, P.J. Protein topology recognition from secondary structure sequences: application of the hidden markov models to the alpha class proteins. J. Mol. Biol. 267, 446– 463 (1997).
Article CAS Google Scholar
Russell, R.B., Copley, R.R. & Barton, G.J. Protein fold recognition by mapping predicted secondary structures. J. Mol. Biol. 259, 349– 365 (1996).
Article CAS Google Scholar
Rost, B., Schneider, R. & Sander, C. Protein fold recognition by prediction-based threading. J Mol Biol 270, 471–480 (1997).
Article CAS Google Scholar
Rice, D.W. & Eisenberg, D. A 3D–1D substitution matrix for protein fold recognition that includes predicted secondary structure of the sequence. J. Mol. Biol. 267, 1026– 1038 (1997).
Article CAS Google Scholar
Aurora, R. & Rose, G.D. Seeking an ancient enzyme in Methanococcus jannaschii using ORF, a program based on predicted secondary structure comparisons. Proc. Natl. Acad. Sci. USA 95 , 2818–2823 (1998).
Article CAS Google Scholar
Holm, L. & Sander, C. Mapping the protein universe. Science 273, 595–603 ( 1996).
Article CAS Google Scholar
Needleman, S.B. & Wunsch, C.D. A general method applicable to the search for similarities in the amino acid sequence of two proteins. J. Mol. Biol. 48, 443– 453 (1970).
Article CAS Google Scholar
Sander, C. & Schneider, R. Database of homology-derived protein structures and the structural meaning of sequence alignment. Proteins Struct. Funct. Genet. 9, 56–68 (1991).
Article CAS Google Scholar
Doolittle, R.F. The multiplicity of domains in proteins. Annu. Rev. Biochem. 64, 287–314 (1995).
Article CAS Google Scholar
Doolittle, R.F. Of Urfs and Orfs 1-1–103 (University Science Books, Sausalito, California; 1986).
Google Scholar
Altschul, S.F., Boguski, M.S., Gish, W. & Wootton, J.C. Issues in searching molecular sequence databases. Nat. Genet. 6, 119–129 (1994).
Article CAS Google Scholar
Smith, H.O., Annau, T.M. & Chandrasegaran, S. Finding sequence motifs in groups of functionally related proteins. Proc Natl Acad Sci USA 87, 826 –830 (1990).
Article CAS Google Scholar
Lipman, D.J. & Pearson, W.R. Rapid and sensitive protein similarity searches. Science 227, 1435– 1441 (1985).
Article CAS Google Scholar
Neuwald, A.F., Liu, J.S., Lipman, D.J. & Lawrence, C.E. Extracting protein alignment models from the sequence database. Nucleic Acids Res. 25, 1665–1677 ( 1997).
Article CAS Google Scholar
Henikoff, S. & Henikoff, J.G. Embedding strategies for effective use of information from multiple sequence alignments. Protein Sci. 6, 698–705 ( 1997).
Article CAS Google Scholar
Luthy, R., Bowie, J.U. & Eisenberg, D. Assessment of protein models with three-dimensional profiles. Nature 356, 83– 85 (1992).
Article CAS Google Scholar
Gibrat, J-F., Madej, T. & Bryant, S.H. Surprising similarities in structure comparison. Curr. Opin. Struct. Biol. 6, 377–385 (1996).
Article CAS Google Scholar
Hobohm, U. & Sander, C. Enlarged representative set of protein structures. Protein Sci. 3, 522– 524 (1994).
Article CAS Google Scholar
Bernstein, F.C. et al. The Protein Data Bank: a computer-based archival file for macromolecular structures. J. Mol. Biol. 112, 535–542 (1977).
Article CAS Google Scholar
Levitt, M. & Chothia, C. Structural patterns in globular proteins. Nature 261, 552– 558 (1976).
Article CAS Google Scholar
Thompson, J.D., Higgins, D.G. & Gibson, T.J. CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. Nucleic Acids Res. 22, 4673–4680 (1994).
Article CAS Google Scholar
Saitou, N. & Nei, M. The neighborhood-joining method: a new method for reconstructing phylogenic trees. Mol. Biol. Evol. 4, 406–424 (1987).
CAS PubMed Google Scholar
Richardson, J.S. The anatomy and taxonomy of protein structure. Adv. Prot. Chem. 34, 168–340 ( 1981).
Google Scholar
Orengo, C.A., Michie, A.D., Jones, D.T., Swindells, M.B. & Thornton, J.M. CATH—a hierarchic classification of protein domain structures. Structure 5, 1093–1108 (1997).
Article CAS Google Scholar
Holm, L. & Sander, C. Protein structure comparison by alignment of distance matrices. J. Mol. Biol. 233, 123–138 (1993).
Article CAS Google Scholar
King, J. Genetic analysis of protein folding pathways. Biotechnology 4, 297–303 (1986).
CAS Google Scholar
Lattman, E.E. & Rose, G.D. Protein folding — what's the question? Proc. Natl. Acad. Sci. USA 90, 439–441 (1993).
Article CAS Google Scholar
Aurora, R., Creamer, T.P., Srinivasan, R. & Rose, G.D. Local interactions in protein folding: lessons from the α-helix. J. Biol. Chem. 272, 1413–1416 (1997).
Article CAS Google Scholar
Baldwin, R.L. & Rose, G.D. Is protein folding hierarchic? I. Local structure and peptide folding. Trends Biochem. Sci. 24, 26–33 (1999).
Article CAS Google Scholar
Holm, L. & Sander, C. An evolutionary treasure: unification of a broad set of amidohydrolases related to urease. Proteins Struct. Funct. Genet. 28, 72–82 (1997).
Article CAS Google Scholar
Waterman, M.S. Introduction to computational biology: maps, sequences, and genomes (Chapman & Hall, London;1995).
Book Google Scholar
Cohen, J. & Farach, M. In Proc. of eighth ann. ACM–SIAM symp. on discrete algorithms. (Association for Computing Machinery, New York; 410–416; 1997).
Google Scholar

Download references

Acknowledgements

We thank R. Srinivasan, V. Murthy and P. Thiessen for helpful suggestions, and J. Cohen for providing access to his tree-construction program, Tande. We are particularly indebted to an anonymous referee for assistance in bringing this paper to fruition. Supported by the Sloan Foundation (T.P.) andthe NIH (G.D.R.).

Author information

Rajeev Aurora
Present address: Monsanto, Mail Zone AA3G, St. Louis, 700 Chesterfield Parkway, Missouri, 63198, USA

Authors and Affiliations

Department of Biophysics and Biophysical Chemistry, Johns Hopkins University School of Medicine, 725 N. Wolfe Street, Baltimore, 21205, Maryland, USA
Teresa Przytycka, Rajeev Aurora & George D. Rose

Authors

Teresa Przytycka
View author publications
You can also search for this author in PubMed Google Scholar
Rajeev Aurora
View author publications
You can also search for this author in PubMed Google Scholar
George D. Rose
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to George D. Rose.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Przytycka, T., Aurora, R. & Rose, G. A protein taxonomy based on secondary structure. Nat Struct Mol Biol 6, 672–682 (1999). https://doi.org/10.1038/10728

Download citation

Received: 24 July 1998
Accepted: 29 March 1999
Issue Date: July 1999
DOI: https://doi.org/10.1038/10728

This article is cited by

Combination of site directed mutagenesis and secondary structure analysis predicts the amino acids essential for stability of M. leprae MurE
- Anusuya Shanmugam
- Jeyakumar Natarajan
Interdisciplinary Sciences: Computational Life Sciences (2014)
Outer membrane proteins can be simply identified using secondary structure element alignment
- Ren-Xiang Yan
- Zhen Chen
- Ziding Zhang
BMC Bioinformatics (2011)
Improving protein secondary structure prediction based on short subsequences with local structure similarity
- Hsin-Nan Lin
- Ting-Yi Sung
- Wen-Lian Hsu
BMC Genomics (2010)
DescFold: A web server for protein fold recognition
- Ren-Xiang Yan
- Jing-Na Si
- Ziding Zhang
BMC Bioinformatics (2009)
TIM-Finder: A new method for identifying TIM-barrel proteins
- Jing-Na Si
- Ren-Xiang Yan
- Xiao-Dong Su
BMC Structural Biology (2009)

A protein taxonomy based on secondary structure

Abstract

Access options

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

This article is cited by

Combination of site directed mutagenesis and secondary structure analysis predicts the amino acids essential for stability of M. leprae MurE

Outer membrane proteins can be simply identified using secondary structure element alignment

Improving protein secondary structure prediction based on short subsequences with local structure similarity

DescFold: A web server for protein fold recognition

TIM-Finder: A new method for identifying TIM-barrel proteins

Search

Quick links

Abstract

Access options

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

This article is cited by

Combination of site directed mutagenesis and secondary structure analysis predicts the amino acids essential for stability of M. leprae MurE

Outer membrane proteins can be simply identified using secondary structure element alignment

Improving protein secondary structure prediction based on short subsequences with local structure similarity

DescFold: A web server for protein fold recognition

TIM-Finder: A new method for identifying TIM-barrel proteins

Search

Quick links