Access

Article

Nature 437, 512-518 (22 September 2005) | doi:10.1038/nature03991; Received 25 January 2005; Accepted 30 June 2005

Open Innovation Challenges

naturejobs

Evolutionary information for specifying a protein fold

Michael Socolich1,2,5, Steve W. Lockless1,2,4,5, William P. Russ1,2, Heather Lee1,2, Kevin H. Gardner2,3 & Rama Ranganathan1,2

  1. Howard Hughes Medical Institute and Departments of
  2. Pharmacology and
  3. Biochemistry, University of Texas Southwestern Medical Center, Dallas, Texas 75390-9050, USA
  4. †Present address: The Rockefeller University, 1230 York Avenue, New York, New York 10021, USA
  5. *These authors contributed equally to this work

Correspondence to: Rama Ranganathan1,2 Correspondence and requests for materials should be addressed to R.R. (Email: rama.ranganathan@utsouthwestern.edu). Atomic coordinates for CC45 have been deposited in the Protein Data Bank under accession code 1YMZ.

Top

Classical studies show that for many proteins, the information required for specifying the tertiary structure is contained in the amino acid sequence. Here, we attempt to define the sequence rules for specifying a protein fold by computationally creating artificial protein sequences using only statistical information encoded in a multiple sequence alignment and no tertiary structure information. Experimental testing of libraries of artificial WW domain sequences shows that a simple statistical energy function capturing coevolution between amino acid residues is necessary and sufficient to specify sequences that fold into native structures. The artificial proteins show thermodynamic stabilities similar to natural WW domains, and structure determination of one artificial protein shows excellent agreement with the WW fold at atomic resolution. The relative simplicity of the information used for creating sequences suggests a marked reduction to the potential complexity of the protein-folding problem.

  1. Howard Hughes Medical Institute and Departments of
  2. Pharmacology and
  3. Biochemistry, University of Texas Southwestern Medical Center, Dallas, Texas 75390-9050, USA
  4. †Present address: The Rockefeller University, 1230 York Avenue, New York, New York 10021, USA
  5. *These authors contributed equally to this work

Correspondence to: Rama Ranganathan1,2 Correspondence and requests for materials should be addressed to R.R. (Email: rama.ranganathan@utsouthwestern.edu). Atomic coordinates for CC45 have been deposited in the Protein Data Bank under accession code 1YMZ.

MORE ARTICLES LIKE THIS

These links to content published by NPG are automatically generated.

NEWS AND VIEWS

Transfer RNA: One State or Many

Nature News and Views (15 Dec 1972)