Analysis | Published:

Understanding the physical properties that control protein crystallization by analysis of large-scale experimental data

Nature Biotechnology volume 27, pages 5157 (2009) | Download Citation

Abstract

Crystallization is the most serious bottleneck in high-throughput protein-structure determination by diffraction methods. We have used data mining of the large-scale experimental results of the Northeast Structural Genomics Consortium and experimental folding studies to characterize the biophysical properties that control protein crystallization. This analysis leads to the conclusion that crystallization propensity depends primarily on the prevalence of well-ordered surface epitopes capable of mediating interprotein interactions and is not strongly influenced by overall thermodynamic stability. We identify specific sequence features that correlate with crystallization propensity and that can be used to estimate the crystallization probability of a given construct. Analyses of entire predicted proteomes demonstrate substantial differences in the amino acid–sequence properties of human versus eubacterial proteins, which likely reflect differences in biophysical properties, including crystallization propensity. Our thermodynamic measurements do not generally support previous claims regarding correlations between sequence properties and protein stability.

Access optionsAccess options

Rent or Buy article

Get time limited or full article access on ReadCube.

from$8.99

All prices are NET prices.

References

  1. 1.

    , , & Structure at 2.8 A resolution of F1-ATPase from bovine heart mitochondria. Nature 370, 621–628 (1994).

  2. 2.

    , & Structural basis of transcription: RNA polymerase II at 2.8 angstrom resolution. Science 292, 1863–1876 (2001).

  3. 3.

    & The photosynthetic reaction center from the purple bacterium Rhodopseudomonas viridis. Science 245, 1463–1473 (1989).

  4. 4.

    , , , & Structural basis of transcription: an RNA polymerase II elongation complex at 3.3 A resolution. Science 292, 1876–1882 (2001).

  5. 5.

    Structure and function in myoglobin and other proteins. Fed. Proc. 18, 740–751 (1959).

  6. 6.

    et al. Robotic cloning and Protein Production Platform of the Northeast Structural Genomics Consortium. Methods Enzymol. 394, 210–243 (2005).

  7. 7.

    , , & TargetDB: a target registration database for structural genomics projects. Bioinformatics 20, 2860–2862 (2004).

  8. 8.

    , & High-throughput crystallography for lead discovery in drug design. Nat. Rev. Drug Discov. 1, 45–54 (2002).

  9. 9.

    , , & Docking and scoring in virtual screening for drug discovery: methods and applications. Nat. Rev. Drug Discov. 3, 935–949 (2004).

  10. 10.

    , , , & Extent and nature of contacts between protein molecules in crystal lattices and between subunits of protein oligomers. Proteins 28, 494–514 (1997).

  11. 11.

    & Protein-protein interaction at crystal contacts. Proteins 23, 580–587 (1995).

  12. 12.

    et al. Protein crystallization by surface entropy reduction: optimization of the SER strategy. Acta Crystallogr. D Biol. Crystallogr. 63, 636–645 (2007).

  13. 13.

    The use of recombinant methods and molecular engineering in protein crystallization. Methods 34, 354–363 (2004).

  14. 14.

    Rational protein crystallization by mutational surface engineering. Structure 12, 529–535 (2004).

  15. 15.

    & Entropy and surface engineering in protein crystallization. Acta Crystallogr. D Biol. Crystallogr. 62, 116–124 (2006).

  16. 16.

    , , & Protein crystallization by rational mutagenesis of surface residues: Lys to Ala mutations promote crystallization of RhoGDI. Acta Crystallogr. D Biol. Crystallogr. 57, 679–688 (2001).

  17. 17.

    et al. The impact of Glu → Ala and Glu → Asp mutations on the crystallization properties of RhoGDI: the structure of RhoGDI at 1.3 A resolution. Acta Crystallogr. D Biol. Crystallogr. 58, 1983–1991 (2002).

  18. 18.

    , , & Protein biophysical properties that correlate with crystallization success in Thermotoga maritima: maximum clustering strategy for structural genomics. J. Mol. Biol. 344, 977–991 (2004).

  19. 19.

    , , , & Addressing the intrinsic disorder bottleneck in structural proteomics. Proteins 59, 444–453 (2005).

  20. 20.

    et al. Mining the structural genomics pipeline: identification of protein properties that affect high-throughput experimental analysis. J. Mol. Biol. 336, 115–130 (2004).

  21. 21.

    & A normalised scale for structural genomics target ranking: the OB-Score. FEBS Lett. 580, 4005–4009 (2006).

  22. 22.

    et al. The challenge of protein structure determination–lessons from structural genomics. Protein Sci. 16, 2472–2482 (2007).

  23. 23.

    , & The use of differential scanning fluorimetry to detect ligand interactions that promote protein stability. Nat. Protocols 2, 2212–2221 (2007).

  24. 24.

    Natively unfolded proteins: a point where biology waits for physics. Protein Sci. 11, 739–756 (2002).

  25. 25.

    & Why protein crystals favour some space-groups over others. Nat. Struct. Biol. 2, 1062–1067 (1995).

  26. 26.

    et al. An approach to crystallizing proteins by synthetic symmetrization. Proc. Natl. Acad. Sci. USA 103, 16230–16235 (2006).

  27. 27.

    et al. Automatic classification of sub-microlitre protein-crystallization trials in 1536-well plates. Acta Crystallogr. 59, 1619–1627 (2003).

  28. 28.

    & A simple method for displaying the hydropathic character of a protein. J. Mol. Biol. 157, 105–132 (1982).

  29. 29.

    Side-chain conformational entropy in protein unfolded states. Proteins 40, 443–450 (2000).

  30. 30.

    , , , & The Drosoph. Inf. Serv.OPRED server for the prediction of protein disorder. Bioinformatics 20, 2138–2139 (2004).

  31. 31.

    Solvent entropy effects in the formation of protein solid phases. Methods Enzymol. 368, 84–105 (2003).

  32. 32.

    , & The PredictProtein server. Nucleic Acids Res. 32, W321–326 (2004).

  33. 33.

    et al. SPINE 2: a system for collaborative structural proteomics within a federated database framework. Nucleic Acids Res. 31, 2833–2838 (2003).

  34. 34.

    et al. SPINE: an integrated tracking database and data mining approach for identifying feasible targets in high-throughput structural proteomics. Nucleic Acids Res. 29, 2884–2898 (2001).

  35. 35.

    , & EMBOSS: the European Molecular Biology Open Software Suite. Trends Genet. 16, 276–277 (2000).

  36. 36.

    , & A new generation of information retrieval tools for biologists: the example of the ExPASy WWW server. Trends Biochem. Sci. 19, 258–260 (1994).

  37. 37.

    in The Proteomics Protocols Handbook (ed. J.E. Walker) 875–901 (Humana Press, Totowa, New Jersey, 2005).

Download references

Acknowledgements

This work was supported by Protein Structure Initiative grants from the National Institutes of Health (NIH) to the Northeast Structural Genomics Consortium and the Center for High-Throughput Structural Biology. The full staffs of these consortia contributed to the experimental data analyzed in this paper. W.N.P. II was supported in part by an NIH training grant to the Department of Biological Sciences at Columbia, and S.K.H. was supported in part by a National Science Foundation grant to J.F.H. The authors thank Wayne Hendrickson and Liang Tong for support and advice and John Schwanoff and the New York Structural Biology Center for maintenance of the X4 beamlines at Brookhaven National Laboratory.

Author information

Affiliations

  1. Northeast Structural Genomics Consortium, 702A Fairchild Center, MC2434, Columbia University, New York, New York 10027, USA.

    • W Nicholson Price II
    • , Yang Chen
    • , Samuel K Handelman
    • , Helen Neely
    • , Philip Manor
    • , Richard Karlin
    • , Rajesh Nair
    • , Jinfeng Liu
    • , Michael Baran
    • , John Everett
    • , Saichiu N Tong
    • , Farhad Forouhar
    • , Swarup S Swaminathan
    • , Thomas Acton
    • , Rong Xiao
    • , Joseph R Luft
    • , Angela Lauricella
    • , George T DeTitta
    • , Burkhard Rost
    • , Gaetano T Montelione
    •  & John F Hunt
  2. Department of Biological Sciences, 702A Fairchild Center, MC2434, Columbia University, New York, New York 10027, USA.

    • W Nicholson Price II
    • , Yang Chen
    • , Samuel K Handelman
    • , Helen Neely
    • , Philip Manor
    • , Richard Karlin
    • , Farhad Forouhar
    • , Swarup S Swaminathan
    •  & John F Hunt
  3. Department of Biochemistry and Molecular Biophysics, Columbia University, New York, New York 10032, USA.

    • Rajesh Nair
    • , Jinfeng Liu
    •  & Burkhard Rost
  4. Department of Molecular Biology and Biochemistry, Center for Advanced Biotechnology and Medicine, Rutgers University, Piscataway, New Jersey 08854, USA.

    • Michael Baran
    • , John Everett
    • , Saichiu N Tong
    • , Thomas Acton
    • , Rong Xiao
    •  & Gaetano T Montelione
  5. Hauptman-Woodward Institute, 700 Ellicott Street, Buffalo, New York 14203, USA.

    • Joseph R Luft
    • , Angela Lauricella
    •  & George T DeTitta
  6. Department of Biochemistry, Robert Wood Johnson Medical School, University of Medicine and Dentistry of New Jersey, Piscataway, New Jersey 08854, USA.

    • Gaetano T Montelione

Authors

  1. Search for W Nicholson Price II in:

  2. Search for Yang Chen in:

  3. Search for Samuel K Handelman in:

  4. Search for Helen Neely in:

  5. Search for Philip Manor in:

  6. Search for Richard Karlin in:

  7. Search for Rajesh Nair in:

  8. Search for Jinfeng Liu in:

  9. Search for Michael Baran in:

  10. Search for John Everett in:

  11. Search for Saichiu N Tong in:

  12. Search for Farhad Forouhar in:

  13. Search for Swarup S Swaminathan in:

  14. Search for Thomas Acton in:

  15. Search for Rong Xiao in:

  16. Search for Joseph R Luft in:

  17. Search for Angela Lauricella in:

  18. Search for George T DeTitta in:

  19. Search for Burkhard Rost in:

  20. Search for Gaetano T Montelione in:

  21. Search for John F Hunt in:

Corresponding author

Correspondence to John F Hunt.

Supplementary information

PDF files

  1. 1.

    Supplementary Text and Figures

    Figures 1–20, Tables 1–4, Methods, Notes

About this article

Publication history

Published

DOI

https://doi.org/10.1038/nbt.1514

Further reading