Crystallization is the most serious bottleneck in high-throughput protein-structure determination by diffraction methods. We have used data mining of the large-scale experimental results of the Northeast Structural Genomics Consortium and experimental folding studies to characterize the biophysical properties that control protein crystallization. This analysis leads to the conclusion that crystallization propensity depends primarily on the prevalence of well-ordered surface epitopes capable of mediating interprotein interactions and is not strongly influenced by overall thermodynamic stability. We identify specific sequence features that correlate with crystallization propensity and that can be used to estimate the crystallization probability of a given construct. Analyses of entire predicted proteomes demonstrate substantial differences in the amino acid–sequence properties of human versus eubacterial proteins, which likely reflect differences in biophysical properties, including crystallization propensity. Our thermodynamic measurements do not generally support previous claims regarding correlations between sequence properties and protein stability.
Subscribe to Journal
Get full journal access for 1 year
only $21.58 per issue
All prices are NET prices.
VAT will be added later in the checkout.
Rent or Buy article
Get time limited or full article access on ReadCube.
All prices are NET prices.
Abrahams, J.P., Leslie, A.G., Lutter, R. & Walker, J.E. Structure at 2.8 A resolution of F1-ATPase from bovine heart mitochondria. Nature 370, 621–628 (1994).
Cramer, P., Bushnell, D.A. & Kornberg, R.D. Structural basis of transcription: RNA polymerase II at 2.8 angstrom resolution. Science 292, 1863–1876 (2001).
Deisenhofer, J. & Michel, H. The photosynthetic reaction center from the purple bacterium Rhodopseudomonas viridis. Science 245, 1463–1473 (1989).
Gnatt, A.L., Cramer, P., Fu, J., Bushnell, D.A. & Kornberg, R.D. Structural basis of transcription: an RNA polymerase II elongation complex at 3.3 A resolution. Science 292, 1876–1882 (2001).
Kendrew, J.C. Structure and function in myoglobin and other proteins. Fed. Proc. 18, 740–751 (1959).
Acton, T.B. et al. Robotic cloning and Protein Production Platform of the Northeast Structural Genomics Consortium. Methods Enzymol. 394, 210–243 (2005).
Chen, L., Oughtred, R., Berman, H.M. & Westbrook, J. TargetDB: a target registration database for structural genomics projects. Bioinformatics 20, 2860–2862 (2004).
Blundell, T.L., Jhoti, H. & Abell, C. High-throughput crystallography for lead discovery in drug design. Nat. Rev. Drug Discov. 1, 45–54 (2002).
Kitchen, D.B., Decornez, H., Furr, J.R. & Bajorath, J. Docking and scoring in virtual screening for drug discovery: methods and applications. Nat. Rev. Drug Discov. 3, 935–949 (2004).
Dasgupta, S., Iyer, G.H., Bryant, S.H., Lawrence, C.E. & Bell, J.A. Extent and nature of contacts between protein molecules in crystal lattices and between subunits of protein oligomers. Proteins 28, 494–514 (1997).
Janin, J. & Rodier, F. Protein-protein interaction at crystal contacts. Proteins 23, 580–587 (1995).
Cooper, D.R. et al. Protein crystallization by surface entropy reduction: optimization of the SER strategy. Acta Crystallogr. D Biol. Crystallogr. 63, 636–645 (2007).
Derewenda, Z.S. The use of recombinant methods and molecular engineering in protein crystallization. Methods 34, 354–363 (2004).
Derewenda, Z.S. Rational protein crystallization by mutational surface engineering. Structure 12, 529–535 (2004).
Derewenda, Z.S. & Vekilov, P.G. Entropy and surface engineering in protein crystallization. Acta Crystallogr. D Biol. Crystallogr. 62, 116–124 (2006).
Longenecker, K.L., Garrard, S.M., Sheffield, P.J. & Derewenda, Z.S. Protein crystallization by rational mutagenesis of surface residues: Lys to Ala mutations promote crystallization of RhoGDI. Acta Crystallogr. D Biol. Crystallogr. 57, 679–688 (2001).
Mateja, A. et al. The impact of Glu → Ala and Glu → Asp mutations on the crystallization properties of RhoGDI: the structure of RhoGDI at 1.3 A resolution. Acta Crystallogr. D Biol. Crystallogr. 58, 1983–1991 (2002).
Canaves, J.M., Page, R., Wilson, I.A. & Stevens, R.C. Protein biophysical properties that correlate with crystallization success in Thermotoga maritima: maximum clustering strategy for structural genomics. J. Mol. Biol. 344, 977–991 (2004).
Oldfield, C.J., Ulrich, E.L., Cheng, Y., Dunker, A.K. & Markley, J.L. Addressing the intrinsic disorder bottleneck in structural proteomics. Proteins 59, 444–453 (2005).
Goh, C.S. et al. Mining the structural genomics pipeline: identification of protein properties that affect high-throughput experimental analysis. J. Mol. Biol. 336, 115–130 (2004).
Overton, I.M. & Barton, G.J. A normalised scale for structural genomics target ranking: the OB-Score. FEBS Lett. 580, 4005–4009 (2006).
Slabinski, L. et al. The challenge of protein structure determination–lessons from structural genomics. Protein Sci. 16, 2472–2482 (2007).
Niesen, F.H., Berglund, H. & Vedadi, M. The use of differential scanning fluorimetry to detect ligand interactions that promote protein stability. Nat. Protocols 2, 2212–2221 (2007).
Uversky, V.N. Natively unfolded proteins: a point where biology waits for physics. Protein Sci. 11, 739–756 (2002).
Wukovitz, S.W. & Yeates, T.O. Why protein crystals favour some space-groups over others. Nat. Struct. Biol. 2, 1062–1067 (1995).
Banatao, D.R. et al. An approach to crystallizing proteins by synthetic symmetrization. Proc. Natl. Acad. Sci. USA 103, 16230–16235 (2006).
Cumbaa, C.A. et al. Automatic classification of sub-microlitre protein-crystallization trials in 1536-well plates. Acta Crystallogr. 59, 1619–1627 (2003).
Kyte, J. & Doolittle, R.F. A simple method for displaying the hydropathic character of a protein. J. Mol. Biol. 157, 105–132 (1982).
Creamer, T.P. Side-chain conformational entropy in protein unfolded states. Proteins 40, 443–450 (2000).
Ward, J.J., McGuffin, L.J., Bryson, K., Buxton, B.F. & Jones, D.T. The Drosoph. Inf. Serv.OPRED server for the prediction of protein disorder. Bioinformatics 20, 2138–2139 (2004).
Vekilov, P.G. Solvent entropy effects in the formation of protein solid phases. Methods Enzymol. 368, 84–105 (2003).
Rost, B., Yachdav, G. & Liu, J. The PredictProtein server. Nucleic Acids Res. 32, W321–326 (2004).
Goh, C.S. et al. SPINE 2: a system for collaborative structural proteomics within a federated database framework. Nucleic Acids Res. 31, 2833–2838 (2003).
Bertone, P. et al. SPINE: an integrated tracking database and data mining approach for identifying feasible targets in high-throughput structural proteomics. Nucleic Acids Res. 29, 2884–2898 (2001).
Rice, P., Longden, I. & Bleasby, A. EMBOSS: the European Molecular Biology Open Software Suite. Trends Genet. 16, 276–277 (2000).
Appel, R.D., Bairoch, A. & Hochstrasser, D.F. A new generation of information retrieval tools for biologists: the example of the ExPASy WWW server. Trends Biochem. Sci. 19, 258–260 (1994).
Rost, B. in The Proteomics Protocols Handbook (ed. J.E. Walker) 875–901 (Humana Press, Totowa, New Jersey, 2005).
This work was supported by Protein Structure Initiative grants from the National Institutes of Health (NIH) to the Northeast Structural Genomics Consortium and the Center for High-Throughput Structural Biology. The full staffs of these consortia contributed to the experimental data analyzed in this paper. W.N.P. II was supported in part by an NIH training grant to the Department of Biological Sciences at Columbia, and S.K.H. was supported in part by a National Science Foundation grant to J.F.H. The authors thank Wayne Hendrickson and Liang Tong for support and advice and John Schwanoff and the New York Structural Biology Center for maintenance of the X4 beamlines at Brookhaven National Laboratory.
About this article
Cite this article
Price II, W., Chen, Y., Handelman, S. et al. Understanding the physical properties that control protein crystallization by analysis of large-scale experimental data. Nat Biotechnol 27, 51–57 (2009). https://doi.org/10.1038/nbt.1514
Characterization of the inhibition mechanism of a tissuefactor inhibiting single-chain variable fragment: a combined computational approach
Journal of Molecular Modeling (2020)
Current pivotal strategies leading a difficult target protein to a sample suitable for crystallographic analysis
Biochemical Society Transactions (2020)
Crystal Growth & Design (2020)
Acta Crystallographica Section D Structural Biology (2019)
Crystal Growth & Design (2018)