The energy-based refinement of low-resolution protein structure models to atomic-level accuracy is a major challenge for computational structural biology. Here we describe a new approach to refining protein structure models that focuses sampling in regions most likely to contain errors while allowing the whole structure to relax in a physically realistic all-atom force field. In applications to models produced using nuclear magnetic resonance data and to comparative models based on distant structural homologues, the method can significantly improve the accuracy of the structures in terms of both the backbone conformations and the placement of core side chains. Furthermore, the resulting models satisfy a particularly stringent test: they provide significantly better solutions to the X-ray crystallographic phase problem in molecular replacement trials. Finally, we show that all-atom refinement can produce de novo protein structure predictions that reach the high accuracy required for molecular replacement without any experimental phase information and in the absence of templates suitable for molecular replacement from the Protein Data Bank. These results suggest that the combination of high-resolution structure prediction with state-of-the-art phasing tools may be unexpectedly powerful in phasing crystallographic data for which molecular replacement is hindered by the absence of sufficiently accurate previous models.
Subscribe to Journal
Get full journal access for 1 year
only $3.90 per issue
All prices are NET prices.
VAT will be added later in the checkout.
Tax calculation will be finalised during checkout.
Rent or Buy article
Get time limited or full article access on ReadCube.
All prices are NET prices.
Misura, K. M. & Baker, D. Progress and challenges in high-resolution refinement of protein structure models. Proteins 59, 15–29 (2005)
Pieper, U. et al. MODBASE: a database of annotated comparative protein structure models and associated resources. Nucleic Acids Res. 34, D291–D295 (2006)
Moult, J. A decade of CASP: progress, bottlenecks and prognosis in protein structure prediction. Curr. Opin. Struct. Biol. 15, 285–289 (2005)
Schwarzenbacher, R., Godzik, A., Grzechnik, S. K. & Jaroszewski, L. The importance of alignment accuracy for molecular replacement. Acta Crystallogr. D 60, 1229–1236 (2004)
Giorgetti, A., Raimondo, D., Miele, A. E. & Tramontano, A. Evaluating the usefulness of protein structure models for molecular replacement. Bioinformatics 21 (suppl. 2). ii72–ii76 (2005)
Chen, Y. W., Dodson, E. J. & Kleywegt, G. J. Does NMR mean “not for molecular replacement”? Using NMR-based search models to solve protein crystal structures. Structure 8, R213–R220 (2000)
Strop, P., Brzustowicz, M. R. & Brunger, A. T. Ab initio molecular-replacement phasing for symmetric helical membrane proteins. Acta Crystallogr. D 63, 188–196 (2007)
Rossmann, M. G. Ab initio phase determination and phase extension using non-crystallographic symmetry. Curr. Opin. Struct. Biol. 5, 650–655 (1995)
Kuhlman, B. et al. Design of a novel globular protein fold with atomic-level accuracy. Science 302, 1364–1368 (2003)
McCoy, A. J. et al. Phaser crystallographic software. J. Appl. Crystallogr. 40, 658–674 (2007)
Perrakis, A., Morris, R. & Lamzin, V. S. Automated protein model building combined with iterative structure refinement. Nature Struct. Biol. 6, 458–463 (1999)
Terwilliger, T. C. Automated main-chain model building by template matching and iterative fragment extension. Acta Crystallogr. D 59, 38–44 (2003)
Bradley, P., Misura, K. M. & Baker, D. Toward high-resolution de novo structure prediction for small proteins. Science 309, 1868–1871 (2005)
Rohl, C. A., Strauss, C. E., Misura, K. M. & Baker, D. Protein structure prediction using Rosetta. Methods Enzymol. 383, 66–93 (2004)
Leaver-Fay, A., Kuhlman, B. & Snoeyink, J. Rotamer-pair energy calculations using a Trie data structure. In Algorithms in Bioinformatics (eds Casadio, R. & Myers, G.) 389 (Springer, Berlin, 2005)
Wales, D. J. & Scheraga, H. A. Global optimization of clusters, crystals, and biomolecules. Science 285, 1368–1372 (1999)
Wallner, B. & Elofsson, A. Identification of correct regions in protein models using structural, alignment, and consensus information. Protein Sci. 15, 900–913 (2006)
Glover, F. & Laguna, M. Tabu Search (Kluwer, Norwell, Massachusetts, 1997)
Lee, J., Liwo, A. & Scheraga, H. A. Energy-based de novo protein folding by conformational space annealing and an off-lattice united-residue force field: application to the 10–55 fragment of staphylococcal protein A and to apo calbindin D9K. Proc. Natl Acad. Sci. USA 96, 2025–2030 (1999)
Doreleijers, J. F., Rullmann, J. A. & Kaptein, R. Quality assessment of NMR structures: a statistical survey. J. Mol. Biol. 281, 149–164 (1998)
Grishaev, A. & Bax, A. An empirical backbone-backbone hydrogen-bonding potential in proteins and its applications to NMR structure refinement and validation. J. Am. Chem. Soc. 126, 7281–7292 (2004)
Rieping, W., Habeck, M. & Nilges, M. Inferential structure determination. Science 309, 303–306 (2005)
Zemla, A. LGA: A method for finding 3D similarities in protein structures. Nucleic Acids Res. 31, 3370–3374 (2003)
Lovell, S. C. et al. Structure validation by Cα geometry: φ, ψ and Cβ deviation. Proteins 50, 437–450 (2003)
Das, R. et al. Structure prediction for CASP7 targets using extensive all-atom refinement with Rosetta@home. Proteins doi: 10.1002/prot.21636 (25 September 2007)
Berman, H., Henrick, K., Nakamura, H. & Markley, J. L. The worldwide Protein Data Bank (wwPDB): ensuring a single, uniform archive of PDB data. Nucleic Acids Res. 35, D301–D303 (2007)
Andrade, S. L., Dickmanns, A., Ficner, R. & Einsle, O. Crystal structure of the archaeal ammonium transporter Amt-1 from Archaeoglobus fulgidus . Proc. Natl Acad. Sci. USA 102, 14994–14999 (2005)
Pannu, N. S. & Read, R. J. Improved structure refinement through maximum likelihood. Acta Crystallogr. A 52, 659–668 (1996)
Dauter, Z. New approaches to high-throughput phasing. Curr. Opin. Struct. Biol. 12, 674–678 (2002)
Englander, J. J. et al. Protein structure change studied by hydrogen-deuterium exchange, functional labeling, and mass spectrometry. Proc. Natl Acad. Sci. USA 100, 7057–7062 (2003)
Young, M. M. et al. High throughput protein fold identification by using experimental constraints derived from intramolecular cross-links and mass spectrometry. Proc. Natl Acad. Sci. USA 97, 5802–5806 (2000)
Takamoto, K. & Chance, M. R. Radiolytic protein footprinting with mass spectrometry to probe the structure of macromolecular complexes. Annu. Rev. Biophys. Biomol. Struct. 35, 251–276 (2006)
Zhang, Y. & Skolnick, J. TM-align: a protein structure alignment algorithm based on the TM-score. Nucleic Acids Res. 33, 2302–2309 (2005)
Ortiz, A. R., Strauss, C. E. & Olmea, O. MAMMOTH (matching molecular models obtained from theory): an automated method for model comparison. Protein Sci. 11, 2606–2621 (2002)
Simons, K. T., Kooperberg, C., Huang, E. & Baker, D. Assembly of protein tertiary structures from fragments with similar local sequences using simulated annealing and Bayesian scoring functions. J. Mol. Biol. 268, 209–225 (1997)
Canutescu, A. A. & Dunbrack, R. L. Cyclic coordinate descent: A robotics algorithm for protein loop closure. Protein Sci. 12, 963–972 (2003)
Lazaridis, T. & Karplus, M. Effective energy function for proteins in solution. Proteins 35, 133–152 (1999)
Dunbrack, R. L. & Cohen, F. E. Bayesian statistical analysis of protein side-chain rotamer preferences. Protein Sci. 6, 1661–1681 (1997)
Engh, R. A. & Huber, R. Accurate bond and angle parameters for X-ray protein structure refinement. Acta Crystallogr. A 47, 392–400 (1991)
Wang, C., Schueler-Furman, O. & Baker, D. Improved side-chain modeling for protein-protein docking. Protein Sci. 14, 1328–1339 (2005)
Li, Z. & Scheraga, H. A. Monte Carlo-minimization approach to the multiple-minima problem in protein folding. Proc. Natl Acad. Sci. USA 84, 6611–6615 (1987)
Garbuzynskiy, S. O., Melnik, B. S., Lobanov, M. Y., Finkelstein, A. V. & Galzitskaya, O. V. Comparison of X-ray and NMR structures: is there a systematic difference in residue contacts between X-ray- and NMR-resolved protein structures? Proteins 60, 139–147 (2005)
Ginalski, K., Elofsson, A., Fischer, D. & Rychlewski, L. 3D-Jury: a simple approach to improve protein structure predictions. Bioinformatics 19, 1015–1018 (2003)
Chivian, D. & Baker, D. Homology modeling using parametric alignment ensemble generation with consensus and energy-based model selection. Nucleic Acids Res. 34, e112 (2006)
Sali, A. & Blundell, T. L. Comparative protein modelling by satisfaction of spatial restraints. J. Mol. Biol. 234, 779–815 (1993)
Bonneau, R., Strauss, C. E. & Baker, D. Improving the performance of Rosetta using multiple sequence alignment information and global measures of hydrophobic core formation. Proteins 43, 1–11 (2001)
Moult, J., Fidelis, K., Rost, B., Hubbard, T. & Tramontano, A. Critical assessment of methods of protein structure prediction (CASP)–round 6. Proteins 61 (suppl. 7). 3–7 (2005)
Petsko, G. A. The grail problem. Genome Biol. 1, COMMENT002 (2000)
Plewczynski, D., Pas, J., Von Grotthuss, M. & Rychlewski, L. Comparison of proteins based on segments structural similarity. Acta Biochim. Pol. 51, 161–172 (2004)
Kuhlman, B. & Baker, D. Native protein sequences are close to optimal for their structures. Proc. Natl Acad. Sci. USA 97, 10383–10388 (2000)
We thank Rosetta@home participants for contributing computing power that made testing of many new ideas possible; the DOE INCITE program for access to Blue Gene/L at Argonne National Laboratory and the IBM Blue Gene Watson supercomputers; and the NCSA, SDSC and Argonne National Laboratory supercomputer centres for computer time and help with porting Rosetta to Blue Gene. We thank D. Kim and K. Laidig for developing the computational infrastructure underlying Rosetta@home; J. Abendroth for help with RESOLVE and ARP/wARP software; M. Kennedy of NESG for the NMR structure coordinates of protein 1xpw and for help with the molecular replacement calculations; and J. Abendroth, J. Bosch, J. Havranek and C. Wang for comments on the manuscript. We also thank the CASP organizers and contributing structural biologists for providing an invaluable test set for new structure refinement methods. This work was funded by the National Institute of General Medical Sciences, National Institutes of Health (to D.B.), the Wellcome Trust, UK (to R.J.R.), the Howard Hughes Medical Institute (D.B.), a Leukemia and Lymphoma Society Career Development fellowship (to B.Q.), and a Jane Coffin Childs fellowship (to R.D.). Rosetta software and source code are available to academic users free of charge at http://www.rosettacommons.org/software/.
Author Contributions B.Q., S.R. and R.D. contributed equally to this work. Structure predictions for NMR-based, comparative-model-based and de novo predictions were carried out by S.R., B.Q. and R.D. respectively, with advice and software from D.B. and P.B. Phasing trials were performed by R.J.R., B.Q., S.R. and R.D., with advice from R.J.R. and A.J.M. All authors discussed results and commented on the manuscript.
The authors declare no competing financial interests.
About this article
Cite this article
Qian, B., Raman, S., Das, R. et al. High-resolution structure prediction and the crystallographic phase problem. Nature 450, 259–264 (2007). https://doi.org/10.1038/nature06249
Nature Communications (2020)
Nature Communications (2019)
BMC Bioinformatics (2018)
The conformational wave in capsaicin activation of transient receptor potential vanilloid 1 ion channel
Nature Communications (2018)