Skip to main content

Thank you for visiting You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

High-resolution structure prediction and the crystallographic phase problem


The energy-based refinement of low-resolution protein structure models to atomic-level accuracy is a major challenge for computational structural biology. Here we describe a new approach to refining protein structure models that focuses sampling in regions most likely to contain errors while allowing the whole structure to relax in a physically realistic all-atom force field. In applications to models produced using nuclear magnetic resonance data and to comparative models based on distant structural homologues, the method can significantly improve the accuracy of the structures in terms of both the backbone conformations and the placement of core side chains. Furthermore, the resulting models satisfy a particularly stringent test: they provide significantly better solutions to the X-ray crystallographic phase problem in molecular replacement trials. Finally, we show that all-atom refinement can produce de novo protein structure predictions that reach the high accuracy required for molecular replacement without any experimental phase information and in the absence of templates suitable for molecular replacement from the Protein Data Bank. These results suggest that the combination of high-resolution structure prediction with state-of-the-art phasing tools may be unexpectedly powerful in phasing crystallographic data for which molecular replacement is hindered by the absence of sufficiently accurate previous models.

Access options

Rent or Buy article

Get time limited or full article access on ReadCube.


All prices are NET prices.

Figure 1: Overview of the rebuilding-and-refinement method.
Figure 2: Improvement in model accuracy produced by rebuilding and refinement.
Figure 3: Improvement in electron density using models from rebuilding and refinement in molecular replacement searches.
Figure 4: Ab initio phasing by ab initio modelling.


  1. 1

    Misura, K. M. & Baker, D. Progress and challenges in high-resolution refinement of protein structure models. Proteins 59, 15–29 (2005)

    CAS  Article  PubMed  Google Scholar 

  2. 2

    Pieper, U. et al. MODBASE: a database of annotated comparative protein structure models and associated resources. Nucleic Acids Res. 34, D291–D295 (2006)

    ADS  CAS  Article  PubMed  Google Scholar 

  3. 3

    Moult, J. A decade of CASP: progress, bottlenecks and prognosis in protein structure prediction. Curr. Opin. Struct. Biol. 15, 285–289 (2005)

    CAS  Article  PubMed  Google Scholar 

  4. 4

    Schwarzenbacher, R., Godzik, A., Grzechnik, S. K. & Jaroszewski, L. The importance of alignment accuracy for molecular replacement. Acta Crystallogr. D 60, 1229–1236 (2004)

    Article  PubMed  PubMed Central  Google Scholar 

  5. 5

    Giorgetti, A., Raimondo, D., Miele, A. E. & Tramontano, A. Evaluating the usefulness of protein structure models for molecular replacement. Bioinformatics 21 (suppl. 2). ii72–ii76 (2005)

    CAS  Article  PubMed  Google Scholar 

  6. 6

    Chen, Y. W., Dodson, E. J. & Kleywegt, G. J. Does NMR mean “not for molecular replacement”? Using NMR-based search models to solve protein crystal structures. Structure 8, R213–R220 (2000)

    CAS  Article  PubMed  Google Scholar 

  7. 7

    Strop, P., Brzustowicz, M. R. & Brunger, A. T. Ab initio molecular-replacement phasing for symmetric helical membrane proteins. Acta Crystallogr. D 63, 188–196 (2007)

    CAS  Article  PubMed  Google Scholar 

  8. 8

    Rossmann, M. G. Ab initio phase determination and phase extension using non-crystallographic symmetry. Curr. Opin. Struct. Biol. 5, 650–655 (1995)

    CAS  Article  PubMed  Google Scholar 

  9. 9

    Kuhlman, B. et al. Design of a novel globular protein fold with atomic-level accuracy. Science 302, 1364–1368 (2003)

    ADS  CAS  Article  Google Scholar 

  10. 10

    McCoy, A. J. et al. Phaser crystallographic software. J. Appl. Crystallogr. 40, 658–674 (2007)

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  11. 11

    Perrakis, A., Morris, R. & Lamzin, V. S. Automated protein model building combined with iterative structure refinement. Nature Struct. Biol. 6, 458–463 (1999)

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  12. 12

    Terwilliger, T. C. Automated main-chain model building by template matching and iterative fragment extension. Acta Crystallogr. D 59, 38–44 (2003)

    Article  Google Scholar 

  13. 13

    Bradley, P., Misura, K. M. & Baker, D. Toward high-resolution de novo structure prediction for small proteins. Science 309, 1868–1871 (2005)

    ADS  CAS  Article  PubMed  Google Scholar 

  14. 14

    Rohl, C. A., Strauss, C. E., Misura, K. M. & Baker, D. Protein structure prediction using Rosetta. Methods Enzymol. 383, 66–93 (2004)

    CAS  Article  Google Scholar 

  15. 15

    Leaver-Fay, A., Kuhlman, B. & Snoeyink, J. Rotamer-pair energy calculations using a Trie data structure. In Algorithms in Bioinformatics (eds Casadio, R. & Myers, G.) 389 (Springer, Berlin, 2005)

    Chapter  Google Scholar 

  16. 16

    Wales, D. J. & Scheraga, H. A. Global optimization of clusters, crystals, and biomolecules. Science 285, 1368–1372 (1999)

    CAS  Article  PubMed  Google Scholar 

  17. 17

    Wallner, B. & Elofsson, A. Identification of correct regions in protein models using structural, alignment, and consensus information. Protein Sci. 15, 900–913 (2006)

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  18. 18

    Glover, F. & Laguna, M. Tabu Search (Kluwer, Norwell, Massachusetts, 1997)

    Book  Google Scholar 

  19. 19

    Lee, J., Liwo, A. & Scheraga, H. A. Energy-based de novo protein folding by conformational space annealing and an off-lattice united-residue force field: application to the 10–55 fragment of staphylococcal protein A and to apo calbindin D9K. Proc. Natl Acad. Sci. USA 96, 2025–2030 (1999)

    ADS  CAS  Article  PubMed  Google Scholar 

  20. 20

    Doreleijers, J. F., Rullmann, J. A. & Kaptein, R. Quality assessment of NMR structures: a statistical survey. J. Mol. Biol. 281, 149–164 (1998)

    CAS  Article  PubMed  Google Scholar 

  21. 21

    Grishaev, A. & Bax, A. An empirical backbone-backbone hydrogen-bonding potential in proteins and its applications to NMR structure refinement and validation. J. Am. Chem. Soc. 126, 7281–7292 (2004)

    CAS  Article  PubMed  Google Scholar 

  22. 22

    Rieping, W., Habeck, M. & Nilges, M. Inferential structure determination. Science 309, 303–306 (2005)

    ADS  CAS  Article  PubMed  Google Scholar 

  23. 23

    Zemla, A. LGA: A method for finding 3D similarities in protein structures. Nucleic Acids Res. 31, 3370–3374 (2003)

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  24. 24

    Lovell, S. C. et al. Structure validation by Cα geometry: φ, ψ and Cβ deviation. Proteins 50, 437–450 (2003)

    CAS  Article  Google Scholar 

  25. 25

    Das, R. et al. Structure prediction for CASP7 targets using extensive all-atom refinement with Rosetta@home. Proteins doi: 10.1002/prot.21636 (25 September 2007)

  26. 26

    Berman, H., Henrick, K., Nakamura, H. & Markley, J. L. The worldwide Protein Data Bank (wwPDB): ensuring a single, uniform archive of PDB data. Nucleic Acids Res. 35, D301–D303 (2007)

    CAS  Article  PubMed  Google Scholar 

  27. 27

    Andrade, S. L., Dickmanns, A., Ficner, R. & Einsle, O. Crystal structure of the archaeal ammonium transporter Amt-1 from Archaeoglobus fulgidus . Proc. Natl Acad. Sci. USA 102, 14994–14999 (2005)

    ADS  CAS  Article  PubMed  PubMed Central  Google Scholar 

  28. 28

    Pannu, N. S. & Read, R. J. Improved structure refinement through maximum likelihood. Acta Crystallogr. A 52, 659–668 (1996)

    Article  Google Scholar 

  29. 29

    Dauter, Z. New approaches to high-throughput phasing. Curr. Opin. Struct. Biol. 12, 674–678 (2002)

    CAS  Article  PubMed  Google Scholar 

  30. 30

    Englander, J. J. et al. Protein structure change studied by hydrogen-deuterium exchange, functional labeling, and mass spectrometry. Proc. Natl Acad. Sci. USA 100, 7057–7062 (2003)

    ADS  CAS  Article  PubMed  Google Scholar 

  31. 31

    Young, M. M. et al. High throughput protein fold identification by using experimental constraints derived from intramolecular cross-links and mass spectrometry. Proc. Natl Acad. Sci. USA 97, 5802–5806 (2000)

    ADS  CAS  Article  PubMed  Google Scholar 

  32. 32

    Takamoto, K. & Chance, M. R. Radiolytic protein footprinting with mass spectrometry to probe the structure of macromolecular complexes. Annu. Rev. Biophys. Biomol. Struct. 35, 251–276 (2006)

    CAS  Article  PubMed  Google Scholar 

  33. 33

    Zhang, Y. & Skolnick, J. TM-align: a protein structure alignment algorithm based on the TM-score. Nucleic Acids Res. 33, 2302–2309 (2005)

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  34. 34

    Ortiz, A. R., Strauss, C. E. & Olmea, O. MAMMOTH (matching molecular models obtained from theory): an automated method for model comparison. Protein Sci. 11, 2606–2621 (2002)

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  35. 36

    Simons, K. T., Kooperberg, C., Huang, E. & Baker, D. Assembly of protein tertiary structures from fragments with similar local sequences using simulated annealing and Bayesian scoring functions. J. Mol. Biol. 268, 209–225 (1997)

    CAS  Article  Google Scholar 

  36. 37

    Canutescu, A. A. & Dunbrack, R. L. Cyclic coordinate descent: A robotics algorithm for protein loop closure. Protein Sci. 12, 963–972 (2003)

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  37. 38

    Lazaridis, T. & Karplus, M. Effective energy function for proteins in solution. Proteins 35, 133–152 (1999)

    CAS  Article  PubMed  Google Scholar 

  38. 39

    Dunbrack, R. L. & Cohen, F. E. Bayesian statistical analysis of protein side-chain rotamer preferences. Protein Sci. 6, 1661–1681 (1997)

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  39. 40

    Engh, R. A. & Huber, R. Accurate bond and angle parameters for X-ray protein structure refinement. Acta Crystallogr. A 47, 392–400 (1991)

    Article  Google Scholar 

  40. 41

    Wang, C., Schueler-Furman, O. & Baker, D. Improved side-chain modeling for protein-protein docking. Protein Sci. 14, 1328–1339 (2005)

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  41. 42

    Li, Z. & Scheraga, H. A. Monte Carlo-minimization approach to the multiple-minima problem in protein folding. Proc. Natl Acad. Sci. USA 84, 6611–6615 (1987)

    ADS  MathSciNet  CAS  Article  PubMed  Google Scholar 

  42. 43

    Garbuzynskiy, S. O., Melnik, B. S., Lobanov, M. Y., Finkelstein, A. V. & Galzitskaya, O. V. Comparison of X-ray and NMR structures: is there a systematic difference in residue contacts between X-ray- and NMR-resolved protein structures? Proteins 60, 139–147 (2005)

    CAS  Article  PubMed  Google Scholar 

  43. 44

    Ginalski, K., Elofsson, A., Fischer, D. & Rychlewski, L. 3D-Jury: a simple approach to improve protein structure predictions. Bioinformatics 19, 1015–1018 (2003)

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  44. 45

    Chivian, D. & Baker, D. Homology modeling using parametric alignment ensemble generation with consensus and energy-based model selection. Nucleic Acids Res. 34, e112 (2006)

    Article  PubMed  PubMed Central  Google Scholar 

  45. 46

    Sali, A. & Blundell, T. L. Comparative protein modelling by satisfaction of spatial restraints. J. Mol. Biol. 234, 779–815 (1993)

    CAS  Article  Google Scholar 

  46. 47

    Bonneau, R., Strauss, C. E. & Baker, D. Improving the performance of Rosetta using multiple sequence alignment information and global measures of hydrophobic core formation. Proteins 43, 1–11 (2001)

    CAS  Article  PubMed  Google Scholar 

  47. 48

    Moult, J., Fidelis, K., Rost, B., Hubbard, T. & Tramontano, A. Critical assessment of methods of protein structure prediction (CASP)–round 6. Proteins 61 (suppl. 7). 3–7 (2005)

    CAS  Article  PubMed  Google Scholar 

  48. 49

    Petsko, G. A. The grail problem. Genome Biol. 1, COMMENT002 (2000)

    CAS  PubMed  PubMed Central  Google Scholar 

  49. 50

    Plewczynski, D., Pas, J., Von Grotthuss, M. & Rychlewski, L. Comparison of proteins based on segments structural similarity. Acta Biochim. Pol. 51, 161–172 (2004)

    CAS  PubMed  Google Scholar 

  50. 51

    Kuhlman, B. & Baker, D. Native protein sequences are close to optimal for their structures. Proc. Natl Acad. Sci. USA 97, 10383–10388 (2000)

    ADS  CAS  Article  PubMed  Google Scholar 

Download references


We thank Rosetta@home participants for contributing computing power that made testing of many new ideas possible; the DOE INCITE program for access to Blue Gene/L at Argonne National Laboratory and the IBM Blue Gene Watson supercomputers; and the NCSA, SDSC and Argonne National Laboratory supercomputer centres for computer time and help with porting Rosetta to Blue Gene. We thank D. Kim and K. Laidig for developing the computational infrastructure underlying Rosetta@home; J. Abendroth for help with RESOLVE and ARP/wARP software; M. Kennedy of NESG for the NMR structure coordinates of protein 1xpw and for help with the molecular replacement calculations; and J. Abendroth, J. Bosch, J. Havranek and C. Wang for comments on the manuscript. We also thank the CASP organizers and contributing structural biologists for providing an invaluable test set for new structure refinement methods. This work was funded by the National Institute of General Medical Sciences, National Institutes of Health (to D.B.), the Wellcome Trust, UK (to R.J.R.), the Howard Hughes Medical Institute (D.B.), a Leukemia and Lymphoma Society Career Development fellowship (to B.Q.), and a Jane Coffin Childs fellowship (to R.D.). Rosetta software and source code are available to academic users free of charge at

Author Contributions B.Q., S.R. and R.D. contributed equally to this work. Structure predictions for NMR-based, comparative-model-based and de novo predictions were carried out by S.R., B.Q. and R.D. respectively, with advice and software from D.B. and P.B. Phasing trials were performed by R.J.R., B.Q., S.R. and R.D., with advice from R.J.R. and A.J.M. All authors discussed results and commented on the manuscript.

Author information



Corresponding author

Correspondence to David Baker.

Ethics declarations

Competing interests

The authors declare no competing financial interests.

Supplementary information

Supplementary Information

The file contains Supplementary Tables S1-S2 and Supplementary Figures S1-S5 with Legends and additional acknowledgements. (PDF 985 kb)

Rights and permissions

Reprints and Permissions

About this article

Cite this article

Qian, B., Raman, S., Das, R. et al. High-resolution structure prediction and the crystallographic phase problem. Nature 450, 259–264 (2007).

Download citation

Further reading


By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.


Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing