Article | Published:

High-resolution structure prediction and the crystallographic phase problem

Nature volume 450, pages 259264 (08 November 2007) | Download Citation

Abstract

The energy-based refinement of low-resolution protein structure models to atomic-level accuracy is a major challenge for computational structural biology. Here we describe a new approach to refining protein structure models that focuses sampling in regions most likely to contain errors while allowing the whole structure to relax in a physically realistic all-atom force field. In applications to models produced using nuclear magnetic resonance data and to comparative models based on distant structural homologues, the method can significantly improve the accuracy of the structures in terms of both the backbone conformations and the placement of core side chains. Furthermore, the resulting models satisfy a particularly stringent test: they provide significantly better solutions to the X-ray crystallographic phase problem in molecular replacement trials. Finally, we show that all-atom refinement can produce de novo protein structure predictions that reach the high accuracy required for molecular replacement without any experimental phase information and in the absence of templates suitable for molecular replacement from the Protein Data Bank. These results suggest that the combination of high-resolution structure prediction with state-of-the-art phasing tools may be unexpectedly powerful in phasing crystallographic data for which molecular replacement is hindered by the absence of sufficiently accurate previous models.

Access optionsAccess options

Rent or Buy article

Get time limited or full article access on ReadCube.

from$8.99

All prices are NET prices.

References

  1. 1.

    & Progress and challenges in high-resolution refinement of protein structure models. Proteins 59, 15–29 (2005)

  2. 2.

    et al. MODBASE: a database of annotated comparative protein structure models and associated resources. Nucleic Acids Res. 34, D291–D295 (2006)

  3. 3.

    A decade of CASP: progress, bottlenecks and prognosis in protein structure prediction. Curr. Opin. Struct. Biol. 15, 285–289 (2005)

  4. 4.

    , , & The importance of alignment accuracy for molecular replacement. Acta Crystallogr. D 60, 1229–1236 (2004)

  5. 5.

    , , & Evaluating the usefulness of protein structure models for molecular replacement. Bioinformatics 21 (suppl. 2). ii72–ii76 (2005)

  6. 6.

    , & Does NMR mean “not for molecular replacement”? Using NMR-based search models to solve protein crystal structures. Structure 8, R213–R220 (2000)

  7. 7.

    , & Ab initio molecular-replacement phasing for symmetric helical membrane proteins. Acta Crystallogr. D 63, 188–196 (2007)

  8. 8.

    Ab initio phase determination and phase extension using non-crystallographic symmetry. Curr. Opin. Struct. Biol. 5, 650–655 (1995)

  9. 9.

    et al. Design of a novel globular protein fold with atomic-level accuracy. Science 302, 1364–1368 (2003)

  10. 10.

    et al. Phaser crystallographic software. J. Appl. Crystallogr. 40, 658–674 (2007)

  11. 11.

    , & Automated protein model building combined with iterative structure refinement. Nature Struct. Biol. 6, 458–463 (1999)

  12. 12.

    Automated main-chain model building by template matching and iterative fragment extension. Acta Crystallogr. D 59, 38–44 (2003)

  13. 13.

    , & Toward high-resolution de novo structure prediction for small proteins. Science 309, 1868–1871 (2005)

  14. 14.

    , , & Protein structure prediction using Rosetta. Methods Enzymol. 383, 66–93 (2004)

  15. 15.

    , & Rotamer-pair energy calculations using a Trie data structure. In Algorithms in Bioinformatics (eds Casadio, R. & Myers, G.) 389 (Springer, Berlin, 2005)

  16. 16.

    & Global optimization of clusters, crystals, and biomolecules. Science 285, 1368–1372 (1999)

  17. 17.

    & Identification of correct regions in protein models using structural, alignment, and consensus information. Protein Sci. 15, 900–913 (2006)

  18. 18.

    & Tabu Search (Kluwer, Norwell, Massachusetts, 1997)

  19. 19.

    , & Energy-based de novo protein folding by conformational space annealing and an off-lattice united-residue force field: application to the 10–55 fragment of staphylococcal protein A and to apo calbindin D9K. Proc. Natl Acad. Sci. USA 96, 2025–2030 (1999)

  20. 20.

    , & Quality assessment of NMR structures: a statistical survey. J. Mol. Biol. 281, 149–164 (1998)

  21. 21.

    & An empirical backbone-backbone hydrogen-bonding potential in proteins and its applications to NMR structure refinement and validation. J. Am. Chem. Soc. 126, 7281–7292 (2004)

  22. 22.

    , & Inferential structure determination. Science 309, 303–306 (2005)

  23. 23.

    LGA: A method for finding 3D similarities in protein structures. Nucleic Acids Res. 31, 3370–3374 (2003)

  24. 24.

    et al. Structure validation by Cα geometry: φ, ψ and Cβ deviation. Proteins 50, 437–450 (2003)

  25. 25.

    et al. Structure prediction for CASP7 targets using extensive all-atom refinement with Rosetta@home. Proteins doi: 10.1002/prot.21636 (25 September 2007)

  26. 26.

    , , & The worldwide Protein Data Bank (wwPDB): ensuring a single, uniform archive of PDB data. Nucleic Acids Res. 35, D301–D303 (2007)

  27. 27.

    , , & Crystal structure of the archaeal ammonium transporter Amt-1 from Archaeoglobus fulgidus. Proc. Natl Acad. Sci. USA 102, 14994–14999 (2005)

  28. 28.

    & Improved structure refinement through maximum likelihood. Acta Crystallogr. A 52, 659–668 (1996)

  29. 29.

    New approaches to high-throughput phasing. Curr. Opin. Struct. Biol. 12, 674–678 (2002)

  30. 30.

    et al. Protein structure change studied by hydrogen-deuterium exchange, functional labeling, and mass spectrometry. Proc. Natl Acad. Sci. USA 100, 7057–7062 (2003)

  31. 31.

    et al. High throughput protein fold identification by using experimental constraints derived from intramolecular cross-links and mass spectrometry. Proc. Natl Acad. Sci. USA 97, 5802–5806 (2000)

  32. 32.

    & Radiolytic protein footprinting with mass spectrometry to probe the structure of macromolecular complexes. Annu. Rev. Biophys. Biomol. Struct. 35, 251–276 (2006)

  33. 33.

    & TM-align: a protein structure alignment algorithm based on the TM-score. Nucleic Acids Res. 33, 2302–2309 (2005)

  34. 34.

    , & MAMMOTH (matching molecular models obtained from theory): an automated method for model comparison. Protein Sci. 11, 2606–2621 (2002)

  35. 36.

    , , & Assembly of protein tertiary structures from fragments with similar local sequences using simulated annealing and Bayesian scoring functions. J. Mol. Biol. 268, 209–225 (1997)

  36. 37.

    & Cyclic coordinate descent: A robotics algorithm for protein loop closure. Protein Sci. 12, 963–972 (2003)

  37. 38.

    & Effective energy function for proteins in solution. Proteins 35, 133–152 (1999)

  38. 39.

    & Bayesian statistical analysis of protein side-chain rotamer preferences. Protein Sci. 6, 1661–1681 (1997)

  39. 40.

    & Accurate bond and angle parameters for X-ray protein structure refinement. Acta Crystallogr. A 47, 392–400 (1991)

  40. 41.

    , & Improved side-chain modeling for protein-protein docking. Protein Sci. 14, 1328–1339 (2005)

  41. 42.

    & Monte Carlo-minimization approach to the multiple-minima problem in protein folding. Proc. Natl Acad. Sci. USA 84, 6611–6615 (1987)

  42. 43.

    , , , & Comparison of X-ray and NMR structures: is there a systematic difference in residue contacts between X-ray- and NMR-resolved protein structures? Proteins 60, 139–147 (2005)

  43. 44.

    , , & 3D-Jury: a simple approach to improve protein structure predictions. Bioinformatics 19, 1015–1018 (2003)

  44. 45.

    & Homology modeling using parametric alignment ensemble generation with consensus and energy-based model selection. Nucleic Acids Res. 34, e112 (2006)

  45. 46.

    & Comparative protein modelling by satisfaction of spatial restraints. J. Mol. Biol. 234, 779–815 (1993)

  46. 47.

    , & Improving the performance of Rosetta using multiple sequence alignment information and global measures of hydrophobic core formation. Proteins 43, 1–11 (2001)

  47. 48.

    , , , & Critical assessment of methods of protein structure prediction (CASP)–round 6. Proteins 61 (suppl. 7). 3–7 (2005)

  48. 49.

    The grail problem. Genome Biol. 1, COMMENT002 (2000)

  49. 50.

    , , & Comparison of proteins based on segments structural similarity. Acta Biochim. Pol. 51, 161–172 (2004)

  50. 51.

    & Native protein sequences are close to optimal for their structures. Proc. Natl Acad. Sci. USA 97, 10383–10388 (2000)

Download references

Acknowledgements

We thank Rosetta@home participants for contributing computing power that made testing of many new ideas possible; the DOE INCITE program for access to Blue Gene/L at Argonne National Laboratory and the IBM Blue Gene Watson supercomputers; and the NCSA, SDSC and Argonne National Laboratory supercomputer centres for computer time and help with porting Rosetta to Blue Gene. We thank D. Kim and K. Laidig for developing the computational infrastructure underlying Rosetta@home; J. Abendroth for help with RESOLVE and ARP/wARP software; M. Kennedy of NESG for the NMR structure coordinates of protein 1xpw and for help with the molecular replacement calculations; and J. Abendroth, J. Bosch, J. Havranek and C. Wang for comments on the manuscript. We also thank the CASP organizers and contributing structural biologists for providing an invaluable test set for new structure refinement methods. This work was funded by the National Institute of General Medical Sciences, National Institutes of Health (to D.B.), the Wellcome Trust, UK (to R.J.R.), the Howard Hughes Medical Institute (D.B.), a Leukemia and Lymphoma Society Career Development fellowship (to B.Q.), and a Jane Coffin Childs fellowship (to R.D.). Rosetta software and source code are available to academic users free of charge at http://www.rosettacommons.org/software/.

Author Contributions B.Q., S.R. and R.D. contributed equally to this work. Structure predictions for NMR-based, comparative-model-based and de novo predictions were carried out by S.R., B.Q. and R.D. respectively, with advice and software from D.B. and P.B. Phasing trials were performed by R.J.R., B.Q., S.R. and R.D., with advice from R.J.R. and A.J.M. All authors discussed results and commented on the manuscript.

Author information

Author notes

    • Bin Qian
    • , Srivatsan Raman
    •  & Rhiju Das

    These authors contributed equally to this work.

Affiliations

  1. University of Washington, Department of Biochemistry and Howard Hughes Medical Institute, Box 357350, Seattle 98195, USA

    • Bin Qian
    • , Srivatsan Raman
    • , Rhiju Das
    • , Philip Bradley
    •  & David Baker
  2. Department of Haematology, University of Cambridge, Cambridge Institute for Medical Research, Wellcome Trust/MRC Building, Hills Road, Cambridge CB2 0XY, UK

    • Airlie J. McCoy
    •  & Randy J. Read

Authors

  1. Search for Bin Qian in:

  2. Search for Srivatsan Raman in:

  3. Search for Rhiju Das in:

  4. Search for Philip Bradley in:

  5. Search for Airlie J. McCoy in:

  6. Search for Randy J. Read in:

  7. Search for David Baker in:

Competing interests

The authors declare no competing financial interests.

Corresponding author

Correspondence to David Baker.

Supplementary information

PDF files

  1. 1.

    Supplementary Information

    The file contains Supplementary Tables S1-S2 and Supplementary Figures S1-S5 with Legends and additional acknowledgements.

About this article

Publication history

Received

Accepted

Published

DOI

https://doi.org/10.1038/nature06249

Further reading

Comments

By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.