Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Brief Communication
  • Published:

Protein structure determination by combining sparse NMR data with evolutionary couplings

Abstract

Accurate determination of protein structure by NMR spectroscopy is challenging for larger proteins, for which experimental data are often incomplete and ambiguous. Evolutionary sequence information together with advances in maximum entropy statistical methods provide a rich complementary source of structural constraints. We have developed a hybrid approach (evolutionary coupling–NMR spectroscopy; EC-NMR) combining sparse NMR data with evolutionary residue-residue couplings and demonstrate accurate structure determination for several proteins 6−41 kDa in size.

This is a preview of subscription content, access via your institution

Access options

Rent or buy this article

Prices vary by article type

from$1.95

to$39.95

Prices may be subject to local taxes which are calculated during checkout

Figure 1: The EC-NMR approach.
Figure 2: Performance of the EC-NMR method.

Similar content being viewed by others

Accession codes

Accessions

Protein Data Bank

References

  1. Mao, B., Guan, R. & Montelione, G.T. Structure 19, 757–766 (2011).

    CAS  PubMed  PubMed Central  Google Scholar 

  2. Mao, B., Tejero, R., Baker, D. & Montelione, G.T. J. Am. Chem. Soc. 136, 1893–1906 (2014).

    CAS  PubMed  PubMed Central  Google Scholar 

  3. Gardner, K.H., Rosen, M.K. & Kay, L.E. Biochemistry 36, 1389–1401 (1997).

    CAS  PubMed  Google Scholar 

  4. Mueller, G.A. et al. J. Mol. Biol. 300, 197–212 (2000).

    CAS  PubMed  Google Scholar 

  5. Rosen, M.K. et al. J. Mol. Biol. 263, 627–636 (1996).

    CAS  PubMed  Google Scholar 

  6. Marks, D.S. et al. PLoS ONE 6, e28766 (2011).

    CAS  PubMed  PubMed Central  Google Scholar 

  7. Morcos, F. et al. Proc. Natl. Acad. Sci. USA 108, E1293–E1301 (2011).

    CAS  PubMed  PubMed Central  Google Scholar 

  8. Hopf, T.A. et al. Cell 149, 1607–1621 (2012).

    CAS  PubMed  PubMed Central  Google Scholar 

  9. Marks, D.S., Hopf, T.A. & Sander, C. Nat. Biotechnol. 30, 1072–1080 (2012).

    CAS  PubMed  PubMed Central  Google Scholar 

  10. Hopf, T.A. et al. eLife 3, e03430 (2014).

    PubMed Central  Google Scholar 

  11. Ovchinnikov, S., Kamisetty, H. & Baker, D. eLife 3, e02030 (2014).

    PubMed  PubMed Central  Google Scholar 

  12. Sulkowska, J.I., Morcos, F., Weigt, M., Hwa, T. & Onuchic, J.N. Proc. Natl. Acad. Sci. USA 109, 10340–10345 (2012).

    CAS  PubMed  PubMed Central  Google Scholar 

  13. Nugent, T. & Jones, D.T. Proc. Natl. Acad. Sci. USA 109, E1540–E1547 (2012).

    CAS  PubMed  PubMed Central  Google Scholar 

  14. Evenas, J. et al. J. Mol. Biol. 309, 961–974 (2001).

    CAS  PubMed  Google Scholar 

  15. Araki, M. et al. J. Biol. Chem. 286, 39644–39653 (2011).

    CAS  PubMed  PubMed Central  Google Scholar 

  16. Kainosho, M. et al. Nature 440, 52–57 (2006).

    CAS  PubMed  Google Scholar 

  17. Bhattacharya, A., Tejero, R. & Montelione, G.T. Proteins 66, 778–795 (2007).

    CAS  PubMed  Google Scholar 

  18. Huang, Y.J., Powers, R. & Montelione, G.T. J. Am. Chem. Soc. 127, 1665–1674 (2005).

    CAS  PubMed  Google Scholar 

  19. Huang, Y.J., Rosato, A., Singh, G. & Montelione, G.T. Nucleic Acids Res. 40, W542–W546 (2012).

    CAS  PubMed  PubMed Central  Google Scholar 

  20. Tugarinov, V., Kanelis, V. & Kay, L.E. Nat. Protoc. 1, 749–754 (2006).

    CAS  PubMed  Google Scholar 

  21. Hiller, S. et al. Science 321, 1206–1210 (2008).

    CAS  PubMed  PubMed Central  Google Scholar 

  22. Raman, S. et al. Science 327, 1014–1018 (2010).

    CAS  PubMed  PubMed Central  Google Scholar 

  23. Lange, O.F. et al. Proc. Natl. Acad. Sci. USA 109, 10873–10878 (2012).

    CAS  PubMed  PubMed Central  Google Scholar 

  24. Tugarinov, V., Choy, W.Y., Orekhov, V.Y. & Kay, L.E. Proc. Natl. Acad. Sci. USA 102, 622–627 (2005).

    CAS  PubMed  PubMed Central  Google Scholar 

  25. Grishaev, A., Tugarinov, V., Kay, L.E., Trewhella, J. & Bax, A. J. Biomol. NMR 40, 95–106 (2008).

    CAS  PubMed  Google Scholar 

  26. Huang, Y.J., Tejero, R., Powers, R. & Montelione, G.T. Proteins 62, 587–603 (2006).

    CAS  PubMed  Google Scholar 

  27. Herrmann, T., Güntert, P. & Wüthrich, K. J. Mol. Biol. 319, 209–227 (2002).

    CAS  PubMed  Google Scholar 

  28. Rohl, C.A., Strauss, C.E., Misura, K.M. & Baker, D. Methods Enzymol. 383, 66–93 (2004).

    CAS  PubMed  Google Scholar 

  29. Eddy, S.R. PLoS Comput. Biol. 7, e1002195 (2011).

    CAS  PubMed  PubMed Central  Google Scholar 

  30. Wishart, D.S. & Sykes, B.D. J. Biomol. NMR 4, 171–180 (1994).

    CAS  PubMed  Google Scholar 

  31. Wüthrich, K. NMR of Proteins and Nucleic Acids (Wiley, 1986).

  32. Maltsev, A.S., Ying, J. & Bax, A. J. Biomol. NMR 54, 181–191 (2012).

    CAS  PubMed  PubMed Central  Google Scholar 

  33. Shen, Y., Delaglio, F., Cornilescu, G. & Bax, A. J. Biomol. NMR 44, 213–223 (2009).

    CAS  PubMed  PubMed Central  Google Scholar 

  34. Jones, D.T., Buchan, D.W., Cozzetto, D. & Pontil, M. Bioinformatics 28, 184–190 (2012).

    CAS  PubMed  Google Scholar 

  35. Kamisetty, H., Ovchinnikov, S. & Baker, D. Proc. Natl. Acad. Sci. USA 110, 15674–15679 (2013).

    CAS  PubMed  PubMed Central  Google Scholar 

  36. Ekeberg, M., Lovkvist, C., Lan, Y., Weigt, M. & Aurell, E. Phys. Rev. E 87, 012707 (2013).

    Google Scholar 

  37. de Juan, D., Pazos, F. & Valencia, A. Nat. Rev. Genet. 14, 249–261 (2013).

    CAS  PubMed  Google Scholar 

  38. Ulrich, E.L. et al. Nucleic Acids Res. 36, D402–D408 (2008).

    CAS  PubMed  Google Scholar 

  39. Bartels, C., Xia, T.H., Billeter, M., Güntert, P. & Wüthrich, K. J. Biomol. NMR 6, 1–10 (1995).

    CAS  PubMed  Google Scholar 

  40. Diercks, T., Coles, M. & Kessler, H. J. Biomol. NMR 15, 177–180 (1999).

    CAS  PubMed  Google Scholar 

  41. Shen, Y. & Bax, A. J. Biomol. NMR 56, 227–241 (2013).

    CAS  PubMed  PubMed Central  Google Scholar 

  42. Zweckstetter, M. & Bax, A. J. Am. Chem. Soc. 122, 3791–3792 (2000).

    CAS  Google Scholar 

  43. Valafar, H. & Prestegard, J.H. J. Magn. Reson. 167, 228–241 (2004).

    CAS  PubMed  Google Scholar 

  44. Lovell, S.C. et al. Proteins 50, 437–450 (2003).

    CAS  PubMed  Google Scholar 

  45. Laskowski, R.A., Moss, D.S. & Thornton, J.M. J. Mol. Biol. 231, 1049–1067 (1993).

    CAS  PubMed  Google Scholar 

  46. Sippl, M.J. Proteins 17, 355–362 (1993).

    CAS  PubMed  Google Scholar 

  47. Luthy, R., Bowie, J.U. & Eisenberg, D. Nature 356, 83–85 (1992).

    CAS  PubMed  Google Scholar 

  48. Shen, Y. et al. Proc. Natl. Acad. Sci. USA 105, 4685–4690 (2008).

    CAS  PubMed  PubMed Central  Google Scholar 

  49. Acton, T.B. et al. Methods Enzymol. 493, 21–60 (2011).

    CAS  PubMed  PubMed Central  Google Scholar 

  50. Baran, M.C., Huang, Y.J., Moseley, H.N. & Montelione, G.T. Chem. Rev. 104, 3541–3556 (2004).

    CAS  PubMed  Google Scholar 

  51. Huang, Y.J. et al. Methods Enzymol. 394, 111–141 (2005).

    CAS  PubMed  Google Scholar 

  52. Araki, M. et al. J. Biol. Chem. 286, 39644–39653 (2011).

    CAS  PubMed  PubMed Central  Google Scholar 

  53. Koradi, R., Billeter, M. & Wüthrich, K. J. Mol. Graphics 14, 51–55 (1996).

    CAS  Google Scholar 

  54. Montelione, G.T. et al. Structure 21, 1563–1570 (2013).

    CAS  PubMed  Google Scholar 

  55. Tejero, R., Snyder, D., Mao, B., Aramini, J.M. & Montelione, G.T. J. Biomol. NMR 56, 337–351 (2013).

    CAS  PubMed  PubMed Central  Google Scholar 

  56. Prestegard, J.H., Bougault, C.M. & Kishore, A.I. Chem. Rev. 104, 3519–3540 (2004).

    CAS  PubMed  Google Scholar 

  57. Bax, A. Protein Sci. 12, 1–16 (2003).

    CAS  PubMed  PubMed Central  Google Scholar 

  58. Al-Hashimi, H.M. et al. J. Magn. Reson. 143, 402–406 (2000).

    CAS  PubMed  Google Scholar 

  59. Word, J.M., Lovell, S.C., Richardson, J.S. & Richardson, D.C. J. Mol. Biol. 285, 1735–1747 (1999).

    CAS  PubMed  Google Scholar 

Download references

Acknowledgements

We thank all of the members of the Northeast Structural Genomics Consortium who generated and archived NMR spectroscopy data used in this work, particularly scientists in the laboratories of C. Arrowsmith, M. Kennedy, G.T.M., T. Szyperski and J. Prestegard. We thank J. Aramini, G. Liu, G.V.T. Swapna, H. Valafar, M. Nilges and F. Xu for helpful discussions. This work was supported by grants from the US National Institutes of Health grant 1R01-GM106303 to C.S. and D.S.M. and Protein Structure Initiative grant U54-GM094597 to G.T.M.

Author information

Authors and Affiliations

Authors

Contributions

Y.T., Y.J.H., T.A.H., C.S., D.S.M. and G.T.M. designed the research. Y.J.H. wrote ASDP program code. Y.T., Y.J.H., T.A.H. and D.S.M. performed calculations. Y.T., Y.J.H., T.A.H., C.S., D.S.M. and G.T.M. analyzed data. Y.T., Y.J.H., T.A.H., C.S., D.S.M. and G.T.M. wrote the manuscript.

Corresponding authors

Correspondence to Chris Sander, Debora S Marks or Gaetano T Montelione.

Ethics declarations

Competing interests

G.T.M. is associated with Nexomics Biosciences, Inc., a scientific contract research organization.

Integrated supplementary information

Supplementary Figure 1 3D structure determination by the hybrid EC-NMR method.

The hybrid EC-NMR strategy combines Evolutionary Coupling (EC) information with sparse experimental nuclear magnetic resonance (NMR) data. Sparse NMR data typically includes backbone (HN, 15N, 13Cα, 13Cb, 13C’), sidechain amide 15N-1H, and in some cases some sidechain methyl resonance assignments, together with NOESY and 15N-1H residual dipolar coupling (RDC) data. The EC and NMR data are iteratively analyzed and refined to identify false-positive EC contacts, and to resolve ambiguities in the assignment of NOESY cross peaks, providing an expanded list of inferred residue pair contacts of higher accuracy than those available from either method alone. This information is then used to guide restrained energy refinement to provide accurate protein structures.

Supplementary Figure 2 Flow chart of iterative EC-NMR analysis process.

The EC-NMR analysis process has been fully automated by integration into the automated NOESY NMR data assignment program ASDP16,17. NOESY cross peak and NMR resonance assignments for 1H-15N and (where available) 1H-13C methyl resonances are combined with high-scoring EC contacts predicted from sequence co-variance analysis9,10. Specific restraints are identified by ASDP from these data, and used to generate structural models with the program CYANA18. Initial structures are then used to assess false-positive EC contacts and false-NOESY peaks (i.e. noise peaks), and to identify consistent Residue Pair Contacts (RPCs). Once the analysis converges, specific interatomic distance restraints are generated from the final RPC list and NOESY cross peak assignments, and used for restrained minimization of the EC-NMR structures with the program Rosetta19.

Supplementary Figure 3 Contact maps illustrating the EC-NMR process.

Residue-residue contact maps illustrating the process of EC-NMR hybrid method for structure determination. (A) A9CJD6_AGRTT5 (no experimental RDC data). (B) A9CJD6_AGRTT5 (2 simulated RDC alignment tensors). (C) Q6D6V0_ERWCT (2 experimental RDC alignment tensors). (D) Q9ZV63_ARATH (2 experimental RDC alignment tensors). (E) Q1LD49_RALME (1 experimental RDC alignment tensor). (F) Q1LD49_RALME (1 experimental RDC alignment tensor and 1 simulated RDC alignment tensor). (G) RASH_HUMAN (no experimental RDC data). (H) RASH_HUMAN (2 simulated RDC alignment tensor). (I) YIAD_ECOLI (2 experimental RDC alignment tensors). (J) P74712_SYNY3 (2 experimental RDC alignment tensors). (K) MALE_ECOLI (1 experimental RDC alignment tensor). (L) MALE_ECOLI (1 experimental RDC alignment and 1 simulated RDC alignment tensor). For each protein, the superimposed ribbon diagrams are shown above the first triangles. Green ribbon - final EC-NMR structure. Grey ribbon - reference X-ray crystal structures (G, H, J, K, L) or NMR structures (A, B, C, D, E, F, I). In the contact map, Grey contacts – contacts in the reference X-ray crystal structure (G, H, J, K, L) or NMR structure (A, B, C, D, E, F, I); Red contacts – initial EC residue-pair contacts; Blue contacts – contacts indicated by unambiguous NOESY peak assignments obtained by ASDP using only the sparse NMR data; Green contacts – final residue-pair contacts resulting from simultaneous analysis of EC and sparse NMR data; Orange contacts in the upper triangle of the middle panel – all ambiguous NOESY peak potential assignments obtained by ASDP in the first cycle prior to incorporating EC data; Orange contacts in the upper triangle of the right panel –final NOESY peak assignments obtained used in conventional NMR structure determinations done with essentially complete chemical shift data including sidechain resonance assignments. These orange contacts in right panels illustrate the contact information obtained for a fully protonated NMR sample with extensive sidechain assignments, which is available for some of the smaller proteins used to test the EC-NMR method. For P74712_SYNY3 (J) and MALE_ECOLI (K, L), all available NMR structures in the PDB were solved using sparse NMR data with limited sidechain resonance assignments, thus orange contacts in the upper triangle region of the right panel are not available.

Supplementary Figure 4 Box plots of backbone r.m.s. deviations relative to reference structures.

Box plots showing backbone RMSD statistics for protein structures determined using EC constraints alone (EC, red), sparse-NMR NOESY and chemical shift data alone (sparse NMR, blue), and the combined EC-NMR protocol (EC-NMR, green). (A) A9CJD6_AGRTT5 (no experimental RDC data). (B) A9CJD6_AGRTT5 (2 simulated RDC alignment tensors). (C) Q6D6V0_ERWCT (2 experimental RDC alignment tensors). (D) Q9ZV63_ARATH (2 experimental RDC alignment tensors). (E) Q1LD49_RALME (1 experimental RDC alignment tensor). (F) Q1LD49_RALME (1 experimental RDC alignment tensor and 1 simulated RDC alignment tensor). (G) RASH_HUMAN (no experimental RDC data). (H) RASH_HUMAN (2 simulated RDC alignment tensors). (I) YIAD_ECOLI (2 experimental RDC alignment tensors). (J) P74712_SYNY3 (2 experimental RDC alignment tensors). (K) MALE_ECOLI (full length, 1 experimental RDC alignment tensor). (L) MALE_ECOLI (full length, 1 experimental RDC alignment tensor and 1 simulated RDC alignment tensor). (M) MALE_ECOLI (NTD, 1 experimental RDC alignment tensor). (N) MALE_ECOLI (NTD, 1 experimental RDC alignment tensor and 1 simulated RDC alignment tensor). (O) MALE_ECOLI (CTD, 1 experimental RDC alignment tensor). (P) MALE_ECOLI (CTD, 1 experimental RDC alignment tensor and 1 simulated RDC alignment tensor).

Supplementary Figure 5 Box plots of all-heavy-atom r.m.s. deviations relative to reference structures.

Box plots showing all-heavy-atom RMSD statistics for protein structures determined using EC restraints alone (EC, red), sparse-NMR NOESY and chemical shift data alone (sparse NMR, blue), and the combined EC-NMR protocol (EC-NMR, green). (A) A9CJD6_AGRTT5 (no experimental RDC data). (B) A9CJD6_AGRTT5 (2 simulated RDC alignment tensors). (C) Q6D6V0_ERWCT (2 experimental RDC alignment tensors). (D) Q9ZV63_ARATH (2 experimental RDC alignment tensors). (E) Q1LD49_RALME (1 experimental RDC alignment tensor). (F) Q1LD49_RALME (1 experimental RDC alignment tensor and 1 simulated RDC alignment tensor). (G) RASH_HUMAN (no RDC data). (H) RASH_HUMAN (2 simulated RDC alignment tensors). (I) YIAD_ECOLI (2 experimental RDC alignment tensors). (J) P74712_SYNY3 (2 experimental RDC alignment tensors). (K) MALE_ECOLI (full length, 1 experimental RDC alignment tensor). (L) MALE_ECOLI (full length, 1 experimental RDC alignment tensor and 1 simulated RDC alignment tensor). (M) MALE_ECOLI (NTD, 1 experimental RDC alignment tensor). (N) MALE_ECOLI (NTD, 1 experimental RDC alignment tensor and 1 simulated RDC alignment tensor). (O) MALE_ECOLI (CTD, 1 experimental RDC alignment tensor). (P) MALE_ECOLI (CTD, 1 experimental RDC alignment tensor and 1 simulated RDC alignment tensor).

Supplementary Figure 6 EC-NMR structures of maltose-binding protein.

Comparison of the Maltose Binding Protein (MBP) EC-NMR structures and the corresponding X-ray crystal structure. EC-NMR structures are shown in green, and the X-ray crystal structure in grey. (A) Full-length MBP determined with 1 experimental RDC alignment tensor, superimposed on the full length X-ray crystal structure. (B) Full-length MBP determined with 1 experimental RDC alignment tensor and 1 simulated RDC alignment tensor, superimposed on the full length X-ray crystal structure. (C) MBP NTD determined with 1 experimental RDC alignment tensor. (D) MBP NTD determined with 1 experimental RDC alignment tensor and 1 simulated RDC alignment tensor. (E) MBP CTD determined with 1 experimental RDC alignment tensor. (F) MBP CTD determined with 1 experimental RDC alignment tensor and 1 simulated RDC alignment tensor.

Supplementary Figure 7 Statistics assessing performance of EC-NMR protocol in identifying false positive and false negative contacts.

Values of (A) Number of long-range (|i-j| ≥ 5), (B) Precision, (C) Recall, and (D) F-measure are all significantly higher for the final Residue Pair Contact (RPC) list (red + green histograms) than for the initial EC list (red histograms).

Supplementary Figure 8 Assessment of the accuracy of buried side chains.

Buried sidechains are compared between EC-NMR structures (green) and reference X-ray crystal structures (grey). (A) RASH_HUMAN (2 simulated RDC alignment tensors). (B) MALE_ECOLI (NTD, 1 experimental RDC alignment tensor and 1 simulated RDC alignment tensor). (C) MALE_ECOLI (CTD, 1 experimental RDC alignment tensor and 1 simulated RDC alignment tensor), (D) MALE_ECOLI (NTD, 1 experimental RDC alignment tensor). (E) MALE_ECOLI (CTD, 1 experimental RDC alignment tensor).

Supplementary Figure 9 Evaluation of structure quality statistics for 20 ensembles of EC-NMR structures.

The 20 ensembles of EC NMR structures generated in the sensitivity analysis outlined Supplementary Table 6 were assessed using structural quality metrics of the Protein Structure Validation Software Suite (PSVS), including NMR DP scores. The top panel plots backbone RMSD to reference structures (black data points) and the Number of Reliable EC Pairs (red data points), as defined in the On Line Methods, for each of the 20 EC-NMR analyses carried out using different amounts of sequence data (Neff/L). The bottom panel plots DP (left axis) or knowledge-base Z score (right axis) computed with the PSVS software. The metrics include the NMR DP score (red circles), ProCheck analysis of backbone dihedral angle distributions (Procheck, green boxes) and distributions for all backbone and sidechain dihedral angels (Procheck_all, blue diamonds), Veify3D (magenta triangles), ProsaII (grey triangles), and Molprobity atomic clash scores (black asterix). Z scores are normalized against a collection of high-resolution X-ray crystal structures. “Reliable” structures (backbone RMSD < 3.5 Å from X-ray crystal structure), obtained using multiple-sequence alignments of Neff/L > 5, all have DP score > ~ 0.73 and knowledge-base Z scores > -2 (upper right quadrant), while less accurate structures all have DP < ~ 0.73 and Z-scores for Verify3D < -2. 19 ensembles were generated using different amounts of evolutionary sequence data, and the 20th ensemble was generated using no evolutionary sequence data.

Supplementary Figure 10 Evaluation of structure quality statistics for reference X-ray crystal and NMR spectroscopy structures, and EC-NMR structures.

The ensembles of EC-NMR structures described in the main text were assessed using NMR DP scores16,25 and structural quality metrics of the Protein Structure Validation Software Suite (PSVS). The reference structures and EC-NMR structures assessed include (A) RASH, (B) P74712, and (C) Maltose Binding Protein (MBP) generated by X-ray crystallography (X-ray, the reference structures), EC’s alone (EC), sparse NMR data alone (sparse NMR), the EC-NMR method using only available experimental data (EC-NMR), and EC-NMR supplemented with additional RDC data as described in the On Line Methods (EC-NMR*). PSVS structure quality Z scores < -6 were set to -6. According to the criteria outlined in the text (and Supplementary Figure 9), the X-ray reference structures and the EC-NMR structures are all classified as “reliable” structures (DP > ~ 0.73; structure quality Z scores > -2), while the EC alone structures are classified as “less accurate” structures. Structures determined with sparse NMR data alone (sparse NMR) have marginal structure quality scores, which are improved in the more accurate EC-NMR structures.

Supplementary Figure 11 The precision of ECs depends on the amount of available sequence information.

(A) The amount of sequence information available for EC inference is varied by sampling from the full alignment of P74712 (Neff/L= 227). The number of predicted high-confidence EC pairs exceeding the non-informative background level of coupling by a factor of two or more decreases sharply once more than 75% of sequences have been removed (dashed blue line). (B) The number of high-confidence EC pairs for each sampled alignment is a good predictor for the overall precision of the top L EC pairs when evaluating their distance in the protein structure. The size of the high-confidence EC set correlates with the size of the sequence alignments (Pearson r = 0.81), and starts to saturate at Neff/L = 75, where there is no more gain for increasing numbers of sequences (panel A). For this particular protein the high-confidence set of ECs ranges from 89 EC pairs for the full alignment, down to zero pairs for the smallest alignment. As one would expect, the proportion of ECs that are close in the crystal structure (out of the top 150 ECs) positively correlates with the number of sequences in the alignment, saturating at about Neff/L=100 (Pearson r = 0.91, panel B). The true positive rate of L ECs is higher than the number of high confidence ECs deduced from the corresponding alignment. This indicates that the scoring method is conservative and the number of high confidence ECs in a corresponding alignment sets a lower bound of true positives.

Supplementary Figure 12 Structural accuracy and NMR DP scores are correlated for 2H,13C,15N-enriched, I(δ)LV methyl-protonated proteins.

Protein samples were generated using standard protocols26. Backbone resonance assignments were determined using standard 2H-decoupled triple-resonance NMR experiments, as described elsewhere3. For each protein, 2000 decoy conformers, spanning a range of accuracies, were generated use the CS-Rosetta structure modeling program27 together with these backbone chemical shift data. The resulting decoys were compared with reference structures determined by X-ray crystallography, which are deposited in the Protein Data Bank, by backbone RMSD (x-axis). DP scores (y-axis), comparing each decoy model with the NOESY peak list and chemical shift assignment list, were computed using the RPF-DP program16,25. Backbone RMSDs were calculated by comparing the Rosetta decoy to the X-ray structure in well-ordered regions, which was defined consistently for all decoys using the the corresponding NMR structure solved using a fully-protonated sample, available from the PDB. (A) Protein GmR137, 78 residues, MW = 8.5 kDa (reference PDB ID 3CWI). (B) Protein HR3201A, 87 residues, MW = 9.9 kDa (reference PDB ID 3KW6). (C) Cold-shock protein A, 70 residues, MW = 7.4 kDa (reference PDB ID 1MJC). Most structures classified as “reliable”, with backbone RMSD < 4 Å from the reference structure, have DP scores > 0.73.

Supplementary information

Supplementary Text and Figures

Supplementary Figures 1–12, Supplementary Tables 1–6 and Supplementary Notes 1–5 (PDF 11351 kb)

Source data

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Tang, Y., Huang, Y., Hopf, T. et al. Protein structure determination by combining sparse NMR data with evolutionary couplings. Nat Methods 12, 751–754 (2015). https://doi.org/10.1038/nmeth.3455

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1038/nmeth.3455

This article is cited by

Search

Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing