Abstract
Metals have vital roles in both the mechanism and architecture of biological macromolecules. Yet structures of metal-containing macromolecules in which metals are misidentified and/or suboptimally modeled are abundant in the Protein Data Bank (PDB). This shows the need for a diagnostic tool to identify and correct such modeling problems with metal-binding environments. The CheckMyMetal (CMM) web server (http://csgid.org/csgid/metal_sites/) is a sophisticated, user-friendly web-based method to evaluate metal-binding sites in macromolecular structures using parameters derived from 7,350 metal-binding sites observed in a benchmark data set of 2,304 high-resolution crystal structures. The protocol outlines how the CMM server can be used to detect geometric and other irregularities in the structures of metal-binding sites, as well as how it can alert researchers to potential errors in metal assignment. The protocol also gives practical guidelines for correcting problematic sites by modifying the metal-binding environment and/or redefining metal identity in the PDB file. Several examples where this has led to meaningful results are described in the ANTICIPATED RESULTS section. CMM was designed for a broad audience—biomedical researchers studying metal-containing proteins and nucleic acids—but it is equally well suited for structural biologists validating new structures during modeling or refinement. The CMM server takes the coordinates of a metal-containing macromolecule structure in the PDB format as input and responds within a few seconds for a typical protein structure with 2–5 metal sites and a few hundred amino acids.
This is a preview of subscription content, access via your institution
Access options
Subscribe to this journal
Receive 12 print issues and online access
$259.00 per year
only $21.58 per issue
Buy this article
- Purchase on Springer Link
- Instant access to full article PDF
Prices may be subject to local taxes which are calculated during checkout
Similar content being viewed by others
References
Harding, M.M., Nowicki, M.W. & Walkinshaw, M.D. Metals in protein structures: a review of their principal features. Crystallogr. Rev. 16, 247–302 (2010).
Berman, H.M. et al. The Protein Data Bank. Nucleic Acids Res. 28, 235–242 (2000).
Pozharski, E., Weichenberger, C.X. & Rupp, B. Techniques, tools and best practices for ligand electron-density analysis and results from their application to deposited crystal structures. Acta Crystallogr. D 69, 150–167 (2013).
Chruszcz, M., Domagalski, M., Osinski, T., Wlodawer, A. & Minor, W. Unmet challenges of structural genomics. Curr. Opin. Struct. Biol. 20, 587–597 (2010).
Zheng, H., Chruszcz, M., Lasota, P., Lebioda, L. & Minor, W. Data mining of metal ion environments present in protein structures. J. Inorg. Biochem. 102, 1765–1776 (2008).
Branden, C. & Jones, T. Between objectivity and subjectivity. Nature 343, 687–689 (1990).
Adams, P.D. et al. Advances, interactions, and future developments in the CNS, Phenix, and Rosetta structural biology software systems. Annu. Rev. Biophys. 42, 265–287 (2013).
Minor, W., Cymborowski, M., Otwinowski, Z. & Chruszcz, M. HKL-3000: the integration of data reduction and structure solution—from diffraction images to an initial model in minutes. Acta Crystallogr. D 62, 859–866 (2006).
Chen, V.B. et al. MolProbity: all-atom structure validation for macromolecular crystallography. Acta Crystallogr. D 66, 12–21 (2010).
Abriata, L.A. Investigation of non-corrin cobalt(II)-containing sites in protein structures of the Protein Data Bank. Acta Crystallogr. B 69, 176–183 (2013).
Dauter, Z., Weiss, M.S., Einspahr, H. & Baker, E.N. Expectation bias and information content. Acta Crystallogr. F 69, 83 (2013).
Weichenberger, C.X., Pozharski, E. & Rupp, B. Visualizing ligand molecules in Twilight electron density. Acta Crystallogr. F 69, 195–200 (2013).
Wlodawer, A., Minor, W., Dauter, Z. & Jaskolski, M. Protein crystallography for non-crystallographers, or how to get the best (but not more) from published macromolecular structures. FEBS J. 275, 1–21 (2008).
Nayal, M. & Di Cera, E. Valence screening of water in protein crystals reveals potential Na+ binding sites. J. Mol. Biol. 256, 228–234 (1996).
Nabuurs, S.B., Spronk, C.A., Vuister, G.W. & Vriend, G. Traditional biomolecular structure determination by NMR spectroscopy allows for major errors. PLoS Comput. Biol. 2, e9 (2006).
Hsin, K., Sheng, Y., Harding, M.M., Taylor, P. & Walkinshaw, M.D. MESPEUS: a database of the geometry of metal sites in proteins. J. Appl. Crystallogr. 41, 963–968 (2008).
Abriata, L.A. Analysis of copper-ligand bond lengths in X-ray structures of different types of copper sites in proteins. Acta Crystallogr. D 68, 1223–1231 (2012).
Murshudov, G.N. et al. REFMAC5 for the refinement of macromolecular crystal structures. Acta Crystallogr. D 67, 355–367 (2011).
Adams, P.D. et al. PHENIX: a comprehensive Python-based system for macromolecular structure solution. Acta Crystallogr. D 66, 213–221 (2010).
Sheldrick, G.M. A short history of SHELX. Acta Crystallogr. A 64, 112–122 (2008).
Bergerhoff, G. & Brandenburg, K. in International Tables for Crystallography (eds. Wilson, J.C. & Prince, E.) 778–789 (John Wiley & Sons, 2006).
Laskowski, R.A., MacArthur, M.W., Moss, D.S. & Thornton, J.M. PROCHECK—a program to check the stereochemical quality of protein structures. J. Appl. Crystallogr. 26, 283–291 (1993).
Vaguine, A.A., Richelle, J. & Wodak, S.J. SFCHECK: a unified set of procedures for evaluating the quality of macromolecular structure-factor data and their agreement with the atomic model. Acta Crystallogr. D 55, 191–205 (1999).
Ascone, I. & Strange, R. Biological X-ray absorption spectroscopy and metalloproteomics. J. Synchrotron Radiat. 16, 413–421 (2009).
Garcia, J.S., Magalhaes, C.S. & Arruda, M.A. Trends in metal-binding and metalloprotein analysis. Talanta 69, 1–15 (2006).
Müller, P., Köpke, S. & Sheldrick, G.M. Is the bond-valence method able to identify metal atoms in protein structures? Acta Crystallogr. D 59, 32–37 (2003).
Tylichova, M. et al. Structural and functional characterization of plant aminoaldehyde dehydrogenase from Pisum sativum with a broad specificity for natural and synthetic aminoaldehydes. J. Mol. Biol. 396, 870–882 (2010).
Seff, A.L., Pilbak, S., Silaghi-Dumitrescu, I. & Poppe, L. Computational investigation of the histidine ammonia-lyase reaction: a modified loop conformation and the role of the zinc(II) ion. J. Mol. Model. 17, 1551–1563 (2011).
Srikanth, R., Mendoza, V.L., Bridgewater, J.D., Zhang, G. & Vachet, R.W. Copper binding to β-2-microglobulin and its pre-amyloid oligomers. Biochemistry 48, 9871–9881 (2009).
Cooper, D.R., Porebski, P.J., Chruszcz, M. & Minor, W. X-ray crystallography: assessment and validation of protein-small molecule complexes for drug discovery. Exp. Opin. Drug Discov. 6, 771–782 (2011).
Pietrzyk, A.J. et al. High-resolution structure of Bombyx mori lipoprotein 7: crystallographic determination of the identity of the protein and its potential role in detoxification. Acta Crystallogr. D 68, 1140–1151 (2012).
Brown, I.D. Recent developments in the methods and applications of the bond valence model. Chem. Rev. 109, 6858–6919 (2009).
Hanson, R.M. Jmol—a paradigm shift in crystallographic visualization. J. Appl. Crystallogr. 43, 1250–1260 (2010).
Allen, F.H. The Cambridge Structural Database: a quarter of a million crystal structures and rising. Acta Crystallogr. B 58, 380–388 (2002).
Brylinski, M. & Skolnick, J. FINDSITE-metal: integrating evolutionary information and machine learning for structure-based metal-binding site prediction at the proteome level. Proteins 79, 735–751 (2011).
Sodhi, J.S. et al. Predicting metal-binding site residues in low-resolution structural models. J. Mol. Biol. 342, 307–320 (2004).
Cai, C.Z., Han, L.Y., Ji, Z.L., Chen, X. & Chen, Y.Z. SVM-Prot: Web-based support vector machine software for functional classification of a protein from its primary sequence. Nucleic Acids Res. 31, 3692–3697 (2003).
Levy, R., Edelman, M. & Sobolev, V. Prediction of 3D metal binding sites from translated gene sequences based on remote-homology templates. Proteins 76, 365–374 (2009).
Passerini, A., Lippi, M. & Frasconi, P. MetalDetector v2.0: predicting the geometry of metal binding sites from protein sequence. Nucleic Acids Res. 39, W288–W292 (2011).
Hemavathi, K. et al. MIPS: metal interactions in protein structures. J. Appl. Crystallogr. 43, 196–199 (2010).
Castagnetto, J.M. et al. MDB: the Metalloprotein Database and Browser at The Scripps Research Institute. Nucleic Acids Res. 30, 379–382 (2002).
Andreini, C., Cavallaro, G., Lorenzini, S. & Rosato, A. MetalPDB: a database of metal sites in biological macromolecular structures. Nucleic Acids Res. 41, D312–D319 (2013).
Andreini, C., Bertini, I., Cavallaro, G., Holliday, G.L. & Thornton, J.M. Metal-MACiE: a database of metals involved in biological catalysis. Bioinformatics 25, 2088–2089 (2009).
Degtyarenko, K.N., North, A.C. & Findlay, J.B. PROMISE: a database of bioinorganic motifs. Nucleic Acids Res. 27, 233–236 (1999).
Laskowski, R.A. PDBsum new things. Nucleic Acids Res. 37, D355–D359 (2009).
Golovin, A. & Henrick, K. MSDmotif: exploring protein sites and motifs. BMC Bioinformatics 9, 312 (2008).
Pettersen, E.F. et al. UCSF Chimera—a visualization system for exploratory research and analysis. J. Comput. Chem. 25, 1605–1612 (2004).
Brese, N.E. & O'Keeffe, M. Bond-valence parameters for solids. Acta Crystallogr. B 47, 192–197 (1991).
Shields, G.P., Raithby, P.R., Allen, F.H. & Motherwell, W.D. The assignment and validation of metal oxidation states in the Cambridge Structural Database. Acta Crystallogr. B 56 (Part 3): 455–465 (2000).
Carugo, O. & Djinovic Carugo, K. When X-rays modify the protein structure: radiation damage at work. Trends Biochem. Sci. 30, 213–219 (2005).
Hersleth, H.P. & Andersson, K.K. How different oxidation states of crystalline myoglobin are influenced by X-rays. Biochim. Biophys. Acta 1814, 785–796 (2011).
Katz, A., Glusker, J., Beebe, S. & Bock, C. Calcium ion coordination: A comparison with that of beryllium, magnesium, and zinc. J. Am. Chem. Soc. 118, 5752–5763 (1996).
Harding, M.M. The architecture of metal coordination groups in proteins. Acta Crystallogr. D 60, 849–859 (2004).
Kuppuraj, G., Dudev, M. & Lim, C. Factors governing metal-ligand distances and coordination geometries of metal complexes. J. Phys. Chem. B 113, 2952–2960 (2009).
Bailey, S. The CCP4 suite—programs for protein crystallography. Acta Crystallogr. D 50, 760–763 (1994).
Emsley, P. & Cowtan, K. Coot: model-building tools for molecular graphics. Acta Crystallogr. D 60, 2126–2132 (2004).
Lovell, S.C. et al. Structure validation by Cαgeometry: φ, ψ and Cβ deviation. Proteins 50, 437–450 (2003).
Joosten, R.P., Joosten, K., Cohen, S.X., Vriend, G. & Perrakis, A. Automatic rebuilding and optimization of crystallographic structures in the Protein Data Bank. Bioinformatics 27, 3392–3398 (2011).
Ye, Q., Crawley, S.W., Yang, Y., Cote, G.P. & Jia, Z. Crystal structure of the α-kinase domain of Dictyostelium myosin heavy chain kinase A. Sci. Signal. 3, ra17 (2010).
Prasad, L., Leduc, Y., Hayakawa, K. & Delbaere, L.T. The structure of a universally employed enzyme: V8 protease from Staphylococcus aureus. Acta Crystallogr. D 60, 256–259 (2004).
Yoshiba, S. et al. Structural insights into the Thermus thermophilus ADP-ribose pyrophosphatase mechanism via crystal structures with the bound substrate and metal. J. Biol. Chem. 279, 37163–37174 (2004).
Chitale, M., Hawkins, T., Park, C. & Kihara, D. ESG: extended similarity group method for automated protein function prediction. Bioinformatics 25, 1739–1745 (2009).
Eustermann, S. et al. Combinatorial readout of histone H3 modifications specifies localization of ATRX to heterochromatin. Nat. Struct. Mol. Biol. 18, 777–782 (2011).
Kobashigawa, Y. et al. Autoinhibition and phosphorylation-induced activation mechanisms of human cancer and autoimmune disease-related E3 protein Cbl-b. Proc. Natl. Acad. Sci. USA 108, 20579–20584 (2011).
Loughlin, F.E. et al. Structural basis of pre-let-7 miRNA recognition by the zinc knuckles of pluripotency factor Lin28. Nat. Struct. Mol. Biol. 19, 84–89 (2011).
Veith, T. et al. Structural and functional analysis of the archaeal endonuclease Nob1. Nucleic Acids Res. 40, 3259–3274 (2011).
Li, H. et al. Molecular basis for site-specific read-out of histone H3K4me3 by the BPTF PHD finger of NURF. Nature 442, 91–95 (2006).
Acknowledgements
This work was supported by Federal funds from the National Institute of Allergy and Infectious Diseases, US National Institutes of Health, Department of Health and Human Services, under contract nos. HHSN272200700058C and HHSN272201200026C. We thank M. Grabowski, K.M. Langner and M. Domagalski for the CSGID website framework containing the CMM server; J. Hou, I.G. Shabalin, I.A. Shumilin, M. Demas and A.A. Knapik for server testing; W.F. Anderson for valuable discussion; and M.D. Zimmerman and H.C. Chapman for critically reading the manuscript.
Author information
Authors and Affiliations
Contributions
H.Z. designed, implemented, tested and maintained the CMM server; H.Z. developed, implemented and optimized the NEIGHBORHOOD database; M.D.C. and D.R.C. helped design the geometry assignment algorithm and server interface; D.R.C. implemented the first version of the Jmol applet; P.M. and G.M.S. introduced the CBVS and VECSUM methods, which were slightly modified for this study; H.Z., M.D.C., D.R.C., M.C., P.M., G.M.S. and W.M. wrote and approved the manuscript; and W.M. supervised the project.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing financial interests.
Integrated supplementary information
Supplementary Figure 1 Valence distribution for the benchmark dataset.
Numbers of sample sizes and peak values for metal binding sites are indicated in parentheses. Valence values where the distribution heights were 50% or 10% of peak values were used as threshold values to define acceptable, borderline and outlier zones. Similar elements with same the expected valence are grouped (Supplementary Table 1). Poorly coordinated Na/K ions in the benchmark dataset resulted in a valence close to 0 and are considered as noise. These were removed prior to threshold estimation.
Supplementary Figure 2 Distribution of nVECSUM, gRMSD, vacancy and B-factor agreement for the benchmark dataset.
Metal binding sites with invalid parameter values are excluded from the statistics. Numbers of sample sizes and peak values for metal binding sites are indicated in parentheses. Values of parameters where the distributions had heights of 50% or 10% of the peak height are used as threshold values to define acceptable, borderline and outlier zones.
Supplementary Figure 3 One of the three modeled magnesiums (A901) in the catalytic center of the kinase domain of myosin heavy chain kinase A (PDB code: 3lkm).
Even though coordination by a phosphate of AMP often creates a favorable cation binding environment, this site has an improbable geometry (tetrahedral) for magnesium (which prefers octahedral geometry) (a). The ion is better interpreted as a water molecule (b). The other two modeled magnesiums (A902 and A903) in the same structure are re-refined as potassium ions. The re-refined site A902 is shown as an example in the main text in Fig. 4.
Supplementary information
Supplementary Figure 1
Valence distribution for the benchmark dataset. (PDF 566 kb)
Supplementary Figure 2
Distribution of nVECSUM, gRMSD, vacancy and B-factor agreement for the benchmark dataset. (PDF 455 kb)
Supplementary Figure 3
One of the three modeled magnesiums (A901) in the catalytic center of the kinase domain of myosin heavy chain kinase A (PDB code: 3lkm). (PDF 263 kb)
Supplementary Table 1
Comparison of CMM with other programs and services for metal binding site prediction or investigation. (PDF 221 kb)
Supplementary Table 2
Threshold values for CMM parameters. A parenthesis indicates that the nearest endpoint is excluded from the interval; a square bracket indicates that the endpoint is included in the interval. (PDF 306 kb)
Supplementary Table 3
Re-refinement statistics and geometry for the examples described in the main text. Clashscore, rotamer outliers, and the number of residues in the Ramachandran plot favored regions were calculated using MolProbity43. The Rfree set reported in the structure factor files available from the PDB were used for Rfree calculation. (PDF 296 kb)
Rights and permissions
About this article
Cite this article
Zheng, H., Chordia, M., Cooper, D. et al. Validation of metal-binding sites in macromolecular structures with the CheckMyMetal web server. Nat Protoc 9, 156–170 (2014). https://doi.org/10.1038/nprot.2013.172
Published:
Issue Date:
DOI: https://doi.org/10.1038/nprot.2013.172
This article is cited by
-
An intramolecular macrocyclase in plant ribosomal peptide biosynthesis
Nature Chemical Biology (2024)
-
The Ruminococcus bromii amylosome protein Sas6 binds single and double helical α-glucan structures in starch
Nature Structural & Molecular Biology (2024)
-
Reanalysis of a μ opioid receptor crystal structure reveals a covalent adduct with BU72
BMC Biology (2023)
-
Molecular docking in organic, inorganic, and hybrid systems: a tutorial review
Monatshefte für Chemie - Chemical Monthly (2023)
-
Kinetic studies and homology modeling of a dual-substrate linalool/nerolidol synthase from Plectranthus amboinicus
Scientific Reports (2021)
Comments
By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.