Predicting disease-associated mutation of metal-binding sites in proteins using a deep learning approach

Abstract

Metalloproteins play important roles in many biological processes. Mutations at the metal-binding sites may functionally disrupt metalloproteins, initiating severe diseases; however, there seemed to be no effective approach to predict such mutations until now. Here we develop a deep learning approach to successfully predict disease-associated mutations that occur at the metal-binding sites of metalloproteins. We generate energy-based affinity grid maps and physiochemical features of the metal-binding pockets (obtained from different databases as spatial and sequential features) and subsequently implement these features into a multichannel convolutional neural network. After training the model, the multichannel convolutional neural network can successfully predict disease-associated mutations that occur at the first and second coordination spheres of zinc-binding sites with an area under the curve of 0.90 and an accuracy of 0.82. Our approach stands for the first deep learning approach for the prediction of disease-associated metal-relevant site mutations in metalloproteins, providing a new platform to tackle human diseases.

Access options

Rent or Buy article

Get time limited or full article access on ReadCube.

from$8.99

All prices are NET prices.

Fig. 1: Workflow of data collection and feature extraction to train the deep learning model.
Fig. 2: A proportional graph of disease-associated mutations of the metal-binding pocket.
Fig. 3: Performance of MCCNN to predict disease-associated mutations.
Fig. 4: MCCNN can accurately predict disease-associated site mutations in both the first and second coordination spheres of metals in metalloproteins.

Data availability

The disease-associated and benign mutations data have been attached as supporting tables. The implemented model and spatial and sequential features for training the model is available in BitBucket code repository: https://bitbucket.org/mkoohim/multichannel-cnn.

Code availability

The implemented model of MCCNN is publicly available in BitBucket repository under GPL v3.0 license: https://bitbucket.org/mkoohim/multichannel-cnn.

References

  1. 1.

    Waldron, K. J. & Robinson, N. J. How do bacterial cells ensure that metalloproteins get the correct metal? Nat. Rev. Microbiol. 7, 25–35 (2009).

  2. 2.

    Finney, L. A. & O’Halloran, T. V. Transition metal speciation in the cell: insights from the chemistry of metal ion receptors. Science 300, 931–936 (2003).

  3. 3.

    Changela, A. et al. Molecular basis of metal-ion selectivity and zeptomolar sensitivity by CueR. Science 301, 1383–1387 (2003).

  4. 4.

    Barnham, K. J. & Bush, A. I. Biological metals and metal-targeting compounds in major neurodegenerative diseases. Chem. Soc. Rev. 43, 6727–6749 (2014).

  5. 5.

    Waldron, K. J., Rutherford, J. C., Ford, D. & Robinson, N. J. Metalloproteins and metal sensing. Nature 460, 823–830 (2009).

  6. 6.

    Yang, X., Li, H., Lai, T. P. & Sun, H. UreE–UreG complex facilitates nickel transfer and preactivates GTPase of UreG in Helicobacter pylori. J. Biol. Chem. 290, 12474–12485 (2015).

  7. 7.

    Yang, X. et al. Nickel translocation between metallochaperones HypA and UreE in Helicobacter pylori. Metallomics 6, 1731–1736 (2014).

  8. 8.

    Zhao, M., Wang, H. B., Ji, L. N. & Mao, Z. W. Insights into metalloenzyme microenvironments: biomimetic metal complexes with a functional second coordination sphere. Chem. Soc. Rev. 42, 8360–8375 (2013).

  9. 9.

    Mirts, E. N., Bhagi-Damodaran, A. & Lu, Y. Understanding and modulating metalloenzymes with unnatural amino acids, non-native metal ions, and non-native metallocofactors. Acc. Chem. Res. 52, 935–944 (2019).

  10. 10.

    Lu, Y., Yeung, N., Sieracki, N. & Marshall, N. M. Design of functional metalloproteins. Nature 460, 855–862 (2009).

  11. 11.

    Dudev, T. & Lim, C. Metal binding affinity and selectivity in metalloproteins: insights from computational studies. Annu. Rev. Biophys. 37, 97–116 (2008).

  12. 12.

    Haas, K. L. & Franz, K. J. Application of metal coordination chemistry to explore and manipulate cell biology. Chem. Rev. 109, 4921–4960 (2009).

  13. 13.

    Levy, R., Sobolev, V. & Edelman, M. First- and second-shell metal binding residues in human proteins are disproportionately associated with disease-related SNPs. Hum. Mutat. 32, 1309–1318 (2011).

  14. 14.

    Jackson, S. P. & Bartek, J. The DNA-damage response in human biology and disease. Nature 461, 1071–1078 (2009).

  15. 15.

    Chan, P. A. et al. Interpreting missense variants: comparing computational methods in human disease genes CDKN2A, MLH1, MSH2, MECP2, and tyrosinase (TYR). Hum. Mutat. 28, 683–693 (2007).

  16. 16.

    Bao, L., Zhou, M. & Cui, Y. nsSNPAnalyzer: identifying disease-associated nonsynonymous single nucleotide polymorphisms. Nucleic Acids Res. 33, W480–W482 (2005).

  17. 17.

    Bromberg, Y. & Rost, B. SNAP: predict effect of non-synonymous polymorphisms on function. Nucleic Acids Res. 35, 3823–3835 (2007).

  18. 18.

    Calabrese, R., Capriotti, E., Fariselli, P., Martelli, P. L. & Casadio, R. Functional annotations improve the predictive score of human disease-related mutations in proteins. Hum. Mutat. 30, 1237–1244 (2009).

  19. 19.

    Thusberg, J., Olatubosun, A. & Vihinen, M. Performance of mutation pathogenicity prediction methods on missense variants. Hum. Mutat. 32, 358–368 (2011).

  20. 20.

    Adzhubei, I. A. et al. A method and server for predicting damaging missense mutations. Nat. Methods. 7, 248–249 (2010).

  21. 21.

    Putignano, V., Rosato, A., Banci, L. & Andreini, C. MetalPDB in 2018: a database of metal sites in biological macromolecular structures. Nucleic Acids Res. 46, D459–D464 (2017).

  22. 22.

    Gohlke, B. O., Nickel, J., Otto, R., Dunkel, M. & Preissner, R. CancerResource–updated database of cancer-relevant proteins, mutations and interacting drugs. Nucleic Acids Res. 44, D932–D937 (2016).

  23. 23.

    Landrum, M. J. et al. ClinVar: public archive of relationships among sequence variation and human phenotype. Nucleic. Acids. Res. 42, D980–D985 (2013).

  24. 24.

    Wu, C. H. et al. The Universal Protein Resource (UniProt): an expanding universe of protein information. Nucleic Acids Res. 34, D187–D191 (2006).

  25. 25.

    Pommié, C., Levadoux, S., Sabatier, R., Lefranc, G. & Lefranc, M. P. IMGT standardized criteria for statistical analysis of immunoglobulin V‐REGION amino acid properties. J. Mol. Recognit. 17, 17–32 (2004).

  26. 26.

    Yarden, R. I., Pardo-Reoyo, S., Sgagias, M., Cowan, K. H. & Brody, L. C. BRCA1 regulates the G2/M checkpoint by activating Chk1 kinase upon DNA damage. Nat. Genet. 30, 285–289 (2002).

  27. 27.

    Chenevix-Trench, G. et al. Genetic and histopathologic evaluation of BRCA1 and BRCA2 DNA sequence variants of unknown clinical significance. Cancer. Res. 66, 2019–2027 (2006).

  28. 28.

    Kruse, J. P. & Gu, W. Modes of p53 regulation. Cell 137, 609–622 (2009).

  29. 29.

    Bachinski, L. L. et al. Genetic mapping of a third Li-Fraumeni syndrome predisposition locus to human chromosome 1q23. Cancer Res. 65, 427–431 (2005).

  30. 30.

    Zenker, M. et al. Deficiency of UBR1, a ubiquitin ligase of the N-end rule pathway, causes pancreatic dysfunction, malformations and mental retardation (Johanson-Blizzard syndrome). Nat. Genet. 37, 1345–1350 (2005).

  31. 31.

    Kwak, K. S. et al. Regulation of protein catabolism by muscle-specific and cytokine-inducible ubiquitin ligase E3alpha-II during cancer cachexia. Cancer Res. 64, 8193–8198 (2004).

  32. 32.

    Runtuwene, V. et al. Noonan syndrome gain-of-function mutations in NRAS cause zebrafish gastrulation defects. Dis. Model. Mech. 4, 393–399 (2011).

  33. 33.

    Monti, P. et al. Transcriptional functionality of germ line p53 mutants influences cancer phenotype. Clin. Cancer Res. 13, 3789–3795 (2007).

  34. 34.

    Wang, Y., Wang, H., Li, H. & Sun, H. Metallomic and metalloproteomic strategies in elucidating the molecular mechanisms of metallodrugs. Dalton. Trans. 44, 437–447 (2015).

  35. 35.

    Lipscomb, C. E. Medical subject headings (MeSH). Bull. Med. Libr. Assoc. 88, 265 (2000).

  36. 36.

    Morris, G. M. et al. Automated docking using a Lamarckian genetic algorithm and an empirical binding free energy function. J. Comput. Chem. 19, 1639–1662 (1998).

  37. 37.

    Cao, D. S., Xu, Q. S. & Liang, Y. Z. propy: a tool to generate various modes of Chou’s PseAAC. Bioinformatics 29, 960–962 (2013).

  38. 38.

    Chollet, F. Keras. GitHub https://github.com/keras-team/keras (2015).

Download references

Acknowledgements

We thank the Research Grants Council of Hong Kong (grant nos. 17307017P and R7070-18), the National Science Foundation of China (grant no. 21671203), the University of Hong Kong (for a studentship for M.K. and a Norman and Cecilia Yip Foundation for H.S.) and the Hong Kong PhD Fellowship (HKPF for H.W.) for support. A startup fund from the Mayo Clinic Arizona, Mayo Clinic Center for Individualized Medicine and Mayo Clinic Cancer Center (grant no. P30CA015083-45 for M.K. and J.W.) is acknowledged for support. We thank G.H. Chen (University of Hong Kong) and X.H. Xia (University of Ottawa) for helpful comments.

Author information

Affiliations

Authors

Contributions

For the work described herein, M.K. designed and implemented the pipeline. M.K., H.W., Y.W., X.Y. and H.L. performed data intergradation and result validation. M.K., H.W., H.L. and H.S. wrote the paper. Y.W., X.Y. and J.W. commented on and edited the manuscript. J.W. and H.S. provided overall project leadership.

Corresponding authors

Correspondence to Junwen Wang or Hongzhe Sun.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplementary Information

Supplementary figs., tables and notes

Supplementary Table 1

Disease-associated mutations of metal-binding pocket

Supplementary Table 2

Benign mutations of metal-binding pocket

Supplementary Table 9

The prediction result of the unseen data

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Koohi-Moghadam, M., Wang, H., Wang, Y. et al. Predicting disease-associated mutation of metal-binding sites in proteins using a deep learning approach. Nat Mach Intell 1, 561–567 (2019). https://doi.org/10.1038/s42256-019-0119-z

Download citation

Further reading