Abstract
Knowledge of the oxidation state of metal centres in compounds and materials helps in the understanding of their chemical bonding and properties. Chemists have developed theories to predict oxidation states based on electron-counting rules, but these can fail to describe oxidation states in extended crystalline systems such as metal–organic frameworks. Here we propose the use of a machine-learning model, trained on assignments by chemists encoded in the chemical names in the Cambridge Structural Database, to automatically assign oxidation states to the metal ions in metal–organic frameworks. In our approach, only the immediate local environment around a metal centre is considered. We show that the strategy is robust to experimental uncertainties such as incorrect protonation, unbound solvents or changes in bond length. This method gives good accuracy and we show that it can be used to detect incorrect assignments in the Cambridge Structural Database, illustrating how collective knowledge can be captured by machine learning and converted into a useful tool.
This is a preview of subscription content, access via your institution
Access options
Access Nature and 54 other Nature Portfolio journals
Get Nature+, our best-value online-access subscription
$29.99 / 30 days
cancel any time
Subscribe to this journal
Receive 12 print issues and online access
$259.00 per year
only $21.58 per issue
Buy this article
- Purchase on SpringerLink
- Instant access to full article PDF
Prices may be subject to local taxes which are calculated during checkout
Similar content being viewed by others
Data availability
The feature matrices, labels and a pretrained model are deposited on the Materials Cloud archive (https://doi.org/10.24435/materialscloud:dq-ey). The data that reproduce the plots shown in the main text can be found in a Code Ocean Capsule (https://doi.org/10.24433/CO.3636895.v2).
Code availability
Predictions for MOF structures can be performed using the oximachinerunner Python package (https://github.com/kjappelbaum/oximachinerunner), which can be installed from PyPi. The code for parsing, featurization as well for the ML models is available on GitHub (https://github.com/kjappelbaum/learn_mof_ox_state/tree/master and https://github.com/kjappelbaum/oximachine_featurizer) and deposited on Zenodo (10.5281/zenodo.3567011, 10.5281/zenodo.3567274). The web app is hosted on the work section of Materials Cloud (go.epfl.ch/oximachine)66. The code for this app, along with a Dockerfile, is also available on GitHub (https://github.com/kjappelbaum/oximachinetool) and deposited on Zenodo (10.5281/zenodo.3603606). The code used to generate the plots shown in the main text can be found in a Code Ocean capsule (https://doi.org/10.24433/CO.3636895.v2). The code used to generate the structure graphics in the graphical abstract is available in ref. 67.
References
Walsh, A., Sokol, A. A., Buckeridge, J., Scanlon, D. O. & Catlow, C. R. A. Oxidation states and ionicity. Nat. Mater. 17, 958–964 (2018).
Jensen, W. B. The origin of the oxidation-state concept. J. Chem. Educ. 84, 1418 (2007).
Wöhler, F. Grundriss Der Chemie: Unorganische Chemie 3rd edn, 4 (Duncker & Humblot, 1835).
Latimer, W. M. The Oxidation States of the Elements and Their Potentials in Aqueous Solutions (Prentice-Hall Chemistry Series) 2nd edn (Prentice-Hall, 1952).
Connelly, N. G., Damhus, T., Hartshorn, R. M. & Hutton, A. T. (eds.) Nomenclature of Inorganic Chemistry. IUPAC Recommendations 2005 (RSC and IUPAC, 2005).
Kroll, J. H. et al. Carbon oxidation state as a metric for describing the chemistry of atmospheric organic aerosol. Nat. Chem. 3, 133–139 (2011).
Terrett, J. A., Cuthbertson, J. D., Shurtleff, V. W. & MacMillan, D. W. C. Switching on elusive organometallic mechanisms with photoredox catalysis. Nature 524, 330–334 (2015).
Jørgensen, C. K. Oxidation Numbers and Oxidation States (Springer, 1969).
Ball, P. Beyond the bond. Nature 469, 26–28 (2011).
Gold, V. (ed.) The IUPAC Compendium of Chemical Terminology: The Gold Book (IUPAC, 2019); https://doi.org/10.1351/goldbook
Karen, P., McArdle, P. & Takats, J. Comprehensive definition of oxidation state (IUPAC recommendations 2016). Pure Appl. Chem. 88, 831–839 (2016).
Brown, I. D. Recent developments in the methods and applications of the bond valence model. Chem. Rev. 109, 6858–6919 (2009).
Pauling, L. Atomic radii and interatomic distances in metals. J. Am. Chem. Soc. 69, 542–553 (1947).
Shields, G. P., Raithby, P. R., Allen, F. H. & Motherwell, W. D. S. The assignment and validation of metal oxidation states in the Cambridge Structural Database. Acta Crystallogr. B 56, 455–465 (2000).
Reeves, M. G., Wood, P. A. & Parsons, S. Automated oxidation-state assignment for metal sites in coordination complexes in the Cambridge Structural Database. Acta Crystallogr. B 75, 1096–1105 (2019).
Taylor, R. & Wood, P. A. A million crystal structures: the whole is greater than the sum of its parts. Chem. Rev. 119, 9427–9477 (2019).
O’Keeffe, M. A proposed rigorous definition of coordination number. Acta Crystallogr. A 35, 772–775 (1979).
Walsh, A., Sokol, A. A., Buckeridge, J., Scanlon, D. O. & Catlow, C. R. A. Electron counting in solids: oxidation states, partial charges, and ionicity. J. Phys. Chem. Lett. 8, 2074–2075 (2017).
Pan, H. et al. Benchmarking coordination number prediction algorithms on inorganic crystal structures. Inorg. Chem. 60, 1590–1603 (2021).
Conry, R. R. in Encyclopedia of Inorganic Chemistry (eds King, R. B. et al.) https://doi.org/10.1002/0470862106.ia052 (Wiley, 2006).
Wang, L., Maxisch, T. & Ceder, G. Oxidation energies of transition metal oxides within the GGA + U framework. Phys. Rev. B 73, 195107 (2006).
Stevanović, V., Lany, S., Zhang, X. & Zunger, A. Correcting density functional theory for accurate predictions of compound enthalpies of formation: fitted elemental-phase reference energies. Phys. Rev. B 85, 115104 (2012).
Raebiger, H., Lany, S. & Zunger, A. Charge self-regulation upon changing the oxidation state of transition metals in insulators. Nature 453, 763–766 (2008).
Bendix, J., Brorson, M. & Schäffer, C. E. in Coordination Chemistry Vol. 565 (ed. Kauffman, G. B.) 213–225 (American Chemical Society, 1994).
Jansen, M. & Wedig, U. A piece of the picture-misunderstanding of chemical concepts. Angew. Chem. Int. Ed. 47, 10026–10029 (2008).
Groom, C. R., Bruno, I. J., Lightfoot, M. P. & Ward, S. C. The Cambridge Structural Database. Acta Crystallogr. B 72, 171–179 (2016).
Holgate, S. CSD data curation – the human touch. The Cambridge Crystallographic Data Centre https://www.ccdc.cam.ac.uk/Community/blog/CSD-data-curation-the-human-touch/ (2019).
Allen, F. H. & Taylor, R. Research applications of the Cambridge Structural Database (CSD). Chem. Soc. Rev. 33, 463–475 (2004).
Bürgi, H.-B. & Dunitz, J. D. (eds) Structure Correlation (Wiley, 1994).
Jablonka, K. M., Ongari, D., Moosavi, S. M. & Smit, B. Big-data science in porous materials: materials genomics and machine learning. Chem. Rev. 120, 8066–8129 (2020).
Janet, J. P. & Kulik, H. J. Resolving transition metal chemical space: feature selection for machine learning and structure–property relationships. J. Phys. Chem. A 121, 8939–8954 (2017).
Pauling, L. The principles determining the structure of complex ionic crystals. J. Am. Chem. Soc. 51, 1010–1026 (1929).
Prodan, E. & Kohn, W. Nearsightedness of electronic matter. Proc. Natl Acad. Sci. USA 102, 11635–11638 (2005).
Baur, W. H. Bond length variation and distorted coordination polyhedra in inorganic crystals. Trans. Am. Crystallogr. Assoc. 6, 129–155 (1970).
George, J. et al. The limited predictive power of the Pauling rules. Angew. Chem. Int. Ed. 59, 7569–7575 (2020).
Müller, P., Köpke, S. & Sheldrick, G. M. Is the bond-valence method able to identify metal atoms in protein structures? Acta Crystallogr. D 59, 32–37 (2003).
Harvey, M. A., Baggio, S. & Baggio, R. A new simplifying approach to molecular geometry description: the vectorial bond-valence model. Acta Crystallogr. A 62, 1038–1042 (2006).
Brown, I. D. View of lone electron pairs and their role in structural chemistry. J. Phys. Chem. A 115, 12638–12645 (2011).
Liu, S., Grinberg, I., Takenaka, H. & Rappe, A. M. Reinterpretation of the bond-valence model with bond-order formalism: an improved bond-valence-based interatomic potential for PbTiO3. Phys. Rev. B 88, 104102 (2013).
Jahn, H. & Teller, E. Stability of polyatomic molecules in degenerate electronic states - I—Orbital degeneracy. Proc. R. Soc. Lond. A 161, 220–235 (1937).
Gillespie, R. J. & Hargittai, I. The VSEPR Model of Molecular Geometry (Dover Publications, 2012).
Zimmermann, N. E. R., Horton, M. K., Jain, A. & Haranczyk, M. Assessing local structure motifs using order parameters for motif recognition, interstitial identification, and diffusion path characterization. Front. Mater. 4, 34 (2017).
Davies, D. W., Butler, K. T., Isayev, O. & Walsh, A. Materials discovery by chemical analogy: role of oxidation states in structure prediction. Faraday Discuss. 211, 553–568 (2018).
Ward, L. et al. Including crystal structure attributes in machine learning models of formation energies via Voronoi tessellations. Phys. Rev. B 96, 024104 (2017).
Rokach, L. Ensemble-based classifiers. Artif. Intell. Rev. 33, 1–39 (2010).
Ahmed, A. et al. Cu(I)Cu(II)BTC, a microporous mixed-valence MOF via reduction of HKUST-1. RSC Adv. 6, 8902–8905 (2016).
Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. In Advances in Neural Information Processing Systems 30 (eds Guyon, I. et al.) 4765–4774 (Curran Associates, 2017).
Molnar, C. Interpretable Machine Learning: A Guide for making Black Box Models Interpretable (Leanpub, 2020); https://christophm.github.io/interpretable-ml-book/
Barthelet, K., Marrot, J., Riou, D. & Férey, G. A breathing hybrid organic–inorganic solid with very large pores and high magnetic characteristics. Angew. Chem. Int. Ed. 41, 281–284 (2002).
Centrone, A., Harada, T., Speakman, S. & Hatton, T. A. Facile synthesis of vanadium metal–organic frameworks and their magnetic properties. Small 6, 1598–1602 (2010).
Leclerc, H. et al. Influence of the oxidation state of the metal center on the flexibility and adsorption properties of a porous metal organic framework: MIL-47(V). J. Phys. Chem. C 115, 19828–19840 (2011).
Kozachuk, O. et al. A solid-solution approach to mixed-metal metal–organic frameworks – detailed characterization of local structures, defects and breathing behaviour of Al/V frameworks. Eur. J. Inorg. Chem. 2013, 4546–4557 (2013).
Krakowiak, J., Lundberg, D. & Persson, I. A coordination chemistry study of hydrated and solvated cationic vanadium ions in oxidation states +III, +IV, and +V in solution and Solid State. Inorg. Chem. 51, 9598–9609 (2012).
Bloch, E. D. et al. Selective binding of O2 over N2 in a redox–active metal–organic framework with open iron(II) coordination sites. J. Am. Chem. Soc. 133, 14814–14822 (2011).
Jain, A. et al. Commentary: the Materials Project: a materials genome approach to accelerating materials innovation. APL Mater. 1, 011002 (2013).
Janet, J. P. & Kulik, H. J. Predicting electronic structure properties of transition metal complexes with neural networks. Chem. Sci. 8, 5137–5152 (2017).
Ongari, D., Yakutovich, A. V., Talirz, L. & Smit, B. Building a consistent and reproducible database for adsorption vvaluation in covalent–organic frameworks. ACS Cent. Sci. 5, 1663–1675 (2019).
Jiang, L., Levchenko, S. V. & Rappe, A. M. Rigorous definition of oxidation states of ions in solids. Phys. Rev. Lett. 108, 166403 (2012).
Moghadam, P. Z. et al. Development of a Cambridge Structural Database subset: a collection of metal–organic frameworks for past, present, and future. Chem. Mater. 29, 2618–2625 (2017).
Ward, L. et al. Matminer: an open source toolkit for materials data mining. Comput. Mater. Sci. 152, 60–69 (2018).
Pedregosa, F. et al. Scikit-learn: machine learning in Python. J. Mach. Learn. Res. 12, 2825–2830 (2011).
Komer, B., Bergstra, J. & Eliasmith, C. Hyperopt-sklearn: automatic hyperparameter configuration for Scikit-learn. In Proc. 13th Python in Science Conference (eds van der Walt, S. & Bergstra, J.) 32–37 (SciPy, 2014).
Sechidis, K., Tsoumakas, G. & Vlahavas, I. On the stratification of multi-label data. In Machine Learning and Knowledge Discovery in Databases Vol. 6913 (eds Gunopulos, D. et al.) 145–158 (Springer, 2011).
Schreiber, J., Bilmes, J. & Noble, W. S. apricot: Submodular selection for data summarization in Python. Preprint at http://arxiv.org/abs/1906.03543 (2019).
Momma, K. & Izumi, F. VESTA 3 for three-dimensional visualization of crystal, volumetric and morphology data. J. Appl. Crystallogr. 44, 1272–1276 (2011).
Talirz, L. et al. Materials Cloud, a platform for open computational science. Sci. Data 7, 299 (2020).
Dubbeldam, D., Calero, S. & Vlugt, T. J. iRASPA: GPU-accelerated visualization software for materials scientists. Mol. Simul. 44, 653–676 (2018).
Acknowledgements
This work was supported by a European Research Council (ERC) Advanced Grant (Grant Agreement No. 666983, MaGic), the Swiss National Science Foundation (SNSF) under Grant 200021_172759 and the National Center of Competence in Research (NCCR) through the Materials’ Revolution: Computational Design and Discovery of Novel Materials (MARVEL). We thank L. Talirz and the Materials Cloud team for feedback on the web app, the integration into AiiDA workflows for DFT optimization (https://github.com/lsmo-epfl/aiida-lsmo) and A. Yakutovich for help with the integration in AiiDAlab. Moreover, we are grateful for all the feedback we received from chemists all over the world on the potential errors we found in their CSD entries.
Author information
Authors and Affiliations
Contributions
K.M.J. developed the ML workflows. D.O. carried out the bond valence sum analyses. B.S., S.M.M. and K.M.J. developed the featurization. All authors contributed to the analysis of the data and the writing of the article.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing interests.
Additional information
Peer review information Nature Chemistry thanks Joshua Schrier, Vivek Sinha and the other, anonymous, reviewer(s) for their contribution to the peer review of this work.
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary information
Supplementary Information
Supplementary Figs. 1–36, Discussion, Tables 1–34 and refs. 1–327.
Rights and permissions
About this article
Cite this article
Jablonka, K.M., Ongari, D., Moosavi, S.M. et al. Using collective knowledge to assign oxidation states of metal cations in metal–organic frameworks. Nat. Chem. 13, 771–777 (2021). https://doi.org/10.1038/s41557-021-00717-y
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1038/s41557-021-00717-y
This article is cited by
-
Nano-enhanced solid-state hydrogen storage: Balancing discovery and pragmatism for future energy solutions
Nano Research (2024)
-
Direct prediction of gas adsorption via spatial atom interaction learning
Nature Communications (2023)
-
Predicting the oxidation states of Mn ions in the oxygen-evolving complex of photosystem II using supervised and unsupervised machine learning
Photosynthesis Research (2023)
-
cell2mol: encoding chemistry to interpret crystallographic data
npj Computational Materials (2022)
-
A data-science approach to predict the heat capacity of nanoporous materials
Nature Materials (2022)