Abstract
Understanding material surfaces and interfaces is vital in applications such as catalysis or electronics. By combining energies from electronic structure with statistical mechanics, ab initio simulations can, in principle, predict the structure of material surfaces as a function of thermodynamic variables. However, accurate energy simulations are prohibitive when coupled to the vast phase space that must be statistically sampled. Here we present a bi-faceted computational loop to predict surface phase diagrams of multicomponent materials that accelerates both the energy scoring and statistical sampling methods. Fast, scalable and data-efficient machine learning interatomic potentials are trained on high-throughput density-functional-theory calculations through closed-loop active learning. Markov chain Monte Carlo sampling in the semigrand canonical ensemble is enabled by using virtual surface sites. The predicted surfaces for GaN(0001), Si(111) and SrTiO3(001) are in agreement with past work and indicate that the proposed strategy can model complex material surfaces and discover previously unreported surface terminations.
This is a preview of subscription content, access via your institution
Access options
Access Nature and 54 other Nature Portfolio journals
Get Nature+, our best-value online-access subscription
$29.99 / 30 days
cancel any time
Subscribe to this journal
Receive 12 digital issues and online access to articles
$99.00 per year
only $8.25 per issue
Buy this article
- Purchase on Springer Link
- Instant access to full article PDF
Prices may be subject to local taxes which are calculated during checkout
Similar content being viewed by others
Data availability
The trained models, DFT data and Jupyter notebooks used for data analysis are available on Zenodo at https://doi.org/10.5281/zenodo.7758174 (ref. 72). Source data are provided with this paper.
Code availability
The VSSR-MC algorithm reported in this work is available on GitHub at https://github.com/learningmatter-mit/surface-sampling. The version of code used in this work is available on Zenodo at https://doi.org/10.5281/zenodo.10086398 (ref. 73).
References
Shi, R., Waterhouse, G. I. & Zhang, T. Recent progress in photocatalytic CO2 reduction over perovskite oxides. Solar RRL 1, 1700126 (2017).
Sumaria, V., Nguyen, L., Tao, F. F. & Sautet, P. Atomic-scale mechanism of platinum catalyst restructuring under a pressure of reactant gas. J. Am. Chem. Soc. 145, 392–401 (2023).
Fabbri, E. et al. Dynamic surface self-reconstruction is the key of highly active perovskite nano-electrocatalysts for water splitting. Nat. Mater. 16, 925–931 (2017).
Zhang, Z., Wei, Z., Sautet, P. & Alexandrova, A. N. Hydrogen-induced restructuring of a Cu(100) electrode in electroreduction conditions. J. Am. Chem. Soc. 144, 19284–19293 (2022).
Sha, Z., Shen, Z., Cali, E., Kilner, J. A. & Skinner, S. J. Understanding surface chemical processes in perovskite oxide electrodes. J. Mater. Chem. 11, 5645–5659 (2023).
Jung, S.-K. et al. Understanding the degradation mechanisms of LiNi0.5Co0.2Mn0.3O2 cathode material in lithium ion batteries. Adv. Energy Mater. 4, 1300787 (2014).
Han, B. et al. From coating to dopant: how the transition metal composition affects alumina coatings on Ni-rich cathodes. ACS Appl. Mater. Interfaces 9, 41291–41302 (2017).
Xu, C. et al. Bulk fatigue induced by surface reconstruction in layered Ni-rich cathodes for Li-ion batteries. Nat. Mater. 20, 84–92 (2021).
Hirata, A., Saiki, K., Koma, A. & Ando, A. Electronic structure of a SrO-terminated SrTiO3(100) surface. Surf. Sci. 319, 267–271 (1994).
Castell, M. R. Scanning tunneling microscopy of reconstructions on the SrTiO3(001) surface. Surf. Sci. 505, 1–13 (2002).
Erdman, N. et al. The structure and chemistry of the TiO2-rich surface of SrTiO3(001). Nature 419, 55–58 (2002).
Heifets, E., Piskunov, S., Kotomin, E. A., Zhukovskii, Y. F. & Ellis, D. E. Electronic structure and thermodynamic stability of double-layered SrTiO3(001) surfaces: ab initio simulations. Phys. Rev. B 75, 115417 (2007).
Li, H., Jiao, Y., Davey, K. & Qiao, S.-Z. Data-driven machine learning for understanding surface structures of heterogeneous catalysts. Angew. Chem. Int. Ed. 135, e202216383 (2023).
Merte, L. R. et al. Structure of an ultrathin oxide on Pt3Sn(111) solved by machine learning enhanced global optimization. Angew. Chem. Int. Ed. 61, e202204244 (2022).
Foiles, S. M., Baskes, M. I. & Daw, M. S. Embedded-atom-method functions for the fcc metals Cu, Ag, Au, Ni, Pd, Pt, and their alloys. Phys. Rev. B 33, 7983–7991 (1986).
Nord, J., Albe, K., Erhart, P. & Nordlund, K. Modelling of compound semiconductors: analytical bond-order potential for gallium, nitrogen and gallium nitride. J. Phys. Condensed Matter 15, 5649 (2003).
Kolpak, A. M., Li, D., Shao, R., Rappe, A. M. & Bonnell, D. A. Evolution of the structure and thermodynamic stability of the BaTiO3(001) surface. Phys. Rev. Lett. 101, 036102 (2008).
Wexler, R. B., Qiu, T. & Rappe, A. M. Automatic prediction of surface phase diagrams using ab initio grand canonical Monte Carlo. J. Phys. Chem. C 123, 2321–2328 (2019).
Zhou, X.-F., Oganov, A. R., Shao, X., Zhu, Q. & Wang, H.-T. Unexpected reconstruction of the α-boron (111) surface. Phys. Rev. Lett. 113, 176101 (2014).
Timmermann, J. et al. IrO2 surface complexions identified through machine learning and surface investigations. Phys. Rev. Lett. 125, 206101 (2020).
Wales, D. J. & Doye, J. P. K. Global optimization by basin-hopping and the lowest energy structures of Lennard–Jones clusters containing up to 110 atoms. J. Phys. Chem. A 101, 5111–5116 (1997).
Panosetti, C., Krautgasser, K., Palagin, D., Reuter, K. & Maurer, R. J. Global materials structure search with chemically motivated coordinates. Nano Lett. 15, 8044–8048 (2015).
Obersteiner, V., Scherbela, M., Hörmann, L., Wegner, D. & Hofmann, O. T. Structure prediction for surface-induced phases of organic monolayers: overcoming the combinatorial bottleneck. Nano Lett. 17, 4453–4460 (2017).
Egger, A. T. et al. Charge transfer into organic thin films: a deeper insight through machine-learning-assisted structure search. Adv. Sci. 7, 2000992 (2020).
Bauer, M. N., Probert, M. I. J. & Panosetti, C. Systematic comparison of genetic algorithm and basin hopping approaches to the global optimization of Si(111) surface reconstructions. J. Phys. Chem. A 126, 3043–3056 (2022).
Wang, Q., Oganov, A. R., Zhu, Q. & Zhou, X.-F. New reconstructions of the (110) surface of rutile TiO2 predicted by an evolutionary method. Phys. Rev. Lett. 113, 266101 (2014).
Schusteritsch, G. & Pickard, C. J. Predicting interface structures: from SrTiO3 to graphene. Phys. Rev. B 90, 035424 (2014).
Meldgaard, S. A., Mortensen, H. L., Jørgensen, M. S. & Hammer, B. Structure prediction of surface reconstructions by deep reinforcement learning. J. Phys. Condensed Matter 32, 404005 (2020).
Hess, F. & Yildiz, B. Polar or not polar? The interplay between reconstruction, Sr enrichment, and reduction at the La0.75Sr0.25MnO3(001) surface. Phys. Rev. Mater. 4, 015801 (2020).
Unke, O. T. et al. Machine learning force fields. Chem. Rev. 121, 10142–10186 (2021).
Axelrod, S. et al. Learning matter: materials design with machine learning and atomistic simulations. Acc. Mater. Res. 3, 343–357 (2022).
Bisbo, M. K. & Hammer, B. Efficient global structure optimization with a machine-learned surrogate model. Phys. Rev. Lett. 124, 086102 (2020).
Bisbo, M. K. & Hammer, B. Global optimization of atomic structure enhanced by machine learning. Phys. Rev. B 105, 245404 (2022).
Timmermann, J. et al. Data-efficient iterative training of Gaussian approximation potentials: application to surface structure determination of rutile IrO2 and RuO2. J. Chem. Phys. 155, 244107 (2021).
Rønne, N. et al. Atomistic structure search using local surrogate model. J. Chem. Phys. 157, 174115 (2022).
Han, Y. et al. Prediction of surface reconstructions using MAGUS. J. Chem. Phys. 158, 174109 (2023).
Xu, J., Xie, W., Han, Y. & Hu, P. Atomistic insights into the oxidation of flat and stepped platinum surfaces using large-scale machine learning potential-based grand-canonical Monte Carlo. ACS Catal. 12, 14812–14824 (2022).
Bernardin, F. E. & Rutledge, G. C. Semi-grand canonical Monte Carlo (SGMC) simulations to interpret experimental data on processed polymer melts and glasses. Macromolecules 40, 4691–4702 (2007).
Damewood, J., Schwalbe-Koda, D. & Gómez-Bombarelli, R. Sampling lattices in semi-grand canonical ensemble with autoregressive machine learning. npj Comput. Mater. 8, 61 (2022).
Carrete, J., Montes-Campos, H., Wanzenböck, R., Heid, E. & Madsen, G. K. H. Deep ensembles vs committees for uncertainty estimation in neural-network force fields: comparison and application to active learning. J. Chem. Phys. 158, 204801 (2023).
Tan, A. R., Urata, S., Goldman, S., Dietschreit, J. C. B. & Gómez-Bombarelli, R. Single-model uncertainty quantification in neural network potentials does not consistently outperform model ensembles. Preprint at https://arxiv.org/abs/2305.01754 (2023).
Schwalbe-Koda, D., Tan, A. R. & Gómez-Bombarelli, R. Differentiable sampling of molecular geometries with uncertainty-based adversarial attacks. Nat. Commun. 12, 5104 (2021).
Fu, X. et al. Forces are not enough: benchmark and critical evaluation for machine learning force fields with molecular simulations. Transactions on Machine Learning Research https://openreview.net/forum?id=A8pqQipwkt (2023).
Damewood, J. et al. Representations of materials for machine learning. Annu. Rev. Mater. Res. 53, 399–426 (2023).
Stephenson, P. C. L., Radny, M. W. & Smith, P. V. A modified Stillinger–Weber potential for modelling silicon surfaces. Surf. Sci. 366, 177–184 (1996).
Northrup, J. E., Neugebauer, J., Feenstra, R. M. & Smith, A. R. Structure of GaN(0001): the laterally contracted Ga bilayer model. Phys. Rev. B 61, 9932–9935 (2000).
Štich, I., Payne, M. C., King-Smith, R. D., Lin, J.-S. & Clarke, L. J. Ab initio total-energy calculations for extremely large systems: application to the Takayanagi reconstruction of Si(111). Phys. Rev. Lett. 68, 1351–1354 (1992).
Smeu, M., Guo, H., Ji, W. & Wolkow, R. A. Electronic properties of Si(111)-7×7 and related reconstructions: density functional theory calculations. Phys. Rev. B 85, 195315 (2012).
Herger, R. et al. Surface of strontium titanate. Phys. Rev. Lett. 98, 076102 (2007).
Hong, C. et al. Anomalous intense coherent secondary photoemission from a perovskite oxide. Nature 617, 493–498 (2023).
Szot, K. & Speier, W. Surfaces of reduced and oxidized SrTiO3 from atomic force microscopy. Phys. Rev. B 60, 5909–5926 (1999).
Kubo, T. & Nozoye, H. Surface structure of SrTiO3(100). Surf. Sci. 542, 177–191 (2003).
Winter, G. & Gómez-Bombarelli, R. Simulations with machine learning potentials identify the ion conduction mechanism mediating non-Arrhenius behavior in LGPS. J. Phys. Energy 5, 024004 (2023).
Millan, R., Bello-Jurado, E., Moliner, M., Boronat, M. & Gomez-Bombarelli, R. Effect of framework composition and NH3 on the diffusion of Cu+ in Cu-CHA catalysts predicted by machine-learning accelerated molecular dynamics. ACS Cent. Sci. 9, 2044–2056 (2023).
Thompson, A. P. et al. LAMMPS—a flexible simulation tool for particle-based materials modeling at the atomic, meso, and continuum scales. Comput. Phys. Commun. 271, 108171 (2022).
Larsen, A. H. et al. The atomic simulation environment—a Python library for working with atoms. J. Phys. Condensed Matter 29, 273002 (2017).
Boes, J. R., Mamun, O., Winther, K. & Bligaard, T. Graph theory approach to high-throughput surface adsorption structure generation. J. Phys. Chem. A 123, 2281–2285 (2019).
Ong, S. P. et al. Python Materials Genomics (pymatgen): a robust, open-source python library for materials analysis. Comput. Mater. Sci. 68, 314–319 (2013).
Momma, K. & Izumi, F. VESTA 3 for three-dimensional visualization of crystal, volumetric and morphology data. J. Appl. Crystallogr. 44, 1272–1276 (2011).
Jain, A. et al. The Materials Project: a materials genome approach to accelerating materials innovation. APL Mater. 1, 011002 (2013).
Schütt, K., Unke, O. & Gastegger, M. Equivariant message passing for the prediction of tensorial properties and molecular spectra. In Proc. 38th International Conference on Machine Learning, Proc. Machine Learning Research Vol. 139 (eds Meila, M. & Zhang, T.) 9377–9388 (PMLR, 2021).
Martinez-Cantin, R., Tee, K. & McCourt, M. Practical Bayesian optimization in the presence of outliers. In Proc. Twenty-First International Conference on Artificial Intelligence and Statistics, Proc. Machine Learning Research Vol. 84 (eds Storkey, A. & Perez-Cruz, F.) 1722–1731 (PMLR, 2018).
Ramachandran, P., Zoph, B. & Le, Q. V. Searching for activation functions. Preprint at https://arxiv.org/abs/1710.05941 (2017).
Kingma, D. P. & Ba, J. Adam: a method for stochastic optimization. In Proc. 3rd International Conference on Learning Representations, ICLR 2015 (eds Bengio, Y. & LeCun, Y.) (2015).
Gasteiger, J., Giri, S., Margraf, J. T. & Günnemann, S. Fast and uncertainty-aware directional message passing for non-equilibrium molecules. Machine Learning for Molecules Workshop, NeurIPS 2020 https://ml4molecules.github.io/papers2020/ML4Molecules_2020_paper_35.pdf (2020).
Reuter, K. & Scheffler, M. Composition, structure, and stability of RuO2(110) as a function of oxygen pressure. Phys. Rev, B 65, 035406 (2001).
Heifets, E., Ho, J. & Merinov, B. Density functional simulation of the BaZrO3(011) surface structure. Phys. Rev. B 75, 155431 (2007).
Kresse, G. & Furthmüller, J. Efficient iterative schemes for ab initio total-energy calculations using a plane-wave basis set. Phys. Rev. B 54, 11169–11186 (1996).
Kresse, G. & Joubert, D. From ultrasoft pseudopotentials to the projector augmented-wave method. Phys. Rev. B 59, 1758–1775 (1999).
Perdew, J. P., Burke, K. & Ernzerhof, M. Generalized gradient approximation made simple. Phys. Rev. Lett. 77, 3865–3868 (1996).
Tadmor, E. B., Elliott, R. S., Sethna, J. P., Miller, R. E. & Becker, C. A. The potential of atomistic simulations and the knowledgebase of interatomic models. JOM 63, 17 (2011).
Du, X. Data for: Machine-learning-accelerated simulations to enable automatic surface reconstruction. Zenodo https://doi.org/10.5281/zenodo.7758174 (2023).
Du, X. learningmatter-mit/surface-sampling. Zenodo https://doi.org/10.5281/zenodo.10086398 (2023).
Acknowledgements
We thank G. Winter, J. Peng, N. Frey and M. Liu for helpful discussions. We also appreciate editing by J. Peng and A. Hoffman. X.D. acknowledges support from the National Science Foundation Graduate Research Fellowship under grant no. 2141064. J.K.D. was supported by the Department of Defense through the National Defense Science and Engineering Graduate Fellowship Program. We are grateful for computation time allocated on the MIT SuperCloud cluster, the MIT Engaging cluster and the NERSC Perlmutter cluster. This material is based on work supported by the Under Secretary of Defense for Research and Engineering under Air Force Contract No. FA8702-15-D-0001. Any opinions, findings, conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the Under Secretary of Defense for Research and Engineering. Delivered to the US Government with Unlimited Rights, as defined in DFARS Part 252.227-7013 or 7014 (February 2014). Notwithstanding any copyright notice, US Government rights in this work are defined by DFARS 252.227-7013 or DFARS 252.227-7014 as detailed above. Use of this work other than as specifically authorized by the US Government may violate any copyrights that exist in this work.
Author information
Authors and Affiliations
Contributions
X.D. implemented the sampling algorithm, performed surface modeling, ran DFT calculations, trained the neural networks and carried out surface stability analysis. J.K.D. assisted with sampling algorithm implementation and provided guidance with surface modeling. J.R.L. provided guidance with surface modeling and ran DFT calculations. R.M. provided guidance with neural network training and active learning. B.Y. provided guidance with the choice of surfaces and surface stability analysis. L.L. supervised the research and contributed to securing funding. R.G.-B. conceived the project, supervised the research and contributed to securing funding. All authors contributed to results discussion and paper writing.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing interests.
Peer review
Peer review information
Nature Computational Science thanks Mie Andersen and the other, anonymous, reviewer(s) for their contribution to the peer review of this work. Primary Handling Editor: Kaitlin McCardle, in collaboration with the Nature Computational Science team. Peer reviewer reports are available.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Extended data
Extended Data Fig. 1 Top view of additional GaN(0001) MC-sampled structures.
The surface reconstructions are rotated in comparison with the reference structure from ref. 46 but contain the same hexagonal pattern.
Extended Data Fig. 2 Comparing classical potential and DFT energies of Si(111) sampled surface reconstructions.
a–c, Structures shown were obtained from constant-composition (canonical) VSSR-MC sampling using the SRS modified Stillinger-Weber potential45 with 3x3 (a), 5x5 (b) and 7x7 (c) unit cells. The SRS energies were obtained from the depicted structures while the DFT energies came from structures further relaxed at the DFT level. * Further relaxation using DFT resulted in the 3x3 DAS structure.
Extended Data Fig. 3 Correlation plot of force MAE with force s.d. over AL generations.
At each AL generation, an ensemble of just three NFF models was able to estimate force s.d. that correlated strongly with force error. Each individual data point represents a sampled structure. Each blue ‘X’ represents a binned average and a best-fit line is drawn through the binned averages. The binned average is calculated by dividing both the force s.d. and force MAE into equal-sized bins. The average force MAE is then plotted against the median force s.d. for each corresponding bin.
Extended Data Fig. 4 Force distribution over AL generations.
The majority of high-force structures were added in AL generations 1, 2 and 6, which correspond either to random structures or structures obtained through adversarial attack. The three VSSR-MC AL generations produced structures with low force values mostly around 50 eV Å-1 or less.
Extended Data Fig. 5 Test performance of the best NFF model.
As described in the main paper, the test data is obtained from VSSR-MC runs using the sixth-generation NFF model.
Extended Data Fig. 6 Strengths and limitations of VSSR-MC.
a,b, Comparison of limited fixed on-lattice sites (a) and denser algorithmically-generated virtual surface sites that can overlap (b). c, Off-lattice reconstructions can be obtained following VSSR-MC discrete sampling at virtual sites and continuous relaxation of surface atoms and adsorbates. d, Amorphous reconstructions with many local minima, however, will likely be difficult for VSSR-MC to sample.
Extended Data Fig. 7 Side view of virtual sites for surfaces studied in this work.
a–d, Pymatgen (a) and CatKit (b) virtual sites for GaN(0001) against the contracted Ga monolayer reconstruction, two-layer pymatgen sites for Si(111) against the 5x5 DAS reconstruction (c), and pymatgen virtual sites for SrTiO3(001) against the double-layer TiO2 reconstruction (d). The dashed lines are a guide for the eye.
Extended Data Fig. 8 Visualizations in the latent space.
a, Clustering of VSSR-MC structures in the NFF latent space visualized in the first three principal components. In the VSSR-MC with clustering AL method, the surface from each cluster with the highest force s.d. is selected for DFT evaluation. b, PCA of training data and the dominant terminations (term.) in the latent space of the sixth-generation model.
Supplementary information
Supplementary Information
Supplementary Sections: (1) abbreviations used; and (2) surface stability analysis.
Supplementary Data 1
Comparison of AutoSurfRecon with existing computational methods for surface reconstruction. AutoSurfRecon automatically samples across many surface compositions and configurations while training an accurate NFF for low-cost energy prediction.
Source data
Source Data Fig. 3
Statistical source data: Typical GaN(0001) VSSR-MC run profile.
Source Data Fig. 4
Statistical source data: (b) force error and predicted force s.d. for the sixth-generation model; (c) latent space embedding PCA of surfaces acquired at each AL generation; (d) force and energy predictions of the model at each AL generation on the final test set.
Source Data Fig. 5
Statistical source data: (b) predicted surface free energies for each dominant termination across Sr and O chemical potentials; (c-e) predicted surface free energies of sampled structures at Sr chemical potentials of −10, −7 and −4 eV and O chemical potential of 0 eV.
Source Data Extended Data Fig. 3
Statistical source data: force error and predicted force s.d. over six AL generations.
Source Data Extended Data Fig. 4
Statistical source data: distribution of force magnitudes over six AL generations.
Source Data Extended Data Fig. 5
Statistical source data: predictions of the sixth-generation AL model on final test data.
Source Data Extended Data Fig. 8
Statistical source data: (a) PCA of test data in the latent space of the sixth-generation model; (b) PCA of the sixth-generation training data and dominant terminations in the latent space of the sixth-generation model.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Du, X., Damewood, J.K., Lunger, J.R. et al. Machine-learning-accelerated simulations to enable automatic surface reconstruction. Nat Comput Sci 3, 1034–1044 (2023). https://doi.org/10.1038/s43588-023-00571-7
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1038/s43588-023-00571-7
This article is cited by
-
Machine learning speeds up search for surface structure
Nature Computational Science (2023)