Machine-learning-based dynamic-importance sampling for adaptive multiscale simulations

Bhatia, Harsh; Carpenter, Timothy S.; Ingólfsson, Helgi I.; Dharuman, Gautham; Karande, Piyush; Liu, Shusen; Oppelstrup, Tomas; Neale, Chris; Lightstone, Felice C.; Van Essen, Brian; Glosli, James N.; Bremer, Peer-Timo

doi:10.1038/s42256-021-00327-w

Article
Published: 22 April 2021

Machine-learning-based dynamic-importance sampling for adaptive multiscale simulations

Nature Machine Intelligence volume 3, pages 401–409 (2021)Cite this article

2712 Accesses
23 Citations
6 Altmetric
Metrics details

Subjects

Abstract

Multiscale simulations are a well-accepted way to bridge the length and time scales required for scientific studies with the solution accuracy achievable through available computational resources. Traditional approaches either solve a coarse model with selective refinement or coerce a detailed model into faster sampling, both of which have limitations. Here, we present a paradigm of adaptive, multiscale simulations that couple different scales using a dynamic-importance sampling approach. Our method uses machine learning to dynamically and exhaustively sample the phase space explored by a macro model using microscale simulations and enables an automatic feedback from the micro to the macro scale, leading to a self-healing multiscale simulation. As a result, our approach delivers macro length and time scales, but with the effective precision of the micro scale. Our approach is arbitrarily scalable as well as transferable to many different types of simulations. Our method made possible a multiscale scientific campaign of unprecedented scale to understand the interactions of RAS proteins with a plasma membrane in the context of cancer research running over several days on Sierra, which is currently the second-most-powerful supercomputer in the world.

Access through your institution

Buy or subscribe

This is a preview of subscription content, access via your institution

Access options

Access through your institution

Buy this article

Purchase on Springer Link
Instant access to full article PDF

Buy now

Prices may be subject to local taxes which are calculated during checkout

**Fig. 1: ML-based DynIm sampling framework.**

**Fig. 2: Patches represent concentrations of lipids on 30 × 30 nm² local neighbourhoods of RAS discretized into 5 × 5 grids.**

**Fig. 3: Event history of our ML-based DynIm sampling for a period of 14 days of wall-clock time.**

**Fig. 4: Evaluation and comparison of DynIm sampling.**

Fig. 5: The self-healing mechanism enabled by the in situ computation of DynIm weights allows refinement of macro model parameters, such as RAS−lipid RDFs, by appropriately aggregating the results of micro scale simulations.

Fig. 6: Convergence of DynIm sampling.

Improving microbial phylogeny with citizen science within a mass-market video game

Article Open access 15 April 2024

Highly accurate protein structure prediction with AlphaFold

Article Open access 15 July 2021

Pooled multicolour tagging for visualizing subcellular protein dynamics

Article Open access 19 April 2024

Data availability

Sample data for DynIm is also made available along with the code repository. The data related to the multiscale simulation described in the paper will be made available upon reasonable request; the size of all raw data is hundreds of terabytes. For more information, please see details of the simulation¹⁴

Code availability

The framework for DynIm has been released open source under the MIT license: https://github.com/LLNL/dynim.

References

Ingram, G., Cameron, I. & Hangos, K. Classification and analysis of integrating frameworks in multiscale modelling. Chem. Eng. Sci. 59, 2171–2187 (2004).
Article Google Scholar
Weinan, E. Principles of Multiscale Modeling (Cambridge Univ. Press, 2011).
Hoekstra, A., Chopard, B. & Coveney, P. Multiscale modelling and simulation: a position paper. Phil. Trans. R. Soc. A 372, 20130377 (2014).
Article MathSciNet Google Scholar
Chopard, B., Borgdorff, J. & Hoekstra, A. A framework for multi-scale modelling. Phil. Trans. R. Soc. A 372, 20130378 (2014).
Article Google Scholar
Geweke, J. Bayesian inference in econometric models using Monte Carlo integration. Econometrica 57, 1317–39 (1989).
Article MathSciNet Google Scholar
MacKay, D. J. C. Information Theory, Inference, and Learning Algorithms (Cambridge Univ. Press, 2003).
Liang, F. Dynamically weighted importance sampling in Monte Carlo computation. J. Am. Stat. Assoc. 97, 807–821 (2002).
Article MathSciNet Google Scholar
Liang, F. & Cheon, S. Monte Carlo dynamically weighted importance sampling for spatial models with intractable normalizing constants. J. Phys. Conf. Ser. 197, 012004 (2009).
Article Google Scholar
Joubert, D. J. & Marwala, T. Monte Carlo dynamically weighted importance sampling for finite element model updating. In Topics in Modal Analysis and Testing (ed. Mains, M.) Vol. 10, 303−312 (Springer, 2016).
Katharopoulos, A. & Fleuret, F. Not all samples are created equal: deep learning with importance sampling. In Proc. 35th Int. Conf. on Machine Learning (eds Dy, J. & Krause, A.) Vol. 80, 2525−2534 (Proceedings of Machine Learning Research, 2018). http://proceedings.mlr.press/v80/katharopoulos18a.html"
Johnson, T. B. & Guestrin, C. Training deep models faster with robust, approximate importance sampling. In Advances in Neural Information Processing Systems (eds Bengio, S. et al.) Vol. 31, 7265−7275 (Curran Associates, 2018). http://papers.nips.cc/paper/7957-training-deep-models-faster-with-robust-approximate-importance-sampling.pdf
Simanshu, D. K., Nissley, D. V. & McCormick, F. RAS proteins and their regulators in human disease. Cell 170, 17–33 (2017).
Article Google Scholar
Waters, A. M. & Der, C. J. KRAS: the critical driver and therapeutic target for pancreatic cancer. Cold Spring Harbor Persp. Med. 8, a031435 (2018).
Article Google Scholar
Ingólfsson, H. I. et al. Machine learning-driven multiscale modeling reveals lipid-dependent dynamics of RAS signaling proteins. Preprint at Research Square https://www.researchsquare.com/article/rs-50842/v1 (2020).
Marrink, S. J., Risselada, H. J., Yefimov, S., Tieleman, D. P. & de Vries, A. H. The MARTINI Force Field: coarse grained model for biomolecular simulations. J. Phys. Chem. B 111, 7812–7824 (2007).
Article Google Scholar
Di Natale, F. et al. A massively parallel infrastructure for adaptive multiscale simulations: modeling RAS initiation pathway for cancer. In Supercomputing ’19: Int. Conf. High Performance Computing, Networking, Storage, and Analysis 57 (ACM, 2019).
November 2019 TOP500 The List https://www.top500.org/lists/2019/11/ (2019).
Doersch, C. Tutorial on variational autoencoders. Preprint at https://arxiv.org/abs/1606.05908 (2016).
Jégou, H., Douze, M., Johnson, J. & Hosseini, L. FAISS. GitHub https://github.com/facebookresearch/faiss (2021).
Johnson, J., Douze, M. & Jégou, H. Billion-scale similarity search with gpus. IEEE Trans. Big Data https://doi.org/10.1109/TBDATA.2019.2921572 (2019).
Zhang, X. et al. ddcMD: a fully GPU-accelerated molecular dynamics program for the Martini Force Field. J. Chem. Phys. 153, 045103 (2020).
Article Google Scholar
Correa, C., Lindstrom, P. & Bremer, P. T. Topological spines: a structure-preserving visual representation of scalar fields. IEEE Trans. Visualiz. Comput. Graph. 17, 1842–1851 (2011).
Article Google Scholar
Liu, S. et al. Scalable topological data analysis and visualization for evaluating data-driven models in scientific applications. IEEE Trans. Visualiz. Comput. Graph. 26, 291–300 (2020).
Article Google Scholar
Korn, F. & Muthukrishnan, S. Influence sets based on reverse nearest neighbor queries. SIGMOD Rec. 29, 201–212 (2000).
Article Google Scholar
Kullback, S. & Leibler, R. A. On information and sufficiency. Ann. Math. Stat. 22, 79–86 (1951).
Article MathSciNet Google Scholar
Marconi, U. M. B. & Tarazona, P. Dynamic density functional theory of fluids. J. Chem. Phys. 110, 8032–8044 (1999).
Article Google Scholar
Tonks, M. R., Gaston, D., Millett, P. C., Andrs, D. & Talbot, P. An object-oriented finite element framework for multiphysics phase field simulations. Comput. Mater. Sci. 51, 20–29 (2012).
Article Google Scholar
Streitz, F. H., Glosli, J. N. & Patel, M. V. Beyond finite-size scaling in solidification simulations. Phys. Rev. Lett. 96, 225701 (2006).
Article Google Scholar
Wassenaar, T. A., Ingólfsson, H. I., Böckmann, R. A., Tieleman, D. P. & Marrink, S. J. Computational lipidomics with insane: a versatile tool for generating custom membranes for molecular simulations. J. Chem. Theor. Comput. 11, 2144–2155 (2015).
Article Google Scholar
Abraham, M. J. et al. GROMACS: high performance molecular simulations through multi-level parallelism from laptops to supercomputers. SoftwareX 1/2, 19–25 (2015).
Article Google Scholar
Streitz, F. H. et al. 100+ TFlop solidification simulations on BlueGene/L. In Proc. 2005 ACM/IEEE Conf. Supercomputing (SC ’05) http://www.cresco.enea.it/SC05/schedule/pdf/pap307.pdf (ACM, 2005).
Glosli, J. N. et al. Extending stability beyond CPU millennium: a micron-scale atomistic simulation of Kelvin−Helmholtz instability. In Proc. 2007 ACM/IEEE Conf. Supercomputing (SC ’07) 58:1–58:11, https://doi.org/10.1145/1362622.1362700 (ACM, 2007).
Ingólfsson, H. I. et al. Capturing biologically complex tissue-specific membranes at different levels of compositional complexity. J. Phys. Chem. B 124, 7819–7829 (2020).
Article Google Scholar

Download references

Acknowledgements

This work has been supported in part by the Joint Design of Advanced Computing Solutions for Cancer (JDACS4C) programme established by the US Department of Energy (DOE) and the National Cancer Institute (NCI) of the National Institutes of Health (NIH). For computing time, we thank Livermore Computing (LC) and Livermore Institutional Grand Challenge. This work was performed under the auspices of the US DOE by Lawrence Livermore National Laboratory under contract DE-AC52-07NA27344 and Los Alamos National Laboratory under contract DEAC5206NA25396. Release number: LLNL-JRNL-806073.

Author information

Authors and Affiliations

Center for Applied Scientific Computing, Lawrence Livermore National Laboratory, Livermore, CA, USA
Harsh Bhatia, Shusen Liu, Brian Van Essen & Peer-Timo Bremer
Physical and Life Sciences, Lawrence Livermore National Laboratory, Livermore, CA, USA
Timothy S. Carpenter, Helgi I. Ingólfsson, Gautham Dharuman, Tomas Oppelstrup, Felice C. Lightstone & James N. Glosli
Computational Engineering, Lawrence Livermore National Laboratory, Livermore, CA, USA
Piyush Karande
Theoretical Biology and Biophysics, Los Alamos National Laboratory, Los Alamos, NM, USA
Chris Neale

Authors

Harsh Bhatia
View author publications
You can also search for this author in PubMed Google Scholar
Timothy S. Carpenter
View author publications
You can also search for this author in PubMed Google Scholar
Helgi I. Ingólfsson
View author publications
You can also search for this author in PubMed Google Scholar
Gautham Dharuman
View author publications
You can also search for this author in PubMed Google Scholar
Piyush Karande
View author publications
You can also search for this author in PubMed Google Scholar
Shusen Liu
View author publications
You can also search for this author in PubMed Google Scholar
Tomas Oppelstrup
View author publications
You can also search for this author in PubMed Google Scholar
Chris Neale
View author publications
You can also search for this author in PubMed Google Scholar
Felice C. Lightstone
View author publications
You can also search for this author in PubMed Google Scholar
Brian Van Essen
View author publications
You can also search for this author in PubMed Google Scholar
James N. Glosli
View author publications
You can also search for this author in PubMed Google Scholar
Peer-Timo Bremer
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

The framework was designed by H.B., T.S.C., H.I.I., G.D., B.V.E., J.N.G., P.-T.B. and F.C.L.; the framework was implemented by H.B. Sampling analysis was done by H.B., P.K., S.L., T.O. and C.N. All authors contributed to the writing of the paper.

Corresponding author

Correspondence to Harsh Bhatia.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Peer review information Nature Machine Intelligence thanks Shangying Wang, and the other anonymous reviewer(s), for their contribution to the peer review of this work.

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplementary Information

Supplementary Fig. 1 and one algorithm.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Bhatia, H., Carpenter, T.S., Ingólfsson, H.I. et al. Machine-learning-based dynamic-importance sampling for adaptive multiscale simulations. Nat Mach Intell 3, 401–409 (2021). https://doi.org/10.1038/s42256-021-00327-w

Download citation

Received: 01 June 2020
Accepted: 26 February 2021
Published: 22 April 2021
Issue Date: May 2021
DOI: https://doi.org/10.1038/s42256-021-00327-w

This article is cited by

A State-of-the-Art Review on Machine Learning-Based Multiscale Modeling, Simulation, Homogenization and Design of Materials
- Dana Bishara
- Yuxi Xie
- Shaofan Li
Archives of Computational Methods in Engineering (2023)
Multiscale simulations of complex systems by learning their effective dynamics
- Pantelis R. Vlachas
- Georgios Arampatzis
- Petros Koumoutsakos
Nature Machine Intelligence (2022)
Linking the length scales
- Shangying Wang
- Simone Bianco
Nature Machine Intelligence (2021)