Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

Machine-learning-based dynamic-importance sampling for adaptive multiscale simulations

Abstract

Multiscale simulations are a well-accepted way to bridge the length and time scales required for scientific studies with the solution accuracy achievable through available computational resources. Traditional approaches either solve a coarse model with selective refinement or coerce a detailed model into faster sampling, both of which have limitations. Here, we present a paradigm of adaptive, multiscale simulations that couple different scales using a dynamic-importance sampling approach. Our method uses machine learning to dynamically and exhaustively sample the phase space explored by a macro model using microscale simulations and enables an automatic feedback from the micro to the macro scale, leading to a self-healing multiscale simulation. As a result, our approach delivers macro length and time scales, but with the effective precision of the micro scale. Our approach is arbitrarily scalable as well as transferable to many different types of simulations. Our method made possible a multiscale scientific campaign of unprecedented scale to understand the interactions of RAS proteins with a plasma membrane in the context of cancer research running over several days on Sierra, which is currently the second-most-powerful supercomputer in the world.

Access options

Rent or Buy article

Get time limited or full article access on ReadCube.

from$8.99

All prices are NET prices.

Fig. 1: ML-based DynIm sampling framework.
Fig. 2: Patches represent concentrations of lipids on 30 × 30 nm2 local neighbourhoods of RAS discretized into 5 × 5 grids.
Fig. 3: Event history of our ML-based DynIm sampling for a period of 14 days of wall-clock time.
Fig. 4: Evaluation and comparison of DynIm sampling.
Fig. 5: The self-healing mechanism enabled by the in situ computation of DynIm weights allows refinement of macro model parameters, such as RAS−lipid RDFs, by appropriately aggregating the results of micro scale simulations.
Fig. 6: Convergence of DynIm sampling.

Data availability

Sample data for DynIm is also made available along with the code repository. The data related to the multiscale simulation described in the paper will be made available upon reasonable request; the size of all raw data is hundreds of terabytes. For more information, please see details of the simulation14

Code availability

The framework for DynIm has been released open source under the MIT license: https://github.com/LLNL/dynim.

References

  1. 1.

    Ingram, G., Cameron, I. & Hangos, K. Classification and analysis of integrating frameworks in multiscale modelling. Chem. Eng. Sci. 59, 2171–2187 (2004).

    Article  Google Scholar 

  2. 2.

    Weinan, E. Principles of Multiscale Modeling (Cambridge Univ. Press, 2011).

  3. 3.

    Hoekstra, A., Chopard, B. & Coveney, P. Multiscale modelling and simulation: a position paper. Phil. Trans. R. Soc. A 372, 20130377 (2014).

    MathSciNet  Article  Google Scholar 

  4. 4.

    Chopard, B., Borgdorff, J. & Hoekstra, A. A framework for multi-scale modelling. Phil. Trans. R. Soc. A 372, 20130378 (2014).

    Article  Google Scholar 

  5. 5.

    Geweke, J. Bayesian inference in econometric models using Monte Carlo integration. Econometrica 57, 1317–39 (1989).

    MathSciNet  Article  Google Scholar 

  6. 6.

    MacKay, D. J. C. Information Theory, Inference, and Learning Algorithms (Cambridge Univ. Press, 2003).

  7. 7.

    Liang, F. Dynamically weighted importance sampling in Monte Carlo computation. J. Am. Stat. Assoc. 97, 807–821 (2002).

    MathSciNet  Article  Google Scholar 

  8. 8.

    Liang, F. & Cheon, S. Monte Carlo dynamically weighted importance sampling for spatial models with intractable normalizing constants. J. Phys. Conf. Ser. 197, 012004 (2009).

    Article  Google Scholar 

  9. 9.

    Joubert, D. J. & Marwala, T. Monte Carlo dynamically weighted importance sampling for finite element model updating. In Topics in Modal Analysis and Testing (ed. Mains, M.) Vol. 10, 303−312 (Springer, 2016).

  10. 10.

    Katharopoulos, A. & Fleuret, F. Not all samples are created equal: deep learning with importance sampling. In Proc. 35th Int. Conf. on Machine Learning (eds Dy, J. & Krause, A.) Vol. 80, 2525−2534 (Proceedings of Machine Learning Research, 2018). http://proceedings.mlr.press/v80/katharopoulos18a.html"

  11. 11.

    Johnson, T. B. & Guestrin, C. Training deep models faster with robust, approximate importance sampling. In Advances in Neural Information Processing Systems (eds Bengio, S. et al.) Vol. 31, 7265−7275 (Curran Associates, 2018). http://papers.nips.cc/paper/7957-training-deep-models-faster-with-robust-approximate-importance-sampling.pdf

  12. 12.

    Simanshu, D. K., Nissley, D. V. & McCormick, F. RAS proteins and their regulators in human disease. Cell 170, 17–33 (2017).

    Article  Google Scholar 

  13. 13.

    Waters, A. M. & Der, C. J. KRAS: the critical driver and therapeutic target for pancreatic cancer. Cold Spring Harbor Persp. Med. 8, a031435 (2018).

    Article  Google Scholar 

  14. 14.

    Ingólfsson, H. I. et al. Machine learning-driven multiscale modeling reveals lipid-dependent dynamics of RAS signaling proteins. Preprint at Research Square https://www.researchsquare.com/article/rs-50842/v1 (2020).

  15. 15.

    Marrink, S. J., Risselada, H. J., Yefimov, S., Tieleman, D. P. & de Vries, A. H. The MARTINI Force Field: coarse grained model for biomolecular simulations. J. Phys. Chem. B 111, 7812–7824 (2007).

    Article  Google Scholar 

  16. 16.

    Di Natale, F. et al. A massively parallel infrastructure for adaptive multiscale simulations: modeling RAS initiation pathway for cancer. In Supercomputing ’19: Int. Conf. High Performance Computing, Networking, Storage, and Analysis 57 (ACM, 2019).

  17. 17.

    November 2019 TOP500 The List https://www.top500.org/lists/2019/11/ (2019).

  18. 18.

    Doersch, C. Tutorial on variational autoencoders. Preprint at https://arxiv.org/abs/1606.05908 (2016).

  19. 19.

    Jégou, H., Douze, M., Johnson, J. & Hosseini, L. FAISS. GitHub https://github.com/facebookresearch/faiss (2021).

  20. 20.

    Johnson, J., Douze, M. & Jégou, H. Billion-scale similarity search with gpus. IEEE Trans. Big Data https://doi.org/10.1109/TBDATA.2019.2921572 (2019).

  21. 21.

    Zhang, X. et al. ddcMD: a fully GPU-accelerated molecular dynamics program for the Martini Force Field. J. Chem. Phys. 153, 045103 (2020).

    Article  Google Scholar 

  22. 22.

    Correa, C., Lindstrom, P. & Bremer, P. T. Topological spines: a structure-preserving visual representation of scalar fields. IEEE Trans. Visualiz. Comput. Graph. 17, 1842–1851 (2011).

    Article  Google Scholar 

  23. 23.

    Liu, S. et al. Scalable topological data analysis and visualization for evaluating data-driven models in scientific applications. IEEE Trans. Visualiz. Comput. Graph. 26, 291–300 (2020).

    Article  Google Scholar 

  24. 24.

    Korn, F. & Muthukrishnan, S. Influence sets based on reverse nearest neighbor queries. SIGMOD Rec. 29, 201–212 (2000).

    Article  Google Scholar 

  25. 25.

    Kullback, S. & Leibler, R. A. On information and sufficiency. Ann. Math. Stat. 22, 79–86 (1951).

    MathSciNet  Article  Google Scholar 

  26. 26.

    Marconi, U. M. B. & Tarazona, P. Dynamic density functional theory of fluids. J. Chem. Phys. 110, 8032–8044 (1999).

    Article  Google Scholar 

  27. 27.

    Tonks, M. R., Gaston, D., Millett, P. C., Andrs, D. & Talbot, P. An object-oriented finite element framework for multiphysics phase field simulations. Comput. Mater. Sci. 51, 20–29 (2012).

    Article  Google Scholar 

  28. 28.

    Streitz, F. H., Glosli, J. N. & Patel, M. V. Beyond finite-size scaling in solidification simulations. Phys. Rev. Lett. 96, 225701 (2006).

    Article  Google Scholar 

  29. 29.

    Wassenaar, T. A., Ingólfsson, H. I., Böckmann, R. A., Tieleman, D. P. & Marrink, S. J. Computational lipidomics with insane: a versatile tool for generating custom membranes for molecular simulations. J. Chem. Theor. Comput. 11, 2144–2155 (2015).

    Article  Google Scholar 

  30. 30.

    Abraham, M. J. et al. GROMACS: high performance molecular simulations through multi-level parallelism from laptops to supercomputers. SoftwareX 1/2, 19–25 (2015).

    Article  Google Scholar 

  31. 31.

    Streitz, F. H. et al. 100+ TFlop solidification simulations on BlueGene/L. In Proc. 2005 ACM/IEEE Conf. Supercomputing (SC ’05) http://www.cresco.enea.it/SC05/schedule/pdf/pap307.pdf (ACM, 2005).

  32. 32.

    Glosli, J. N. et al. Extending stability beyond CPU millennium: a micron-scale atomistic simulation of Kelvin−Helmholtz instability. In Proc. 2007 ACM/IEEE Conf. Supercomputing (SC ’07) 58:1–58:11, https://doi.org/10.1145/1362622.1362700 (ACM, 2007).

  33. 33.

    Ingólfsson, H. I. et al. Capturing biologically complex tissue-specific membranes at different levels of compositional complexity. J. Phys. Chem. B 124, 7819–7829 (2020).

    Article  Google Scholar 

Download references

Acknowledgements

This work has been supported in part by the Joint Design of Advanced Computing Solutions for Cancer (JDACS4C) programme established by the US Department of Energy (DOE) and the National Cancer Institute (NCI) of the National Institutes of Health (NIH). For computing time, we thank Livermore Computing (LC) and Livermore Institutional Grand Challenge. This work was performed under the auspices of the US DOE by Lawrence Livermore National Laboratory under contract DE-AC52-07NA27344 and Los Alamos National Laboratory under contract DEAC5206NA25396. Release number: LLNL-JRNL-806073.

Author information

Affiliations

Authors

Contributions

The framework was designed by H.B., T.S.C., H.I.I., G.D., B.V.E., J.N.G., P.-T.B. and F.C.L.; the framework was implemented by H.B. Sampling analysis was done by H.B., P.K., S.L., T.O. and C.N. All authors contributed to the writing of the paper.

Corresponding author

Correspondence to Harsh Bhatia.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Peer review informationNature Machine Intelligence thanks Shangying Wang, and the other anonymous reviewer(s), for their contribution to the peer review of this work.

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplementary Information

Supplementary Fig. 1 and one algorithm.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Bhatia, H., Carpenter, T.S., Ingólfsson, H.I. et al. Machine-learning-based dynamic-importance sampling for adaptive multiscale simulations. Nat Mach Intell (2021). https://doi.org/10.1038/s42256-021-00327-w

Download citation

Further reading

Search

Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing