Learning nonlinear operators via DeepONet based on the universal approximation theorem of operators

Lu, Lu; Jin, Pengzhan; Pang, Guofei; Zhang, Zhongqiang; Karniadakis, George Em

doi:10.1038/s42256-021-00302-5

Article
Published: 18 March 2021

Learning nonlinear operators via DeepONet based on the universal approximation theorem of operators

Nature Machine Intelligence volume 3, pages 218–229 (2021)Cite this article

31k Accesses
607 Citations
179 Altmetric
Metrics details

Subjects

Abstract

It is widely known that neural networks (NNs) are universal approximators of continuous functions. However, a less known but powerful result is that a NN with a single hidden layer can accurately approximate any nonlinear continuous operator. This universal approximation theorem of operators is suggestive of the structure and potential of deep neural networks (DNNs) in learning continuous operators or complex systems from streams of scattered data. Here, we thus extend this theorem to DNNs. We design a new network with small generalization error, the deep operator network (DeepONet), which consists of a DNN for encoding the discrete input function space (branch net) and another DNN for encoding the domain of the output functions (trunk net). We demonstrate that DeepONet can learn various explicit operators, such as integrals and fractional Laplacians, as well as implicit operators that represent deterministic and stochastic differential equations. We study different formulations of the input function space and its effect on the generalization error for 16 different diverse applications.

Access through your institution

Buy or subscribe

This is a preview of subscription content, access via your institution

Access options

Access through your institution

Buy this article

Purchase on Springer Link
Instant access to full article PDF

Buy now

Prices may be subject to local taxes which are calculated during checkout

**Fig. 1: Illustrations of the problem set-up and new architectures of DeepONets that lead to good generalization.**

**Fig. 2: Learning explicit operators using different V spaces and different network architectures.**

**Fig. 3: Fast learning of implicit operators in a nonlinear pendulum (k = 1 and T = 3).**

**Fig. 4: Fast learning of implicit operators in a diffusion-reaction system.**

**Fig. 5: DeepONet prediction for a stochastic ODE.**

**Fig. 6: DeepONet prediction for a stochastic elliptic equation.**

Piecewise linear neural networks and deep learning

Article 09 June 2022

Deep transfer operator learning for partial differential equations under conditional shift

Article 01 December 2022

Data-driven discovery of Green’s functions with human-understandable deep learning

Article Open access 22 March 2022

Data availability

All the datasets in the study were generated directly from the code.

Code availability

The code used in the study is publicly available from the GitHub repository https://github.com/lululxvi/deeponet⁵⁵.

References

Rico-Martinez, R., Krischer, K., Kevrekidis, I. G., Kube, M. C. & Hudson, J. L. Discrete- vs. continuous-time nonlinear signal processing of Cu electrodissolution data. Chem. Eng. Commun. 118, 25–48 (1992).
Article Google Scholar
Rico-Martinez, R., Anderson, J. S. & Kevrekidis, I. G. Continuous-time nonlinear signal processing: a neural network based approach for gray box identification. In Proc. IEEE Workshop on Neural Networks for Signal Processing 596–605 (IEEE, 1994).
González-García, R., Rico-Martínez, R. & Kevrekidis, I. G. Identification of distributed parameter systems: a neural net based approach. Comput. Chem. Eng. 22, S965–S968 (1998).
Article Google Scholar
Psichogios, D. C. & Ungar, L. H. A hybrid neural network-first principles approach to process modeling. AIChE J. 38, 1499–1511 (1992).
Article Google Scholar
Kevrekidis, I. G. et al. Equation-free, coarse-grained multiscale computation: enabling mocroscopic simulators to perform system-level analysis. Commun. Math. Sci. 1, 715–762 (2003).
Article MathSciNet Google Scholar
Weinan, E. Principles of Multiscale Modeling (Cambridge Univ. Press, 2011).
Ferrandis, J., Triantafyllou, M., Chryssostomidis, C. & Karniadakis, G. Learning functionals via LSTM neural networks for predicting vessel dynamics in extreme sea states. Preprint at https://arxiv.org/pdf/1912.13382.pdf (2019).
Qin, T., Chen, Z., Jakeman, J. & Xiu, D. Deep learning of parameterized equations with applications to uncertainty quantification. Preprint at https://arxiv.org/pdf/1910.07096.pdf (2020).
Chen, T. Q., Rubanova, Y., Bettencourt, J. & Duvenaud, D. K. Neural ordinary differential equations. In Advances in Neural Information Processing Systems 6571–6583 (NIPS, 2018).
Jia, J. & Benson, A. R. Neural jump stochastic differential equations. Preprint at https://arxiv.org/pdf/1905.10403.pdf (2019).
Greydanus, S., Dzamba, M. & Yosinski, J. Hamiltonian neural networks. In Advances in Neural Information Processing Systems 15379–15389 (NIPS, 2019).
Toth, P. et al. Hamiltonian generative networks. Preprint at https://arxiv.org/pdf/1909.13789.pdf (2019).
Zhong, Y. D., Dey, B. & Chakraborty, A. Symplectic ODE-Net: learning Hamiltonian dynamics with control. Preprint at https://arxiv.org/pdf/1909.12077.pdf (2019).
Chen, Z., Zhang, J., Arjovsky, M. & Bottou, L. Symplectic recurrent neural networks. Preprint at https://arxiv.org/pdf/1909.13334.pdf (2019).
Winovich, N., Ramani, K. & Lin, G. ConvPDE-UQ: convolutional neural networks with quantified uncertainty for heterogeneous elliptic partial differential equations on varied domains. J. Comput. Phys. 394, 263–279 (2019).
Article MathSciNet Google Scholar
Zhu, Y., Zabaras, N., Koutsourelakis, P.-S. & Perdikaris, P. Physics-constrained deep learning for high-dimensional surrogate modeling and uncertainty quantification without labeled data. J. Comput. Phys. 394, 56–81 (2019).
Article MathSciNet Google Scholar
Trask, N., Patel, R. G., Gross, B. J. & Atzberger, P. J. GMLS-Nets: a framework for learning from unstructured data. Preprint at https://arxiv.org/pdf/1909.05371.pdf (2019).
Li, Z. et al. Neural operator: graph kernel network for partial differential equations. Preprint at https://arxiv.org/pdf/2003.03485.pdf (2020).
Rudy, S. H., Brunton, S. L., Proctor, J. L. & Kutz, J. N. Data-driven discovery of partial differential equations. Sci. Adv. 3, e1602614 (2017).
Article Google Scholar
Zhang, D., Lu, L., Guo, L. & Karniadakis, G. E. Quantifying total uncertainty in physics-informed neural networks for solving forward and inverse stochastic problems. J. Comput. Phys. 397, 108850 (2019).
Article MathSciNet Google Scholar
Pang, G., Lu, L. & Karniadakis, G. E. fPINNs: fractional physics-informed neural networks. SIAM J. Sci. Comput. 41, A2603–A2626 (2019).
Article MathSciNet Google Scholar
Lu, L., Meng, X., Mao, Z. & Karniadakis, G. E. DeepXDE: a deep learning library for solving differential equations. SIAM Rev. 63, 208–228 (2021).
Article MathSciNet Google Scholar
Yazdani, A., Lu, L., Raissi, M. & Karniadakis, G. E. Systems biology informed deep learning for inferring parameters and hidden dynamics. PLoS Comput. Biol. 16, e1007575 (2020).
Article Google Scholar
Chen, Y., Lu, L., Karniadakis, G. E. & Negro, L. D. Physics-informed neural networks for inverse problems in nano-optics and metamaterials. Opt. Express 28, 11618–11633 (2020).
Article Google Scholar
Holl, P., Koltun, V. & Thuerey, N. Learning to control PDEs with differentiable physics. Preprint at https://arxiv.org/pdf/2001.07457.pdf (2020).
Lample, G. & Charton, F. Deep learning for symbolic mathematics. Preprint at https://arxiv.org/pdf/1912.01412.pdf (2019).
Charton, F., Hayat, A. & Lample, G. Deep differential system stability—learning advanced computations from examples. Preprint at https://arxiv.org/pdf/2006.06462.pdf (2020).
Cybenko, G. Approximation by superpositions of a sigmoidal function. Math. Control Signals Syst. 2, 303–314 (1989).
Article MathSciNet Google Scholar
Hornik, K., Stinchcombe, M. & White, H. Multilayer feedforward networks are universal approximators. Neural Networks 2, 359–366 (1989).
Article Google Scholar
Chen, T. & Chen, H. Universal approximation to nonlinear operators by neural networks with arbitrary activation functions and its application to dynamical systems. IEEE Trans. Neural Networks 6, 911–917 (1995).
Article Google Scholar
Chen, T. & Chen, H. Approximations of continuous functionals by neural networks with application to dynamic systems. IEEE Trans. Neural Networks 4, 910–918 (1993).
Article Google Scholar
Mhaskar, H. N. & Hahm, N. Neural networks for functional approximation and system identification. Neural Comput. 9, 143–159 (1997).
Article Google Scholar
Rossi, F. & Conan-Guez, B. Functional multi-layer perceptron: a non-linear tool for functional data analysis. Neural Networks 18, 45–60 (2005).
Article Google Scholar
Chen, T. & Chen, H. Approximation capability to functions of several variables, nonlinear functionals, and operators by radial basis function neural networks. IEEE Trans. Neural Networks 6, 904–910 (1995).
Article Google Scholar
Brown, T. B. et al. Language models are few-shot learners. Preprint at https://arxiv.org/pdf/2005.14165.pdf (2020).
Lu, L., Su, Y. & Karniadakis, G. E. Collapse of deep and narrow neural nets. Preprint at https://arxiv.org/pdf/1808.04947.pdf (2018).
Jin, P., Lu, L., Tang, Y. & Karniadakis, G. E. Quantifying the generalization error in deep learning in terms of data distribution and neural network smoothness. Neural Networks 130, 85–99 (2020).
Article Google Scholar
Lu, L., Shin, Y., Su, Y. & Karniadakis, G. E. Dying ReLU and initialization: theory and numerical examples. Commun. Comput. Phys. 28, 1671–1706 (2020).
Article MathSciNet Google Scholar
He, K., Zhang, X., Ren, S. & Sun, J. Deep residual learning for image recognition. In Proc. 2016 IEEE Conference on Computer Vision and Pattern Recognition 770–778 (IEEE, 2016).
Vaswani, A. et al. Attention is all you need. In Advances in Neural Information Processing Systems 5998–6008 (NIPS, 2017).
Dumoulin, V. et al. Feature-wise transformations. Distill https://distill.pub/2018/feature-wise-transformations (2018).
Sutskever, I., Vinyals, O. & Le, Q. V. Sequence to sequence learning with neural networks. In Advances in Neural Information Processing Systems 3104–3112 (NIPS, 2014).
Bahdanau, D., Cho, K. & Bengio. Y. Neural machine translation by jointly learning to align and translate. Preprint at https://arxiv.org/pdf/1409.0473.pdf (2014).
Britz, D., Goldie, A., Luong, M. & Le, Q. Massive exploration of neural machine translation architectures. Preprint at https://arxiv.org/pdf/1703.03906.pdf (2017).
Gelbrich, M. On a formula for the l² Wasserstein metric between measures on Euclidean and Hilbert spaces. Math. Nachrichten 147, 185–203 (1990).
Article MathSciNet Google Scholar
Podlubny, I. Fractional Differential Equations: An Introduction to Fractional Derivatives, Fractional Differential Equations, to Methods of their Solution and Some of their Applications (Elsevier, 1998).
Zayernouri, M. & Karniadakis, G. E. Fractional Sturm–Liouville Eigen-problems: theory and numerical approximation. J. Comput. Phys. 252, 495–517 (2013).
Article MathSciNet Google Scholar
Lischke, A. et al. What is the fractional Laplacian? A comparative review with new results. J. Comput. Phys. 404, 109009 (2020).
Article MathSciNet Google Scholar
Born, M. & Wolf, E. Principles of Optics: Electromagnetic Theory of Propagation, Interference and Diffraction of Light (Elsevier, 2013).
Mitzenmacher, M. & Upfal, E. Probability and Computing: Randomization and Probabilistic Techniques in Algorithms and Data Analysis (Cambridge Univ. Press, 2017).
Shwartz-Ziv, R. & Tishby, N. Opening the black box of deep neural networks via information. Preprint at https://arxiv.org/pdf/1703.00810.pdf (2017).
Cai, S., Wang, Z., Lu, L., Zaki, T. A. & Karniadakis, G. E. DeepM&Mnet: inferring the electroconvection multiphysics fields based on operator approximation by neural networks. Preprint at https://arxiv.org/pdf/2009.12935.pdf (2020).
Tai, K. S., Bailis, P. & Valiant, G. Equivariant transformer networks. Preprint at https://arxiv.org/pdf/1901.11399.pdf (2019).
Hanin, B. Universal function approximation by deep neural nets with bounded width and ReLU activations. Preprint at https://arxiv.org/pdf/1708.02691.pdf (2017).
Lu, L. DeepONet https://doi.org/10.5281/zenodo.4319385 (13 December 2020).

Download references

Acknowledgements

This work was supported by the DOE PhILMs project (no. DE-SC0019453) and DARPA-CompMods grant no. HR00112090062.

Author information

Authors and Affiliations

Department of Mathematics, Massachusetts Institute of Technology, Cambridge, MA, USA
Lu Lu
Division of Applied Mathematics, Brown University, Providence, RI, USA
Pengzhan Jin, Guofei Pang & George Em Karniadakis
LSEC, ICMSEC, Academy of Mathematics and Systems Science, Chinese Academy of Sciences, Beijing, China
Pengzhan Jin
Department of Mathematical Sciences, Worcester Polytechnic Institute, Worcester, MA, USA
Zhongqiang Zhang

Authors

Lu Lu
View author publications
You can also search for this author in PubMed Google Scholar
Pengzhan Jin
View author publications
You can also search for this author in PubMed Google Scholar
Guofei Pang
View author publications
You can also search for this author in PubMed Google Scholar
Zhongqiang Zhang
View author publications
You can also search for this author in PubMed Google Scholar
George Em Karniadakis
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

L.L. and G.E.K. designed the study based on G.E.K.’s original idea. L.L. developed DeepONet architectures. L.L., P.J. and Z.Z. developed the theory. L.L. performed the experiments for the integral, nonlinear ODE, gravity pendulum and stochastic ODE/PDE operators. L.L. and P.J. performed the experiments for the Legendre transform, diffusion-reaction, advection and advection-diffusion PDEs. G.P. performed the experiments for fractional operators. L.L., P.J., G.P., Z.Z. and G.E.K. wrote the manuscript. G.E.K. supervised the project.

Corresponding author

Correspondence to George Em Karniadakis.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Peer review information Nature Machine Intelligence thanks Irana Higgins, Jian-Xun Wang and the other, anonymous, reviewer(s) for their contribution to the peer review of this work.

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplementary Information

Supplementary Information.

Reporting Summary

Rights and permissions

Reprints and permissions

About this article

Cite this article

Lu, L., Jin, P., Pang, G. et al. Learning nonlinear operators via DeepONet based on the universal approximation theorem of operators. Nat Mach Intell 3, 218–229 (2021). https://doi.org/10.1038/s42256-021-00302-5

Download citation

Received: 14 April 2020
Accepted: 25 January 2021
Published: 18 March 2021
Issue Date: March 2021
DOI: https://doi.org/10.1038/s42256-021-00302-5

This article is cited by

Learning the intrinsic dynamics of spatio-temporal processes through Latent Dynamics Networks
- Francesco Regazzoni
- Stefano Pagani
- Alfio Quarteroni
Nature Communications (2024)
Generative adversarial reduced order modelling
- Dario Coscia
- Nicola Demo
- Gianluigi Rozza
Scientific Reports (2024)
Deep neural operator-driven real-time inference to enable digital twin solutions for nuclear energy systems
- Kazuma Kobayashi
- Syed Bahauddin Alam
Scientific Reports (2024)
Neural operators for accelerating scientific simulations and design
- Kamyar Azizzadenesheli
- Nikola Kovachki
- Anima Anandkumar
Nature Reviews Physics (2024)
A physics based machine learning model to characterize room temperature semiconductor detectors in 3D
- Srutarshi Banerjee
- Miesher Rodrigues
- Aggelos K. Katsaggelos
Scientific Reports (2024)