Affine transformations accelerate the training of physics-informed neural networks of a one-dimensional consolidation problem

Physics-informed neural networks (PINNs) leverage data and knowledge about a problem. They provide a nonnumerical pathway to solving partial differential equations by expressing the field solution as an artificial neural network. This approach has been applied successfully to various types of differential equations. A major area of research on PINNs is the application to coupled partial differential equations in particular, and a general breakthrough is still lacking. In coupled equations, the optimization operates in a critical conflict between boundary conditions and the underlying equations, which often requires either many iterations or complex schemes to avoid trivial solutions and to achieve convergence. We provide empirical evidence for the mitigation of bad initial conditioning in PINNs for solving one-dimensional consolidation problems of porous media through the introduction of affine transformations after the classical output layer of artificial neural network architectures, effectively accelerating the training process. These affine physics-informed neural networks (AfPINNs) then produce nontrivial and accurate field solutions even in parameter spaces with diverging orders of magnitude. On average, AfPINNs show the ability to improve the \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\mathscr {L}}_2$$\end{document}L2 relative error by \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$64.84\%$$\end{document}64.84% after 25,000 epochs for a one-dimensional consolidation problem based on Biot’s theory, and an average improvement by \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$58.80\%$$\end{document}58.80% with a transfer approach to the theory of porous media.


A AfPINNs for Burgers' Equation
To supplement the consolidation problem with two theories described in the main body of the paper, we also studied the original example with the Burgers' equation from the paper by Raissi et al. 1 which is also part of the examples of DeepXDE 2 .The problem is governed by the Burgers' equation in one spatial dimension with the following Dirichlet BCs with fluid velocity u(x,t) in the spatiotemporal domain given by x ∈ [−1 m, 1 m] and t ∈ [0 s, 1 s].We adopted the default values of DeepXDE for our hyperparameters, but increased the number of collocation points by a factor of 10 to more sharply constrain the position of the discontinuity in the solution and thus not randomly increase the L 2 relative error by a detrimental choice of collocation points.Hence, we used 80 sample points on the BC, 160 sample points on the IC, and 25400 collocation points.The network architecture consists of 2 input neurons in the input layer (x and t), 3 hidden layers with 20 neurons and a nonlinear tanh activation, as well as 1 output neuron (u) with linear activation.We used ADAM optimization 3 with 15000 epochs and a fixed learning rate of 10 −3 on the full data batch.All weights and biases were initialized using the Glorot normal initializer 4 .A Bayes optimization with 300 calls yielded the parameters of the affine transformation to w u = 0.

4/7 B Influence of Collocation Points for AfPINNs and the One-Dimensional Consolidation Problem
In addition to the results obtained in the main part of this work with 1000 collocation points, we compared the performance of AfPINNs with 100 and 10000 collocation points.For this, all other hyperparameters were left unchanged including the affine weights obtained, i.e., w u = w p = b u = b p = 1e 2. as well as the 100 sample points on the BCs and 100 sample points on the ICs, respectively.

Figure S. 2 .Figure S. 3 .
Figure S.2.Comparison of AfPINNs and vanillaPINNs for Burgers' equation by mean (thick line), along with minimum and maximum value as shaded area calculated over 500 runs each with intermediate values taken every 500 steps for 15000 total epochs.MSE Loss, L 2 relative error, and maximum of the absolute error for fluid speed u are plotted with a logarithmic y-scale.It can be seen that AfPINNs perform better on average and have a significantly lower variability than PINNs.

Figure S. 4 Figure S. 4 .
Figure S.4.Comparison of AfPINNs and vanilla PINNs for Biot's theory by mean (thick line), along with minimum and maximum value as shaded area calculated over 250 runs each with intermediate values taken every 500 steps for 25000 total epochs.MSE Loss and L 2 relative error are plotted with a logarithmic y-scale for 100, 1000, and 10000 collocation points respectively.

(b) standard deviation Table S.1. Mean
and standard deviation of 500 PINN and AfPINN runs for MSE Loss, L 2 relative error, and maximum of the absolute error for displacement u and pressure p every 5000 epochs as absolute values for Biot's Theory.
(b) standard deviation

Table S . 2 .
Mean and standard deviation of 500 PINN and AfPINN runs for MSE Loss, L 2 relative error, and maximum of the absolute error for displacement u and pressure p every 5000 epochs as absolute values for TPM.