A physics-informed neural network based on mixed data sampling for solving modified diffusion equations

Fang, Qian; Mou, Xuankang; Li, Shiben

doi:10.1038/s41598-023-29822-3

Download PDF

Article
Open access
Published: 13 February 2023

A physics-informed neural network based on mixed data sampling for solving modified diffusion equations

Qian Fang¹^na1,
Xuankang Mou¹^na1 &
Shiben Li¹

Scientific Reports volume 13, Article number: 2491 (2023) Cite this article

3483 Accesses
2 Citations
Metrics details

Subjects

Abstract

We developed a physics-informed neural network based on a mixture of Cartesian grid sampling and Latin hypercube sampling to solve forward and backward modified diffusion equations. We optimized the parameters in the neural networks and the mixed data sampling by considering the squeeze boundary condition and the mixture coefficient, respectively. Then, we used a given modified diffusion equation as an example to demonstrate the efficiency of the neural network solver for forward and backward problems. The neural network results were compared with the numerical solutions, and good agreement with high accuracy was observed. This neural network solver can be generalized to other partial differential equations.

Physics-informed machine learning

Article 24 May 2021

Encoding physics to learn reaction–diffusion processes

Article 17 July 2023

Affine transformations accelerate the training of physics-informed neural networks of a one-dimensional consolidation problem

Article Open access 20 September 2023

Introduction

Partial differential equations (PDEs), especially second-order PDEs, have been extensively used in physics, engineering, finance, and other fields. Some simple PDEs can be solved analytically, but most complex PDEs rely on numerical solutions, which are usually divided into forward and backward problems. Common numerical methods include finite difference^1,2,3,4,5,6, finite element^{7,8,9,10,11,12,13}, and Lagrange multiplier^{14,15,16,17,18}. These conventional methods have been extensively applied to solve forward PDEs in various practical problems. However, the deep neural network (DNN) provides another solution for complex nonlinear PDEs without the domain discretization used in numerical methods and is thus suitable for forward and backward problems^{19,20,21,22,23}.

DNN is exhibiting major advancement in solving PDEs and has attracted increasing attention in various research areas due to its universal approximations^24,25,26,27. However, a large amount of labeled data is usually required for training DNN-based models to solve PDEs, and such data are often unavailable in many physical applications. To overcome this disadvantage, researchers have proposed a novel DNN-based neural network called the physics-informed neural network (PINN) help to reduce the needed training time in a physics-informed manner; in this network, physics-informed loss functions are constructed based on PDE residuals^{20,28,29,30,31}. Generally, residual design plays an important role in PINN approximations, so residual DNN is a highly effective type of neural network^32,33,34. PINN can encode any underlying physics law such that the differential operators in the governing PDEs can approximated by automatic differentiation^35,36. With such advantages, PINNs have been applied extensively to solve complex PDEs in various application areas in recent years^{37,38,39,40,41,42,43,44,45,46,47}. On the one hand, an increasing number of studies are being conducted to examine the approaches for building improved PINN models by incorporating such models into other methods. For example, Karniadakis et al. introduced a systematic PINN model for the first time and presented a series of PINN variants, including Bayesian PINN⁴⁸, fractional PINN⁴⁹, extended PINN⁵⁰, parareal PINN⁵¹, non-Newtonian PINN⁵², hp-variational PINN⁵³, and nonlocal PINN⁵⁴. These extended PINN models were constructed to approximate various forward PDEs, either linear or nonlinear, in application areas ranging from engineering to finance. On the other hand, PINN models have also been extended to backward problems, such as advection-dispersion equations⁵⁵, stochastic problems⁵⁶, flow problems⁵⁷, and conservation laws⁵⁸. In these backward problems, the training data are inputted into DNNs to screen the unknown parameters in the PDEs by constructing PINN loss functions.

In DNN, data sampling is another important factor for solving second-order or higher-order PDEs^21,59,60,61. Latin hypercube sampling (LHS) filters the variance associated with the additive components of a transformation, and it is a powerful sampling method for data analysis in nearly every field of computational science, engineering, and mathematics^62,63,64,65. In LHS, because the sample space is divided into a series of subspaces that are randomly paired, the LHS algorithm iterates to determine optimal pairings on the basis of some specified criteria. To solve PDEs using neural networks, LHS²⁸ and simple random sampling (RS)^66,67 can be used to sample data sets. LHS and RS are usually employed to solve PDEs in regular domains. However, other sampling methods based on domain decomposition or irregular domains have also been developed to solve PDEs^50,68,69.

In this study, we developed an improved LHS method, namely, GLHS, where LHS and Cartesian grid sampling (GS) are merged and optimized to deal with a data set under the periodic boundary condition that commonly appears in the theory and simulation of polymer chains under bulk conditions. We are aware that in polymer physics, especially in self-consistent field theory (SCFT), the modified diffusion equation (MDE), which is a parabolic-type second-order PDE, is a key equation in SCFT for Gaussian and wormlike chains^70,71,72. Several classical numerical methods have been developed to solve MDEs in SCFT and have achieved great success in reproducing the microstructures and properties of self-assembled polymer chains^{73,74,75,76,77,78}. Recently, Chen et al. trained the traditional DNN to solve the MDE in diblock copolymer systems and the static Schrodinger equation in quantum systems, and the efficiency of the solver was analyzed^67,79. In this work, we developed a PINN with residual units, which combines with the GLHS, to solve the forward and backward MDEs used in polymer physics.

Several important issues are addressed in the current study. In the following section, we describe the PINN with residual units and mixed sampling method. Then, we solve the forward and backward problems in MDEs as examples to examine the PINN solver based on the mixed data sampling by optimizing the parameters in PINN and mixed data samplings. We also compare the PINN to the numerical results and the traditional NN to verify the accuracy and efficiency of the PINN. The research summary is presented in final section.

Neural network and data sampling

PINN with residual units

We describe PINN with residual units, as shown in Fig. 1. To solve the complex problems in network convergence caused by gradient disappearance or network degradation in the traditional neural network^80,81, we apply the neural network based on residual units to solve second-order PDEs. We describe a residual unit in Fig. 1a, where the input layer (IL) is a neural network layer with weight, biases, and an activation function. For this network layer, the output of tensor $\textbf{X}^{i-1}$ fed into the network is

$$\begin{aligned} {\textbf{I}}{\textbf{L}}(\textbf{X}^{i-1}) = \sigma (\textbf{W}^{i}\textbf{X}^{i-1} + \textbf{b}^{i}), \end{aligned}$$

(1)

where $\textbf{W}^{i} \in \mathbb {R}^{\textbf{n}_{i-1} \times \textbf{n}_i}$ is the weight parameter in the network layer, $\textbf{b}^{i} \in \mathbb {R}^{\textbf{n}_{i}}$ is the bias parameter in the network layer, and $\textbf{n}^{i}$ represents the network width of the current layer, that is, the number of neurons. The activation function $\sigma (\cdots )$, which is nonlinear, is the key factor in the universal approximation of the neural network. In general, the activation function selects nonlinear functions, such as sigmoid and tanh. Here, we choose tanh as the activation function^22,66, i.e.,

$$\begin{aligned} \sigma (\textbf{X}) = Tanh(\textbf{X}) = \frac{e^{\textbf{X}} - e^{-\textbf{X}}}{e^{\textbf{X}} + e^{-\textbf{X}}}. \end{aligned}$$

(2)

The output layer (OL) is an ordinary network layer with weights and biases. For input tensor $\textbf{X}^{i-1}$, its output can be expressed as

$$\begin{aligned} {\textbf{O}}{\textbf{L}}(\textbf{X}^{i-1}) = \textbf{W}^{i}\textbf{X}^{i-1} + \textbf{b}^{i}. \end{aligned}$$

(3)

The residual unit is constructed by combining IL and OL, as illustrated in Fig. 1a. First, input tensor $\textbf{X}^{i-1}$ in the residual unit goes through IL and OL to obtain

$$\begin{aligned} f(\textbf{X}^{i-1}) = \textbf{W}^{i+1}\sigma (\textbf{W}^{i}\textbf{X}^{i-1}+\textbf{b}^{i})+\textbf{b}^{i+1}. \end{aligned}$$

(4)

Second, input tensor $\textbf{X}^{i-1}$ is connected by jump identity, and the output is obtained by the tanh activation function after accumulation with $f(\textbf{X}^{i-1})$, namely,

$$\begin{aligned} g(\textbf{X}^{i-1}) = \sigma (f(\textbf{X}^{i-1}) + \textbf{X}^{i-1}). \end{aligned}$$

(5)

This residual unit design enables us to transmit the input data directly from the lower network layer to the higher layer. This process differs from the common stacking of two neural network layers. Therefore, the residual unit can facilitate the optimization process of the neural network and solve the problem of network degradation to a certain extent.

Parameterized parabolic-type PDEs, such as MDE, can be expressed in a general form as follows:

$$\begin{aligned} u_t + N[u(\textbf{r}, t);\lambda ] = 0, \textbf{r} \in \Omega , t \in [0, T], \end{aligned}$$

(6)

where $u(\textbf{r}, t)$ denotes the solution with respect to the time and space variables, $N[u(\textbf{r}, t);\lambda ]$ represents the differential operator parameterized by $\lambda $, $\Omega $ is the spatial definition domain, and $u_t$ denotes the partial derivative with respect to time t.

In accordance with the universal approximation of neural networks^19,82, the solution $u(\textbf{r}, t)$ in PDE can be equivalently expressed through DNN. In the current study, we use a PINN with residual units to solve PDE, as shown in Fig. 1b, instead of a common neural network. In such a PINN, the initial variables are inputted into the input layer and sent to several residual units (Fig. 1a), where the solutions are assessed by the physics-informed loss functions. Then, we obtain the solution $u(\textbf{r}, t)$ in the output layer.

We illustrate the calculations for the partial derivatives with physical constraints in the solution process in Fig. 1c and d, and the chain derivative rule based on automatic differentiation is used³⁵. In the PINNs, the forward and backward solutions in the PDEs are learned by optimizing the loss functions with physical information. The forward solution process for the PDE in the neural networks is shown in Fig. 1c. The physical constraints of the differential equation can be defined as

$$\begin{aligned} \Gamma _e = u_t + N[u(\textbf{r}, t);\lambda ]. \end{aligned}$$

(7)

The representation network needs to learn that the judgment mode of the solution of the differential equation is $\Gamma _e \rightarrow 0$, which simultaneously satisfies the physical constraints with the periodic boundary condition (PBC) $\Gamma _p \rightarrow 0$, and the initial condition $\Gamma _i \rightarrow 0$. Then, we can define the total loss function as

$$\begin{aligned} J_0(\theta _0) = \Gamma _e + \Gamma _p + \Gamma _i, \end{aligned}$$

(8)

where $\theta _0$ denotes an intermediate variable that includes the parameters appearing in $\Gamma _e$, $\Gamma _p$, and $\Gamma _i$. The learning task stops in the forward process when $J_0(\theta _0) \rightarrow 0$. For the reverse problem, as shown in Fig. 1d, the network needs to satisfy $\Gamma _e \rightarrow 0$ and the constraints created by existing data in the network $\Gamma _b \rightarrow 0$, which will be described in detail in solving MDE section. In the backward process, the gradient of a certain underlying output can be expanded as

$$\begin{aligned} \frac{\partial {J_1}}{\partial {\textbf{X}^{i-1}}} = \frac{\partial {J_1}}{\partial {g}}\frac{\partial {g}}{\partial {\textbf{X}^{i-1}}} + \frac{\partial {J_1}}{\partial {g}}\frac{\partial {g}}{\partial {f}}\frac{\partial {f}}{\partial {\textbf{X}^{i-1}}}. \end{aligned}$$

(9)

Equation (9) shows that the gradient of the bottom output of the network can be decomposed into two terms. The first term indicates that the wrong signal can be directly transmitted to the bottom without any intermediate weight matrix transformation, thus alleviating the problem of gradient dispersion to a certain extent. The gradient will not disappear even if the weight of the intermediate layer matrix is small. The residual unit, which has been successfully applied to image recognition^32,34, provides an efficient tool to solve the backward problems in PDEs in the current study.

Mixed GS and LHS method

The basic problem in solving PDEs by using a neural network is to produce results that satisfy the physical conditions in the differential equations, where the data points in the defined domain are fed to the neural network. Thus, selecting appropriate data points in the training process of DNNs is crucial. In the current work, we adopt a mixed sampling method (GLHS), i.e., mixture of Cartesian GS and LHS, in the PINN solver.

For the GLHS method, we assume that the number of data points in Cartesian GS is $\alpha {N}$, where N is the total number of data points and $\alpha $ is a proportionality coefficient, i.e., $\alpha \in [0,1]$. All the data are located in the domain $\mathbb {R}^{\Omega }$. We allocate the $\alpha {N}$ data points to the grid points where the n-dimensional Cartesian grids are equally divided into INT($\root n \of {\alpha {N}}$) grids in one dimension. INT($\cdots $) denotes the integer part of the number. This condition means that a total of $N_G$ = [INT$(\root n \of {\alpha {N}})]^{n}$ data points are sent to the grid in an n-dimensional space, where one data point corresponds to one grid point. Then, the remaining data points, $N_L$ are sampled by the LHS method, where $N_L = N - [$INT($\root n \of {\alpha {N}})]^{n}$. In LHS, the $N_L$ data points are sent to $N_L$ equal subdomains for RS, where each subdomain corresponds to one LHS data point^62,64. To effectively describe the GLHS method, we show a simple example in Fig. 2, where $\alpha =0.5$, $N=20$, $n=2$. For simplicity, we use the definition domain $\Omega _1=[0, 1;0, 1]$ as an example (other examples are given in Fig. S1 of Supplementary information). The blue dots represent the Cartesian grid data points, and the crossed dots denote the data points in LHS. Clearly, 3 data points exist in the Cartesian grid sampling, and 11 data points are present in LHS, where the domain is divided into 11 subdomains in one dimension and each data point in a subdomain is randomly sampled, as shown in Fig. 2a. We expect that the extracted random data are evenly distributed in the definition domain, so a linear cumulative density function, CDF$(x)=\Omega ^{-1}{x}$, is used in each dimension for LHS. Then, the total data points in two dimensions are calculated by the Cartesian product, i.e., total of 9 data points in the Cartesian grids. Meanwhile, 11 data points are randomly paired into two dimensions via a cumulative linear density function, $\Omega _1=[0, 1;0, 1]$, as shown in Fig. 2b. Finally, the Cartesian grid and LHS data are randomly mixed to obtain GLHS data as the final input data for PINN.

We present an example to illustrate the advantages of the GLHS method in Fig. 3, where three types of sampling, namely, RS, LHS, and GLHS, are compared. In each sampling type, two types of two-dimensional definition domains, i.e., $\Omega _1=[0, 1;0, 1]$ and $\Omega _2=[0, 0.1;0, 0.1]$, are used as examples. A total of 2,000,000 and 20,000 data points are randomly imported into the two definition domains. To examine the uniformity of the data distribution in $\Omega _1$ and $\Omega _2$, we divide the two definition domains into 100 equal square data collection areas and labeled the number of data points in each data collection area as $N_{1,ij}$ and $N_{2,ij}$ over all the regions in $\Omega _1$ and $\Omega _2$. The subscripts i and j denote the ij-th collection area in the two dimensions. We plot more detailed data distributions in the two-dimensional definition domains in Fig. S2 of Supplementary information. Thus, 100 data collection areas are distributed in each definition domain for RS, LHS, and GLHS. We sort $N_{1,ij}$ and $N_{2,ij}$ into a sequence on the basis of their values and label the $N_{1,ij}$ and $N_{2,ij}$ in the middle 25–75% of the sequence as a square box with dotted lines, as shown in Fig. 3a–c. Among the three sampling types, RS has the largest box and GLHS has the smallest one in the $\Omega _1$ domains, indicating that GLHS possesses the most uniform data distributions in the definition domain.For the $\Omega _2$ domains, GLHS also has the most uniform data distribution, but the data points at the middle of the sequence have values below 200, which is the average number of data points for the collection data in $\Omega _2$ domains. This result may due to the reason that the amount of input data is not large enough in the $\Omega _2$ domains. Recently, simple LHS with point transformation was used to increase the uniformity of data distribution⁸³. In this study, we observe that GLHS has an obvious advantage over RS and LHS, although RS and LHS have similar sampling procedures in the uncertainty analysis⁶².

Solving for forward and backward MDE

In polymer physics, MDE is a key equation in self-consistent field theory and has been numerically solved by many methods^70,71. We adopt MDE as an example to illustrate the use of PINN based on GLHS in solving forward and backward problems. First, we present the general form of forward and backward MDE. Second, we discuss the efficiency of GLHS and PBC loss functions in solving for MDE. Lastly, we discuss the forward and backward problems in MDE solutions by using PINN based on GLHS.

Problem setup

As an example, we take a simple form of forward and backward MDE with the initial conditions and PBCs, which can be expressed as

$$\begin{aligned} \frac{\partial }{\partial {t}}u(x, t) = \frac{\partial ^{2}}{\partial {x}^{2}}u(x, t) - \lambda \sin (2\pi {x})u(x,t), \end{aligned}$$

(10)

where the initial condition is $u(x,0) = 1$ and the periodic boundary condition is $u(0,t) = u(L,t)$. Here, only two dimensions are used; L is the period in the x dimension. And when $\lambda $ is a given parameter, such as $\lambda = 7$ used in this work, the problem becomes a forward problem. When $\lambda $ is an unknown parameter, solving the equation becomes a backward problem. MDE is a linear, second-order, parabolic-type PDE in which the forward problem is addressed by DNN^28,79. However, the backward problem in MDE still needs to be understood. Here, we use PINN based on GLHS to solve the forward and backward problems in MDE.

Optimization scheme for GLHS and PINN

The neural network and sampling method should be optimized when used to solve a special PDE. In this study, the core issue in GLHS is how to find the optimizing mixture coefficient $\alpha $ as the special PDE; meanwhile, depth D and width W in the neural network are also important parameters in PINN and should be optimized. Given that the two types of parameters are independent, we adopt the independent variable method to optimize the GLHS and PINN parameters.

The primary issue in GLHS, which is how to find optimizing mixture coefficient $\alpha $ as the special PDE, is solved in PINN. We further describe the GLHS by comparing it with other mixture sampling methods in solving MDE using PINN, as shown in Fig. 4. For a comparison, we adopt two types of mixture samplings, namely, the mixture of random sampling and grid sampling, as shown in Fig. 4a, and GLHS in Fig. 4b, respectively. The data used are from solving for MDE in Eq. (10) with PBC of $L=1.0$, where the number of residual units is 6 and the width of neural network layer is 20 in PINN. Then, we use a total of $N=301 \times 301$ data points in the mixture of random and grid samplings as well as GLHS, where the number of data points in Cartesian GS is $\alpha {N}$. To explain the efficiency of the sampling method, we employ sampling standard errors as follows:

$$\begin{aligned} \sigma = \root \of {\sum _{i=1}^{N}\frac{(u_i-u_{0i})^2}{N-1}}, \end{aligned}$$

(11)

where $u_i$ is the PINN solving value and $u_{0i}$ is the numerical value solved by the Crank-Nicholson method, which has been used for MDE in previous simulation calculations^74,84,85,86. The sum takes over all the discrete data points in the definition domains used in the Crank-Nicholson method, and N is the total number of discrete data points.

In the mixture of random and grid sampling case, the results indicate that it is difficult to optimize $\sigma $ and t as $\alpha $ varies; t is the corresponding training time, as shown in Fig. 4a. In particular, when standard error $\sigma $ has the minimum value, training time t is the longest. In the GLHS case, $\sigma $ can reach its minimum value and the t minimum value simultaneously, as shown in Fig. 4b. We optimize GLHS with $\alpha =0.5$ which indicates that the minimum error and training time can be achieved when the data numbers in GS are equal to those in LHS. Figure 4 also show the results of RS ($\alpha =0.0$ in Fig. 4a), LHS ($\alpha =0.0$ in Fig 4b) and GS ($\alpha =1.0$ in Fig. 4a and b) by using PINN. We can find that in terms of t and $\sigma $, these three sampling methods are not good choices for data samplings, comparing to the GLHS with $\alpha =0.5$. These results agree with those data distributions described in Fig. 3. We note that the LHS has been used in PINN²⁸. Here, we try to use the GLHS instead of LHS to enhance the efficiency for solving PDEs in PINN.

PBC is important in PDE, especially when handling a bulk polymer system. The boundary condition has been strengthened in previous studies when designing the structure of neural networks^36,87,88. Thus, we consider PBC optimization when choosing the proper network parameters. As illustrated in Eq. (10), PBC in spatial period L can be regarded as $u(0,t)=u(L,t)$. The PBC in numerical methods can be done by setting the calculation cell size to L, which satisfies $u(0,t)=u(L,t)$^72,74. However, the PBC in neural networks has to satisfy the left condition $u(x,t)=u(x-L,t)$, the right condition $u(x,t)=u(x+L,t)$, or both, that is, the squeeze period condition where $x \in [0,L]$. To identify the optimized PBC among the three PBCs, we construct the following types of loss functions.

$$\begin{aligned} J_{p1}= & {} \frac{1}{N}\sum _{i=1}^{N}[u(0,t_i)-u(L,t_i)]^2, \end{aligned}$$

(12)

$$\begin{aligned} J_{p2}= & {} \frac{1}{N}\sum _{i=1}^{N}[u(x_i,t_i)-u(x_i-L,t_i)]^2, \end{aligned}$$

(13)

$$\begin{aligned} J_{p3}= & {} \frac{1}{N}\sum _{i=1}^{N}[u(x_i,t_i)-u(x_i+L,t_i)]^2, \end{aligned}$$

(14)

$$\begin{aligned} J_{p4}= & {} J_{p2}+J_{p3}, \end{aligned}$$

(15)

where $x_i \in [0,L]$ and $t_i \in [0,T]$. Summation is performed over the two domains. Equation (12) is commonly used in numerical methods and listed here for comparison.

The training processes for the four PBC loss functions are shown in Fig. 5. The standard errors $(\sigma _p)$ from the results of PINN, i.e., u(x, t), and from the results of the Crank-Nicholson numerical method, i.e., $u_0{(x,t)}$ are also listed on the right side. In Type 1, as shown in Fig. 5a, although loss function $J_{p1}$ converges to a desired value, standard error $\sigma _{p1}$ is too large to achieve correct solutions. For Types 2 and 3, as shown in Fig. 5b and c, loss functions $J_{p2}$ and $J_{p3}$ can converge to the desired values, and standard errors $\sigma _{p2}$ and $\sigma _{p3}$ are still able to converge to the desired values. However, when we use the squeeze period condition, loss function $J_{p4}$ and standard error $\sigma _{p4}$ can converge to the desired value in the training processes. Furthermore, we calculate the Pearson correlation coefficient, $\rho _{X,Y}$, between standard error $\sigma _p$ and loss function $J_p$, which can be generally defined as

$$\begin{aligned} \rho _{X,Y}=\frac{cov(X,Y)}{\sigma _X\sigma _Y}, \end{aligned}$$

(16)

where X and Y denote standard error $\sigma _p$ and loss function $J_p$, respectively. $\sigma _X$ denotes the standard error of X, and cov(X, Y) represents the covariance between X and Y. Then, we list the $\rho _{X,Y}$ for the four types of PBC in Table 1. The data indicate that the first three types of PBC have a strongly correlation between $\sigma _p$ and $J_p$, but $\sigma _p$ and $J_p$ have almost no correlations in the squeeze boundary condition. This result indicates that the squeeze period method is feasible and can effectively improve the training accuracy of solving PDEs with PBC in PINN.

Table 1 Pearson correlation coefficient between standard error $\sigma _L$ and loss function $J_L$.

Full size table

Then, we optimize the neural network parameters, namely, the depth and width in PINN(D and W). D is the number of residual units, and W is the number of neural units in the layer. D, W, and the amount of trained data(N) have a great influence on the output accuracy of the neural network. We train PINN in a multi-parameter space of D, W, and N, where $D \in [3,4,5,6,7,8]$, $W \in [10,15,20,25,30]$, and $N \in [2000,4000,6000,8000,10000,20000]$, as indicated in Table 2, where only several typical combinations are shown. Indeed, $D \times W \times N=180$ combinations exist in the full parameter space of [D, W, N]. Other detailed combinations can be found in Figs. S3 and S4 of Supplementary information.

Table 2 Standard error ($\sigma $) and loss function (J) corresponding to different parameters, depth (D), width (W), and number of data (N).

Full size table

We use loss functions J and standard errors $\sigma $ to screen the desired parameters. The optimal parameter corresponds to the minimum $\sigma $, which is the combination with $D=6$, $W=20$, and $N=20000$. Generally, an increase in N leads to a decrease in J and $\sigma $, indicating a simple relationship. However, the optimal output relies on the complex combination of D and W. The data in Table 2 reveal that the best optimization combination is the parameter combination of $D=6$, $W=20$, and $N=20000$. Generally, using a number of hidden layers during training results in a large precision loss, but the optimization combination here shows that this is not the case⁸⁹.

Forward and backward solutions

First, we discuss the forward solutions for MDE by PINN. We use the optimal PINN parameters, $D=6$, $W=20$ and $N=20000$, and the GLHS parameter, $\alpha =0.5$, to solve the forward and backward problems in MDEs appearing in Eq. (10). We take the definition domains $x \in [0,1]$ and $t \in [0,1]$, and to evaluate the accuracy of the PINN solution u, we define the relative errors with respect to the numerical solutions $u_0$ as follows:

$$\begin{aligned} \delta _{ij}=\vert \frac{u(x_i,t_j)-u_0{(x_i,t_j)}}{u_0{(x_i,t_j)}}\vert , \end{aligned}$$

(17)

where subscripts $x_i$ and $t_j$ correspond to definition domains x and t,respectively. We plot the neural network results and compare them with the numerical results in Fig. 6. We adopt an overall view in the two-dimensional space, as shown in Fig. 6a and b. The results indicate that the PINN results have high accuracy within $10^{-3}$ distributed in the defintion domain. Then, we examine the accuracy in one-dimensional space with given x or t, as shown in Fig. 6c and d, respectively. The data confirm that our PINN results vary with the given x or t. Furthermore, we illustrate the data for the given x or t in Table 3, where the relative errors are also listed.

Table 3 The relative errors corresponding to different time and space positions.

Full size table

For the inverse problem in MDE, discovering the unknown parameters is difficult due to the complex physical constraints and gradient disappearance. Unlike in the case where a sparse regression method is employed to determine PDE by time series measurements in the spatial domain²¹, in this study, we design an interleaved training method with discontinuous double-loss functions $\Gamma _e$ and $\Gamma _b$. Loss function $\Gamma _e$ is defined in Eq. 7, and $\Gamma _b$ can be defined as

$$\begin{aligned} \Gamma _b=\frac{1}{N}\sum _{i=1}^{N}[u(x_i,t_i)-u_0{(x_i,t_i)}]^2. \end{aligned}$$

(18)

The sum over all the definition domains, $u_0{(x_i,t_i)}$,is the numerical solution and taken as the standard value. In this method, we search for unknown parameter $\lambda $ through loss function $\Gamma _e$ and optimize the network solution through $\Gamma _b$. That is, $\Gamma _e$ is optimized to screen parameter $\lambda $ at the first training stage. Then, we lock up the parameter $\lambda $ to optimize the network solution $\Gamma _b$ and obtain a high-accuracy network solution.

Specifically, we construct the PINN with four residual units, each of which has a full-connection layer width of 20 neural units, to solve the inverse problems. We optimize the network parameters and unknown parameters $\theta _0{(w,b,\lambda )}$ via the loss function $J_0{(\theta _0)}$ at the first stage. Then, we lock up parameter $\lambda $ and optimize $\theta _1{(w,b)}$ by using loss function $J_1{(\theta _1)}$ at the second stage until the loss function reaches $10^{-5}$, as shown in Fig. 7. We label the blue and brown dotted lines as $\lambda =10$ and $J_0{(\theta _0)}=0$, respectively. The results indicate that parameter $\lambda =10.00111$ at $t=108s$ when $J_0{(\theta _0)}\le 10^{-5}$, as shown in Fig. 7a. At this point, we lock up parameter $\lambda $, and loss function $J_0{(\theta _0)}$ is automatically awitched to $J_1{(\theta _1)}$. Then, network parameter $\theta _1{(w,b)}$, which determines the network solution, is further optimized at the second stage. We observe that the loss function is discontinuous at this point, as shown in the inserted part in Fig. 7b. Generally, the loss function exhibits a sudden drop when the optimizer switches in the training process⁵². We also observe a discontinuous loss function in a small region of $\Delta {J}=0.05$ when the loss function switches. In the current study, we obtain parameter $\lambda $ with high accuracy. The absolute error of parameter $\lambda $ is 0.0011, the relative error is $\delta _\lambda =0.011\%$, and the standard error is $\sigma _\lambda =0.0046$. In this work, we present a high-efficiency interleaved training method to search for the unknown parameters in the reverse problem in MDE, and it can be reasonably extended to other PDEs.

Comparison between PINN and NN

To illustrate the advantages of PINN with residual units, we compare the results from the PINN to the traditional NN by the standard error $\sigma $ and training time t, as shown in Fig. 8. Here, the traditional NN is constructed without the residual units, which is similar to the previous works^22,25. For convenient comparison, the other conditions are set to be the same as in both cases, including the data samplings (GLHS with $\alpha $=0.5), the input data number( $N=20000$), and the width of network ($W=20$). Since the residual unit consists of two neural network layers, we take two network layers in traditional NN as a layer for more convenient comparison, so that there are 2D network layers in the traditional NN. Then, we only change the depth of PINN D to output the training times and standard errors in both cases. The results indicate that with the increase of the number of neural network layers, the advantages of PINN with residual units become more and more obvious. In particular, when D = 6, the traditional neural network encounters a gradient explosion, leading to the meaningless output of loss and the other parameters, as shown in Fig. 8; when D = 7 and 8, the traditional NN can hardly be optimized bacause of the problem of gradient disapperance. This problem results in the extremely short training times t and large standard errors $\sigma $. In addition, even in the shallow networks with small D, the training time of PINN with residual units is still superiority to those of traditional NN, as shown in Fig. 8. Furthermore, we have compared the efficiencies between the PINN and traditional NN with the other W, and the similar results were obtained, please see Tabel S1 in Supplementary information. We expect this PINN with residual untis based on the mixed data sampling to be applied to other works, especially to the three-dimensional PDEs, in the future.

Summary

We developed a PINN based on GLHS to solve the forward and backward MDEs by optimizing the corresponding parameters. This solver provides high efficiency and accuracy in solving forward and backward problems in one-dimensional MDEs, and it effectively avoids the problems of gradient disappearance and network degradation existing in traditional feedforward neural networks. For the neural network, we properly designed residual units for PINN to solve MDE, and squeeze PBC was considered. For data sampling, we considered the mixture of GS and LHS. We believe that this method can also used in other dimensional, and we will confirm the view in the future.

Then, we optimized the parameters used in PINN by considering the loss function of PBC. Specifically, the depth and width of the neural network, D and W, were optimized. The results indicated that the squeeze condition is suitable for MDE. We also optimized GLHS data sampling by adjusting the mixture coefficient $\alpha $. The results revealed that the parameter combination $[D,W,N,\alpha ]$ should be optimized to [6, 20, 20000, 0.5] in the given MDE with high precision. We demonstrated how the hybrid solver deals with forward and backward problems in a special MDE. We compared the neural network solvers results to the numerical solutions and found good agreement. For the forward MDE, we obtained high-accuracy PINN solutions within $10^{-3}$ by analyzing the errors between the PINN solutions and the numerical results. For the inverse problem in MDE, we designed the ITM method to screen the unknown parameters. Unknown parameter $\lambda $ was locked up with relative error $\delta _\lambda =0.011\%$ and standard error $\sigma _\lambda =0.0046$. This PINN with residual units based on mixed data sampling GLHS can be generalized to other cases for other PDEs.

Data availability

The datasets used and/or analysed during the current study available from the corresponding author on reasonable request.

References

Meerschaert, M. M. & Tadjeran, C. Finite difference approximations for fractional advection-dispersion flow equations. J. Comput. Appl. Math. 172, 65–77 (2004).
Article MathSciNet MATH ADS Google Scholar
Inan, B. & Bahadir, A. R. Numerical solution of the one-dimensional Burgers’ equation: Implicit and fully implicit exponential finite difference methods. Pramana 81, 547–556 (2013).
Article ADS Google Scholar
Alikhanov, A. A. A new difference scheme for the time fractional diffusion equation. J. Comput. Phys. 280, 424–438 (2015).
Article MathSciNet MATH ADS Google Scholar
Gao, G., Sun, H. & Sun, Z. Stability and convergence of finite difference schemes for a class of time-fractional sub-diffusion equations based on certain superconvergence. J. Comput. Phys. 280, 510–528 (2015).
Article MathSciNet MATH ADS Google Scholar
Moghaddam, B. P. & Machado, J. A stable three-level explicit spline finite difference scheme for a class of nonlinear time variable order fractional partial differential equations. Comput. Math. with Appl. 73, 1262–1269 (2017).
Article MathSciNet MATH Google Scholar
Elango, S. et al. Finite difference scheme for singularly perturbed reaction diffusion problem of partial delay differential equation with nonlocal boundary condition. Adv. Differ. Equ. 2021, 115 (2021).
Article MathSciNet Google Scholar
Ying, L. Partial differential equations and the finite element method. Math. Comput. 76, 1693–1694 (2007).
Article Google Scholar
Jiang, Y. & Ma, J. High-order finite element methods for time-fractional partial differential equations. J. Comput. Appl. Math. 235, 3285–3290 (2011).
Article MathSciNet MATH Google Scholar
Gunzburger, M. D., Webster, C. G. & Zhang, G. Stochastic finite element methods for partial differential equations with random input data. Acta Numer 23, 521–650 (2014).
Article MathSciNet MATH Google Scholar
Lehrenfeld, C., Olshanskii, M. A. & Xu, X. A stabilized trace finite element method for partial differential equations on evolving surfaces. SIAM J. Numer. Anal. 56, 1643–1672 (2018).
Article MathSciNet MATH Google Scholar
Li, C. & Wang, Z. The local discontinuous Galerkin finite element methods for Caputo-type partial differential equations: Mathematical analysis. Appl. Numer. Math. 150, 587–606 (2020).
Article MathSciNet MATH Google Scholar
Lai, J., Liu, F., Anh, V. V. & Liu, Q. A space-time finite element method for solving linear riesz space fractional partial differential equations. Numer Algorithms 88, 499–520 (2021).
Article MathSciNet MATH Google Scholar
Du, S. & Cai, Z. Adaptive finite element method for dirichlet boundary control of elliptic partial differential equations. J. Sci. Comput. 89, 36 (2021).
Article MathSciNet MATH Google Scholar
Xu, Y., Chen, Q. & Guo, Z. Optimization of heat exchanger networks based on Lagrange multiplier method with the entransy balance equation as constraint. Int. J. Heat Mass Transf. 95, 109–115 (2016).
Article Google Scholar
Hamid, M., Usman, M., Zubair, T. & Mohyud-Din, S. T. Comparison of Lagrange multipliers for telegraph equations. Ain Shams Eng. J. 9, 2323–2328 (2017).
Article Google Scholar
Antoine, X., Shen, J. & Tang, Q. Scalar Auxiliary Variable/Lagrange multiplier based pseudospectral schemes for the dynamics of nonlinear Schrödinger/Gross-Pitaevskii equations. J. Comput. Phys. 437, 110328 (2021).
Article MATH Google Scholar
Lee, H. G., Shin, J. & Lee, J.-Y. A high-order and unconditionally energy stable scheme for the conservative Allen-Cahn equation with a nonlocal Lagrange multiplier. J. Sci. Comput. 90, 51 (2022).
Article MathSciNet MATH Google Scholar
Yang, J. & Kim, J. Numerical simulation and analysis of the Swift-Hohenberg equation by the stabilized Lagrange multiplier approach. Comput. Appl. Math. 41, 20 (2022).
Article MathSciNet MATH Google Scholar
Lecun, Y., Bengio, Y. & Hinton, G. Deep learning. Nature 521, 436–444 (2015).
Article CAS PubMed ADS Google Scholar
Raissi, M., Perdikaris, P. & Karniadakis, G. E. Machine learning of linear differential equations using Gaussian processes. J. Comput. Phys. 348, 683–693 (2017).
Article MathSciNet MATH ADS Google Scholar
Rudy, S. H., Brunton, S. L., Proctor, J. L. & Kutz, J. N. Data-driven discovery of partial differential equations. Sci. Adv. 3, e1602614 (2017).
Article PubMed PubMed Central ADS Google Scholar
Brink, A. R., Najera-Flores, D. A. & Martinez, C. The neural network collocation method for solving partial differential equations. Neural Comput. Appl. 33, 5591–5608 (2021).
Article Google Scholar
Chen, Z., Churchill, V., Wu, K. & Xiu, D. Deep neural network modeling of unknown partial differential equations in nodal space. J. Comput. Phys. 449, 110782 (2022).
Article MathSciNet MATH Google Scholar
Mistry, A., Franco, A. A., Cooper, S. J., Roberts, S. A. & Viswanathan, V. How machine learning will revolutionize electrochemical sciences. ACS Energy Lett. 6, 1422–1431 (2021).
Article CAS PubMed PubMed Central Google Scholar
Hauptmann, A. & Cox, B. Deep learning in photoacoustic tomography: current approaches and future directions. J. Biomed. Opt. 25, 112903 (2020).
Article CAS PubMed Central ADS Google Scholar
Alipanahi, B., Delong, A., Weirauch, M. T. & Frey, B. J. Predicting the sequence specificities of DNA- and RNA-binding proteins by deep learning. Nat. Biotechnol. 33, 831–838 (2015).
Article CAS PubMed Google Scholar
Lake, B. M., Salakhutdinov, R. & Tenenbaum, J. B. Human-level concept learning through probabilistic program induction. Science 350, 1332–1338 (2015).
Article MathSciNet CAS PubMed MATH ADS Google Scholar
Raissi, M., Perdikaris, P. & Karniadakis, G. E. Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. J. Comput. Phys. 378, 686–707 (2019).
Article MathSciNet MATH ADS Google Scholar
Owhadi, H. Bayesian numerical homogenization. Multiscale Model. Simul. 13, 812–828 (2015).
Article MathSciNet MATH Google Scholar
Raissi, M., Perdikaris, P. & Karniadakis, G. E. Inferring solutions of differential equations using noisy multi-fidelity data. J. Comput. Phys. 335, 736–746 (2017).
Article MathSciNet MATH ADS Google Scholar
Karniadakis, G. E. et al. Physics-informed machine learning. Nat. Rev. Phys. 3, 422–440 (2021).
Article Google Scholar
He, K., Zhang, X., Ren, S. & Sun, J. Deep residual learning for image recognition. IEEE Conf. Comput. Vis. Pattern Recognit Proc. 770–778 (2016).
Ruthotto, L. & Haber, E. Deep neural networks motivated by partial differential equations. J. Math. Imaging Vis. 62, 352–364 (2020).
Article MathSciNet MATH Google Scholar
Luo, Z., Sun, Z., Zhou, W., Wu, Z. & Kamata, S. I. Rethinking ResNets: Improved stacking strategies with high-order schemes for image classification. Complex. Intell. Syst. 8, 3395–3407 (2022).
Article Google Scholar
Baydin, A. G., Pearlmutter, B. A., Radul, A. A. & Siskind, J. M. Automatic differentiation in machine learning: A survey. J. Mach. Learn. Res. 18, 5595–5637 (2018).
MathSciNet MATH Google Scholar
Lu, L., Meng, X., Mao, Z. & Karniadakis, G. E. Deepxde: A deep learning library for solving differential equations. SIAM Rev. 63, 208–228 (2021).
Article MathSciNet MATH Google Scholar
Cai, S., Mao, Z., Wang, Z., Yin, M. & Karniadakis, G. E. Physics-informed neural networks (PINNs) for fluid mechanics: A review. Acta. Mech. Sin. 37, 1727–1738 (2022).
Article MathSciNet ADS Google Scholar
Viana, F. A. & Subramaniyan, A. K. A survey of Bayesian calibration and physics-informed neural networks in scientific modeling. Arch. Comput. Methods Eng. 28, 3801–3830 (2021).
Article MathSciNet Google Scholar
Mao, Z., Jagtap, A. D. & Karniadakis, G. E. Physics-informed neural networks for high-speed flows. Comput. Methods Appl. Mech. Eng. 360, 112789 (2020).
Article MathSciNet MATH ADS Google Scholar
Zhang, E., Dao, M., Karniadakis, G. E. & Suresh, S. Analyses of internal structures and defects in materials using physics-informed neural networks. Sci. Adv. 8, eabk0644 (2022).
Article PubMed PubMed Central ADS Google Scholar
Cai, S., Wang, Z., Wang, S., Perdikaris, P. & Karniadakis, G. E. Physics-informed neural networks for heat transfer problems. J. Heat Transf. 143, 060801 (2021).
Article CAS Google Scholar
Chen, Z., Gao, J., Wang, W. & Yan, Z. Physics-informed generative neural network: An application to troposphere temperature prediction. Environ. Res. Lett. 16, 065003 (2021).
Article ADS Google Scholar
Bai, Y., Chaolu, T. & Bilige, S. The application of improved physics-informed neural network (IPINN) method in finance. Nonlinear Dyn. 107, 3655–3667 (2022).
Article Google Scholar
Taghizadeh, E., Byrne, H. M. & Wood, B. D. Explicit physics-informed neural networks for nonlinear closure: The case of transport in tissues. J. Comput. Phys. 449, 110781 (2022).
Article MathSciNet MATH Google Scholar
Jiang, J. et al. Physics-informed deep neural network enabled discovery of size-dependent deformation mechanisms in nanostructures. Int. J. Solids Struct. 236–237, 111320 (2022).
Article Google Scholar
Kissas, G. et al. Machine learning in cardiovascular flows modeling: Predicting arterial blood pressure from non-invasive 4D flow MRI data using physics-informed neural networks. Comput. Methods Appl. Mech. Eng. 358, 112623 (2020).
Article MathSciNet MATH ADS Google Scholar
Riel, B., Minchew, B. & Bischoff, T. Data-driven inference of the mechanics of slip along glacier beds using physics-informed neural networks: Case study on rutford ice stream, Antarctica. J. Adv. Model. Earth Syst. 13, e2021MS002621 (2021).
Article ADS Google Scholar
Yang, L., Meng, X. & Karniadakis, G. E. B-pinns: Bayesian physics-informed neural networks for forward and inverse PDE problems with noisy data. J. Comput. Phys. 425, 109913 (2021).
Article MathSciNet MATH Google Scholar
Pang, G., Lu, L. & Karniadakis, G. E. Fpinns: Fractional physics-informed neural networks. SIAM J. Sci. Comput. 41, A2603–A2626 (2019).
Article MathSciNet MATH Google Scholar
Jagtap, A. D. & Karniadakis, G. E. Extended physics-informed neural networks (XPINNs): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. Commun. Comput. Phys. 28, 2002–2041 (2020).
Article MathSciNet MATH Google Scholar
Meng, X., Li, Z., Zhang, D. & Karniadakis, G. E. PPINN: Parareal physics-informed neural network for time-dependent PDEs. Comput. Methods Appl. Mech. Eng. 370, 113250 (2020).
Article MathSciNet MATH ADS Google Scholar
Mahmoudabadbozchelou, M., Karniadakis, G. E. & Jamali, S. nn-pinns: Non-Newtonian physics-informed neural networks for complex fluid modeling. Soft Matter. 18, 172–185 (2022).
Article CAS ADS Google Scholar
Kharazmi, E., Zhang, Z. & Karniadakis, G. E. hp-VPINNs: Variational physics-informed neural networks with domain decomposition. Comput. Methods Appl. Mech. Engrg. 374, 113547 (2021).
Article MathSciNet MATH ADS Google Scholar
Pang, G., D’Elia, M., Parks, M. & Karniadakis, G. E. nPINNs: Nonlocal physics-informed neural networks for a parametrized nonlocal universal laplacian operator. Algorithms Appl. J. Comput. Phys. 422, 109760 (2020).
MathSciNet MATH Google Scholar
He, Q. & Tartakovsky, A. M. Physics-informed neural network method for forward and backward advection-dispersion equations. Water Resour. Res. 57, e2020WR029479 (2021).
Article ADS Google Scholar
Zhang, D., Lu, L., Guo, L. & Karniadakis, G. E. Quantifying total uncertainty in physics-informed neural networks for solving forward and inverse stochastic problems. J. Comput. Phys. 397, 108850 (2019).
Article MathSciNet MATH Google Scholar
Lou, Q., Meng, X. & Karniadakis, G. E. Physics-informed neural networks for solving forward and inverse flow problems via the Boltzmann-BGK formulation. J. Comput. Phys. 447, 110676 (2021).
Article MathSciNet MATH Google Scholar
Jagtap, A. D., Kharazmi, E. & Karniadakis, G. E. Conservative physics-informed neural networks on discrete domains for conservation laws: Applications to forward and inverse problems. Comput. Methods Appl. Mech. Eng. 365, 113028 (2020).
Article MathSciNet MATH ADS Google Scholar
Han, J., Jentzen, A. & Weinan, E. Solving high-dimensional partial differential equations using deep learning. Proc. Natl. Acad. Sci. USA 115, 8505–8510 (2018).
Article MathSciNet CAS PubMed PubMed Central MATH ADS Google Scholar
Ruthotto, L., Osher, S. J., Li, W., Nurbekyan, L. & Fung, S. W. A machine learning framework for solving high-dimensional mean field game and mean field control problems. Proc. Natl. Acad. Sci. USA 117, 9183–9193 (2020).
Article MathSciNet CAS PubMed PubMed Central ADS Google Scholar
Bar-Sinai, Y., Hoyer, S., Hickey, J. & Brenner, M. P. Learning data-driven discretizations for partial differential equations. Proc. Natl. Acad. Sci. USA 116, 15344–15349 (2019).
Article MathSciNet CAS PubMed PubMed Central MATH ADS Google Scholar
Helton, J. C., Davis, F. J. & Johnson, J. D. A comparison of uncertainty and sensitivity analysis results obtained with random and Latin hypercube sampling. Reliab. Eng. Syst. Saf. 89, 305–330 (2005).
Article Google Scholar
Navid, A., Khalilarya, S. & Abbasi, M. Diesel engine optimization with multi-objective performance characteristics by non-evolutionary Nelder-Mead algorithm: Sobol sequence and Latin hypercube sampling methods comparison in DoE process. Fuel 228, 349–367 (2018).
Article CAS Google Scholar
Shields, M. D. & Zhang, J. The generalization of Latin hypercube sampling. Reliab. Eng. Syst. Saf. 148, 96–108 (2016).
Article Google Scholar
Chen, Y., Wen, J. & Cheng, S. Probabilistic load flow method based on nataf transformation and Latin hypercube sampling. IEEE Trans. Sustain. Energy 4, 294–301 (2013).
Article ADS Google Scholar
Sirignano, J. & Spiliopoulos, K. DGM: A deep learning algorithm for solving partial differential equations. J. Comput. Phys. 375, 1339–1364 (2018).
Article MathSciNet MATH ADS Google Scholar
Li, H., Zhai, Q. & Chen, J. Z. Neural-network-based multistate solver for a static Schrödinger equation. Phys. Rev. A 103, 032405 (2021).
Article CAS ADS Google Scholar
Gao, H., Sun, L. & Wang, J. X. PhyGeoNet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state PDEs on irregular domain. J. Comput. Phys. 428, 110079 (2021).
Article MathSciNet MATH Google Scholar
Dong, S. & Li, Z. Local extreme learning machines and domain decomposition for solving linear and nonlinear partial differential equations. Comput. Methods Appl. Mech. Eng. 387, 114129 (2021).
Article MathSciNet MATH ADS Google Scholar
Matsen, M. W. The standard Gaussian model for block copolymer melts. J. Phys. Condens. Matter 14, R21–R47 (2002).
Article CAS ADS Google Scholar
Fredrickson, G. H., Ganesan, V. & Drolet, F. Field-theoretic computer simulation methods for polymers and complex fluids. Macromolecules 35, 16–39 (2002).
Article CAS ADS Google Scholar
Fredrickson, G. H. The Equilibrium Theory of Inhomogenous Polymers (Oxford University Press, Oxford, 2006).
MATH Google Scholar
Matsen, M. W. & Schick, M. Stable and unstable phases of a diblock copolymer melt. Phys. Rev. Lett. 72, 2660–2663 (1994).
Article CAS PubMed ADS Google Scholar
Drolet, F. & Fredrickson, G. H. Combinatorial screening of complex block copolymer assembly with self-consistent field theory. Phys. Rev. Lett. 83, 4317–4320 (1999).
Article CAS ADS Google Scholar
Guo, Z. et al. Discovering ordered phases of block copolymers: New results from a generic Fourier-space approach. Phys. Rev. Lett. 101, 028301 (2008).
Article PubMed ADS Google Scholar
Song, W., Tang, P., Qiu, F., Yang, Y. & Shi, A. C. Phase behavior of semiflexible-coil diblock copolymers: A hybrid numerical SCFT approach. Soft Matter 7, 929–938 (2011).
Article CAS ADS Google Scholar
Jiang, Y. & Chen, J. Z. Self-consistent field theory and numerical scheme for calculating the phase diagram of wormlike diblock copolymers. Phys. Rev. E 88, 042603 (2013).
Article ADS Google Scholar
Jiang, Y. & Chen, J. Z. Influence of chain rigidity on the phase behavior of wormlike diblock copolymers. Phys. Rev. Lett. 110, 138305 (2013).
Article PubMed ADS Google Scholar
Wei, Q., Jiang, Y. & Chen, J. Z. Machine-learning solver for modified diffusion equations. Phys. Rev. E 98, 053304 (2018).
Article MathSciNet CAS Google Scholar
Glorot, X. & Bengio, Y. Understanding the difficulty of training deep feedforward neural networks. J. Mach. Learn. Res. 9, 249–256 (2010).
Google Scholar
Balduzzi, D. et al. The shattered gradients problem if resnets are the answer, then what is the question?. ICLM Proc. 70, 342–350 (2017).
Google Scholar
Li, X. Simultaneous approximations of multivariate functions and their by neural networks with one hidden layer. Neurocomputing 12, 327–343 (1996).
Article MATH Google Scholar
Bihlo, A. & Popovych, R. O. Physics-informed neural networks for the shallow-water equations on the sphere. J. Comput. Phys. 456, 111024 (2022).
Article MathSciNet MATH Google Scholar
Li, S., Chen, P., Wang, X., Zhang, L. & Liang, H. Surface-induced morphologies of lamella-forming diblock copolymers confined in nanorod arrays. J. Chem. Phys. 130, 014902 (2009).
Article PubMed ADS Google Scholar
Chen, P., Liang, H. & Shi, A. C. Origin of microstructures from confined asymmetric diblock copolymers. Macromolecules 40, 7329–7335 (2007).
Article CAS ADS Google Scholar
Tang, P., Qiu, F., Zhang, H. & Yang, Y. Morphology and phase diagram of complex block copolymers: ABC linear triblock copolymers. Phys. Rev. E 69, 031803 (2004).
Article ADS Google Scholar
Lagaris, I. E., Likas, A. C. & Papageorgiou, D. G. Neural-network methods for boundary value problems with irregular boundaries. IEEE Trans. Neural Netw. 11, 1041–1049 (2000).
Article CAS PubMed Google Scholar
Sun, L., Gao, H., Pan, S. & Wang, J. X. Surrogate modeling for fluid flows based on physics-constrained deep learning without simulation data. Comput. Methods Appl. Mech. Eng. 361, 112732 (2020).
Article MathSciNet MATH ADS Google Scholar
Berg, J. & Nyström, K. A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018).
Article Google Scholar

Download references

Acknowledgements

We thank for the financial supports from the Program of National Natural Science Foundation of China (Grant No. 21973070).

Author information

These authors contributed equally: Qian Fang and Xuankang Mou.

Authors and Affiliations

Department of Physics, Wenzhou University, Wenzhou, 325035, Zhejiang, China
Qian Fang, Xuankang Mou & Shiben Li

Authors

Qian Fang
View author publications
You can also search for this author in PubMed Google Scholar
Xuankang Mou
View author publications
You can also search for this author in PubMed Google Scholar
Shiben Li
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

Methodology, M.X.; Supervision, L.S.; Validation, F.Q.; Writing-original draft, F.Q.; Writing-review and editing, L.S. All authors reviewed the manuscript.

Corresponding author

Correspondence to Shiben Li.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher's note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Supplementary Information.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Fang, Q., Mou, X. & Li, S. A physics-informed neural network based on mixed data sampling for solving modified diffusion equations. Sci Rep 13, 2491 (2023). https://doi.org/10.1038/s41598-023-29822-3

Download citation

Received: 13 September 2022
Accepted: 10 February 2023
Published: 13 February 2023
DOI: https://doi.org/10.1038/s41598-023-29822-3

Comments

By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.