On fast simulation of dynamical system with neural vector enhanced numerical solver

The large-scale simulation of dynamical systems is critical in numerous scientific and engineering disciplines. However, traditional numerical solvers are limited by the choice of step sizes when estimating integration, resulting in a trade-off between accuracy and computational efficiency. To address this challenge, we introduce a deep learning-based corrector called Neural Vector (NeurVec), which can compensate for integration errors and enable larger time step sizes in simulations. Our extensive experiments on a variety of complex dynamical system benchmarks demonstrate that NeurVec exhibits remarkable generalization capability on a continuous phase space, even when trained using limited and discrete data. NeurVec significantly accelerates traditional solvers, achieving speeds tens to hundreds of times faster while maintaining high levels of accuracy and stability. Moreover, NeurVec’s simple-yet-effective design, combined with its ease of implementation, has the potential to establish a new paradigm for fast-solving differential equations based on deep learning.


Introduction
2][3] Studying the temporal evolution of dynamical systems and their statistics can help uncover the physics behind the dynamics and predict future states of the systems. 3Typically, a time-dependent d-dimensional state u(t) is depicted by a system of ordinary differential equations (ODEs): where c 0 represents an initial condition.4][15] To obtain a numerical solution of (1), one may employ an integration method 16,17 (Fig. 1b) given by the iterative formula where S represents a numerical scheme (for example, S(f, u n , ∆t n ) := f (u n )∆t n when the Euler method 18 is used), ∆t n is the step size of the nth time step, and u n ∈ R d is an approximated solution at time n i=0 ∆t n .When approximating a solution at a specific time given an initial condition, we readily customize accuracy and speed via tuning integration strategies (e.g., different scheme S and step size ∆t n selection).However, many real-world applications [19][20][21][22] require simulating many trajectories.In particular, large-scale simulation, which produces forecasts on a set of initial conditions simultaneously, is more useful for these applications.Compared with a single simulation, ensemble-based large-scale simulation is a computationally challenging problem but plays a critical role in a variety of demanding applications.For illustration, we present a few scenarios of such simulation.Fast simulation.Sine late 2019, the epidemic of COVID-19 has raged around the world, hitting the global health and economy. 23Scientists need to perform simulations of virus propagation under thousands of different circumstances. 24,25 hese predictions provide the scientific reference for governments to make quick responses and control policies. 26,27 he virus, such as the Delta and Omicron variants, spreads rapidly and mutates frequently. 28 slow simulation may lead to a delay in decision-making and worsen the situation.Synchronous simulation.Particle systems are a graphical technique that simulates complex physical effects (such as smoke, 29 water flow, 30 and object collision. 31).This is widely used in applications in games, movies, and animation. 32These applications involve synchronously simulating thousands of particles at one time.Yet supporting the real-time simulation of these particle motions with satisfactory visual enjoyment is expensive.Reliable model.Weather forecasting is beneficial for making a proper plan for production and living. 33,34  single forecast of the weather model essentially suffers from considerable errors introduced by the highly simplified model formulation and the chaotic nature of the atmosphere evolution equations.In order to avoid a misleading single forecast, ensemble forecasting [35][36][37] presents a range of possible future weather states through conducting simulations from multiple initial conditions and models.
To meet the demands of these applications, we need to develop a fast solver that is capable of simultaneously simulating the dynamical system over a large batch of initialization data.The advances in processors, such as graphics processing units (GPUs), 38 tensor processing units, 39 and natural graphics processing units, 40 provide the possibility of accelerating the numerical computation via parallel computing of batch data.However, most hardware implements restrictive SIMD-based (single instruction, multiple data) models. 41The numerical method that needs individual processing of each trajectory is not appropriate for SIMD processors directly.For example, the adaptive time-step integrator (e.g., the Runge-Kutta--Fehlberg method 42 ) determines a step size at each step based on an estimate of the local error, making the iterative computation in Eq. ( 2) asynchronous for each trajectory in the batch and affecting the efficiency of parallel computing.On the other hand, we may control the step size to be the same at each step for all trajectories by adding one dimension to Eq. (2), for example, u n+1 ∈ R N ×d with N representing the batch size.Controlling the step size requires considering a combined ODE system and estimating the error on all batch elements 43 .The step size is limited by the largest local truncation error in a batch, making it difficult to use a large step size especially when the batch size is large. 43,44 f the step size is always small in each step, it becomes slow for integration.Therefore, SIMD processors prefer a fixed time-step integrator (i.e., ∆t := ∆t However, a fixed step size integrator encounters a trade-off 44,45 on step size between accuracy and computational

2/27
efficiency: a large step size has a fast simulation but leads to a less accurate solution, while a small step size has a slow simulation but achieves a more accurate solution (see Table 1 for the comparison of evaluation time and theoretical error between the traditional solvers with fine or coarse step size).This trade-off limits the feasibility of large-scale simulation if high accuracy is required.
To break through this speed-accuracy trade-off, in this paper we propose an open-source and data-driven corrector, called neural vector (NeurVec), which enables integration with coarse step size while maintaining the accuracy of fine step size in large-scale simulations.We emperically demonstrate that NeurVec is capable of overcoming the stability restriction of explicit integration methods for ODEs.The deployment of NeurVec comprises offline training and inference (Fig. 1a).During offline training, NeurVec is trained with the accurate solution, while during inference NeurVec is employed to the solver to compensate for the error caused by the coarse step size.Our results on some complex dynamical system benchmarks show that NeurVec is capable of learning the error term and accelerating the large-scale simulation of dynamical system significantly.Additionally, we have found that NeurVec not only overcomes the stability restriction of explicit integration methods for ODEs but also exhibits excellent generalization capabilities.
Some related works [46][47][48] centered around purely data-driven methods for enhancing the accuracy of learning unknown dynamics from data.These approaches employ neural networks to characterize the system dynamics.For example, Ref. 46 improve long-term predictive modelling of nonlinear systems approximated by neural networks, by introducing supplementary regularization loss terms.In Ref 48 , neural networks are fused with existing coarse modeling techniques to improve the accuracy of a reduced/coarse model.Different from the previous works that focus on enhancing the accuracy of learning unkown dynamics 49 , our objective is to accelerate the numerical simulation of multiple trajectories, each initialized differently, for dynamics that are already known.In our approach, we use neural networks to compensate the time integration errors that arise because of large time step sizes.There are some previous works also consider accelerating the solution estimation via deep learning, but they emphasize adopting pure data-driven approaches, 3,45,50 without using any explicit formula of the equation.Because of chaos [51][52][53] (solution is sensitive to small perturbations) and stiffness 44,54 (solution is unstable unless a sufficiently small step size is used), the pure data-driven method still suffers from large errors in prediction, especially for long-term prediction 55 .

Framework of NeurVec
In this study, we aim to discretize and solve the differential equation given by Eq.(1) using a larger step size, specifically k times the step size (k∆t).To illustrate this approach, we consider the Euler method as an example, and use Taylor expansion to obtain: By neglecting all high-order error terms err n and discretizing u, we can obtain the corresponding Euler method.However, when using a large step size of k∆t, discarding all high-order error terms will significantly reduce the accuracy of the Euler method, leading to poor predictions.Notably, from Eq.(1), we have err n (k, ∆t, u(t)) ≜ , which depends on u(t) and the constants k and ∆t.Therefore, we consider utilizing a corrector called NeurVec by a neural network with input u(t) to approximate ∞ n=2 err n .In other words, the corrector NeurVec, in the form of a vector function, is directly added to the estimated solution to compensate for the error caused by the use of the coarse step size (Fig. 1b).Following this idea, other forward numerical solvers besides the Euler method can also benefit from NeurVec to compensate for errors.Specifically, NeurVec, a neural network parameterized by Θ, maps from the state R d to the error correction R d (see the Methods section) and is added to the iterative formula of the solver with the step size k∆t to get the solution {û kn } ∞ k=0 , i.e., ûk(n+1) = ûkn + S(f , ûkn , k∆t) + NeurVec(û kn ; Θ), û0 = c 0 , n = 0, 1, • • • . ( With NeurVec, we just need to estimate the solution on every k steps instead of step by step as in Eq. (3).NeurVec is trained from the more accurate solutions with fine step size ∆t to characterize the error caused by the use of the coarse step size k∆t.The parameter Θ in NeurVec can be optimized by minimizing the mean squared difference between the predicted error and the error of the solver with the coarse step size: where G is the number of training samples {u s } (G+1)k s=1 from the fine step size.Table 1 displays a comparison of evaluation time and theoretical error between the traditional solver and NeurVec.We use ϵ to denote the runtime ratio of NeurVec to the scheme S. NeurVec inevitably increases the relative time complexity for each step by ϵ since an additional computation module is used (see supplementary material for details).When k > (1 + ϵ), NeurVec with the coarse step size k∆t is faster than the solver with the fine step size ∆t, while achieving comparable accuracy.Moreover, the runtime increment ϵ of NeurVec can be lessened.For example, the more complicated scheme S increases the time complexity, and built-in parallel computing in Pytorch, 56 TensorFlow, 57 or other deep learning frameworks enables smaller time complexity of NeurVec.As we will see in the Results section, ϵ < 1 uniformly, so NeurVec can accelerate the solver when k ≥ 2. To characterize the solution error of NeurVec, we consider the Euler method, a simple ODE solver, as a proof of concept.The global truncation error of the Euler method linearly grows with the step size, 16 namely, O(∆t) when the step size is ∆t and O(k∆t) when the step size is k∆t.In our theory, we show that NeurVec of sufficient width can achieve an error of O(∆t) when the step size is k∆t, which breaks the accuracy-speed trade-off.

Method
Step size Evaluation time Theory error (Euler scheme)

Results
We verify the capabilities of NeurVec in two aspects: (1) NeurVec is capable of stabilizing and accelerating the simulation on widely used numerical solvers with consistent performance improvement; and (2) NeurVec can be applied effectively to various complex dynamical system benchmarks.
To illustrate the performance of NeurVec, we employ a simple network structure, a one-hidden-layer fully connected neural network 44 , to model NeurVec, where the number of the hidden neurons is 1,024 and a rational function 58 is used (see the Methods section for details).The training and inference of NeurVec are all performed on a single GeForce RTX 3080 GPU with a memory of 10 gigabytes.The simulations in the training and testing sets are uniformly sampled every time interval η, and hence we have discrete solutions on the time 0, η, 2η, • • • .To compare the speed, we average the clock time of simulations over 70 trials.The complete statistical tests can be found in the supplementary material.
In the preceding sections we used ∆ F t and ∆ C t to distinguish the fine step size from the coarse one used in traditional solver (e.g., ∆ F t = ∆t and ∆ C t = k∆t in our previous notation).∆ NV t represents using NeurVec and integrating with step size ∆ NV t.

NeurVec for different numerical solvers.
We demonstrate the performance of NeurVec on widely used numerical solvers with consistent performance improvement.We perform NeurVec with four solvers: Euler, improved Euler, third-order Runge-Kutta (RK3), and fourth-order Runge-Kutta (RK4) (see the Methods section) on a high-dimensional spring-chain system. 59The spring-chain system we consider describes the motion of d masses linked by d + 1 springs, and the springs are placed horizontally with the two ends connected to a fixed wall.The ODE of the system is given by where the variables q i and p i represent position and momentum of the ith mass, respectively, i = 1  Next, we demonstrate the performance of NeurVec in terms of accuracy and speed.NeurVec learns from the simulations of ∆ F t = 1e − 3 and is applied to the numerical solver with ∆ C t = 2e − 1.We characterize accuracy by the MSE between the reference and the simulated solution.In the short-term simulation on the time interval [0, 17], the numerical solutions of the coarse step size (∆ C t ≥ 1e − 1) incur considerable simulation error and become unstable if Euler and the improved Euler are used (Fig. 2a).By contrast, NeurVec (∆ NV t = 2e − 1) achieves a stable solution with accuracy comparable to that of the fine step size ∆ F t = 1e − 3 (Fig. 2a), which means that NeurVec can overcome the stability restriction.These observations indicate that NeurVec learns the error distribution from the fine step size dataset and compensates for errors caused by the use of the coarse step size, demonstrating that NeurVec is compatible to these solvers.To better display the runtime, we benchmark the runtime of ∆ C t = 2e − 1 as one unit.The use of NeurVec increases for a certain runtime (ϵ ≤ 0.38) for a single step (compared with ∆ C t = 2e − 1), but NuerVec has accuracy comparable to that of ∆ F t = 1e − 3, which needs 200 steps for integrating over the time interval 2e − 1.The runtime of ∆ F t = 1e − 3 is much higher than that of NeurVec (∆ C t = 2e − 1) (P value ≪ 0.001 under two-sided t-tests), and NeurVec enables these numerical methods to have more than 150× speedup on the spring-chain systems (Fig. 2b).Energy error

NeurVec on complex dynamical systems.
We verify the effectiveness of NeurVec on challenging complex systems, including the classical chaotic system Hénon-Heiles system, elastic pendulum, and K-link pendulum.For more challenging systems, such as the Kuramoto-Sivashinsky equation (KSE), please refer to the supplementary materials..The chaotic system is sensitive to perturbation to the initial state, and small errors are increased exponentially by the dynamics.For all of these examples, we generate the testing set using RK4 with step size 1e − 4 while the training set is generated with ∆ F t = 1e − 3. The initial conditions are randomly and uniformly sampled on a range of values (see the supplementary material).NeurVec is applied to RK4.
(1) Hénon-Heiles system.The Hénon-Heiles system is a Hamiltonian system 60 that describes the motion of a body around a center on the x-y plane.Let (q x , q y ) and (p x , p y ) denote the positions and momenta of a particle, respectively.The ODE is given by d dt (q x , q y , p x , p y ) = p x , p y , −q x − 2λq x q y , −q y − λ(q The Hamiltonian (energy) function H, defined by must be conserved during the time evolution.This property is used as an additional metric to evaluate the accuracy of our method.We characterize the energy error by the absolute difference between the energy of the simulated trajectory and the initial energy.The datasets are generated with initial energy between [ 1 12 , 1 6 ] and −1 < q x < 1, −0.5 < q y < 1 such that the equipotential curves of the system form an inescapable interior region and exhibit chaotic behavior 61 .The simulations are uniformly sampled every time interval η = 5e − 1.
We find that NeurVec (∆ NV t = 5e − 1) vastly improves the accuracy of the ODE solvers, achieves almost the same accuracy as RK4 with ∆ F t = 1e − 3 on the time interval [0, 42.5] (Fig. 3a), and works well for a much larger time interval [450, 500] (Fig. 3c).Furthermore, NeurVec with ∆ NV t = 5e − 1 almost maintains the same system energy as does the reference method with ∆ F t = 1e − 3 (Fig. 3d).To illustrate the error correction capability of NeurVec, we visualize three trajectories of the first three components (q x , q y , p x ) in Fig. 3e.For Examples 1-3 of Fig. 3e, NeurVec with ∆ NV t = 5e − 1 produces orbits similar to those of the reference method with ∆ F t = 1e − 3 while having an energy error of the same magnitude.Furthermore, the reference method with ∆ C t = 5e − 1 yields a larger energy error and pathwise difference.Integrating with ∆ F t = 1e − 3 over the time interval η = 5e − 1 takes 500 steps, so it is not surprising that the runtime for ∆ F t = 1e − 3 is much larger than NeurVec with ∆ NV t = 5e − 1.Based on our test, NeurVec (∆ NV t = 5e−1) reaches more than 390× speedup over the reference method with ∆ F t = 1e−3 (Fig. 3b).
(2) Elastic pendulum.The elastic pendulum describes a point mass connected to a spring swinging freely (Fig. 4a), which may exhibit chaotic behavior under the force of gravity and spring 62 .We denote θ as the angle between the spring and the vertical line and r as the length of the spring.θ and ṙ correspond to the time derivative of θ and r, respectively.The motion of this system is governed by the ODE, where k, m, l 0 , and g are spring constant, mass, original length, and gravity constant, respectively.The initial length of r is r(0) = l 0 = 10, ṙ and θ are initialized by constant 0, and θ is randomly sampled from the uniform distribution U([0, π 8 ]).The simulations in the training and testing sets are uniformly sampled every time interval η = 1e − 1. NeurVec is trained on the simulation generated by ∆ F t = 1e − 3. NeurVec (∆ NV t = 1e − 1) has accuracy of the same order as does ∆ F t = 1e − 3 on both short-term prediction (time interval [0, 8.5]) (Fig. 4b) and long-term prediction (time interval [25, 50]) (Fig. 4d).NeurVec (∆ NV t = 1e − 1) is much faster than ∆ F t = 1e − 3 (P value ≪ 0.001 under two-sided t-tests), reaching about 70× speedup (Fig. 4c).
(3) K-link pendulum.A K-link pendulum is a body suspended from a fixed point (Fig. 4a) with K rods and K bobs so that the body can swing back and forth under gravity. 63The system exhibits chaotic behavior.For simplification, the length of each rod and the mass of each bob are set to 1, and the gravity constant g is set to 9.8.Let variables θ := (θ 1 , θ 2 , • • • , θ K ), where θ i is the angle between the ith rod and the vertical axis.The system is governed by the ODE Here , where c(i, j) = K − max(i, j) + 1, for i, j = 1, • • • , K. We use the example of a 2-link pendulum (K = 2) to verify the advantage of NeurVec in terms of efficiency.The trajectory (θ, θ) is a 4-dimensional vector.The initial conditions θ(0) are randomly sampled on U[0, π/8], and θ are set to zero (see the supplementary material).NeurVec (∆ NV t = 1e − 1) significantly improves the accuracy of ∆ C t = 1e − 1, and achieves accuracy similar to that of RK4 (∆ F t = 1e − 3) for both short-term and long-term prediction (Fig. 4b and Fig. 4c).Moreover, NeurVec has over 63× speedup (Fig. 4d).Furthermore, Fig. 4e illustrates the energy error over time.We can see that NeurVec effectively conserves energy in simulations of a K-link pendulum and an elastic pendulum.

Analysis
In this section we provide further analysis of the performance of NeurVec from three aspects: (1) NeurVec maintains the statistic of the solutions from large-scale simulation, which is crucial for constructing reliable models; (2) NeurVec learns the leading error term of the numerical solver, which enables a more accurate estimation for each step; and (3) we compare the evaluation time and solution error among solver with fine or coarse step size and NeurVec.

Maintaining solution statistics in large-scale simulations.
We validate the performance of NeurVec on producing consistent statistical observations for ensemble forecasting.
The ability to enable a large step size for a set of sampled initial conditions is critical for real applications such as weather forecasting.We visualize the time series histogram of the testing set for variables (a) r and ṙ in the elastic pendulum (10) and (b) q y and p y in the Hénon-Heiles system (8) in Fig. 5.The time series histogram is generated by dividing axes into 800 × 100 bins and counting the curves that cross the bins.
For the elastic pendulum, we find that starting from a time T ≥ 25, the statistical difference of ṙ and r between ∆ F t = 1e − 3 and ∆ C t = 1e − 1 becomes larger.When the step size is ∆ F t = 1e − 3, ṙ exhibits periodic behavior, which is in accordance with the periodic variation of the spring during its extend-retract.However, let ∆ C t = 1e − 1, ṙ and r show a trend of approaching specific values, and the change range gradually narrows.On the other hand, the simulations with NeurVec (∆ NV t = 1e − 1) have a pattern smilar to that of ∆ F t = 1e − 3. We have a similar observation for q y and p y in Hénon-Heiles system (Fig. 5b).Therefore, we conclude that NeurVec produces more accurate solutions compared with the reference method with large step size, enabling better and more consistent statistical observation.10) and (b) q y and p y in the Hénon-Heiles system (8).The color represents the number count (the lighter color and the larger frequency).The solutions generated by the solver with coarse step size exhibits a trend of convergence to a specific value, while solutions of the solver with fine step size are distributed within a range, and NeurVec with coarse step size produces a histogram visually identical with that of the solver with fine step size.This result shows that NeurVec has a more accurate solution than does the solver with fine step size.

Learnability and generalizability.
In Eq.( 4), we start from the mathematical expression of the error term err n (k, ∆t, u(t)) in the form of and consider using a neural network to approximate the sum of all error terms.However, neural networks are generally regarded as black box functions with a lack of interpretability.Therefore, in order to explain the good performance of NeurVec and confirm the learnability of the error terms, in this section, we explore and visualize what the neural network in NeurVec learns.We consider solving a 1-link pendulum with the Euler method.Our consideration for testing NeurVec on this system is based on the following motivations.First, the dimension of the state is 2, which facilitates error visualization on phase space.Second, we derive the error term of the Euler method explicitly through the Taylor formula: where ∇f is the Jacobian matrix of f .The second-order term 1 2 (∇f )f (u)∆t 2 is the leading error term of the Euler method, which is supposed to be captured by NeurVec from data of fine step size.Denote R NV (u) := ∥NeurVec(u)∥ 2 2 as the norm of error learned by NeurVec and R EL (u) := 1 2 (∇f ) (f (u) ∆t 2 2 2 as the norm of the leading error term of the Euler method.To train NeurVec for the Euler method, we generate the dataset by randomly sampling the initial conditions of θ and θ from uniform distributions U([0, π/2]) and U([0, 0.5]), respectively, and then use the Euler method to simulate the data with dt = 1e − 3. We train NeurVec with coarse step size dt = 1e − 1.
We found that the learned error R NV (Fig. 6a) is visually consistent with the leading error R EL of the Euler method (Fig. 6b).The squared difference only a small part of the difference near the boundary is relatively large (Fig. 6c).The relatively significant differences near the boundary may be attributed to the limited availability of data that encompasses those boundary regions.
(see the supplementary material for details).Through the training data of high accuracy, NeurVec captured the leading error term of the numerical solver.NeurVec may even capture the higher-order error terms, which enables the use of a coarse step size.Furthermore, we can analyze the generalization ability of NeurVec from Fig. 6.On one hand, the results displayed in Fig. 6c demonstrate that NeurVec is capable of generalizing well on a continuous phase space, even when trained using limited and discrete data.This suggests that NeurVec exhibits good interpolation generalization abilities.On the other hand, it is noteworthy that the initial conditions of our training data were sampled from U([0, π/2]) and U([0, 0.5]), and the testing range shown in Fig. 6 shows the neural network's effective approximation of the error term and implicit prior information, indicating a certain degree of extrapolation capability.Such findings suggest that NeurVec demonstrates a promising potential for generalizing beyond the training data.

Theoretical analysis.
We analyze the runtime and the global error in the solution approximated with NeurVec.Let 0 = t 0 < t 1 < • • • < t pk = T be uniform points on [0, T ] and ∆t = T pk .We compare the runtime of fine and coarse step size.If the step size ∆t is used, then the number of steps for integration is T ∆t .If the step size is k∆t, then the number of steps needed is T k∆t .ϵ is the ratio of the runtime of NeurVec to that of scheme S for one step.Hence, when NeurVec is used to integrate with step size k∆t, we need ϵ × 100% extra time for each step; and the time becomes O( T (1+ϵ) k∆t ).Next we study the error of solvers with fine or coarse step size.For simplification, we focus on the Euler solver and characterize the global discretization error (difference between the true solution and the estimated solution) at the time T .When the Euler scheme is used, Proposition 0.1.We assume that (1) f is Lipschitz continuous with Lipschitz constant L and ( 2) the second derivative of the true solution u is uniformly bounded by M > 0, namely., ∥u ′′ ∥ ∞ ≤ M on [0, T ].Then, using (13), we have For the proof, see the work of Atkinson et al. 64 .Proposition 0.1 shows that the Euler method converges linearly.The error is O(∆t) when the step size ∆t is used.Similarly, the error becomes O(k∆t) when the step size is k∆t.We next derive the error for the Euler method with step size k∆t using NeurVec.The iterative formula is given by ûk 10/27 We can use the following loss function to identify the learnable parameter Θ in NeurVec.V n denotes the residual error for each term.

LS = 1 p
In the next theorem we characterize the error of NeurVec (k∆t) by the quality of the training data and the neural network training error.In addition to the assumption in Proposition 0.1, we assume NeurVec is Lipschitz continuous and the Lipschitz constant is of order k∆.This assumption is reasonable based on the following motivation.According to Taylor expansion v(t + ∆t) = v(t) + v ′ (t)∆t + o(∆t), from our objective we expect that NeurVec ∼ o(k∆t).
Theorem 0.1.Assumptions ( 1) and ( 2) in Proposition 0.1 hold.In addition, we assume that NeurVec is Lipschitz continuous with Lipschitz constant k∆tL N V , which is independent of θ.Then the error is ( The first term in the right-hand side of ( 16) comes from the error of the training data, Euler simulation with step size ∆t, while the second term is the training error of NeurVec.A series of works [65][66][67] utilize a neural tangent kernel to prove the global convergence of a neural-network-based least squares method.Under the assumptions of training data distribution, when the width of one hidden layer network is sufficiently large, gradient descent converges to a globally optimal solution for the quadratic loss function.We might assume that the training error LS → 0 as the increasing update iteration.Then in (16),

Discussion
To address the speed-accuracy trade-off in large-scale simulations of dynamical systems, we proposed NeurVec, a deep learning based corrector in the numerical solver, which can compensate for the error caused by the use of coarse step size for numerical solvers.Through extensive experiments and preliminary theoretical evidence, we show that NeurVec is general and can be applied to widely used explicit integration methods and learn the error distribution through simulations with fine step size.However, there are still some limitations while using NeurVec.
Notice that, according to the analysis in Table 1, NeurVec can achieve fast simulation, i.e., a speedup of O(k/(1+ϵ)) times, under the condition that, after the neural network is trained, it can maintain the same accuracy as the training data at a sufficiently large step size.However, there exist some complex differential equations, such as ultra-high dimensional dynamical systems, may not satisfy this condition and prevent the term 1 2 in Eq.( 16) from converging to 0, which would make NeurVec not necessarily achieve the target accuracy O(∆t).Specifically, we consider a general high-dimensional dynamical system with dimension d, and a shallow network structure with a width of N 1 as shown in Fig. 7 as an example.In addition, Lu et al 68 reveal that for an infinitely deep network with input dimension d and some mild assumptions, it can achieve an arbitrary accuracy while the width of network is at least O(d).Therefore, the dimension of differential equations that NeurVec can currently handle will not exceed O(N 1 ), and in practice, since the neural network depth is limited, the available dimension may be much lower than O(N 1 ).To improve the accuracy, the easiest way is to widen the neural network as much as possible or consider more advanced learning architectures and training algorithms, for example, an attention mechanism, [69][70][71] neural network structure search, [72][73][74] , and large-scale pretrained models 75 .However, these improvements will increase ϵ, and the speedup ratio O(k/(1 + ϵ)) will become relatively small at this time, even accelerated simulations cannot be achieved while ϵ is large enough.
In Table 2, we consider the Sping-chain system to explore the impact of ultra-high dimensional dynamical systems on NeurVec.In this paper, we set N 1 = 1024, the experimental results show that as the dimension increases, the accuracy of NeurVec becomes harder to maintain, especially in the case that the dimension close to κ, which is consistent with our analysis.
In addition, we may extend NeurVec in several ways.
• Generalizing NeurVec.We implemented NeurVec using the simulation data with fixed system parameters and fixed step size.Recently, the neural network, as a universal approximator, shows promising results on learning the nonlinear continuous operator.Motivated by operator learning, 45,50 we may add additional dimensions to the input of NeurVec, such as the step size for integration and the physical parameters in the system.Then NeurVec can be trained with more diverse simulation data, such that NeurVec can be used with different dt for systems with varied physical parameters (such as k and m in the spring-chain system (7)).

Method
Step Table 2.The impact on accuracy (MSE at evaluation time T = 20) under the ultra-high dimension systems.All experimental settings are following the experiments in Fig. 2. The "Dim" is the dimension of the dynamical system (Spring-chain).We also consider the student t-test with a significant level of 0.05 for NeurVec.If the p-value P < 0.05, there exists a significant difference between NeurVec and the corresponding traditional numerical solvers and it also means NeurVec can not maintain consistent accuracy.
• Continual model update.Neural networks may sometimes have inaccurate predictions when encountering abnormal situations.Therefore, we may need to maintain and update NeurVec regularly to learn from new data.The simplest strategy is to retrain the network from scratch, but it needs considerable computing resources to train and memory resources to store the data.To address such a problem, we can fine-tune NeurVec via incremental learning 76 for a small amount of newly collected training data to achieve low-cost model updates.
• Numerical solvers and cutting-edge problems.In this paper, we consider four forward numerical solvers and four kinds of ODE problems.NeurVec can be extended to other types of numerical solvers, such as backward methods and implicit methods.These methods are mainly aimed at improving the simulation accuracy, but the simulation cost for one step may be large.For more intricate cutting-edge problems, NeurVec may require further tuning of training parameters, neural network architecture, and data processing, as explored in Refs. 47,49,77 .Nevertheless, the NeurVec framework we introduce is general and easily extendable.
We believe NeurVec has the potential to be adopted for more complex dynamical systems.

Methods
Datasets.We summarize the simulated training and testing datasets used in the main text in Table 3.For each dataset we integrate with the step size δ using the numerical solver over N random initializations.We obtain the discrete solutions every δ up to the model time T .Next, we sample the solution every time interval η (η is a multiple of δ).Numerical solvers.We introduce four numerical solvers used in our paper: the Euler method, improved Euler method, and 3rd-and 4th-order Runge-Kutta methods.These solvers have different S(f, u n , ∆t n ) in the iterative formula (2).( 1) The Euler method can be written as

Problem
12/27 Rational Input Output (2) The improved Euler method 78,79 can be written as The improved Euler method is a numerical method that uses an implicit trapezoidal formulation to improve the accuracy of the Euler method.Specifically, it first takes a one-step Euler method to obtain ũn+1 = u n + ∆t n f (u n ) and then uses the implicit trapezoidal formula to obtain Even though the improved Euler method requires more computation compared with the Euler method, it has a higher accuracy with a local error of O(∆t 3 ).( 3) The mth-order Runge-Kutta method can be written as For the 3rd-order RK method, the number of stages m = 3, the coefficients λ 1 = λ 3 = 1 6 and λ 2 = 2 3 , and the update rule . For the 4th-order RK method, m = 4, λ 1 = λ 4 = 1 6 and λ 2 = λ 3 = 1 3 .The update rule . Runge-Kutta methods, especially the 4th-order Runge-Kutta method, are widely used in engineering and natural sciences.The Euler method and improved Euler method can also be seen as special Runge-Kutta methods.When using the larger order m in Eq. ( 19), we need to compute iteratively a series of K i , i = 1, 2, ..., m, increasing the computation cost for each step.Implementation details of NeurVec.We use a fully connected neural network to model NeurVec in Eq. ( 5).The fully connected feed-forward neural network is the composition of L nonlinear functions: where , σ is a nonlinear activation function, for example, a rectified linear unit (ReLU) σ(x) = max{x, 0} or hyperbolic tangent function tanh(x), d is the dimension of the state, and N is the batch size.Each h ℓ is referred to as a hidden layer, where N ℓ is the width of the ℓth layer.In this formulation, θ := {W a , W ℓ , b ℓ : 1 ≤ ℓ ≤ L} denotes the set of all parameters in ϕ, which uniquely determines the underlying neural network.In our implementation (Fig. 7), the feed-forward neural network is of one hidden layer (L = 1) with the width N 1 = 1024.The activation function used is the rational activation function 58 defended by where a i , 0 ≤ i ≤ , respectively.We optimize the ϕ for 500 epochs with the Adam optimizer.Moreover, we use the mean square error as the objective function in Eq. ( 6), and we set the initial learning rate to 1e-3.
Simulations on Hénon-Heiles system.10) in the main text, and q x and p x in the Hénon-Heiles system, i.e., Eq. ( 8) in the main text.The statistical difference of θ and θ among fine step size, coarse step size and NeurVec with coarse step size is not large.Yet that of q x and p x is large.10) in the main text.Unlike the results about r and ṙ in Fig. 5a of the main text, the θ and θ generated by different step sizes have similar trends, although there are minor differences among them.However, in (b) q x and p x in the Hénon-Heiles system, i.e., Eq. ( 8) in the main text, there is a consistent observation for the solutions under different step sizes with Fig. 5b in the main text.

23/27
The distribution of training data for 1-link pendulum.
In Fig. S3, we present the training data for the analysis in Figure 6 in the main text, where the background illustrates the difference between the leading error term and NeurVec.The relatively significant differences near the boundary might be attributed to the limited availability of data that encompasses those boundary regions.Furthermore, in regions with limited training data, such as when θ = −1.0 and θ = 1.5, the high accuracy demonstrated by our NeurVec also indicates to some extent the generalization capability.Application of NeurVec on the Kuramoto-Sivashinsky equation.
We intend to showcase NeurVec's potential in handling more intricate scenarios by presenting its numerical results on the Kuramoto-Sivashinsky equation (KSE), thus demonstrating its potential to more challenging systems.We consider KSE on a domain with periodic boundary conditions.The equation is given by: Here, L represents the domain length, and we choose L = 2π √ 0.085 .To solve this equation numerically, we use the exponential time-differencing fourth-order Runge-Kutta method (ETDRK4) as a forward scheme S. The spatial variable x is discretized with a resolution of 48 uniformly spaced points on the interval [0, L].
Our training dataset consists of a single trajectory with an initial state given by u(x, 0) = cos((π/L)x).This trajectory is generated with a time step size of ∆ F t = 2 × 10 −2 , and the model time reaches T = 200k.For testing, we have reference simulations with a smaller time step size of 5 × 10 −3 .
NeurVec is trained using the simulations with ∆ F t = 2 × 10 −2 to learn the error correction of ETDRK4 with a coarse time step size ∆ NV t = 1.To assess accuracy, we use the mean square error (MSE) between the reference solution and the simulated solution of ∆ NV t = 1, averaged over 20 different initializations.
Fig. S4a presents the MSE, where the solid curves and the shaded areas represents the mean and one standard deviation, respectively, calculated from 20 simulation runs.We can observe that NeurVec manages to maintain short-term accuracy, closely matching the reference solution, up until the model time reaches T = 75.However, beyond this point, NeurVec starts to deviate significantly from the reference solution, yet still maintains smaller error than the simulations of ∆ C t = 1.Fig. S4b shows two examples of the visualized solution for a short period of time.Even though the highly chaotic nature of KSE, NeurVec can still maintain the short-term accuracy as the reference one.

Figure 1 .
Figure 1.The structure of NeurVec.a, Deployment of NeurVec for fixed step size solver.During offline training, NeurVec learns from the solutions of high accuracy to characterize the error caused by the use of a coarse step size.During inference, NeurVec applies to the solver and integrates with the coarse step size.b, Comparison of the tradition solver and solver armed by NeurVec.Left: Forward loop of traditional solver with step size ∆t.Right: Forward loop of NeurVec with step size k∆t.

Figure 2 .
Figure 2. Application of NeurVec on different numerical solvers.a, The mean square error (MSE) between the reference solution and the numerical solutions with different configurations (step size ∆C t = 1e − 1, ∆ C t = 2e − 1, ∆ F t = 1e − 3, and NeurVec (∆ NV t = 2e − 1)) on the spring-chain system, averaged over 10.5k simulations.The reference solution is obtained by using the 4th-order Runge-Kutta with step size 1e − 4. NeurVec is trained on the simulations of ∆ F t = 1e − 3. The numerical solution of ∆ C t = 2e − 1 becomes unstable using Euler or the improved Euler formula, while NeurVec (∆ NV t = 2e − 1) achieves a stable solution with accuracy comparable to that of ∆ F t = 1e − 3. b, The (normalized) runtime of the numerical solver with ∆ F t = 1e − 3, ∆ C t = 2e − 1, and NeurVec (∆ NV t = 2e − 1).The runtime of ∆ C t = 2e − 1 is benchmarked to one unit.NeurVec (∆ NV t = 2e − 1) has accuracy similar to that of ∆ F t = 1e − 3 and is over 150 times faster.

Figure 3 .
Figure 3. Performance comparison on the Hénon-Heiles system.a, MSE with varied time on the time interval [0, 42.5] under different configurations (step size ∆ C t = 5e − 1, ∆ C t = 2e − 1, ∆ F t = 1e − 3, and NeurVec (∆ NV t = 5e − 1)).b, MSE with varied time on the longer time interval [450, 500].The upper and lower bounds of the light color indicate the maximal and minimal error, respectively.c, The (normalized) runtime of the numerical solver with ∆ F t = 1e − 3, ∆ C t = 5e − 1, and NeurVec (∆ NV t = 5e − 1).d, Energy error with varied time on [0, 50].e, We provide three examples of trajectories projected on the coordinates (q x , q y , p x ), and the corresponding energy error on [0, 50].More examples can be found in the supplementary material.

Figure 4 .
Figure 4. Performance comparison on the pendulum systems.a, The pendulum systems.b, MSE with varied time on the time interval [0, 8.5].c, MSE with varied time on the longer time interval [25, 50].d, The runtime comparison among the numerical solvers with ∆ F t = 1e − 3, ∆ C t = 1e − 1, and NeurVec (∆ NV t = 1e − 1).e, The energy error of two pendulum systems.

Figure 5 .
Figure 5.Time series histogram.We visualize the time series histogram of a test set for variables (a) r and ṙ in the elastic pendulum(10) and (b) q y and p y in the Hénon-Heiles system(8).The color represents the number count (the lighter color and the larger frequency).The solutions generated by the solver with coarse step size exhibits a trend of convergence to a specific value, while solutions of the solver with fine step size are distributed within a range, and NeurVec with coarse step size produces a histogram visually identical with that of the solver with fine step size.This result shows that NeurVec has a more accurate solution than does the solver with fine step size.

2 2Figure 6 .
Figure 6.Numerical error visualization on the phase space of 1-link pendulum.a, The square sum of leading (second-order) error term of the Euler method, denoted by R EL .The error is calculated by using the true dynamics f .b, The square sum norm of error compensation learned by NeurVec, denoted by R NV .c, The difference between the leading error term and NeurVec.

Figure 7 .
Figure 7. Neural network structure in NeurVec.NeurVec consists of two linear transformations layers (yellow) and one nonlinear activation function layer (red), which is a feed-forward neural network of one hidden layer (L = 1) with the width N 1 = 1024.

Fig
Fig. S1 displays more examples of simulations on Hénon-Heiles system, an implement to Fig. 3e in the main text.

Figure S1 :
Figure S1: More performance comparison on the Hénon-Heiles system.We provide five additional examples of trajectories projected on the coordinates (q x , q y , p x ).NeurVec (∆ NV t = 5e − 1) produces the most orbits similar to ∆ F t = 1e − 3.

Fig
Fig. S2 displays the time series histogram of θ and θ in the elastic pendulum, i.e., Eq. (10) in the main text, and q x and p x in the Hénon-Heiles system, i.e., Eq. (8) in the main text.The statistical difference of θ and θ among fine step size, coarse step size and NeurVec with coarse step size is not large.Yet that of q x and p x is large.

Figure S2 :
Figure S2: Additional experiments for time series histogram.We visualize the time series histogram of the test set for variables in the elastic pendulum and Hénon-Heiles system.The color represents the number count (the lighter color and the larger frequency).(a), θ and θ in the elastic pendulum, i.e., Eq. (10) in the main text.Unlike the results about r and ṙ in Fig.5aof the main text, the θ and θ generated by different step sizes have similar trends, although there are minor differences among them.However, in (b) q x and p x in the Hénon-Heiles system, i.e., Eq. (8) in the main text, there is a consistent observation for the solutions under different step sizes with Fig.5bin the main text.

Figure S3 :
Figure S3: The training data (orange dots) used in 1-link pendulum.The background represents the difference between the leading error term and NeurVec.

a b Example 1 Example 2 Figure
Figure S4: a, The mean square error (MSE) between the reference solution and the numerical solutions with different configurations (step size ∆ C t = 1 and NeurVec with ∆ NV t = 1) for KSE.b, we provide two examples of different initializations and plot the absolute differences.

Table 1 .
Comparison of evaluation time and theoretical error (based on the Euler scheme) among the numerical solvers with fine (∆t) or coarse step size (k∆t) and NeurVec (k∆t).Here ϵ denotes the ratio of the runtime of NeurVec to that of scheme S for one step.The fixed step size solvers suffer from the accuracy-speed trade-off on the step size.NeurVec learns from the solutions of fine step size.Then NeurVec is applied to the solver and integrates with the coarse step size (k∆t) but still has the theoretical accuracy of the fine step size, O(∆t).

Table 3 .
Summary of the simulated datasets used in the main text.