Online real-time learning of dynamical systems from noisy streaming data

Recent advancements in sensing and communication facilitate obtaining high-frequency real-time data from various physical systems like power networks, climate systems, biological networks, etc. However, since the data are recorded by physical sensors, it is natural that the obtained data is corrupted by measurement noise. In this paper, we present a novel algorithm for online real-time learning of dynamical systems from noisy time-series data, which employs the Robust Koopman operator framework to mitigate the effect of measurement noise. The proposed algorithm has three main advantages: (a) it allows for online real-time monitoring of a dynamical system; (b) it obtains a linear representation of the underlying dynamical system, thus enabling the user to use linear systems theory for analysis and control of the system; (c) it is computationally fast and less intensive than the popular extended dynamic mode decomposition (EDMD) algorithm. We illustrate the efficiency of the proposed algorithm by applying it to identify the Van der Pol oscillator, the chaotic attractor of the Henon map, the IEEE 68 bus system, and a ring network of Van der Pol oscillators.


Introduction
The field of dynamical systems was born with the works of Sir Issac Newton [1].Since its humble beginnings, it has developed into a matured branch of pure mathematics with applications to most branches of science and engineering.For almost three hundred years, the approach to studying dynamical systems was model-based, in which the system under study is modeled from first principles.However, this approach has severe limitations when studying systems like power networks, biological networks, stock markets, etc., which are inherently highly nonlinear and typically large-scale.As such it is almost impossible to model these dynamical systems from first principles.The way to circumvent this problem is to use data-driven techniques, in which the governing equations of motion are learned directly from time-series data of the states of the system.
Of the various data-driven learning methods, in recent years, the Koopman operator approach [2,3,4,5,6,7,8,9] has been used extensively for the data-driven learning of dynamical systems.Compared to machine learning (ML) methods, the Koopman operator framework has multiple advantages, with the major factor being the fact that the amount of training data required by Koopman operator approaches is far less compared to ML methods.Furthermore, the Koopman operator is a linear operator that generates an equivalent linear representation of the nonlinear system, which allows us to use concepts from linear systems theory for the analysis and control design for nonlinear systems [10,11].Given a dynamical system, instead of looking at the evolution of trajectories, the Koopman operator governs the evolution of functions on the state space under the dynamical map.The main advantage of this framework, as mentioned earlier, is the fact that the Koopman operator is a linear operator, but the trade-off is that it is an infinite-dimensional operator.Hence, given any general nonlinear system, the corresponding Koopman operator generates an infinite-dimensional linear system, which is an exact representation of the Consider a discrete-time dynamical system where T : X ⊂ R N → Z is assumed to be at least of class C 1 .Associated with the dynamical system (1) is the Borel-σ algebra B(X) on X and the vector space M(X) of bounded complex-valued measures on X.
With this, two linear operators, namely, the Perron-Frobenius (P-F) and the Koopman operator, can be defined as follows [2]: Definition 1 (Perron-Frobenius Operator).The Perron-Frobenius operator P : M(X) → M(X) is given by where δ T (x) (A) is the stochastic transition function that measures the probability that point x will reach the set A in one-time step under the system mapping T .
Definition 2 (Invariant measures).Invariant measures are the fixed points of the P-F operator P that are also probability measures.Let μ be the invariant measure then, μ satisfies If the state space Z is compact, it is known that the P-F operator admits at least one invariant measure.
Definition 3 (Koopman Operator).Given any h ∈ F, U : F → F is defined by where F is the space of function (observables) invariant under the action of the Koopman operator. g < l a t e x i t s h a 1 _ b a s e 6 4 = " z Z v c J R n e I U 3 5 9 F 5 c d 6 d j 0 V r w c l n j u E P n M 8 f t w e M 4 A = = < / l a t e x i t > X < l a t e x i t s h a 1 _ b a s e 6 4 = " k p Both the P-F and the Koopman operators are linear operators even if the underlying system is nonlinear and while the analysis is made tractable by linearity, the tradeoff is that these operators are typically infinite-dimensional.In particular, the P-F and Koopman operators lift a dynamical system from a finitedimensional space to generate an infinite-dimensional linear system on the space of distributions and functions respectively.

Robust Koopman Operator
The Koopman operator is infinite-dimensional, but for computational purposes, one has to compute a finite-dimensional approximation of the Koopman operator from time-series data.The most commonly used approach for computing the finite-dimensional approximation is known as Extended Dynamic Mode Decomposition (EDMD) [4].

Extended Dynamic Mode Decomposition
be M data points from a N -dimensional system x t → T (x t ), which evolves on R N .Let be the set of vector-valued Koopman observables such that each Then the finite-dimensional approximation K ∈ R K×K is obtained as the solution of the optimization problem min where G and A are given by (2).

Robust Koopman Operator
The standard approach to compute the Koopman operator from time-series data is by solving the optimization problem (3).However, in presence of noise in the data, (3) leads to erroneous Koopman operators and one has to explicitly take into account the noise in the data, while computing the Koopman operator.We assume that the measurements are of the form where x m t is the measured value of the states at time t, x t is the true value of the states at time t and δx t is the noise in the measurement at time t.We further assume ∥ δx i ∥≤ Λ for some Λ > 0, that is δx i ∈ ∆, where ∆ is a compact subset of R N .The uncertainty in the data acts as an adversary which tries to maximize the residual error of the optimization problem which computes the Koopman operator, and hence to obtain the Koopman operator K for uncertain data, a robust optimization problem is formulated as the following min − max optimization problem [26,6]. where with K, G δ , A δ ∈ C K×K and ∆ is the bounded uncertainty set.
The robust optimization problem (5), is in general non-convex because the cost J may not be a convex function of δ.
Proposition 4. The optimization problem (5) can be approximated as where U is a compact set in R K×K .
Proof.From Taylor series expansion we have, the first derivative of Ψ(x) at x i .Hence, Hence, δG belongs to a compact set U 1 .Similarly, one can show A δ ≈ A + δA and δA belongs to a compact set U 2 .Letting U = U 1 ∪ U 2 , proves the proposition.
With this, we have the following theorem, which derives a regularized least-squares optimization problem whose solution is the Robust Koopman Operator.
is equivalent to the following optimization problem where λ > 0 is a constant dependent on the bound of the noise.
This follows from the fact that For two matrices A and B, let A ⊗ B denote the Kronecker product of A and B. Let K be the vector form of K and let A and δA be the vector forms of A and δA, respectively.Then, the min-max optimization problem can be written as where I K is the K × K identity matrix.Writing G ⊗ I K as Ĝ and δG ⊗ I K as δ Ĝ, the optimization problem (10) can be written as where be the worst-case residual.Then, where Then, Hence, from ( 13) and ( 15), the worst case residual is Since, K is a constant, the K that minimizes r in ( 16) is the same K that minimizes

Online Learning of Robust Koopman Operator
The Koopman operator is generally computed by solving the optimization problem (9) or directly using the formula where I is the identity matrix of appropriate dimensions.Note that for computing the Koopman operator one needs to use the entire dataset and hence when a new data point streams in the Koopman operator needs to be recalculated from scratch using the new larger data set.However, this requires the inversion of a matrix, which is computationally expensive, especially for data-sets obtained from large-dimensional systems.This necessitates developing a recursive algorithm for Robust Koopman operator computation.In this section, we describe an algorithm that computes the Koopman operator recursively and thus reducing the computational cost.Let be M data points obtained from simulation of a dynamical system x → T (x) or from an experiment, where be the data points in the lifted space (R K ), where the points x i and y i are mapped by the Koopman dictionary functions Ψ.Let be the Koopman operator obtained by using the M data points.Now, a new data point (x M +1 , y M +1 ) is acquired.The problem is to update the Koopman operator K M to K M +1 , without explicitly computing the inverse of (G M +1 + λI).
Note that (20) can be rewritten as where and Ψ(X i ) and Ψ(Y i ) are the ith columns of Ψ(X M ) and Ψ(Y M ) respectively.Now, as the new data point y M +1 = T (x M +1 ) streams in, the updated Koopman operator Hence, using the Matrix Inversion Lemma, we have Ĝ−1 Moreover, Hence, from ( 23), Equation (27) gives the formula for updating the Koopman operator as new data streams in, without explicitly computing the inverse at every step, thus reducing the computational cost and hence improving efficiency.

Initialization of the Algorithm
Equation ( 27) gives the updated Koopman K M +1 operator in terms of quantities computed from the previous time step.Note that the updated Koopman operator K M +1 depends on an inverse, namely, Ĝ−1 M .Hence, for computing the Koopman operator K 1 , one needs to initialize both Ĝ0 and A 0 .One potential way out of this situation is to compute the Koopman operator K q using the initial q data points (x i , y i ), i = 1, 2, • • • , q, q < M as K q = Ĝ−1 q A q and use the corresponding Ĝq and A q to compute the updated Koopman operators K n , n > q.
For practical applications, we set Ĝ0 = λI K , A 0 = 0 K , where λ > 0, I K is the K × K identity matrix, and 0 K is the K × K zero matrix.
Remark 6. Choosing the initialization parameter δ can be tricky and usually one should run the algorithm multiple times, with different δ on a given training dataset and choose the one which has the lowest error for some validation dataset.
Algorithm 1 Algorithm for online Koopman Operator computation using streaming data.

Fix the dictionary functions
As a new data point (x M +1 , y M +1 ) streams in, lift the data point to R K using the dictionary function Ψ. 4. Update A M and Ĝ−1 M as

Simulation Results
In this section, we demonstrate and discuss the properties of the proposed framework for iterative learning of dynamical models from noisy time-series data.

Van der Pol Oscillator
Consider a noisy Van der Pol oscillator, whose equation of motion is given by ẍt where x ∈ R is the position variable, µ ≥ 0 is the damping co-efficient, σ > 0 is a constant, and ξ t ∈ R 2 is independent and identically distributed Gaussian noise of zero mean and unit variance.For simulation purposes, we chose µ = 0.8 and σ = 0.2 and data were sampled at an interval of 0.01 seconds.The phase portrait of the noisy Van der Pol oscillator is shown in Fig. 2. We considered a single trajectory data, from a random initial condition, as the training data and used 40 radial Gaussian basis functions as the Koopman observables.Fig. 3 shows that the RR-EDMD (Recursive Robust Extended Dynamic Mode Decomposition) algorithm recovers the stable limit cycle of the Van der Pol oscillator as new data-points stream in and the Koopman operator gets updated.In particular, Fig. 3(a) shows that the eigenvector corresponding to the unit eigenvalue of the Koopman operator obtained after 500 data points have streamed in has partially identified the stable limit cycle.However, as more data points stream in, the identified region of the limit cycle grows and in Fig. 3(b) we find that with 2000 data points, the complete limit cycle has been identified.However, the biggest advantage of the RR-EDMD formulation is the fact that it is much faster compared to EDMD.To compare the computational cost of RR-EDMD and EDMD, we present in Fig. 4 the computation times of both the algorithms for learning the Koopman operator as new data-points stream in.It can be seen the computation time of the RR-EDMD algorithm varies linearly with the number of iterates, whereas for EDMD it varies almost quadratically.This establishes a clear advantage of the RR-EDMD algorithm over EDMD and it is this fact that facilitates the use of RR-EDMD framework for online real-time learning of dynamical systems from streaming data.

IEEE 68 Bus System
We now discuss the application of the proposed RR-EDMD algorithm to a power system.We consider synthetic time-series measurements corresponding to the 68-bus power network with 16 generators shown in Fig. 5.The synthetic measurements are generated using GridSTAGE [27], a power system toolbox-based simulation platform that emulates PMU measurements corresponding to load changes or faults or adversarial actions.For the real-time learning of power system dynamics, random and large load changes are created using GridSTAGE.Although the synthetic time-series measurements obtained through GridSTAGE are free of noise, the practical PMU measurements will not be due to the effect of communication uncertainties.Hence, random noise is artificially added to the synthetic PMU measurements to mimic practical use cases and ensure the proper applicability of the proposed real-time learning framework.The noise-corrupted PMU measurements are shown in Fig. 6.Since the operating point of the underlying power network keeps changing due to random load changes or faults or due to the change in the operating conditions, the dynamics learnt from PMU data will not be valid for all operating conditions and need to be updated.Therefore, the proposed recursive learning method is particularly useful here for continuously learning the system dynamics from PMU data.
The PMU measurements record data at a rate of 40-80 measurements per second.In this study, we fix its rate to 50 measurements per second and each PMU measures several system states such as frequency, rate of change of frequency, voltage magnitude, voltage angle, current flow through the connected branches are available.Furthermore, to capture the noise in the real-time PMU measurements, we manually add the noise to the measurements such that the signal-to-noise ration (SNR) of the resultant data is 85dB.To demonstrate the proposed robust recursive Koopman operator to understand the system evolution from noisy data in real-time, we perform the analysis considering the important states, frequency and voltage magnitude.
Considering the initialization given in algorithm 1, a robust Koopman operator corresponding to the IEEE 68 bus system representing the underlying load changes is computed in real-time from streaming noisy PMU data.As the frequency and voltage magnitudes are considered at each bus, there is a total of 136 states and we use 150 Gaussian radial functions as observables, to compute the robust Koopman using RR-EDMD.Fig. 7 (a), (b) and (c) respectively show the eigenvalues of the learnt Robust Koopman operator after 100, 500 and 1000 measurements.Moreover, we can observe that the robust Koopman updated with every new measurement yields a stable representation (all eigenvalues lying inside the unit circle).Essentially, applying the proposed RR-EDMD, the power system dynamics is learnt using measurements from a few seconds.However, the Koopman operator learning is a continuous process as the PMU data streams in and the power system properties such as stability margins can be studied at any given time-point.Furthermore, we compared the computational time of RR-EDMD against that of EDMD, shown in Fig. 8.It can be seen that the proposed RR-EDMD outperforms the standard EDMD method.

Scalability of RR-EDMD with Dimension of the Underlying System
In the previous two examples, we compared the computation times of the proposed RR-EDMD method and the existing EDMD method and found that RR-EDMD outperforms EDMD.In this subsection, we analyze how the scalability of the RR-EDMD algorithm as the number of states in the underlying system increases.In particular, we considered a ring network (Fig. 9(a)) of Van der Pol oscillators and recorded the computation times of the Koopman operator using the RR-EDMD algorithm as the number of oscillators in the system increases.The equation of motion of the ith oscillator is given by ẍi where x i (t) is the position variable of the ith oscillator, x(t) = [x 1 (t), • • • , x n (t)] ⊤ , L i is the ith row of the network Laplacian, µ ≥ 0 is the damping constant, σ > 0 is a constant and ξ i (t) ∈ R 2 is an i.i.d.Gaussian noise of zero mean and unit variance.We varied the number of oscillators from 50 to 500, so that the number of states of the system varied from 100 to 1000.However, it is to be noted that the Koopman operator is computed in the space of the Koopman observables.In this example, we used 15 radial Gaussian basis functions per oscillator, so that the size of the Koopman operator varied from 750 × 750 to 7500 × 7500.For simulation purposes, we used the same µ = 0.8 and σ = 0.2 for all the oscillators.Fig. 9(b) shows how the computation time varies with the number of states.On the face of it, it seems that the computation time varies almost quadratically with the number of states.However, it should be noted as the number of states increases, so does the number of dictionary functions, leading to the quadratic nature of the plot.However, if the number of dictionary functions remains constant, the computation time varies linearly with the number of states (Fig. 4).Since the RR-EDMD algorithm scales almost linearly with the dimension of the system, it can be used for the identification of large-dimensional systems in real time as well.

Conclusions
In this paper, we address the important problem of learning the dynamics of a general dynamical system, from noisy measurements in real time.In particular, we use the Koopman operator framework for datadriven learning, so that one obtains a linear system representation of the underlying dynamical system.We  < l a t e x i t s h a 1 _ b a s e 6 4 = " L Z 0 l c h g u 9  resort to Robust Koopman operator estimation to mitigate the effect of measurement noise and we propose an iterative algorithm for recursively learning the Robust Koopman operator from streaming data.We show that the proposed framework is substantially faster than the existing EDMD algorithm, thus making it practical to learn the dynamics in an online fashion and in real time.We also demonstrate the efficiency of the proposed algorithm by applying it to identify the Van der Pol oscillator, the IEEE 68 bus system and a ring network of Van der Pol oscillators.

Figure 1 :
Figure 1: Perron-Frobenius and Koopman operators corresponding to a dynamical system.

Theorem 5 .
The optimization problem min K max δG,δA∈U

Figure 2 :
Figure 2: Phase portrait of noisy Van der Pol oscillator.

Figure 3 :
Figure 3: (a) The Koopman spectrum, computed from 500-time points data identifies the stable limit cycle of the Van der Pol oscillator partially.(b) After 2000 iterations, the Koopman spectra recover the entire stable limit cycle.

Figure 4 :
Figure 4: Comparison of computation time for recursive learning using normal EDMD and RR-EDMD.

Figure 5 :
Figure 5: One line diagram representation of IEEE 68 bus network.

Figure 6 :
Figure 6: Noisy time-series data of frequency and voltage obtained from PMUs.

2 <
m 9 d R h D M 4 h 0 v w 4 B o a c A 9 N a A G B J 3 i G V 3 h z p s 6 L 8 + 5 8 L E c L T r 5 z C n / g f P 4 A w l G S G g = = < / l a t e x i t > V l a t e x i t s h a 1 _ b a s e 6 4 = " k V 9 Z s f r P g E l M m 0 k n A d o N c e p Y W F I = " > A A A B 9 H i c b V D L S g M x F L 1 T X 7 W + q i 7 d B I v g q s x o U Z d F N y 4 r 2 A e 0 Q 8 m k m T Y 0 k 4 x J p l C G f o c b F 4 q 4 9 W P c + T d m 2 l l o 6 4 H A 4 Z x 7 u S c n i D n T x n W / n c L a + s b

4 < 5 <Figure 9 :
Figure 9: (a) Network of Van der Pol oscillators.(b) Computation time for recursive learning using RR-EDMD, as the number of states in the system increases.