Disentangling the stochastic behavior of complex time series

Complex systems involving a large number of degrees of freedom, generally exhibit non-stationary dynamics, which can result in either continuous or discontinuous sample paths of the corresponding time series. The latter sample paths may be caused by discontinuous events – or jumps – with some distributed amplitudes, and disentangling effects caused by such jumps from effects caused by normal diffusion processes is a main problem for a detailed understanding of stochastic dynamics of complex systems. Here we introduce a non-parametric method to address this general problem. By means of a stochastic dynamical jump-diffusion modelling, we separate deterministic drift terms from different stochastic behaviors, namely diffusive and jumpy ones, and show that all of the unknown functions and coefficients of this modelling can be derived directly from measured time series. We demonstrate appli- cability of our method to empirical observations by a data-driven inference of the deterministic drift term and of the diffusive and jumpy behavior in brain dynamics from ten epilepsy patients. Particularly these different stochastic behaviors provide extra information that can be regarded valuable for diagnostic purposes.

Systems under the influence of random forcing or in the presence of non-linear interactions with other systems can behave in a very complex stochastic manner [1][2][3][4] . The analysis of such systems must be based on determining characteristics and strength of fluctuating forces as well as on assessing properties of non-linear interactions. This leads to the problem of retrieving a stochastic dynamical system from measured time series. Addressing the question of how to extract a dynamical system from experimental data with a suitable analysis will provide important information on the properties of the system under consideration 3 .
A widely used non-parametric approach for the modelling of complex dynamical systems employs the conventional Langevin equation that is based on the first-and second-order Kramers-Moyal (KM) coefficientsknown as drift and diffusion terms -, and all functions and parameters of this modelling can be found directly from the measured time series 1,2 . The Langevin equation generates a continuous sample path. However, complex systems generally exhibit non-stationary dynamics that can also result in discontinuous sample paths of the corresponding time series, which might challenge the use of the Langevin equation.
There is now growing evidence 5-9 that a continuous-time modelling of time series of complex systems, which exhibit distinct characteristics such as heavy tails and occasionally sudden large jumps, should account for the presence of discontinuous jump components. Indeed, a non-parametric modelling of time series with jumps provides an attractive means of conducting research to gain intuition of such processes. Processes with jumps have been widely used to describe the random evolution of, e.g., neuron dynamics 10,11 , of soil moisture dynamics 12 , or of financial figures such as stock prices, market indices, and interest rates 13 . Nevertheless, one of the main problems in the study of discontinuous stochastic processes is estimating parameters that represent a jump and the distribution of its size. Another problem is to discriminate between variations caused by a continuous stochastic process and genuine discontinuities in the path of the process when using data sampled at discrete time intervals.
Among the many theoretical models 14 that have been developed to describe discontinuous components, jump-diffusion modelling has received great attention in the literature, since the non-parametric estimation procedure allows to study potential nonlinearities in the drift, in the diffusion, and in the intensity of the discontinuous jump component 5 . Here we show that a stochastic dynamical jump-diffusion modelling enables us to separate the deterministic drift term as well as different stochastic behaviors, namely diffusive and jumpy behavior. We will demonstrate that all of the unknown functions and coefficients of a dynamical stochastic equation that describe a jump-diffusion process can be derived directly from measured time series.
To illustrate the applicability of our approach, we investigate the stochastic behavior of epileptic brain dynamics by analysing long-lasting multi-channel electroencephalographic recordings from ten epilepsy patients. As was shown previously 15,16 , the dynamics of the seizure-generating brain area (epileptic focus)-but not of non-affected brain regions-is characterised by a non-vanishing fourth-order KM coefficient and would thus be assigned to the class of discontinuous processes. Here we relate the higher-order (≥ 4) KM coefficients to the contribution of an underlying jump process and show that during the seizure-free interval, the dynamics of the epileptic focus can be characterised as a stochastic process with small mean diffusion and small mean jump amplitudes compared to the dynamics of non-affected brain regions.

Results
From Langevin to jump-diffusion modelling. By definition, a process x(t) has continuous sample paths if the following relations for the conditional moments hold for small time increments dt 1 : Here o(dt) denotes terms of order higher than linear, which implies that o(dt)/dt vanishes in the limit dt → 0. The Langevin-modelling-based KM coefficients are defined as and can be determined directly from measured time series 3 (we use the subscript L here to distinguish these KM coefficients from the ones that we derive in the jump-diffusion modelling). Note that these KM coefficients 17 differ by a factor of 1/j! from the commonly defined 1 ones. M (1) (x, t) and M (2) (x, t) are conditional infinitesimal mean and variance parameters, which are known as the deterministic drift and the diffusion coefficients. In section Methods, we show that processes with non-vanishing and smooth M (1) (x, t) and M (2) (x, t) but with vanishing higher-order (j ≥ 3) KM coefficients generate continuous sample paths. The probability distribution function p(x, t) for processes with continuous sample paths satisfies a partial differential Fokker-Planck equation, which is of second order in the state variable x and of first order in time t. There is an equivalence between the Fokker-Planck equation and Langevin dynamics, which means that for a continuous diffusion process the dynamics of x(t) will be given by a white-noise-driven stochastic equation and has the following expression (using Itô's interpretation of stochastic integrals 1,2 ): {w(t), t ≥ 0} is a scalar Wiener process and D x t ( , ) and D x t ( , ) L (2) are the drift and diffusion coefficients, respectively. A process x(t) generated with Eq. (3) is a continuous diffusion process for D x t ( , ) and D x t ( , ) that satisfies the Lipschitz condition. We define the function f(x) to satisfy a Lipschitz condition on the interval [a, b] if there exists a constant  (dependent on both f and the interval) such that |f(x 1 ) − f(x 2 )| ≤ |x 1 − x 2 |. Any vanishing KM coefficients of order higher than two, particularly the fourth-order coefficient M (4) (x, t), will guarantee that x(t) is statistically continuous (according to the Pawula theorem a vanishing M (4) (x, t) means that all KM coefficients M (j) (x, t) for j ≥ 3 will also vanish) 1,18 .
Non-vanishing higher-order (> 2) KM coefficients, however, have been observed in various systems 3,15,[19][20][21][22] , which indicates that the corresponding measured time series do not belong to the class of continuous diffusion processes 1,2 . In order to improve modelling of such processes, we argue that if all the conditional moments of KM coefficients of order larger than two are non-vanishing, jump events should play a significant role in the underlying stochastic process. We now build a dynamical equation-a jump-diffusion equation-that is able to produce discontinuous sample paths.
A typical jump-diffusion process is given by a dynamical stochastic equation: (1) ( 2) where {w(t), t ≥ 0} is a scalar Wiener process and J(t) is a time-homogeneous Poisson jump process 23 . This process is characterised by the rate λ(x, t) and the size ξ, which we assume to be normally distributed ( ). We call σ ξ 2 the jump amplitude that may depend on the state variable x. Building upon previous works 5,17,23 , we show below that the drift and diffusion coefficients (D (1) (x, t) and D (2) (x, t)) of a jump-diffusion process can be related to the conditional moments M (1) (x, t) and M (2) (x, t). Before doing so, we illustrate the influence of jump events on higher-order conditional moments by calculating the third-and fourth-order conditional moments for the simplest case of a Poisson jump process with a constant jump rate λ (see section Methods). We find M (3) (x, t) = M (4) (x, t) = λ. This is the simplest example that provides already an intuitive meaning of higher-order KM coefficients.
Generally, the non-vanishing of the fourth-order KM coefficient could indicate that a jump-diffusion modelling is more appropriate than a Langevin-type modelling, particularly in cases where the corresponding measured Scientific RepoRts | 6:35435 | DOI: 10.1038/srep35435 time series do not belong to the class of continuous diffusion processes (see section Methods). We now discuss a non-parametric approach to estimate drift, diffusion, and jump characteristics which can be applied to both stationary and non-stationary time series in the presence of discontinuous jump components.

Theorem. A general jump-diffusion process is given by the dynamical stochastic equation Eq. (4), and all of the functions and parameters in this modelling can be found non-parametrically from measured time series by estimating the KM coefficients as:
In section Methods, we provide a proof for this theorem. In addition, we derive the conditional moments for bivariate jump-diffusion processes, and its generalisation to higher dimensions is straightforward. From this theorem, we can derive an equation for the characteristics of the jump process as follows: Using the relation ξ ξ = n n (2 )!/2 ! n n n 2 2 for the Gaussian random variable ξ and the last relation in Eq. (5), with j = 4 and j = 6, we first estimate the jump amplitude σ ξ x t ( , ) 2 and then the jump rate λ(x, t) as: Once the jump characteristics are identified, the second moment M (2) (x, t) identifies the diffusion coefficient D (2) (x, t) and the first moment gives us the estimate for the drift coefficient D (1) (x, t). To estimate the conditional moments M (j) , we can use the Nadaraya-Watson estimator [24][25][26] , which is a kernel estimator as: Reconstruction of stochastic processes with jumps. To demonstrate the validity of our approach, we estimate drift, diffusion, and jump characteristics from time series of well-known jump-diffusion processes with preset coefficients. First, we investigate a Black-Scholes process 27 in the presence of jumps 28 (jBSP). In jBSP, we consider a linear drift (D (1)  and two constant jump rates (λ = 0.1 and λ = 0.4). Finite-N synthetic data from Eq. (4) may not have jump number n j ≃ λN. Counting the jumps in each run gives the corresponding correct jump rate for each simulation. We generate synthetic time series by a numerical simulation of the corresponding dynamical equation 29 using a sampling interval Δ t = 0.001. For all estimated functions and coefficients, we obtain a very good agreement with theory (see Fig. 1). The fourth-order moment does not vanish (confirming our approach), and increases with an increasing jump rate λ. In addition, we verified that it approaches zero for λ → 0. We note that the diffusion coefficient D (2) (x) and the jump characteristics λ(x) and σ ξ = ξ x ( ) 2 2 contribute to the second-order KM coefficient (see Eq. (5)). Thus, with a Langevin-type modelling, it is not possible to separate these diffusive from jump contributions. This nonlinear example indicates the importance and physical meanings of higher-order KM coefficients.
As a second example, we investigate an Ornstein-Uhlenbeck process with jumps (jOUP). For D (1) , and constant jump rate λ = 0.6 we proceed as before, and for all estimated functions and coefficients, we obtain a very good agreement with theory (data not shown).
Given that data sampled at discrete times will always appear as a succession of jumps, even if the underlying path is continuous, we check the robustness of the jump-diffusion reconstruction using different sampling intervals [30][31][32][33] . For both, jBSP and jOUP, all estimated functions and coefficients remain almost unchanged when increasing or decreasing Δ t by a factor of 10. Eventually, we turn off the jump process and reconstruct the dynamics in terms of a jump-diffusion process. For all considered sampling intervals, we observe σ ξ x ( ) 2 to attain values close to 2 of the generated time series, which can thus be regarded the resolution limit for the jump amplitude.
Detailing the stochastic behavior of epileptic brain dynamics. Stochastic qualifiers of epileptic brain dynamics that are based on specific characteristics of the first-and second-order KM coefficients estimated using the Langevin-type modelling of electroencephalographic time series can yield valuable information for diagnostic purposes. Previous studies 15,16 have shown that particularly diffusion-coefficient-based qualifiers allow a more detailed characterisation of spatial and temporal aspects of the epileptic process in the affected and the non-affected brain hemisphere. The dynamics of the brain region that is responsible for the generation of focal epileptic seizures (epileptic focus), however, is characterised by a non-vanishing fourth-order KM coefficient, in contrast to the dynamics of other, non-affected brain regions 15 . Thus, pathological brain dynamics appear to not belong to the class of continuous diffusion processes and consequently, the Langevin-type modelling may not capture all aspects of this dynamics 15 . Due to the highly non-linear properties of pathological electroencephalographic time series 34,35 , we aim at disentangling the stochastic part of epileptic brain dynamics by explicitly estimating diffusion and jump characteristics. We begin by investigating exemplary brain dynamics from the seizure-free interval of an epilepsy patient. We consider intracranial electroencephalographic (iEEG) time series of 2000 s duration (corresponding to 4 · 10 5 data points) from within the epileptic focus and from a distant brain region (see Fig. 2). For both time series, we found evidence that our data are Markovian down to the sampling interval employing a least-squares method 3 . Next, we calculate the respective conditional moments of orders 1, 2, 4, and 6, using the Nadaraya-Watson estimator with a Gaussian kernel. For these moments, we obtain finite values in the limit of vanishing time increments, which allows us to conclude that the influence of the measurement noise can be neglected (see section Methods). In the following, we consider the interval x ∈ (− 200 mV, 200 mV) and report on averaged amplitudes of drift, diffusion, and jump characteristics (thereby we have assumed that the state space of the process is discretized and the conditional average has to be calculated separately for every x i , with binning the state variable into n b intervals).
The drift coefficients (data not shown) indicate an overall linear damping behavior, however, with small nonlinearities toward larger values of x for the dynamics within the epileptic focus 15 . The slopes of drift coefficients differ by about an order of magnitude (within epileptic focus: ≃ − 0.51 ± 0.05; distant brain region: ≃ − 3.60 ± 0.12, and from the inverse of the slope of the linear part of the drift coefficients we observe correlation time scales in the order of 1.96 ± 0.20 s for the dynamics within the epileptic focus and of 0.28 ± 0.01 s for the dynamics of the distant brain region. A comparable ratio holds for the averaged amplitudes of the diffusion coefficients, with the one for the former dynamics amounting to about a third of the one seen for the latter dynamics. The diffusion coefficient for the dynamics within the epileptic focus is largely independent of the state variable x and for the dynamics of the distant brain region, it depends parabolically on x (see Fig. 3a). Overall, these dependences on x indicate a multiplicative influence of the noise (cf. Eq. (4)).
From Eq. (5), it is known that the second conditional moment contains contributions from the jumpy dynamics, thus we further disentangle the stochastic part of epileptic brain dynamics and explicitly estimate jump characteristics. We find that the dynamics within the epileptic focus is characterised by a smaller averaged jump amplitude (within epileptic focus: 3400 ± 270 (mV) 2 , distant brain region: 4400 ± 350 (mV) 2 ; see Fig. 3b) and by a higher averaged jump rate (within epileptic focus: 18 ± 3 Hz, distant brain region: 8 ± 1 Hz; see Fig. 3c).
Next we demonstrate how the aforementioned findings translate to the long-term brain dynamics from all sampled brain regions. To this end, we perform a time-resolved analysis 36 of the patient's iEEG time series that were recorded over a period of more than eight days. We subdivide the time series into n w non-overlapping windows of size 10 5 data points, calculate estimators σ λ ∈ ξ e D { , , } (2) 2 for each window as outlined above, and by averaging over all windows we obtain their means (denoted as e) and standard deviations for each recording site (see parts a-c of Fig. 4). Eventually, we derive -separately for the sites from within the epileptic focus and for the distant sites -spatial means and standard deviations of the temporally averaged estimators (see parts d-f of Fig. 4). In general, the dynamics of the epileptic focus in this patient can be characterised by a mean diffusion coefficient ∼ D (2) whose amplitude is about half the one seen for the dynamics of the other brain areas. The same holds for the mean jump amplitude σ ξ  2 , however, the differences between affected and non-affected brain regions are more pronounced, with a mean jump amplitude in the epileptic focus amounting to about a sixth of the one of the other brain areas. The mean jump rate λ attains high values at some-though not all-recording sites capturing the dynamics of the epileptic focus, and comparably high mean jump rates can also be observed at distant sites. Consequently, differentiability between affected and non-affected brain regions with the mean jump rate λ  is insignificant. To demonstrate extendability of our observations beyond exemplary data, we now apply the aforementioned steps of a time-resolved analysis for the long-term dynamics of all sampled brain regions from all patients (see section Methods). Figure 5 summarizes our main findings (since differentiability between affected and non-affected brain regions with the mean jump rate λ  is again insignificant, we omit the display of averaged jump rates). In general, both the mean diffusion coefficient ∼ D (2) and the mean jump amplitude σ ξ  2 demonstrate a high interindividual variability, both in terms of their sizes and with respect to differentiability between affected and non-affected brain regions. In 5 (out of 10) patients (see Fig. 5c), differentiability with the mean jump amplitude clearly exceeds the one obtained with the mean diffusion coefficient (≈ 28%), and in 4 patients the opposite holds true (≈ 21%). In only one case (patient B), both estimators allow for a comparable differentiability (≈ 22%). , and jump rates λ(x) together with the respective probability distribution functions P(x) estimated from normalised iEEG time series (4 · 10 5 data points) recorded during the seizure-free interval from within the epileptic focus (red, contact TL4) and from a distant brain region (black, contact GLC6; cf. Fig. 2).

Conclusion
We presented a method-based on a stochastic dynamical jump-diffusion modelling-that allows one to separate the deterministic drift term as well as different stochastic behaviors, namely diffusive and jumpy behavior. We have argued that when the infinitesimal moments of Kramers-Moyal (KM) coefficients of order larger than two are non-vanishing, jump events should play a significant role in a stochastic process. Indeed, these higher-order KM coefficients carry information about the probability of arrival and about the features of the distribution of the jump size. Our method thus allows one to assign a physical meaning to higher-order KM coefficients in terms of jump rate and jump amplitude. We demonstrated that all of the unknown functions and coefficients of a dynamical stochastic equation that describe a jump-diffusion process can be derived non-parametrically from measured time series, both stationary and non-stationary in the presence of discontinuous jump components.
Through extensive analyses of multi-day, multi-channel electroencephalographic recordings from ten epilepsy patients we demonstrated that the dynamics of the epileptic focus can be characterised as a stochastic process with a smaller mean diffusion coefficient and a smaller mean jump amplitude as compared to the dynamics of distant brain regions. Higher-order KM coefficients thus provide extra information that can be regarded valuable for diagnostic purposes, their relationship to actual physiological/pathophysiological activities, however, would need to be investigated in future studies. Taken together, the findings of our proof-of-concept study underline the high suitability of our generalisation of the Langevin-type modelling to a  (2) ), jump amplitudes σ ξ ( ) 2 , and jump rates (λ) for iEEG time series from all recording sites (cf. Fig. 2). Data from sites within the epileptic focus are colored red. (d-f) Spatial means and standard deviations of temporally averaged diffusion coefficients ∼ D (2) . jump amplitudes σ ξ  2 , and jump rates λ  (epileptic focus: red; distant sites: gray).
Scientific RepoRts | 6:35435 | DOI: 10.1038/srep35435 jump-diffusion modelling to improve the characterisation of pathological brain dynamics beyond a continuous process. We expect that our approach also contributes to a detailed understanding of stochastic dynamics of other complex systems.

Continuity of processes generated by the Langevin equation.
In terms of the conditional probability distribution, a continuous process x(t) satisfies the following continuity condition, given some δ > 0 1 (8) is sometimes called Lindeberg's condition 1 . A violation of this continuity condition indicates that the smoothness of the process is not given any more and that discontinuous events like jumps are present in the process.
Here we prove that the processes with non-vanishing and smooth and with vanishing M (j) (x, t) with j ≥ 3 have continuous sample paths. In this case, the Kramers-Moyal expansion for the conditional probability distributions reduces to a Fokker-Planck equation. The short-term propagator p(x, t + dt|x′ , t) for the corresponding Fokker-Planck equation is given by 1 , (1) 2 (2) Figure 5. Stochastic qualifiers of brain dynamics for each epilepsy patient. Spatial means and standard deviations of temporally averaged diffusion coefficients ∼ D (2) (top) and jump amplitudes σ ξ  2 (middle), calculated separately for recordings from within the epileptic focus (black bars) and from distant sites (gray bars). Relative differentiability Δ (bottom) between non-affected (suffix d) and affected (suffix f) brain dynamics using diffusion coefficients ( ; red bars) and jump amplitudes ( Now we show that the conditional probability distribution function p(x, t + dt|x′ , t) satisfies the continuity condition Eq. (8), which means that it describes a continuous stochastic process. Assume that KM coefficients D x t ( , ) L (1) and D x t ( , ) are smooth and not changing dramatically over a short time interval and by substituting p(x, t + dt|x′ , t) from Eq. (9) we have, The first term (A) in Eq. (10) can be written as: (2) ( ( , )d )/ 4 ( , )d 2 . erfc(x) can be written in terms of the error function as . Expanding the expression for A in terms of t gives, (1) 2 (2) 2 (1) 4 In the limit d t → 0 + we find that approaches zero as A similar analysis shows that the second term (B) in Eq. (10) approaches zero in the limit dt → 0 + . We therefore conclude that the short-time propagator of the Fokker-Planck equation satisfies the continuity condition Eq. (8).

Non-vanishing higher-order Kramers-Moyal coefficients and the continuity condition.
For general processes, the continuity condition (Eq. (8)) can be written in terms of conditional moments M (j) (x, t) with j ≥ 1 and some δ > 0 as 37 . To do so, let us consider the order-j conditional moment as, • epileptic focus: comprises iEEG data from electrode contacts within the clinically defined epileptic focus; • distant: comprises iEEG data from all other electrode contacts.