Financial markets’ deterministic aspects modeled by a low-dimensional equation

We ask whether empirical finance market data (Financial Stress Index, swap and equity, emerging and developed, corporate and government, short and long maturity), with their recently observed alternations between calm periods and financial turmoil, could be described by a low-dimensional deterministic model, or whether this requests a stochastic approach. We find that a deterministic model performs at least as well as one of the best stochastic models, but may offer additional insight into the essential mechanisms that drive financial markets.

In particular, an obvious asset of the deterministic model is that it operates in a low-dimensional subspace of models defined by means of the form of the deterministic equation, whereas the stochastic approach lacks such a limiting element. Whereas the ARIMA-GARCH approach deals separately with mean and volatility, in our two-dimensional deterministic approach, the first vector component represents an instantaneous activity that is functionally linked to an underlying trend represented by the second vector component. With this, the Rulkovmap equation offers insight into how the driving sub-processes are entwined and to what effects, e.g., a change of one of its few parameters may lead. By this, our explicit deterministic approach opens a new perspective towards understanding financial markets. In what follows, we present a description of our novel modeling approach, followed by review of the classical modeling tools that we have to compare with. Then, we will recall the most popular measures for model comparison, before we display and finally discuss the results obtained, including the performance obtained from two more recent models. Our main focus will, however, not be that of a competition between the stochastic vs. deterministic approaches, but to show that there is a substantial deterministic part in the data that could be exploited for the explanation and prediction of financial market data.

Modeling tools
Generalized Rulkov map. During the recent turbulent years, ten Asian emerging stock markets showed non-linear serial dependence, and in real exchange rates of developing and emerging market economies 28 , in four USA stock market indices 29 , and in USA short rate indices 30 , mean reversion was observed; jumps are generally common in financial time series 31 . Similarly, biological neurons exhibit a two-phase response: an activation phase, characterized by a transient and presence of fast oscillations, and a subsequent plateau phase characterized by more regular oscillations. The two-dimensional Rulkov map family of biological neurons 26 combines a one-dimensional fast subsystem (x) and a one-dimensional slow subsystem (y), where the latency to a stimulating event depends on the fast-slow dynamics. While this interplay between fast-slow dynamics is a general mechanism unterlying pattern formation, the Rulkov map offers through the involved parameters the possibility to adapt the general mechanism to specific features exhibited by a process modeled. In particular by its recursive nonlinear and mean reverting properties, Rulkov maps may be expected to be highly suitable for the modeling of financial time series, in particular regarding the occurrence of data clusters, heteroskedasticity, mutual synchronization, chaos and regularization of bursts of activity across the markets, after-shock asset classes, and more 32 . In our application, we will optimize the Rulkov family model to best reproduce the processes and events observed in our data.
To provide a first general insight into the nature of Rulkov maps, we start from a two-dimensional non-linear recurrence of the form Such a setting is widely used for the modelling short-time effects like pulses and shocks in time series. Vector component y captures a moving average trend implemented as Combining Eq. (1) with Eq. (2) leads to This version, that differs slightly from the original Rulkov version, is the map that we will use for our financial time series modeling; its two-dimensional form takes equally care of the short-term (x) and the average prediction (y) horizon (the two main subjects of interest in financial modeling), which provides the basis for an optimal modeling framework. Of importance are the following characteristics hosted by function f α . Proposition 2.1 For any (even) integer n, the function f α (x), defined in (1), satisfies: Proof Without loss of generality we set α > 0 for any x ∈ R . This equals zero only for x = 0 , is larger than zero for x < 0 and smaller than zero for x > 0 . Hence, x = 0 is a global maximum for f α (ii) is immediate (iii) is the consequence of lim n→+∞ 1 1+(2x) n = �(x) , for any even integer n, n ∈ N, α, γ , δ, x ∈ R.
(2) y t+1 = β y t − µ x t + η, with β, µ, η, y ∈ R. www.nature.com/scientificreports/ Figure 1 exhibits that the choice of n = 2 provides the x-sensitivity that according to our tests is desirable; this will be our choice in the following, unless specified otherwise.
For our demonstrations, we will mainly use single Rulkov maps; to simulate the interaction between markets, asset classes, and more, Rulkov maps labelled by indices i and j were coupled according to {x w h e r e t h e c o u p l i n g describes a time-lagged interaction between the time series i and j (and likewise for interchanged indices). The idea underlying this specific coupling is that such an interaction will mostly occur on a short time-scale. α, β, γ , µ, η, δ, κ, θ are system parameters, whereas τ coincides with the timeseries specific embedding dimension used for the reconstruction of the phase-space from the one-dimensional time series 2,33 . If a single time series is our object of study, we will use the term specification "single"; if we consider the coupling among two time series, we will use the term specification "coupled" to indicate this. For our applications of the Rulkov map, the involved parameters must be chosen appropriately. For this, we extract the x and y initial conditions from the beginning of the data (for details see "Forecasting" section), then calibrate the parameters by running over the rest of the data a robust nonlinear least squares regression, using an iteratively re-weighted least squares algorithm that, at each iteration, recomputes the weights based on the residual from the previous iteration. This process progressively continues until the weights converge. For details see Holland et al. 34 .

ARIMA-GARCH.
To assure validity, we will mainly benchmark our optimized deterministic Rulkov-map approach against an optimized ARIMA-GARCH model (consisting of optimized Auto-Regressive Integrated Moving Average model (ARIMA) combined with an optimized Autoregressive Conditional Heteroskedasticity model (GARCH)). Probably the currently most used and best-documented stochastic approach 23 . A less detailed comparison to the performance of more advanced versions of the stochastic approach is furnished in the context of the conclusions from this comparison.
In financial data, potential serial correlations deserve special attention. Their effects can be eliminated by differencing the original non-stationary time series once, or if required, a higher number of times (the procedure must, however, not be pushed too far, as higher (> 2) orders tend to distort the desired frequency components 35 ). In our tests up to order four, only the first order yielded improvements. The resulting stationary time series can then be treated by the Box-Jenkins methodology (ARMA or ARIMA) 23,36 .
The AR part of ARIMA combines a linear extrapolation based on past observations and the MA part contributes the influence by past errors. In short note, for the optimal ARIMA-model, the equation , where x denotes the data values, ε the errors (white noise), L the time-shift operator needed for a d-fold differencing towards stationary signals, p the lag-level considered and q the order of the moving-average model. ϕ i are the linear autoregressive coefficients and θ i the moving average coefficients that apply to past errors. From this, the GARCH part then evaluates the variance of the process conditioned on the observed past data values and variances. The ARIMA(p,d,q)-model providing the optimal description of the data can then be used for forecasting. For example, given an ARIMA(2,1,0)-model, we obtain X t = µ + (ϕ 1 (x t−1 − µ)) + (ϕ 2 (x t−2 − µ)) + ε t for the stationary time series, where µ is the mean of time series x t (expected to be small), and ε t is white noise term with zero mean and constant variance. The GARCH part of the approach estimates the volatility based on a convex linear combination of its past values, its long-average persistence term, and the exponentially weighted moving average on past data; GARCH(a,b) models take care of contributions that lag further back in time a with respect to the generating time series, b with respect to the past variances. The determination of the {p, d, q} and {a, b} polynomials is made by using the www.nature.com/scientificreports/ Bayesian information (BIC) and the Akaike information (AIC) criteria 37 . The parameter-optimized ARIMA-GARCH model will be denoted in the following by ARIMA-GARCH * . When in the following we mention the calibration of an algorithm, we mean that its parameters are optimized over the entire or a part of the data.

Forecasting.
A forecasting process consists of multiple estimations over a rolling window, i.e, in distinction to the parameter calibration over the full data set, we calibrate the ARIMA-GARCH and the Rulkov map on a window, or partition, of the data set. Given our weekly data, we take the full time series and consider only the first 52 data points (i.e., a trading year). We extract over that partition the best ARIMA-GARCH or Rulkov model, and we forecast the next value (i.e., the 53rd data point). In the next step, we consider the interval starting with the second data point to the 53rd data point, and redo all the calibration; with the calibrated model we obtain a new forecast for the 54th data point, etc. For the ARIMA-GARCH process, this involves two calibrations (for the x and the y separately). For the Rulkov map, this must be only done once. A standard cost-effective zero-level benchmark against which more sophisticated model predictions can be compared is the naïve model 36 . This model forecasts a value equal to last period's observed values, resulting in a in-sample one step ahead algorithm. A time series

Elements of model results comparison.
To compare modeling approaches, we use a set of metrics: Normalized square root mean error. The root mean squared error (RMSE) defined as RMSE = 1 N N h=1 e 2 h , measures the accuracy of a model in terms of goodness of fit, where e h denote the residuals between the observations and the simulations over N time steps. Hence, the values below unity indicate a good fit. RMSE, however, depends on the scale of the observed data and is, by this virtue, outlier-sensitive. For this reason, we will apply the normalized root mean squared error (NRMSE): where x max denotes the maximum value and x min the minimum value of the observed sample data.
Relative mean absolute error. The relative mean absolute error is where MAE (1) is the mean absolute error of the model of interest and MAE (2) is the error of some benchmark model 38 . A value greater than unity indicates that the chosen model performs worse than the benchmark.
Mean absolute percentage error. To measure the prediction quality of a model, often the mean absolute percentage error (MAPE) is used, where e h denotes the residuals between the observations x h and their simulations over N time steps 39 . Table 1 exhibits the generally accepted MAPE accuracy levels 40 .
Dynamic time warping distance. To scrutinize how close models are, dynamic time warping distance (DTWD) can be used. This technique finds the optimal alignment between two time series by stretching or shrinking on time series along the time axis, and then evaluates the Euclidean distance between the two 41,42 . This commonly used technique in finance and econometrics 32,43,44 copes with slightly temporally varying responses in one of the time series 45 . This feature has made DTWD the predominant tool in speech recognition (for determining whether two waveforms represent the same spoken phrase) and one of the standard methods used for classification problems in data mining 46,47 . In our application, DTWD will include information on how much the predicted time series is shifted against the data modeled. In the comparison between different modelings, such www.nature.com/scientificreports/ effects may play a role, but are expected to be close to the ordinary Euclidean distance between time series. DTWD-results obtained for the artificial system displayed in Fig. 2 will later be used to assess how strong distinct modeling approaches of financial market data differ.

Data investigated
In Table 2 we provide an overview of the empirical financial time series used for our numerical evaluations. Whereas common standard dataset abbreviations will be used otherwise, the abbreviation BAMLEM will refer to the Emerging Markets Corporate Plus Index Total Return Index Value.

Rulkov-map modeling of the financial stress index
In finance, the fast x and the slow y components correspond to the actual data and their memory, respectively, where the latter is approximated by the exponentially weighted moving average (EWMA). Since the two components follow well-separated time scales, this suggests to characterize the instability of each component individually by means of a one-dimensional Lyapunov exponent, although the underlying system is in fact two-dimensional. The Financial Stress Index STLFSI2 48 , see Fig. 3, will be our primary focus in financial data modeling. It is obtained from seven interest rates, six yield spreads, and five other indicators by means of principal component analysis (PCA), as each of these variables captures aspects of financial stress. Accordingly, whenever the level of financial stress in the economy changes, the data series are expected to change in a coherent manner. Positive stress values indicate higher-than-average stress in financial markets, while negative values indicate lower-thanaverage levels of stress. Our financial stress data covers the 2008-2009 financial crisis, as well as the ongoing turmoil in financial markets stemming from the fear and uncertainty associated with the COVID-19 pandemic. As the best ARIMA-GARCH models in terms of Bayesian information criterion (BIC) and Akaike information criterion (AIC) were those integrated of order one 36 , we calibrated all models investigated (Rulkov map, Naive model and the optimized ARIMA-GARCH model) over the first differences of the STLFSI2.
Below we report the accuracy of data modeling achieved by our Rulkov map approach, then the latter's potential for making forecasts, and we finally present evidence of chaotic behavior in the data. For the Rulkov map, an {x, y}-pair provided the input, from which a corresponding output {x,ŷ} was generated, where for with every x-value, an exponential weighted STLFSI2 moving average yielded the corresponding y-value. In the stochastic approach, after the two optimized ARIMA-GARCH * models were determined (one for the x and one for the y-coordinate), similarly each {x, y}-pair was fed into the model, generating a corresponding output {x,ŷ} (for model details, cf. "ARIMA-GARCH"). In Fig. 4 top row, we show how the Rulkov map given the properly calibrated parameters of Table 3, replicates the bursting dynamics of STLFSI2. In the second row, we display the measured error and, in the third row, we compare this to the one obtained from a calibrated ARIMA-GARCH *   Table 4). The greater simplicity of the model with substantial difference in the number of involved parameter, leads to a speed-up of the Rulkov calibration process by a factor of 20, compared to the ARIMA-GARCH calibration process. Among the remaining five financial market time series, there are twenty possible couplings. For space reasons, we will only display the most representative ones; the omitted ones follow their behavior closely. In their treatment, we occasionally observed correlations on the residuals, which led us into further investigations on integration, mean reversion, and ARCH effects, suggesting that some financial time series might not properly satisfy the finiteness condition imposed on the first four distribution moments 49 . Chaotic or random data? To check whether expected chaotic aspects are indeed in the data, and if so, how well they are reflected by the model, we followed the standard approach to evaluate for the data and the obtained simulations the maximal Lyapunov exponents (MLE) 2,3,50-55 . The MLE calculated on the first differences of   Table 4. ARIMA-GARCH* (2,1,2)-(1,1) coefficients for variable x and ARIMA-GARCH* (2,1,1)-(2,1) coefficients for variable y, listed according to the AR-I-MA-G-ARCH parts of the algorithm (where two of the constants appearing in the ARIMA as well as in the GARCH parts have been added, leading to 16 (instead of 20) parameters to be listed).   Orlando 62,63 , see also 59,64 . Generally, the interpretation of such estimates, must, however, be taken with care, as differencing can be seen as filtering, which will generally affect the results 22,65 . Moreover, a difficulty of the method is to arrive at the required saturation of the curves for increasing embedding dimensions in the presence of noise. Therefore, the exhibited results are more of a corroborating nature. The Hurst exponent (H) of single time series is directly related to D by the relation D = 2 − H [66][67][68] . For the first difference of STLFSI2 to the corresponding Rulkov map, we found for the x-dynamics H = 0.40 and H = 0.69 for the y-dynamics, respectively. For various time-series, Table 5 lists the MLE, AE, D and H, together with the values obtained from the corresponding Rulkov map approach, evidencing a good agreement between the model and the empirical time-series data properties. For all these computations, the investigated time series were of a sufficient length.

Details of model comparison
The autocorrelation function (ACF) and the partial autocorrelation function (PACF) on the residuals for state variables x and y of the Rulkov map and for the ARIMA-GARCH * model (Fig. 5) indicate that the first order differencing is more effective in removing autocorrelation on the residuals for the Rulkov than for the ARIMA-GARCH * approach. This could be seen as a sign of the greater explanatory power of the Rulkov map: Although sometimes correlated residuals are features of the research design or intrinsic to the studied phenomenon, "..the inclusion of correlated residuals in latent-variable models is often regarded as a statistical sleight of hand, if not an outright form of cheating" 69 . As STLFSI2 is a combination of other indices and, according to Lo et al. 70,71 , there is significant evidence of positive serial correlation for weekly and monthly holding-period index returns. Generally, weekly and monthly stock returns are weakly negatively correlated, whilst daily weekly and monthly index returns are positively correlated 72 . This effect has been explained by positive cross-autocorrelations across individual securities across time, where it may happen that this cannot be removed completely 72 . The absence of a unit root, however, has been confirmed using the KPSS test 73 .
To determine how close the Rulkov and the ARIMA-GARCH * approaches would be, we applied the commonly used the DTWD measure. We first calibrated DTWD at the simple example of a perturbed sawtooth wave that was replicated upon adding Gaussian white noise of variable strength (cf. Fig. 2). Figure 6 shows that the warped residuals of the Rulkov and the ARIMA-GARCH * x-component are almost identical and closely follow the original timeline (in the sense that the warping did not essentially compress or stretch the dependent time series). For the y component, we obtained corroborating results (Table 6, corresponding plots are withheld for space reasons). With the only exception of the DTWD for x j of the IBOV, our distances in Table 6 are below the DTWD generated by Gaussian white noise. From this, we conclude that the distinction between the ARIMA-GARCH * and the Rulkov map approaches should be of, or even below, the order of the influence of noise.
In Table 6 we report the RMAE accuracy of the Rulkov map, of the Naive, and of the ARIMA-GARCH * models, the NRMSE of the Rulkov map, of the ARIMA-GARCH * model, and their DTWD. As may be expected, in all instances the Rulkov map performs substantially better than the Naive model, whereas a comparison between We run a robust estimation with the iteratively re-weighted least squares algorithm 34 which, at each iteration, recomputes the weights based on the residual from the previous iteration. This process progressively continues until the weights converge. Using MAPE, the generated forecasts are compared with those of the ARIMA-GARCH * model, see Table 7 and Fig. 7. The comparison provides evidence of fits of similar quality, both for the process and its trend, confirming that a deterministic approach performs as well as a stochastic approach. Finally, we include a summary of results that we have obtained from the application of two more recent approaches to our data. The first is the G2++ model, a popular two-factor Gaussian model where the state process is given by the sum of two correlated Gaussian factors plus a properly chosen to fit real data deterministic  www.nature.com/scientificreports/ function. The model is analytically tractable and its (conditional) distribution, as well as its moments, is given in closed formula. For the aforementioned reasons, Gaussian models like this G2++ model are useful in practice (see 74 , Chapter IV]). The second, the JOU-model, is a (mean-reverting) Ornstein-Uhlenbeck process with Poisson jumps 75,76 . The results in terms of error for the calibrated models with our optimized parameters evidences that in terms of RMAE and DTWD, and MAPE for forecast, the Rulkov map performs better than the two alternatives, although the JOU approach's results were quite close to the Rulkov and ARIMA-GARCH* results (details of the comparison available upon request from the authors). The calibration time of the JOU approach was roughly thrice that of the Rulkov approach; G2++ required only slightly more time than Rulkov.

Conclusions
The low DTWD and the almost coinciding residuals of the Rulkov map and ARIMA-GARCH* exhibit a similar quality of the two approaches, indicating that the deterministic chaotic model performs as well as an ultimately refined stochastic one. Furthermore, the low-dimensional deterministic Rulkov map approach suffers less of residuals' autocorrelation, hosts an improved NRMSE and a better DTW property. A main shortcoming of the ARIMA-GARCH is that generally it does not offer a nice handle towards the understanding of its building and the meaning of the resulting coefficients. The explicitness and simplicity of the Rulkov map opens, in contrast, the door for a deeper understanding of the drivers of the market dynamics. For a full understanding the consequences that a variation of the involved parameters will entrain in the context of the real-world financial markets, a more explicit mapping of the model parameters to financial market parameters will, however, be needed. This would require the building of a kind of conceptual real-world model, helped by the Rulkov equation, but replacing it. www.nature.com/scientificreports/ Once the essential market parameters would been identified in this way, the degree of precision of such a model can be assessed by using a recent high-level model testing 77 .
Regarding the long-standing debate whether we should consider financial times series as stochastic or rather chaotic 78 , our direct modeling approach yields evidence for a stronger chaotic data component than what one could probably expect. While both elements seem to be present, the chaotic component offers several options for understanding and, finally, monitoring financial processes, if choosing the appropriate time-scale. In particular, our approach may be expected to host an improved explanatory power to events of interest, such as shocks that emerge as endogenous, instead of the exogenous nature supposed by stochastic models 24 . Given this situation, we are happy to show that a substantial contribution of determinism is found in the data 4,79 . For forecasting, the latter is sufficient; the question whether the whole of the data should be termed chaotic or stochastic is, seen in this perspective, of a more 'academic' nature 1,63 , where the contradicting arguments available may indicate a lack of appropriate technical or theoretical descriptors to deal with such type of data.
More generally, what we deal with here is similar to the one encountered in neuroscience, where initial ideas of simple chaotic or of power-law behavior characterizing the data, had to be replaced by more complicated and, in particular, scale-dependent characterizations 80 . This shifts the question towards determining exactly at what time-scales the two processes of low-dimensional chaos and of stochastic, respectively, are dominant, and to identify the horizon over which a deterministic prediction will yield optimal forecasting results 32 . In this way, our work may open a general avenue towards a better understanding and monitoring of the essential drivers in the data, not only in financial market dynamics, but also in similar processes beyond.