Attractor Ranked Radial Basis Function Network: A Nonparametric Forecasting Approach for Chaotic Dynamic Systems

The curse of dimensionality has long been a hurdle in the analysis of complex data in areas such as computational biology, ecology and econometrics. In this work, we present a forecasting algorithm that exploits the dimensionality of data in a nonparametric autoregressive framework. The main idea is that the dynamics of a chaotic dynamical system consisting of multiple time-series can be reconstructed using a combination of different variables. This nonlinear autoregressive algorithm uses multivariate attractors reconstructed as the inputs of a neural network to predict the future. We show that our approach, attractor ranked radial basis function network (AR-RBFN) provides a better forecast than that obtained using other model-free approaches as well as univariate and multivariate autoregressive models using radial basis function networks. We demonstrate this for simulated ecosystem models and a mesocosm experiment. By taking advantage of dimensionality, we show that AR-RBFN overcomes the shortcomings of noisy and short time-series data.


A. Simulated Data
The simulated data used in this work is generated from ecosystem simulations of a three-species food chain 1 , a three-species coupled logistic model 2 , a flour beetle model 3 and a five species model 4 .

Three-species coupled logistic model
The

Flour beetle model
The chaotic behavior of an insect population, Tribolium Castaneum, is modeled through the following equations for different life stages (larvae, pupae, and adults) of flour beetle suggested by Dennis et al. 3 :

Mesocosm plankton community data
The data drawn from the mesocosm 8-year experiment on a plankton community isolated from the Baltic Sea has been shown to represent the dynamics of a chaotic system. We use the transformed data of the abundance of Rotifers, Calanoid Copepods, Picocyanobacteria and Nanoflagellates from the supplementary material of Benica et al. 5 . The data transformation in Benica et al. is done such that the raw data is interpolated by hermite cubic interpolation to obtain data with equidistant time intervals of 3.35 days, and then rescaled by a fourth-root transformation to suppress sharp peaks. The transformed data are of length 794 samples.

C. Manifold Reconstruction
As described in Ye et al. 2 , the possible number of 3-dimensional manifold reconstructions of combination of variables and their time lags of 0, and 2 is: where is the number of variables in the dynamic system, is the number of possible lags for each variable, and is the embedding dimension. The first term is the number of manifolds formed by choosing of the possible coordinates, and the second term is subtracted to eliminate the number of invalid manifolds with lagged coordinates. A valid manifold is one with at least one unlagged coordinate. For example, the possible number of valid manifold reconstructions for a 3 and 4 variable system is 64, and 164 respectively. Unlike Ye et al. 2 that suggests = √ , we found out that for multiview radial basis function network (MV-RBFN), the best number of top reconstructions to incorporate into MV-RBFN is = , where is the number of variables in the interconnected dynamic system. This is because for any -variate system, if we let be equal to √ (√ ≥ ) we will have too many hidden units in the hidden layer of the radial basis function network. Particularly in cases where the time series is noisy, too many hidden units in the hidden layer of the neural network leads to overfitting of the training samples and poor generalization 6 . In this work, we choose = 1 and = 3 for the ecosystem simulated data and mesocosm experiment data.

D. Multiview Embedding (MVE)
Multiview Embedding (MVE) is a forecasting algorithm that is based on Simplex Projection 7 .
Simplex projection is a nearest neighbor forecasting technique that involves tracking the forward evolution of nearby points in an embedding, i.e., similar past events are used to forecast the future.
Multiview Embedding (MVE) too involves reconstructing valid manifolds from combinations of variables and time lags 2 . In contrast to Simplex Projection where the forecast is based on weighted average of nearest neighbors, MVE examines the top reconstructions and uses the single nearest neighbor from each to perform forecasting. For instance, the forecast of variable is as follows:    (d -f) same as a to c but for 100 randomly sampled libraries of length 50. (g -i) same as a to c but for 100 randomly sampled libraries of length 100. The solid black curves are the average mean absolute errors for the attractor ranked RBFN approach for the top manifold reconstructions. The solid green curves are the average mean absolute errors for the univariate RBFN approach, and the solid pink curves are the average mean absolute error using the multivariate model constructed by the variable combination with the best in-sample prediction skill in the RBFN autoregressive approach. The dotted lines are the upper and lower quartiles.  (g -i) same as a to c but for 100 randomly sampled libraries of length 100. The solid black curves are the average mean absolute errors for the attractor ranked RBFN approach for the top manifold reconstructions. The solid green curves are the average mean absolute errors for the univariate RBFN approach, and the solid pink curves are the average mean absolute error using the using the multivariate model constructed by the variable combination with the best in-sample prediction skill in the RBFN autoregressive approach. The dotted lines are the upper and lower quartiles.