Introduction

Many systems in nature and society possess critical thresholds at which the system undergoes an abrupt and significant change in dynamics1,2. In physiology, the heart can spontaneously transition from a healthy to a dangerous rhythm3; in economics, financial markets can form a ‘bubble’ and crash into a recession4; and in ecology, ecosystems can collapse as a result of their interplay with human behaviour5,6. These events, characterised by a sudden switch to a different dynamical regime, are referred to as critical transitions.

Critical transitions can be better understood with bifurcation theory7,8, a branch of mathematics that studies how dynamical systems can undergo sudden qualitative changes as a parameter crosses a threshold (a bifurcation). Many bifurcations are accompanied by critical slowing down—a diminishing of the local stability of the system—which results in systematic changes to properties of a noisy time series, such as its variance, autocorrelation and power spectrum9,10,11. These properties can be approximated analytically in the presence of different bifurcations10,12,13,14, and a corresponding observation in data can be used as an early warning signal (EWS) for the bifurcation11. Systematic changes in variance and lag-1 autocorrelation have been observed prior to transitions in climate15,16,17, geological18, ecological19,20 and cardiac21 systems, suggesting the presence of a bifurcation. However, these EWS have limited ability to predict the type of bifurcation14,22 and can fail in systems with nonsmooth potentials23 or noise-induced transitions24.

More recently, deep learning techniques have been employed to provide EWS for bifurcations25,26,27. This involves training a neural network to classify a time series based on the type of bifurcation it is approaching, as well as appropriate controls25,26,27. Unlike many applications of deep learning, this approach does not require abundant data from the study system, which, in the context of critical transitions, is often unavailable. (Unfortunately we do not have data from thousands or more ecosystems or climate systems that went through bifurcation). Instead, the approach generates a massive library of simulation data from generic models that possess each type of bifurcation. The neural network then learns generic features associated with each type of bifurcation, that can be recognised in an unseen time series of the study system. This is enabled by the existence of universal properties of bifurcations that are manifested in time series as a dynamical system gets close to a bifurcation7,9. In our previous work, we trained a deep learning classifier to provide an EWS for continuous-time bifurcations, and found it was effective at predicting transitions for real thermoacoustic, climate and geological transitions25.

Bifurcations can be partitioned according to whether they occur in continuous or discrete-time dynamical systems7,8. This distinction is important, since discrete-time dynamical systems (difference equations) can display very different behaviour to their continuous-time counterparts (differential equations). As an example, consider the logistic model for population growth. When set up in continuous time (appropriate for populations with overlapping generations e.g. humans), the population grows smoothly as the reproduction rate increases. Whereas, when set up in discrete time (appropriate for populations with non-overlapping generations, e.g. insects), the population displays a spectrum of dynamics across parameter values, including stable points, stable cycles, and chaos28. It is therefore important to develop EWS suitable for both continuous and discrete-time bifurcations. While indicators like variance and lag-1 autocorrelation can provide EWS for discrete-time bifurcations, the ability of a deep learning classifier at this task has not been investigated.

As well as in ecology, discrete-time bifurcations arise naturally in physiology3, epidemiology29, and economics30, where events can take place on a discrete timeline. To illustrate our approach, we will use model simulations from ecology, physiology and economics, as well as experimental data from spontaneously beating chick heart aggregates21,31. Following administration of a drug, in some aggregates the time interval between two heart beats begins to alternate i.e. there is a period doubling bifurcation (Fig. 1). Such transitions can also occur for the human heart in the form of T-wave alternans, which increases a patient’s risk for sudden cardiac death32. The period-doubling bifurcation is accompanied by critical slowing down, so systematic changes in variance and lag-1 autocorrelation are expected and have been shown to provide an EWS in this system21. The chick heart aggregates serve as a good study system to test the performance of EWS since we have multiple recordings, not all of which underwent a transition, allowing us to test for false positives.

Fig. 1: Period-doubling bifurcation in a spontaneously beating aggregate of embryonic chick heart cells following treatment with a potassium channel blocker (E-4031, 1.5 μmol).
figure 1

a Interbeat intervals (IBI) for consecutive beats. A period-doubling bifurcation occurs at approximately beat 230. Arrows mark intervals plotted in lower panels. bd Normalised signal from optical imaging of the aggregate’s motion. Traces are from a section well before (b), just before (c), and after (d) the period-doubling bifurcation.

Among discrete-time bifurcations, there are many types, each with an associated change in dynamics7. For this study, we focus on the five local bifurcations of codimension-one (Supplementary Note 1). In being ‘local’, these bifurcations are accompanied by critical slowing down, so systematic changes and variance and autocorrelation are expected. However, not all of these bifurcations result in a critical transition22. They can instead involve a smooth transition to an intersecting steady state (transcritical) or to oscillations with gradually increasing amplitude (supercritical Neimark–Sacker). Predicting the type of bifurcation provides information on the nature of the dynamics following the bifurcation, something variance and autocorrelation alone do not provide.

Here, we train a deep learning classifier to provide a specific EWS for bifurcations of discrete-time dynamical systems. We train the classifier using simulation data of normal form equations appended with higher-order terms and noise. We then test the classifier on simulation runs of five discrete-time models used in cardiology, ecology and economics, and assess its performance relative to variance and lag-1 autocorrelation. We vary the noise amplitude and rate of forcing in model simulations to assess robustness of the EWS. Finally, we test the classifier on experimental data of spontaneously beating chick-heart aggregates that go through a period-doubling bifurcation. A reproducible run of all analyses may be performed on Code Ocean (https://codeocean.com/capsule/2209652/tree/v2) where the code is accompanied by the necessary software environment.

Results

Performance of classifiers on withheld test data

We train two different types of classifiers and use their ensemble average to make predictions. Classifier 1 is trained to recognise bifurcation trajectories based on middle portions of the time series, whereas Classifier 2 is trained on end portions (see Methods). In this way, Classifier 1 provides an earlier signal of a bifurcation and Classifier 2 provides a more specific signal, as more information is revealed closer to the bifurcation. To quantify performance of the classifiers, we use the F1 score, which is a combined measure of sensitivity (how many of the true positives were predicted correctly) and specificity (how many of the positive predictions were actually true positives). On the withheld test data, Classifier 1 and 2 achieved an F1 score of 0.66 and 0.85, respectively. On the simpler, binary classification problem of predicting whether or not there will be any bifurcation, the classifiers achieved an F1 score of 0.79 and 0.97, respectively. Classifier 2 has a higher performance as it has the easier task of classifying data closer to the bifurcation where fluctuations are more pronounced. Performance on individual bifurcation classes is shown by confusion matrices (Supplementary Fig. 1). The period-doubling, Neimark–Sacker and fold bifurcations are correctly classified with high sensitivity and specificity. On the other hand, the transcritical and pitchfork bifurcations are often mistaken for one another, likely due to having very similar normal forms (identical linear terms). Despite this, Classifier 2 can distinguish them at better than random, suggesting it is capable of recognising the different higher order terms in the data. From here onward, we report results using the ensemble prediction of the two classifiers, referred to collectively as the deep learning classifier.

Performance of EWS on theoretical models

We monitor variance, lag-1 autocorrelation and the deep learning classifier as progressively more of the time series is revealed. Variance and lag-1 autocorrelation are considered to provide an EWS if they display a strong trend, which we quantify using the Kendall tau statistic. For each of the five theoretical models (Fig. 2(a–e)), we observe an increasing trend in variance (Fig. 2f–j), and an increasing or decreasing trend in lag-1 autocorrelation (Fig. 2k–o). The direction of the trend in lag-1 autocorrelation prior to a bifurcation depends on the frequency of oscillations (θ) at the bifurcation—equivalently the angle of the dominant eigenvalue in the complex plane (Supplementary Note 1). For θ [0, π/2) lag-1 autocorrelation increases, whereas for \(\theta \in \left(\pi /2,\,\pi \right]\) it decreases—insights that can be obtained from analytical expressions of the autocorrelation function10,14. The period-doubling bifurcation is characterised by θ = π, and the Neimark–Sacker bifurcation shown here has θ ≈ π/4. The trends in variance and lag-1 autocorrelation therefore behave as expected and can be used as an EWS.

Fig. 2: Trends in indicators prior to five different bifurcations in the theoretical models.
figure 2

ae Trajectory (grey) and smoothing (black) of a simulation going through a period-doubling, Neimark–Sacker, fold, transcritical and pitchfork bifurcation, respectively. fj Variance of residual dynamics after smoothing, computed over a rolling window (arrow) of size 0.5 times the length of the pre-transition data. ko Lag-1 autocorrelation. pt Probabilities assigned by the deep learning (DL) classifier when given all preceding data. Orange shows probability assigned to the true bifurcation. Grey shows probabilities assigned to the other bifurcations. Blue shows the sum of the five bifurcation probabilities.

The deep learning classifier assigns a probability to each of the six possible outcomes (null, period-doubling, fold, Neimark–Sacker, transcritical and pitchfork). It is considered to provide an EWS when there is a heightening in the sum of the bifurcation probabilities (blue line, Fig. 2p–t). The type of bifurcation predicted is then taken as the highest individual bifurcation probability. For each simulation, the classifier becomes more confident of an approaching bifurcation as time goes on, and its assigned bifurcation probability for the true bifurcation increases. The period-doubling, Neimark–Sacker and fold bifurcations are identified with high confidence well before the transition. The transcritical and pitchfork bifurcations are assigned roughly equal probability on their respective time series, suggesting they are difficult to tell apart—an observation consistent with the classifier’s performance on its within-sample test data.

To obtain a measure of performance for the EWS, we need to test their predictions on both ‘forced’ time series (where a bifurcation is approached) and ‘null’ time series (where no bifurcation is approached). For the theoretical models, we generate null time series by keeping the bifurcation parameter fixed. Sample null time series and their EWS are shown in Supplementary Fig. 3. We also test the robustness of the EWS to the rate of forcing and the noise amplitude of the simulations—two factors that have been shown to influence the performance of variance and lag-1 autocorrelation as an EWS33,34. To this end, we simulate 100 forced and null time series at five different values of noise intensity and five different values of rate of forcing, resulting in a total of 5000 time series for each theoretical model. Sample trajectories illustrating the different noise amplitudes and rates of forcing are shown in Supplementary Fig. 4. We compute the probabilities assigned by the classifier and the Kendall tau values for variance and lag-1 autocorrelation at 80% of the way through the pretransition time series, and use these values as discrimination thresholds to construct ROC curves (Fig. 3a–e).

Fig. 3: ROC curves for predictions of an upcoming transition in model and experimental data.
figure 3

ROC curves compare the performance of the deep learning classifier (DL, blue), variance (Var, red) and lag-1 autocorrelation (AC, green). For models (ae), performance is assessed on 2500 forced and 2500 null simulations with different noise amplitudes and rates of forcing, with predictions made 80% of the way through the pretransition data. For experimental data (f), performance is assessed on 46 experimental runs, with 10 equally spaced predictions made between 60 and 100% of the way through the pre-transition data. The area under the curve (AUC), abbreviated to A, is a measure of performance. Insets show box plots for the probabilities assigned by the classifier to each type of bifurcation (orange being the true bifurcation) among the trajectories approaching a transition. Box centre line is the median, box limits are the upper and lower quartiles, and whiskers capture the range up to 1.5 times the interquartile range. a Fox model going through a period-doubling bifurcation49. b Westerhoff model going through a Neimark–Sacker bifurcation30. c Ricker model going through a fold bifurcation50. d Lotka–Volterra model going through a transcritical bifurcation51. e Lorenz model going through a pitchfork bifurcation53. f Chick heart aggregates going through a period-doubling bifurcation31. PD: period-doubling. NS: Neimark–Sacker. TC: transcritical. PF: pitchfork.

Using the AUC score (area under the ROC curve) as a measure of performance, we find that the classifier outperforms variance and lag-1 autocorrelation for each theoretical model. When evaluated for each combination of noise amplitude and rate of forcing separately, the classifier has the highest AUC score in 100% of cases for the Neimark–Sacker, fold, and pitchfork models, 84% of cases for the period-doubling model, and 80% of cases for the transcritical model (Supplementary Fig. 5). Similar to variance and lag-1 autocorrelation, the performance of the classifier is lower at higher rates of forcing. Noise amplitude affects performance differently depending on the model. In terms of predicting the correct bifurcation, the classifier typically performs better at slower rates of forcing (Supplementary Fig. 6) and was able to classify the period-doubling and Neimark–Sacker bifurcations to high accuracy at all noise amplitudes and rates of forcing considered. Finally, we evaluate the EWS for a range of parameter values in the period-doubling model that yield period-doubling bifurcations of different locations and morphology (Supplementary Fig. 7). We find that the deep learning classifier outperforms variance and lag-1 autocorrelation in each case and correctly identifies the period-doubling bifurcation.

Performance of EWS on chick heart data

In the chick heart data, we mostly observe an increasing trend in variance and a decreasing trend in lag-1 autocorrelation prior to the period-doubling bifurcation, as previously reported21. This is consistent with analytical approximations for variance and lag-1 autocorrelation prior to a period-doubling bifurcation in a noisy dynamical system10,14. The 46 records and their EWS are shown in Supplementary Figs. 811, and a sample of five period-doubling records are shown in Fig. 4. The classifier correctly predicts a period-doubling bifurcation in 16 of the 23 period-doubling records. In other cases, it incorrectly predicts a Neimark–Sacker bifurcation (e.g. Fig. 4e). This seems to be linked to an early increase in lag-1 autocorrelation, perhaps caused by a non-monotonic approach to the period-doubling bifurcation. For predictions made at 60–100% of the way through the chick heart data, the classifier obtains the highest AUC score (Fig. 3f), a slight improvement on variance, with the advantage of also providing the bifurcation type. We find this result is robust to smoothing method (Gaussian or Lowess), a range of different smoothing parameters, different rolling window sizes for variance and lag-1 autocorrelation, and sample error in the experimental data (Supplementary Figs. 1215). However, smoothing with a bandwidth that is too small diminishes the ability of the classifier to identify a period-doubling bifurcation, presumably since fluctuations that enable identification of the bifurcation type are being removed.

Fig. 4: Trends in indicators prior to a period-doubling bifurcation in five chick heart aggregates treated with a potassium channel blocker.
figure 4

ae Inter-beat interval (IBI) trajectory (gray) and smoothing (black). fj Variance of residual dynamics after smoothing, computed over a rolling window (arrow) of size 0.5 times the length of the pre-transition data. ko Lag-1 autocorrelation. pt Probabilities assigned by the deep learning (DL) classifier to the period-doubling bifurcation (orange) and the other bifurcations (gray). Blue shows the sum of the five bifurcation probabilities.

Discussion

Many systems that evolve on a discrete timeline can undergo a sudden change in dynamics via a discrete-time bifurcation. We have found that a deep learning classifier is an effective tool for predicting discrete-time bifurcations in systems with a range of noise levels and rates of approach to the bifurcation. The classifier provides higher sensitivity and specificity than variance and lag-1 autocorrelation—two commonly used EWS for bifurcations. Moreover, the classifier provides early indication of the type of bifurcation—an important piece of information given the qualitatively different dynamics associated with each bifurcation. A reliable early warning signal that specifies the type of bifurcation will help us prevent harmful bifurcations (e.g. dangerous heart rhythms3) and promote favourable transitions (e.g. ecosystem recovery35).

It may be possible to design a deep learning classifier that achieves a higher performance on our test data. First, there are many neural network architectures that could be investigated. For example, transformers, which are the current state-of-the-art for language models like GPT36, may also be useful for time series classification37. Second, the hyperparameters of the classifier could be systematically tuned to optimise performance. Third, there may be benefit to reframing bifurcation prediction as a hierarchical classification problem38. One classifier could address the binary problem of flagging an approaching bifurcation, and a second classifier could address the multi-class problem of classifying the type of bifurcation given that a bifurcation is approaching. (The same way one might want to distinguish images of dogs and cats before attempting to classify dog breeds). Finally, performance could be improved by training a larger ensemble of classifiers39.

For a classifier to be effective, it must be trained on sufficiently diverse training data. As such, the method by which training data is obtained needs careful consideration. Our previous work on continuous-time bifurcations obtained training data from randomly generated dynamical systems with polynomial terms25 and labelled the data using the bifurcation continuation software AUTO. This approach is appealing as it imposes relatively few restrictions on the models that are generated, and may include features associated with higher-order terms. Here, we opted for a more restricted approach that uses normal form models to generate the training data. This method has the advantage of being faster computationally, since the location and type of bifurcation in the model is known a priori. It also alleviates the need to detrend the training data, which can make the classifier reliant on receiving data that has been detrended using a specific method40. We have found that even with this more restricted training data, a classifier can generalise to detecting bifurcations in more complex model and empirical systems.

An important consideration in building a training library for a bifurcation predictor is how to define a ‘null’ trajectory. We opted for a simple approach that uses model simulations with a fixed bifurcation parameter, where the bifurcation parameter is sampled randomly from values that yield λ < 0.8, where λ is the eigenvalue of the Jacobian matrix. Larger values of λ result in a significant portion of simulations going through noise-induced transitions, and therefore not deemed appropriate. Upon investigating how the classifier performs on null trajectories specifically, we find that it is more confident in its prediction for null trajectories that are longer, and further away from the bifurcation (Supplementary Fig. 16), as seems logical. A useful extension to the training data could be a richer set of null trajectories, where the bifurcation parameter is allowed to move around, perhaps stochastically, as one would expect in real systems.

We trained a classifier to provide EWS for a subset of bifurcations, namely local, codimension-one, discrete-time bifurcations. While these bifurcations are present in many systems of interest, the real world presents many other classes of bifurcation in both continuous and discrete-time, including global bifurcations (e.g. homoclinic and heteroclinic), codimension-two bifurcations (e.g. cusp and Bogdanov-Takens), and bifurcations of attractors. For systems on attractors that explore a large portion of their phase space, empirical dynamical modelling41, reservoir computing42,43 and deep neural networks44 can be used to make forecasts that may help predict critical transitions. In cases where spatial information is available, concepts from statistical physics may be useful45, particularly in combination with deep learning27.

A limitation of the present classifier is that it is only trained to predict discrete-time bifurcations. Therefore, one needs to know ahead of time whether continuous or discrete time is a better description for the system. In the case of the chick heart cells, we had prior knowledge that they are well described by a discrete-time dynamical system46, and therefore appropriate for the classifier. An interesting avenue for future research is to build a classifier that works for both continuous and discrete-time bifurcations. This may be achieved by generating a training library from models with a range of discretised timesteps, from very large steps that generate discrete-time bifurcations, down to the limit of a discrete timestep of zero, where continuous-time bifurcations occur. With a large enough training set, one would not need to assume ahead of time whether continuous or discrete time is a better description for the system.

Our results demonstrate that combining dynamical system and deep learning methodologies can provide EWS for critical transitions that are both more reliable and more descriptive than non-hybrid approaches. This study has set a baseline for prediction performance across a variety of popular discrete-time models and an experimental data set. In providing a code capsule that reproduces this study, we hope to facilitate the development, testing and comparison of related methods. In particular, the development of interpretable (as opposed to ‘black box’) models that achieve a similar performance would be highly desirable, especially in safety critical domains47, although this will likely require new analytical and algorithmic insights. Techniques such as deconvolutional networks48 make it possible to map the learned space of a deep learning algorithm back onto the original temporal dataset. This allows one to visualise the features that the algorithm is using to make its decision, which could serve as a starting point for an interpretable model. Building a universal predictor for critical transitions is not a job for a single research team44, and will benefit from a variety of approaches and open source code. Depending on context, critical transitions can be devastating or highly desirable. Improved EWS would allow us to better prevent or promote such transitions.

Methods

Generation of training data for the deep learning classifier

Training data consists of simulation data from a library of 50,000 models. The models are generated at random from five different model frameworks, each possessing one of the bifurcations studied (period-doubling, Neimark–Sacker, fold, transcritical, pitchfork). The models are composed of the normal form of the bifurcation7, higher-order polynomial terms up to degree 10 with coefficients drawn from a normal distribution, and additive Gaussian white noise (ϵt) with amplitude (σ) drawn from a uniform distribution. In each case, the bifurcation occurs at μ = 0.

The model for the period-doubling bifurcation is

$${x}_{t+1}=-(1+\mu ){x}_{t}\pm {x}_{t}^{3}+\mathop{\sum }\limits_{i=4}^{10}{\alpha }_{i}{x}_{t}^{i}+\sigma {\epsilon }_{t},$$
(1)

where \({\alpha }_{i} \sim {{{{{{{\mathcal{N}}}}}}}}(0,\,1)\). The positive (negative) cubic term yields a supercritical (subcritical) bifurcation, and is chosen at random. The model for the Neimark–Sacker bifurcation is

$$\left(\begin{array}{r}{x}_{t+1}\\ {y}_{t+1}\end{array}\right)= (1+\mu )R(\theta )\left(\begin{array}{r}{x}_{t}\\ {y}_{t}\end{array}\right)\pm ({x}_{t}^{2}+{y}_{t}^{2})R(\theta )\left(\begin{array}{r}{x}_{t}\\ {y}_{t}\end{array}\right) \\ +\mathop{\sum }\limits_{i=4}^{10}\mathop{\sum }\limits_{j=0}^{i}\left(\begin{array}{r}{\alpha }_{ij}\\ {\beta }_{ij}\end{array}\right){x}_{t}^{i-j}{y}_{t}^{j}+\sigma \left(\begin{array}{r}{\epsilon }_{t}^{(1)}\\ {\epsilon }_{t}^{(2)}\end{array}\right)$$
(2)

where αij, \({\beta }_{ij} \sim {{{{{{{\mathcal{N}}}}}}}}(0,\,1)\), R(θ) is the rotation matrix

$$R(\theta )=\left(\begin{array}{rc}\cos \theta &-\sin \theta \\ \sin \theta &\cos \theta \end{array}\right),$$
(3)

and \(\theta \sim {{{{{{{\mathcal{U}}}}}}}}[0,\,\pi ]\) is the angular frequency of oscillations at the bifurcation. The positive (negative) cubic term yields a subcritical (supercritical) bifurcation, and is chosen at random. The model for the fold bifurcation is

$${x}_{t+1}=-\!\mu+{x}_{t}-{x}_{t}^{2}+\mathop{\sum }\limits_{i=3}^{10}{\alpha }_{i}{({x}_{t}-\sqrt{-\mu })}^{i}+\sigma {\epsilon }_{t},$$
(4)

where \({\alpha }_{i} \sim {{{{{{{\mathcal{N}}}}}}}}(0,\,1)\). The model for the transcritical bifurcation is

$${x}_{t+1}=(1+\mu ){x}_{t}-{x}_{t}^{2}+\mathop{\sum }\limits_{i=3}^{10}{\alpha }_{i}{x}_{t}^{i}+\sigma {\epsilon }_{t},$$
(5)

where \({\alpha }_{i} \sim {{{{{{{\mathcal{N}}}}}}}}(0,\,1)\). Finally, the model for the pitchfork bifurcation is

$${x}_{t+1}=(1+\mu ){x}_{t}\pm {x}_{t}^{3}+\mathop{\sum }\limits_{i=4}^{10}{\alpha }_{i}{x}_{t}^{i}+\sigma {\epsilon }_{t},$$
(6)

where \({\alpha }_{i} \sim {{{{{{{\mathcal{N}}}}}}}}(0,\,1)\). The positive (negative) cubic term yields a subcritical (supercritical) bifurcation, and is chosen at random.

The library is composed of 10,000 models from each framework. For each model, we run a ‘forced’ simulation where the bifurcation parameter μ is increased linearly across the interval [μ0, 0], and a ‘null’ simulation where μ is fixed at μ0. The initial value for the bifurcation parameter μ0 is drawn from a uniform distribution across all values that correspond to λ < 0.8, where λ is the eigenvalue of the Jacobian matrix in the model. This ensures that the training data contains simulations that start close to and far from a bifurcation. For the period-doubling, Neimark–Sacker, transcritical and pitchfork models, this means drawing μ0 from \({{{{{{{\mathcal{U}}}}}}}}[-1.8,-0.2]\). For the fold bifurcation, this means drawing μ0 from \({{{{{{{\mathcal{U}}}}}}}}[-0.9,-0.1]\). After a burn-in period of 100 iterations, we simulate each model for 600 iterations and keep the last 500 data points, or the 500 data points immediately preceding a transition if one occurs. We define a transition as a time when the deviation from equilibrium exceeds ten times the noise amplitude σ. We simulate one forced and one null simulation from each model, resulting in 50, 000 forced and 50, 000 null trajectories. To balance the number of entries for each class, we take 10, 000 null simulations at random, resulting in a total of 60, 000 entries in the training data set. Example trajectories for each class are shown in Supplementary Fig. 2.

Architecture and training of the deep learning classifier

We use a neural network with a CNN-LSTM architecture and hyperparameters as in ref. 25. This consists of a single convolutional layer with max pooling followed by two LSTM layers with dropout followed by a dense layer that maps to a vector of probabilities over the six possible classes. For training, we use Adam optimisation with a learning rate of 0.0005, a batch size of 1024, and sparse categorical cross entropy as the loss function. We use a training/validation/test split of 0.95/0.025/0.025. We found 200 epochs was sufficient to obtain optimal accuracy on the validation set.

To expose the classifier to time series of different lengths, we censor each time series in the training data. We train two classifiers independently using different censoring techniques. Classifier 1 is trained on time series censored at the beginning and the end, forcing it to learn from data in the middle of the time series. Classifier 2 is trained on time series only censored at the beginning, allowing it to learn from data right up to the bifurcation. The length for each censored time series L is drawn from \({{{{{{{\mathcal{U}}}}}}}}[50,500]\). Then, for Classifier 1, the start time of the censored time series t0 ~ U[0, 500 − L] and for Classifier 2, t0 = 500 − L. The censored time series are then normalised by their mean absolute value and prepended with zeros to make them 500 points in length. We report results using the average prediction of the two classifiers.

Theoretical models

To test the deep learning classifier on out-of-sample data, we simulate a variety of nonlinear, discrete-time models, each containing one of the studied bifurcations. To account for stochasticity, we include additive Gaussian white noise. We run forced simulations, where the bifurcation parameter is increased linearly up to the bifurcation point, and null simulations, where the bifurcation parameter remains fixed. To create a diverse set of test data, we vary the noise amplitude (σ) and rate of forcing (rate of change of the bifurcation parameter). We run 100 forced and null simulations at five different noise amplitudes and five different rates of forcing, resulting in 5000 simulations of each model. Values for the noise amplitude are on a logarithmic scale and values for the rate of forcing result in time series of length 100, 200, 300, 400, and 500. Sample simulations for each model at different noise amplitude and rate of forcing are shown in Supplementary Fig. 4. The transition time for each forced simulation is taken as the moment when the bifurcation parameter crosses the bifurcation, or the moment when the state variable crosses a threshold, if specified.

Fox model

To test the detection of a period-doubling bifurcation, we use a model of cardiac alternans49 with additive Gaussian white noise. This is given by

$${D}_{n+1}=(1-\alpha {M}_{n+1})\left(A+\frac{B}{1+{e}^{-({I}_{n}-C)/D}}\right)+\sigma {\epsilon }_{n},$$
(7)
$${M}_{n+1}={e}^{-{I}_{n}/\tau }[1+({M}_{n}-1){e}^{-{D}_{n}/\tau }],$$
(8)
$${I}_{n}=T-{D}_{n},$$
(9)

where Dn is the action potential duration of the nth beat, Mn is a memory variable, In is the rest duration following the action potential, T is the stimulation period, τ is the time constant of accumulation and dissipation of memory, α is the influence of memory on the action potential duration, and A, B, C and D are parameters governing the shape of the restitution curve. Following49, we take A = 88, B = 122, C = 40, D = 28, τ = 180, α = 0.2, which give dynamics in good agreement with a complex ionic model. This yields a period-doubling bifurcation at approximately T = 200. Forced simulations are run with T decreasing linearly on the interval [300, 150] and null simulations are run with T = 300. Values for noise amplitude are 0.1 × {20, 2−1, 2−2, 2−3, 2−4}.

To test the robustness of EWS to different model parameter values, we simulate trajectories with different values of α and a multiplicative scaling factor of A, B, C and D (Supplementary Fig. 7). In each case, we simulate 100 forced and null trajectories for 300 time steps and a noise amplitude of 0.1. Forced simulations are run with T decreasing linearly from 300 to the bifurcation point and null simulations are run with T = 300.

Westerhoff model

To test detection of a Neimark–Sacker bifurcation, we use a simple model of business cycles based on consumer sentiment30 with additive Gaussian noise. This is given by

$${Y}_{t}=a+(b-d){Y}_{t-1}+d{Y}_{t-2}+\frac{c{Y}_{t-1}}{1+\,{{\mbox{Exp}}}\,\left[-({Y}_{t-1}-{Y}_{t-2})\right]}+\sigma {\epsilon }_{t},$$
(10)

where Yt is the national income at time step t, a is the level of autonomous expenditures of agents, b and c govern a curve that determines the fraction of income consumed by the agents, and d is the policy-maker’s control parameter to offset income trends. We take b = 0.45, c = 0.1, and d = 0.2, which yields a Neimark–Sacker bifurcation at a = 24 corresponding to the onset of business cycles. Forced simulations are run with a increasing linearly on the interval [10, 27] and null simulations are run with a = 10. Values for noise amplitude are 0.1 × {20, 2−1, 2−2, 2−3, 2−4}.

Ricker model

To test detection of a fold bifurcation, we use the Ricker model50 with a sigmoidal harvesting term and additive Gaussian noise. This is given by

$${x}_{t+1}={x}_{t}{e}^{r(1-{x}_{t}/k)}-F\frac{{x}_{t}^{2}}{{x}_{t}^{2}+{h}^{2}}+\sigma {\epsilon }_{t},$$
(11)

where xt is the population size at time step t, r is the intrinsic growth rate, k is the carrying capacity, F is the harvesting rate, and h governs the steepness of the sigmoidal harvesting term. We take r = 0.75, k = 10, h = 0.75, which yields a fold bifurcation at F = 2.36. Forced simulations are run with F increasing linearly on the interval [0, 3.54] and null simulations are run with F = 0. We define a transitions as the time when xt drops below 0.45. Values for noise amplitude are 0.2 × {20, 2−1, 2−2, 2−3, 2−4}.

Lotka–Volterra model

To test detection of a transcritical bifurcation, we use the discrete-time analogue of the Lotka-Volterra model, first studied by Maynard Smith51. This system is especially relevant to arthropod predator-prey and host-parasitoid interactions. The rescaled equations52 are

$${x}_{t+1}=(r+1){x}_{t}-r{x}_{t}^{2}-c{x}_{t}{y}_{t}+\sigma {\epsilon }_{t}^{(1)},$$
(12)
$${y}_{t+1}=c{x}_{t}{y}_{t}+\sigma {\epsilon }_{t}^{(2)},$$
(13)

r relates to the growth rate of the prey (xt), and c relates to the foraging efficiency of the predator (yt). We take r = 0.5, which yields a transcritical bifurcation at c = 1, the critical foraging efficiency beyond which the predator population can sustain themselves. Forced simulations are run with c increasing linearly on the interval [0.5, 1.25] and null simulations are run with c = 0.5. We look for early warning signals in the prey population. Values for noise amplitude are 0.01 × {20, 2−1, 2−2, 2−3, 2−4}.

Lorenz model

To test detection of a pitchfork bifurcation, we use the reduced discrete Lorenz system, which was first introduced as a demonstration of computational chaos53. This is given by

$${x}_{t+1}=(1+ah){x}_{t}-h{x}_{t}{y}_{t}+\sigma {\epsilon }_{t}^{(1)}$$
(14)
$${y}_{t+1}=(1-h){y}_{t}+h{x}_{t}^{2}+\sigma {\epsilon }_{t}^{(2)}$$
(15)

where state variables and parameters are derived from the full Lorenz equations53. We take h = 0.5, which yields a pitchfork bifurcation at a = 0. Forced simulations are run with a increasing linearly over the interval [ − 1, 0.25] and null simulations are run with a = −1. We look for early warning signals in xt. Values for noise amplitude are 0.01 × {20, 2−1, 2−2, 2−3, 2−4}.

Experiments with embryonic chick heart cell aggregates

Experiments were carried out in accordance with the ethical and Health and Safety regulations at McGill University. Aggregates were prepared using the method proposed by DeHaan54. Ventricles of 7-day-old White Leghorn chicken embryo hearts were dissected and dissociated into single cells by trypsinization. The cells were then added to Erlenmeyer flasks containing a culture medium (818A) gassed with 5% CO2, 10% O2, 85% N2 (pH = 7.4), and placed on a gyratory shaker for 24-48 hours at 37 °C. This generated aggregates with a diameter of approximately 100-200 μm that displayed a beating pattern of period approximately 1-2 s. Experiments were conducted 2-6 h after the aggregates were plated, in dishes maintained at 37 °C.

The aggregates were treated with 0.5-2.5 μmol of E4031, a drug that blocks the human Ether-à-go-go-Related Gene (hERG) potassium channel55. The beating of the aggregates was recorded using phase-contrast imaging sampled at 40 Hz using a CCD camera (NeuroCD-SM; RedShirtImaging, LLC) at an 80 × 80 pixel spatial resolution, focussing on light-intensity variation at the edge of the aggregate31. There are periodic stops in the recording (for 2-3 min) for data storage purposes. The light signal for each aggregate is processed through a band-pass filter (cutoff frequencies: 0.1-6.5 Hz). The timing of each beat is then determined as the moment when the signal passes a threshold (the mean of the record plus 0.7 times the standard deviation) with positive slope. The interbeat intervals are computed as the time between consecutive beats, and used in the analysis of this study.

We define the onset of the period-doubling bifurcation as the first time when the slope of a linear regression of the return map composed of a sliding window of interbeat intervals is below -0.95 for the next 10 beats. According to this definition, 43 of the 119 aggregates underwent period-doubling bifurcations. The remaining aggregates either went through no qualitative change in dynamics (18), or underwent a transition to more complex dynamics including irregular rhythms and bursting oscillations (58). Of the period-doubling aggregates, we captured the onset of the period-doubling bifurcation for 23 of them (Sup. Figs. 8, 9). The other period-doubling bifurcations were missed due to pauses in the recording. From the 18 aggregates that undergo no qualitative change, we extract 23 segments at random with a random length between 100 and 500 to serve as null time series (Supplementary Figs. 10, 11). Predictions are made at 10 equally spaced time points between 60–100% of the way through the 23 period-doubling (pre-bifurcation) and 23 null time series.

Computing and assessing the performance of EWS

EWS are computed using the Python package ewstools56. This involves first detrending the (pretransition) time series. For the model simulations, we use a Lowess filter with a span of 0.25 the length of the data. For the heart cell data, we use a Gaussian filter with a bandwidth of 20 beats. Variance and lag-1 autocorrelation are then computed over a rolling window of 0.5, which had higher performance than a rolling window of 0.25. The deep learning predictions at a given point in the time series are obtained by taking the preceding data from the time series, normalising it, prepending it with zeroes to make it 500 points in length, and feeding it into the classifier.

To compare performance of variance, lag-1 autocorrelation and the deep learning classifier, we use the AUC (area under curve) score of the ROC curve. The ROC curve plots the true positive rate vs. the false positive rate as a discrimination threshold is varied. We use the Kendall τ value as the discrimination threshold for variance and lag-1 autocorrelation, and the sum of the bifurcation probabilities for the discrimination threshold of the deep learning classifier.

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.