On the predictability of infectious disease outbreaks

Infectious disease outbreaks recapitulate biology: they emerge from the multi-level interaction of hosts, pathogens, and their shared environment. As a result, predicting when, where, and how far diseases will spread requires a complex systems approach to modeling. Recent studies have demonstrated that predicting different components of outbreaks--e.g., the expected number of cases, pace and tempo of cases needing treatment, importation probability etc.--is feasible. Therefore, advancing both the science and practice of disease forecasting now requires testing for the presence of fundamental limits to outbreak prediction. To investigate the question of outbreak prediction, we study the information theoretic limits to forecasting across a broad set of infectious diseases using permutation entropy as a model independent measure of predictability. Studying the predictability of a diverse collection of historical outbreaks--including, gonorrhea, influenza, Zika, measles, polio, whooping cough, and mumps--we identify a fundamental entropy barrier for time series forecasting. However, we find that for most diseases this barrier to prediction is often well beyond the time scale of single outbreaks, implying prediction is likely to succeed. We also find that the forecast horizon varies by disease and demonstrate that both shifting model structures and social network heterogeneity are the most likely mechanisms for the observed differences in predictability across contagions. Our results highlight the importance of moving beyond time series forecasting, by embracing dynamic modeling approaches to prediction and suggest challenges for performing model selection across long disease time series. We further anticipate that our findings will contribute to the rapidly growing field of epidemiological forecasting and may relate more broadly to the predictability of complex adaptive systems.

prediction 15 , and suggest challenges for performing model selection across long disease time series. We further anticipate that our findings will contribute to the rapidly growing field of epidemiological forecasting and may relate more broadly to the predictability of complex adaptive systems.
"If we don't have a vaccine-yes, we are all going to get it. 16 " This dire assessment by a Canadian nurse in 2003 reflected the global public health community's best understanding of the ongoing SARS outbreak 17,1 . This understanding-for perhaps the first time in history-was partially derived from mathematical and computational models, which were developed in near real-time during the outbreak to forecast SARS transmission risk 10,1 . However, the predictions for SARS failed to match the data 18,1 . Over the subsequent fifteen years, the scientific community developed a rich understanding for how complex social contact networks, variation in health care infrastructure, the spatial distribution of prior immunity, etc., drive complex patterns of disease transmission 19,13 and demonstrated that data-driven, dynamic and/or agent-based models can produce actionable forecasts 20,6,21,22 . What remains an open question is whether the existing barriers to forecasting stem from gaps in our mechanistic understanding of disease transmission and low-quality data or from fundamental limits to the predictability of complex, sociobiological systems, i.e. outbreaks 19,23,13 .
In order to study the predictability of diseases in a comparative framework, which also permits stochasticity and model non-stationarity, we employ permutation entropy as a modelfree measure of time series predictability 25,14 . This measure, i.e permutation entropy, is ideal because-in addition to being a model independent metric of predictability-recent work has demonstrated that it correlates strongly with known limits to forecasting in dynamical systems, e.g., models where we can measure Lyapunov stability 25,14 .
Permutation entropy is conceptually similar to the well-known Shannon entropy 25 . However, instead of being based on the probability of observing a system in a particular state, it utilizes the frequency of discrete motifs, i.e symbols, associated with the growth, decay, and stasis of a time series. For example, in a binary time series the permutation entropy in over which the permutation entropy will be defined. A time series that visits all the possible symbols with equal frequency will have maximal entropy and minimal predictability and a time series that only samples a few of the possible symbols will instead have lower entropy and hence be more predictable.
More formally, for a given time series {x t } t=1,...,N indexed by positive integers, an embedding dimension d and a temporal delay τ, we consider the set of all sequences of value s of the type s = {x t , x t+τ , . . . , x t+(d−1)τ }. To each s, we then associate the permutation π of order d that makes s totally ordered, that iss = π(s) = [x t i , . . . , x t N ] such that x t i < x t j ∀t i < t j , hence generating the symbolic alphabet. Ties in neighboring values, i.e. x t i = x t j , were broken both by keeping them in their original order in the time series and/or by adding a small amount of noise, the method of tie breaking did not affect the results.
The permutation entropy of time series {x t } is then given by the Shannon entropy on the permutation orders, that is H p d,τ ({x t }) = − ∑ π p π log p π , where p π is the probability of encountering the pattern associated with permutation π. In this study, we select a conservative value of H p by searching over a wide range of possible (d, τ) pairs and setting To control for differences in dimension and for the effect of time series length on the entropy estimation, we normalize the entropy by log(d!) (following 26 ), ensure that each window is greater in length than d!, and confirm that the estimate of H p has stabilized (specifically that the marginal change in H p as data are added is less than 1%). To facilitate interpretation, we present results from continuous intervals by fixing τ = 1. However, our results generalize to the case where we fix both d and τ across all diseases and where we minimize over a range of (d, τ) pairs (see Supplement).
As defined above, permutation entropy does not require the a priori specification of a mechanistic nor generating model, which allows us to study the predictability of -potentially very different-systems within a unified framework. What is not explicit in the above formulation is that the permutation entropy can be accurately measured with far shorter time series than Lyapunov exponents and that it is robust to both stochasticity and linear/nonlinear monotonous transformations of the data, i.e. it is equivalent for time series with different magnitudes 25,27 .
Consider-for example-two opposite cases with respect to their known predictability, pure white noise and a perfectly periodic signal. We expect the former, being essentially random, to display a very high entropy as compared with the latter, which instead we expect to show a rather low entropy in consideration of its simple periodic structure.
In Figure 1 we demonstrate that this is indeed the case, even when we allow the periodic signal to be corrupted by a small amount of noise. We track the short scale predictability of the time series by calculating the permutation entropy in moving windows (with width = one year, although the results are robust to variation in window size). For comparison, we calculate the same moving window estimate of the permutation entropy for the time series of measles cases in Texas prior to the introduction of the first vaccine. The critical observation is that the moving-window entropy for the measles time series fluctuates between values comparable with that of pure random noise and, at times, values closer to the more predictable periodic signal, which suggests alternating intervals with different dynamical regimes and, thus, predictability.
The magnitude of the entropy fluctuations for measles in Texas is statistically significant by permutation test, p < 0.001, as compared with simulated fluctuations obtained by building an estimated multinomial distribution over the symbols and repeatedly calculating the expected Kullback-Leibler divergence from simulations.
We now turn our attention to a broader set of diseases and ask how the predictability, Zooming out, what is also conspicuous about the relationship between time series length and predictability is that diseases cluster together and show disease-specific slopes, i.e. predictability vs. time series length, which suggests that permutation entropy is indeed detecting temporal features specific to each disease, Figure 3A. After re-normalizing time for each disease by its corresponding R 0 -we used the mean of all reported values found in a literature review (see Supplement)-we find that the best-fit mixed-effect slope on a log scale is one and that the residual effect is well predicted by the times series' embedding dimension d (see supplemental figures S1 and S2). Moreover, because the embedding dimension d of a time series is the length of the basic blocks used in the calculation of the permutation entropy,

Figure 3. Permutation entropy and time series length show regularity by disease A.)
The predictability (1 − H p ) for chlamydia, gonorrhea, hepatitis A, influenza, Zika, measles, polio, whooping cough, and mumps is plotted as a function of time series length in weeks. Although the slopes are different for each disease, in all cases, longer time series, i.e. more data, result in lower predictability. However, we again find that single outbreaks should be predictable and that diseases show a remarkable degree of clustering based on the slope of entropy gain. B.) We rescaled the time series length based on the mean published basic reproductive number, i.e. R 0 , from the literature (see Supplement) and plot the log of this quantity against the log of the permutation entropy.
it encodes the fundamental temporal unit of predictability in the form of an entropy production rate, thus implying that predictability decreases with time series data at a disease-specific rate determined to first order by R 0 , which is further modulated by d. This result predictability depends on scale also suggests that the permutation entropy could be an approach for justifying the utility of different data sets, i.e. one could determine the optimal granularity of data by selecting the dimension that maximized predictability.
One might assume that this phenomenon, i.e. decreasing predictability with increasing time series length, could be driven purely by random walks on the symbolic alphabet used in the permutation entropy estimation. However, n-dimensional Markov chain models built from the time series embeddings (n = d the time series' embedding dimension) consistently produced stable and smaller predictability values in comparison with those obtained from data, corroborating that the predictability behaviour we observe does not stem from random fluctuations but is an actual fundamental feature of spreading processes. This observation, that Markov chain models of the same embedding order do not reproduce the observed predictability, indicates that either the model structure is changing in time and/or the system has a very long memory, which is consistent with our current understanding of the entanglement between mobility and disease 30,1 . That the best-fit n-dimensional Markov chain models overpredict the amount of entropy in real systems, also supports our earlier results that predictable structure does exist in most outbreak time series. Shortly after the vaccine was introduced the permutation entropy increased significantly, which is expected after a system experiences a change to its model structure, in this case vaccination. B.) The Kullback-Leibler divergence of the symbol frequency distribution was calculated between all pairs of rolling one-year (52 week) windows for the measles time series in panel A (dark blue points) and for a noise-free sine wave (black points). The best-fit loess regression is plotted for both time series. Significance (red line) was determined by permutation test.
To gain insight into what mechanisms might be driving changes in the predictability, we take advantage of the repeated, "natural" experiment of vaccine introduction. For diseases, such as measles, where we have data from both the pre-and post-vaccine era, we ask whether the permutation entropy changes after vaccination begins. We consistently observe that predictability decreases after vaccination, again with significance determined by permutation test (see Figure 4A.). We also find that the Kullback-Leibler divergence of the symbol frequency distribution changes significantly from year-to-year across the entire measles time series (see Figure 4B.). Critically, because-as stated earlier-permutation entropy is not affected by changes in magnitude, the difference in entropy cannot simply be accounted for by a reduction in cases. Instead, it means that the temporal pattern of cases changes. This leads us to the hypothesis that the distribution of secondary infections, its first moment or R 0 and its higher moments, drives predictable changes in the permutation entropy, . Permutation entropy and contact network heterogeneity We simulated outbreaks on social networks with increasing contact heterogeneity, as measured by the first two moments of the degree distribution, and calculated the resulting permutation entropy. We plot the permutation entropy on the ordinate axis as a function of the distance between the critical transmissibility for each simulation T c = <k> <k 2 >−<k> , where < k > is the mean degree and < k 2 > is the mean square degree. The increase in transmissibility is dual to the network heterogeneity and distance from criticality, which are both decreasing along the x-axis, as indicated in the cartoon. We find a significant, non-linear relationship between contact heterogeneity and permutation entropy, which qualitatively matches the pattern seen in the real-world disease time series. Specifically, lower heterogeneity -or conversely larger distance from the critical transition point to an epidemic (i.e. a large-scale outbreak)-leads to higher permutation entropy (i.e. lower predictability). phenomenon originally discovered in synthetic directed networks by Meyers et al. 31 .

High
To further evaluate the hypothesis that heterogeneity in social networks -and thus in the number of secondary infections-produces predictable changes in permutation entropy, we simulated data on social contact networks with varying degree distributions or, equivalently, critical transmissibilities T c -the per contact probability of transmission required to cause a large-scale outbreak, i.e an epidemic 32  We find that the distance of the actual transmissibility from the critical transmissibility (δ = T c −T ) predicts the entropy of the outbreak time series, Figure 5. This result demonstrates that, as the system gets closer to the critical transition point, from localized outbreaks to a large-scale epidemic, the predictability goes up exponentially fast. From this, we can draw three conclusions. First, coupled with our earlier results comparing diseases with different average reproductive numbers, heterogeneity in the number of secondary infections can drive differences in predictability, which is related to results on predicting disease arrival time on networks 33 and to recurrent epidemics in hierarchical metapopulations 34 . Second, the permutation entropy could provide a model-free approach for detecting epidemics, which is related to a recent model-based approach based on bifurcation delays 35 . Finally, as outbreaks grow and transition to large-scale epidemics, they should become more predictable, which-as seen in Figures 1 and 3-appears to be true for real-world diseases as well and agrees with earlier results on how permutation entropy relates to predictability of non-linear systems 14 Research in dynamical systems over the the past 30 years has demonstrated that prediction error increases with increasing forecast length 29 . However, across that same body work, researchers typically find that predictions improve when they are trained on longer time series, even for chaotic systems 29 . Our data-driven results suggest that for infectious diseases the opposite is true, more time series data should most often lead to lower predictability. Then, by integrating our biological understanding of each pathogen and simulated outbreaks, we found that changing dynamics, e.g., the shifting number of secondary infections as a disease moves through a heterogeneous social network, can cause the prediction error to increase with increasing data, which is related to earlier findings on the role of airline travel networks and disease forecasting 15 . What this implies is that different "models" generate data at different time points and suggests that the optimal coarse-graining of complex systems might change with scale and/or time 36 .
The global community of scientists, public health officials, and medical professionals studying infectious diseases has placed a high value on predicting when and where outbreaks will occur, along with how severe they will be 37,1,38,39 . Our results demonstrate that outbreaks should be predictable. However, as outbreaks spread-and spatiotemporally separated waves become entangled with the substrate, human mobility, behavioural changes, pathogen evolution, etc.-the system is driven through a space of diverse model structures, driving down predictability despite increasing time series lengths. Taken together, our results agree with observations that accurate long-range forecasts for complex adaptive systems, e.g., contagions beyond a single outbreak, may be impossible to achieve due to the emergence of entropy barriers. However, they also support the utility and accuracy of dynamical modeling approaches for infectious disease forecasting, especially those that leverage myriad data streams and are iteratively calibrated as outbreaks evolves. Lastly, our results also suggest that cross-validation over long infectious disease time series can not guarantee that the correct model for any individual window of time will be favored, which would imply a no free lunch theorem for infectious disease model selection, and perhaps for sociobiological systems more generally 40 .

Data Availability Statement
Empirical data for all diseases-aside from Zika-were obtained from the U.S.A. National Notifiable Diseases Surveillance System as digitized by Project Tycho 24 . Zika data were obtained from public health reports from Colombia and Mexico as digitized by 28 . All other data and code that support the plots and findings of this study will be available on github soon. In the interim, they are available from the corresponding authors upon reasonable request. The supplement is available at http://scarpino.github.io/files/ supplementary-information-predictability.pdf.