North to south gradient and local waves of influenza in Chile

Influenza seasonality is caused by complex interactions between environmental factors, viral mutations, population crowding, and human travel. To date, no studies have estimated the seasonality and latitudinal patterns of seasonal influenza in Chile. We obtained influenza-like illness (ILI) surveillance data from 29 Chilean public health networks to evaluate seasonality using wavelet analysis. We assessed the relationship between the start, peak, and latitude of the ILI epidemics using linear and piecewise regression. To estimate the presence of incoming and outgoing traveling waves (timing vs distance) between networks and to assess the association with population size, we used linear and logistic regression. We found a north to south gradient of influenza and traveling waves that were present in the central, densely populated region of Chile. Our findings suggest that larger populations in central Chile drive seasonal influenza epidemics.


Results
Hospitals included. From the 78 large and medium size hospitals in Chile, we excluded three hospitals from Santiago and one from Valparaiso with < 75% of daily ED data reported for the period between 2011 and 2016. Six pairs of hospitals from Santiago were combined, four corresponded to the merging of a pediatric and adult hospital (S2). We included 65 hospitals from the 15 regions and the 29 Chilean public health networks (Fig. 1).
A total of 242,982 ILI cases were reported from January 2010 to December 2016. Sixty-five percent of the days (n = 1.782.470) reported zero cases and 34% (n = 959,792) had ≥ 1 case reported. One percent (n = 1861) were days with missing data that ranged from 0 to 5% of the time series in different hospitals.
Start and peak day of seasonal influenza in health networks. The median start and peak of the epidemic for all health networks and years was May 11th and August 4th, respectively (start day 131 of each year; IQR 117, 152 and peak day 217; IQR 200, 237, respectively). The median duration from start to peak was 84 days (IQR 82, 88), equivalent of 12 weeks or three months.
Despite the variability of start and peak day between and within health networks, there was consistency of timing of ILI across Chile. The duration of the epidemic was constant across health networks and latitudes (S4).
Influenza-like illness epidemics tended to start earlier than the previous year between 2010 and 2013, but from 2014 the start was delayed in each successive year. No significant association was found between the peak of the epidemic and predominant strain in the country for each year (S5).
Population and a north to south pattern associated to early annual epidemics. The univariate analysis showed that as latitude increased from north to south, the start and peak of the influenza season were delayed (Fig. 4). While there was a significant association between latitude and timing in the central zone of Chile ( Fig. 4c and d), no association was present in the north or south of the country. There was a positive association between population size and earlier start and peak time of influenza. Given that influenza seasons vary annually, years was analyzed as a categorical variable and showed a significant association between start and peak day. Similarly, each year had a predominant strain in the country. The H3N2 strain had an earlier peak day and a shorter start-peak time compared to H1N1 strain. The presence of an airport in the territory of a health network was not associated with influenza timing (S6). The best multivariate model for the outcome start day included nine variables: latitude, population, presence of airport, and years 2010 − 2016 encoded as dummy variables (R 2 = 0.24; P < 0.01) (  Dark red represent higher power, yellow represents low power and fit, white spaces denote non-significant power compared to white noise. (b) Cumulative power z-score to calculate the periods with best fit across hospitals, represented as the 95th percentile of cumulative power. Power for each hospital were normalized, log transform, subtracted from the mean, and divided by the standard deviation. The sum of z-score for each period was plotted with the 95th percentile (dotted lines in both a and b) 24 www.nature.com/scientificreports/ association was found between start date and population size (β = − 0.3; 95% CI, − 0.5 − −0.1; P < 0.01). Health networks with a nearby airport was associated with an estimated delay of 9.5 days for the start of an ILI epidemic (95% CI, 0.84 − 18.19; P < 0.05). A north to south pattern was found across the country, for one degree increase in latitude there was one day delay on the start of ILI (β = 0.903; 95% CI 0.315 − 1.491; P < 0.01) (Eq. 1 and Table 2).   www.nature.com/scientificreports/ where S d = Start day from January 1st from each year; Lat = South Latitude; Airport = Presence of an airport; Y = Dummy for the year noted. The best model for ILI peak included eight variables: latitude, population and year as a dummy variable (R2 = 0.40; P < 0.01) ( Table 1). A north to south pattern was found for latitude (β = 0.780; 95% CI, 0.346 − 1.215; P < 0.01) and a negative association between ILI peak and population size (β = − 0.16, 95% CI − 0.366 to − 0.016; P < 0.05) (Eq. 2 and Table 2).
where P d = Peak day from January 1st from each year; Lat = South Latitude; Y = Dummy for the year noted.
Local travelling waves. We used a second-degree local polynomial regression model (LOESS) to determine the distance limit for the local traveling wave models. We found a descending Spearman's correlation of phase angles up to 1250 km that disappeared for larger distances (Fig. 5a) in both the reconstructed wavelet analysis and in the original rate time series data (S7). As a result, we included health networks closer than 1250 km in each model of local traveling waves.

Discussion
We found a predominant annual seasonality of influenza in Chile with a periodicity of 50 weeks (95% CI, 43 − 53). This period coincided with the annual seasonality of influenza reported from other temperate territories and from the Chilean Ministry of Health influenza surveillance reports 48 . To our knowledge, this is the first study that confirms a cycle of seasonal influenza in Chile.
There was a north-south latitudinal gradient for the start and peak of influenza related to the central zone between latitudes 31 and 40 degrees south. Chowell and colleagues found a south to north hospitalization gradient for the H1N1 2009 pandemic 22 . The difference between Chowell results and our findings can be explained by the annual variability of influenza epidemics due to weather, population immunity, or human movement 49 . We did not include 2009 data in our analysis; however, we found a south-north latitudinal gradient in 2015 for the start date of the ILI season (S9). In the 2009 and 2015 epidemics, the first influenza cases were reported in southern cities with populations greater than 200,000. While health networks consisting of large populations from central Chile have a higher probability of reporting the first cases of influenza, southern zones with relatively large populations can also experience epidemics.
Population size was also associated with the start and peak of influenza epidemics as described previously in Perú, Hong Kong, Brazil and the USA 5,7,12,13,21 . The patterns found were mostly influenced by the center region of Chile which includes Santiago, the nation's capital and most populated city. The outward, rapid spread of influenza from Santiago to the rest of the country has been previously reported by Burger et al. in Chile and are similar to the findings described by Viboud et al. in the USA 5,23 . These findings are supported by the association between population and the presence of local incoming and outgoing wave. Health networks with significant local outgoing and incoming traveling waves were primarily located in central Chile; two networks with significant  www.nature.com/scientificreports/ incoming waves were also located in the southern region. Our findings suggest that larger populations located in the center of the country drive seasonal influenza epidemics. Health networks in the central zone spread influenza to neighboring networks through outgoing traveling waves. Networks receive incoming waves in the same zone and in southern Chile. This study has several limitations. We assumed that all hospitals had the same age distribution and did not adjust for age structure of health networks. Controlling for age may provide more detailed information on influenza transmission in future studies. We did not include ED data from private insurance companies, hospitals, or practices because they were not available; however, public insurance covers 74% of the population in Chile 26 . Additionally, we included data from hospitals located in a range of latitudes that are collected through a centralized reporting system with national guidelines 24 . While we used secondary ILI case data from ED reports that were based on clinical diagnosis without laboratory confirmation, laboratory surveillance is available only for 25 hospitals with partial coverage of the countries´ territory. ILI has been reported as a useful measure to describe influenza seasonality and identify influenza population patterns 21 . On the other hand, ILI surveillance operates in all Chilean hospital EDs year-round resulting in more complete seasonality data. We assumed a constant proportion of emergency consultation to other providers. Private hospitals, private practices, and public primary and 95% CI (grey). There is a change in the descending relation of correlation and distance at 1250 km that we used as the upper distance limit to define local wave. (b) Phase difference in weeks vs distance (km) for Health Networks with incoming (red) and outgoing (blue) waves. In grey linear models with a significance level of 0.05. In red and blue, linear models with a Bonferroni corrected significance level < 0.001. The distance sign was inverted for negative lag times for a more intuitive display of incoming waves. (c) Geographic location of health networks with significant traveling waves. Incoming and outgoing waves are clustered in the center of the country. Colors correspond to panel (b) 24 www.nature.com/scientificreports/ healthcare centers could attract patients with public insurance, especially in seasons of high demand. We assessed this limitation by estimating the population with public insurance that lives in a 5 km radius that are more likely to visit a hospital ED compared to a public or private provider. Instead of assuming an annual seasonality, we estimated the best seasonality before fitting our models. We selected the best models for the start and peak of influenza by comparing all possible models. This is the first study to estimate seasonal influenza seasonality in Chile and its patterns across latitudes. Novel strains should be considered when different patterns are detected. If a novel strain is detected early, patterns could differ from the ones found here.
Our findings can help decision-makers to prepare for the influenza season by prioritizing zones for early vaccination campaigns, reallocating hospital beds for annual epidemics, and distributing resources according to patterns of influenza. Zones with larger populations and those located in the center of the country would require earlier implementation of interventions to reduce the spread of influenza to regions across the country.

Methods
Data sources. We obtained daily ILI case data collected by the Chilean Ministry of Health between 2010 and 2016 from public hospitals EDs. ILI is an acceptable measure that is correlated to laboratory surveillance regarding population patterns 21 . We used ILI from ED rather than sentinel laboratory surveillance data given that influenza surveillance is limited to 25 hospitals in the country with limited territorial coverage. We supplemented gridded population data from WorldPop with official governmental data on hospital locations, census information, and population coverage to estimate the population with access to each hospital [25][26][27][28] .
We excluded community hospitals, hospitals without an ED, and hospitals that reported < 75% of daily ED data between 2011 and 2016 (< 1920 days). Hospitals that had ≥ 5% missing data in groups of 20 consecutive days were also excluded to reduce imputation errors. We assumed missing data were missing at random. Twelve different imputation methods were tested using randomly generated missing data compared to the original data. Time series imputation with Kalman Smoothing showed the best results for estimating seasonality compared to the original data 29,30 . Data preparation. Given that patients experience a decline in ED access with increasing distance, we chose a five kilometer radius to estimate the population covered by each hospital 31,32 . We obtained the population estimates per 100 × 100 m grid from WorldPop for 2010 and 2015 and adjusted by census data for each municipality and year 28 . We used linear interpolation to impute population estimates for 2011 − 2014, and 2016. For each hospital coverage area, we adjusted for the proportion of individuals with public insurance for each municipality per year and calculated daily rates of ILI over the estimated population 26 . Data from hospitals in the same health network and within 5 km from each other were combined into one time series. To determine seasonality across health networks, we calculated the ILI rate by network by adding ILI and population data from all hospitals within each health network. We used a mid-point location from hospitals within each network to determine the health network's latitude.
Data transformation using wavelets. We used wavelet analysis to determine the best periodicity (e.g. seasonality) of influenza across 29 Chilean health networks. Wavelet transformation is an adequate method to determine seasonality for non-stationary time series and to determine seasonality of a periodic time series without assuming a priori a specific seasonality 33 . Wavelet transformation corresponds to the cross-correlation of ILI rate's time series with wavelets of different "widths" or "scales" across time 33 . Morlet wavelet has been used to calculate infectious disease rates for dengue, pertussis, measles, and influenza [34][35][36][37][38][39] . We computed the Morlet wavelet transformation for the entire country, each health network, and each individual hospital 34,35,40 .
We conducted wavelet transformations for periods (scales) from 14 − 144 weeks. Statistical significance was tested by comparing the wave signal against 1000 randomly generated time series (i.e. white noise) with a significance level of 0.05 41 . The local wavelet power spectrum was used to visualize the cross-correlation between rates and each group of wavelets (Fig. 2b, S1 and Equation s1) 33,42 .
To determine the best seasonality (periods with higher power), we computed the average power for each period, over all time points, from the local wavelet power spectrum for Chile, each health network, and for individual hospitals, according to the work of Torrence 42 : where N represents the number of observations.
We defined the best seasonality of influenza as the range of periods within the 95th percentile of average power across periods.

Association between epidemic timing and latitude in health networks.
To determine the association between influenza timing and latitude, we selected the wavelet transform using the range corresponding to the best seasonality for each health network. We reconstructed the epidemic cycles using the filter defined by Torrence and Campo (S1 and Equation s2) 42 . For each health network, the start date of an annual ILI epidemic was defined as the date when the reconstructed cycles reached the midpoint between the lowest point and the peak of the signal for each year from 2010 to 2016. We calculated the peak day for the corresponding period along with the median start and peak date for each health network. www.nature.com/scientificreports/ We developed univariate, piecewise, and multivariate linear regression models to assess the relationship between the predictor, latitude, and start and peak outcomes of the epidemic across the health networks. For the piecewise models, we separated the country into three zones: north, center, and south with south latitude ranges of < 31, 31-40, and > 41, respectively. For multivariate analysis, we included latitude and population as continuous variables, a binary variable to describe the presence of an airport within the health network territory, and dummy variables for years 2010 to 2016 with 2013 as the reference. The predominant influenza strain was excluded due to linear dependencies with years 2011 and 2014. The final multivariate model was selected by testing all possible predictor combinations with the regression sum of squares method and by comparing the BIC and Mallow's Cp 43 . Local traveling waves of influenza. A Morlet wavelet is a complex function with a real and an imaginary part. The real part enables separation of the amplitude and extraction of the phase of a signal as the timing within a specific period, regardless of the amplitude 33,42,44 . Phases are continuous cycles that are presented as phase angles in radians from -π to + π 41 . We extracted phase angles using a band of periods within the 95th percentile of power as described previously by Torrence (S1 and Eq. 3).
In order to define a local traveling wave, we used a second-degree local polynomial regression model (LOESS) with Spearman's correlation between pairs of health networks phase angle time series to distance with parameter α = 0.75. Then, we determined the distance were the smallest correlation after an initial linear segment of the LOESS model was found. All pairs of health networks within this distance limit were included for each of the 29 linear models for traveling waves.
We calculated the phase angle (timing) time-series median difference for all possible pairs of health networks and tested the association to distance. The sign of the phase difference represented the relative timing of ILI. A negative difference indicated a delayed timing of influenza (incoming); positive difference indicated outgoing travelling waves. We fitted two linear models for each health network to all others, one for positive phase differences and the second for negative phase difference: where θ p,q is the lag between health networks p and q and d p,q is the distance in kilometers between health networks. The distance sign was inverted for negative lag times for a more intuitive display of incoming waves. We considered a local traveling wave as a significant linear association between phase difference and distance using a corrected Bonferroni significance level of 0.0017.
We developed separate logistic regression models for incoming and outgoing waves to determine the association between traveling waves and population estimates, adjusting for each health network latitude. To evaluate model discrimination, we used the receiver operating characteristic curve and the area under the curve (AUC) 45 .
We used R version 3.6.2 R package Leaps for model selection, and R package WaveletComp for wavelet transformation and phases extraction 46,47 .