The impact of climate and antigenic evolution on seasonal influenza virus epidemics in Australia

Although seasonal influenza viruses circulate globally, prevention and treatment occur at the level of regions, cities, and communities. At these scales, the timing, duration and magnitude of epidemics vary substantially, but the underlying causes of this variation are poorly understood. Here, based on analyses of a 15-year city-level dataset of 18,250 laboratory-confirmed and antigenically-characterised influenza virus infections from Australia, we investigate the effects of previously hypothesised environmental and virological drivers of influenza epidemics. We find that anomalous fluctuations in temperature and humidity do not predict local epidemic onset timings. We also find that virus antigenic change has no consistent effect on epidemic size. In contrast, epidemic onset time and heterosubtypic competition have substantial effects on epidemic size and composition. Our findings suggest that the relationship between influenza population immunity and epidemiology is more complex than previously supposed and that the strong influence of short-term processes may hinder long-term epidemiological forecasts.

absolute humidity AH' prior to and after epidemic onset across all five cities. Epidemic onset is 55 marked by the vertical line at 0. For the earliest onset epidemic in each season and city (8 years 56 x 5 cities = 40 epidemics), T' and AH' for each time point are represented by grey points: a point 57 below the horizontal line denotes that the value is lower than the 31 year city-specific mean. 58 Blue points show the mean T' and AH' for that two week period for all epidemics within the study 59 period. Time periods with statistically significantly (p<0.05; Wilcoxon one-sample test) 60 reductions in mean T' or AH' from the 31-year average are shown in orange. the horizontal line denotes that the value is lower than the 31 year city-specific mean. Blue 72 points show the mean T' and AH' for that two week period for all epidemics within the study 73 period in a particular city. Time periods with statistically significantly (p<0.05) reductions in 74 mean T' or AH' from the 31-year average are shown in orange. Time periods with statistically 75 significantly (p<0.05) reductions in mean T' or AH' from the 31-year average are shown in 76 orange. In the two-week period immediately prior to epidemic onset, there is a statistically 77 reduction in AH' of 0.522g m -3 in Perth (p=0.039, Wilcoxon one-sample test), which is roughly 78 equivalent to a 3.13% reduction in relative humidity. This result was not statistically significant 79 after correcting for multiple testing (Holm correction). activity were present or absent in each of the seasons from its initial detection to its replacement 120 by the next variant. . Antigenic variant-specific cumulative incidence was measured relative to 121 the city-specific mean epidemic size, where 1 is equivalent to the mean epidemic incidence.

122
Binary logistic regression models were fitted for each subtype (n = 81, 65, 13 and 72 for B/Vic, 123 B/Yam, A/H1sea and A/H3 respectively). The 95% confidence interval is denoted by the grey 124 shaded area. See Supplementary Table 5 for OR from the binary logistic regressions. Here, we make no such assumptions. Epidemic incidence were compared between seasons 140 associated with and without the epidemic level circulation of a new major antigenic variant. 141 Within each subtype, incidence for individual epidemics were log transformed and subtracted by 142 the city-specific mean of log incidence, to allow for comparison between cities. to delays in updating vaccine strain nomenclature. Here, we make no such assumptions. Within 194 each subtype, incidence for individual epidemics were log transformed and subtracted by the 195 city-specific mean of log incidence, to allow for comparison between cities. Cumulative 196 incidence was measured relative to the city-specific mean epidemic size, where 1 is equivalent 197 to the mean epidemic incidence. r and p values are from Pearson's correlation tests (n = 37, 20, 198 9 and 45 for B/Vic, B/Yam, A/H1sea and A/H3 respectively). Note that antigenic variants of 199 B/Yam and H1sea rarely initiated multiple epidemics during the study period. Here, we make no such assumptions. For each antigenic variant, we examined whether 216 epidemic levels of activity were present or absent in each of the seasons from its initial detection 217 to its replacement by the next variant. . Antigenic variant-specific cumulative incidence was 218 measured relative to the city-specific mean epidemic size, where 1 is equivalent to the mean 219 epidemic incidence. Binary logistic regression models were fitted for each subtype (n = 81, 65, 220 13 and 72 for B/Vic, B/Yam, A/H1sea and A/H3 respectively). The 95% confidence interval is 221 denoted by the grey shaded area. See Supplementary Table 9 for OR from the binary logistic 222 regressions.

Robustness of inferences derived from our estimates of epidemic onset timings 448
Our antigenically characterised data set is relatively small: 18,250 cases. Especially in seasons 449 with fewer cases, it can be difficult to differentiate epidemic from baseline activity. This limits the 450 accuracy with which the timing of epidemic onset can be estimated. To check the robustness of 451 our results to errors in estimated onset, we re-ran our analysis using estimated influenza A 452 epidemic onset timings from a large-scale study of >450,000 Australian influenza cases 23 . 453 Whilst the data set utilised by Geoghegan et al. 23 has many more cases than our dataset and 454 thus might produce more accurate timing estimates, the lack of subtype level resolution means 455 that the city-level epidemic activity recorded was the summation of underlying A/H3, A/H1sea 456 and A/H1pdm09 virus specific activity. Nevertheless, we investigated whether or not, more  Figures 4-5). 463 To further assess the robustness of our analyses, towards potential inaccuracies in our 464 estimates of epidemic onset timings arising from our relatively small data set, we augmented 465 our estimated epidemic onset timings with those of Geoghegan et al. 23

Sensitivity analysis of epidemic onset and end detection of algorithm 499
The threshold value " , which if exceeded marks the onset of an epidemic, and thus the 500 sensitivity of the detection algorithm are determined by the quantile parameter (see Eqn 4; 501 Methods). In the main text, we chose = 0.12, since it identified epidemic onset and end 502 timings that corresponded well with visual inspection of the raw time series. We repeated the 503 estimation of onset and end timings using = 0.05 & 0.2, which increased and reduced 504 threshold values respectively. Overall, these alternative timings were similar to those originally 505 estimated with = 0.12. However, timing estimates were found to be systematically earlier 506 when utilising lower threshold values due to an increase in sensitivity. At the same time, this 507 also reduced the specificity of the detection algorithm: spurious non-epidemic activity early in 508 the calendar year were conflated as periods of above-baseline levels of epidemic activity and 509 recorded as small sized epidemics on multiple occasions, (Supplementary Figure 16). 510 We reran our bootstrap analyses on the effects of climatic factors with these alternative 511 epidemic timings. The estimated epidemic onset and end timings remained largely invariant to 512 changes in so it was unsurprising that we did not identify any fluctuations in anomalous 513 temperature and absolute humidity in the two, four and six week periods immediately prior to the 514 onset of the earliest epidemics from 2000-2015. 515 At a lower value of = 0.05, which increased the threshold value for detection and the 516 specificity of the algorithm, we similarly found no evidence of consistent effects of antigenic 517 change on epidemic size (Wilcoxon two-sample test) . In contrast, when = 0.2, it appeared 518 that for B/Yam and A/H3, the epidemics were of greater size in seasons associated with the 519 emergence of a new antigenic variant (Wilcoxon two-sample test; Supplementary Figure 17). 520 However, this is likely to be an artefact of the reduced specificity of the detection algorithm, 521 which inflated the number of seasons in which small so-called epidemics were detected. 522 We aggregated the data by week and by two-week periods and found that the latter produced 523 smoother time series: this reduced the effect of stochastic noise and made it more amenable for use with our detection algorithm. Aggregation by two-week periods could however obscure 525 fluctuations in local weather, which are likely to occur at shorter timescales. Reassuringly 526 however, we found that the detection of epidemics and estimated timings corresponded well 527 between values calculated from data aggregated by two-week periods and by week: 239/320 528 instances had identical results, whilst in only 43/320 instances did timing estimates differ by 529 more than 14 days. These relatively minor differences in timing estimates did not impact our 530 results: we did not identify any fluctuations in anomalous temperature and absolute humidity in 531 the two, four and six week periods (two-week aggregation) or in the one, two and three week 532 periods (weekly aggregation) immediately prior to the onset of the earliest epidemics from 2000-533

534
Overall, our detection algorithm and downstream results from the analyses on the effects of 535 climatic factors and antigenic change remain robust to choice of time period for the aggregation 536 of case counts and the selection of alternative parameters, which alter the sensitivity and 537 specificity of the algorithm (analyses can be reproduced from code included in the project 538 Github repository). 539