Cross-reactive immunity potentially drives global oscillation and opposed alternation patterns of seasonal influenza A viruses

Several human pathogens exhibit distinct patterns of seasonality and circulate as pairs. For instance, influenza A virus subtypes oscillate and peak during winter seasons of the world’s temperate climate zones. Alternation of dominant strains in successive influenza seasons makes epidemic forecasting a major challenge. From the start of the 2009 influenza pandemic we enrolled influenza A virus infected patients (n = 2980) in a global prospective clinical study. Complete hemagglutinin sequences were obtained from 1078 A/H1N1 and 1033 A/H3N2 viruses. We used phylodynamics to construct high resolution spatio-temporal phylogenetic hemagglutinin trees and estimated global influenza A effective reproductive numbers (R) over time (2009–2013). We demonstrate that R oscillates around R = 1 with a clear opposed alternation pattern between phases of the A/H1N1 and A/H3N2 subtypes. Moreover, we find a similar alternation pattern for the number of global viral spread between the sampled geographical locations. Both observations suggest a between-strain competition for susceptible hosts on a global level. Extrinsic factors that affect person-to-person transmission are a major driver of influenza seasonality. The data presented here indicate that cross-reactive host immunity is also a key intrinsic driver of influenza seasonality, which determines the influenza A virus strain at the onset of each epidemic season.

Several human respiratory viruses circulate as groups of discrete pathogenic entities exhibiting distinct patterns of seasonality 1,2 . For influenza virus such patterns have been studied extensively [3][4][5] . In the world's temperate climate zones influenza activity oscillates and synchronizes with winter periods, while in tropical regions activity appears to be year-around or split into different seasons 4 . They have been attributed largely to 'extrinsic' factors driving efficient virus spread 6 , like air humidity variations 7 , seasonal influences on host susceptibility 8 , and societal structure and behavioural patterns 9 . However, several other mechanisms focussed on host and population immunity have been proposed that explain the seasonality of viruses. Firstly, during an epidemic or pandemic season the viral spread of a specific viral subtype will increase up to a point when herd immunity will prevent further transmission. The loss of specific immunity within a population can lead to a large peak of infections, as was seen for respiratory syncytial virus (RSV) in children after an almost absent RSV-season due to the restrictions of the COVID19-pandemic 10,11 . Apart from subtype specific immunity, Susceptible-Infection-Recovery epidemiological modelling predicted that also cross-reactive immunity plays a role in seasonality of viruses. These models postulate that the presence of a short-term cross-reactive host immune-phenomenon could explain a temporary reduction in host susceptibility to other viruses and cause seasonal infection patterns at a population level 12,13 . Thirdly, in recent years mathematical simulations that mimic virus dynamics suggest the existence of virus-virus interactions. These interactions show that the rise in incidence of infections caused by one virus directly influences the amount of other virus infections at both population as well as individual host level. This phenomenon was also found in clinical cohort where they found that the presence of rhinovirus interfered with the presence of Influenza A virus in patients 14,15 . This direct interaction between viruses caused one virus to block the other and thereby affect the seasonality of viruses.
All of these 'intrinsic' factors may also be attributed to other aspects of influenza epidemiology, like the replacement of a seasonal strain by a pandemic virus. This occurred for the last time during the 2009 influenza pandemic when the seasonal A/H1N1 was replaced by the pandemic A/H1N1 virus. Interestingly, like the other known pandemics this virus did not spread in winter, but during the 2009 northern hemisphere summer 16 .
To date, the newly introduced pandemic 2009 A/H1N1 virus continues to co-circulate with the A/H3N2 subtype causing seasonal epidemics in humans. Both influenza A viruses are under intense selective pressure by the host immune system and they continuously evolve to persist in humans. Viruses escape from pre-existing immunity through mutation at antigenic sites at the globular head of the hemagglutinin (HA). This is a major virus surface glycoprotein and primary target of host neutralizing antibodies. Continual viral presence in the population on the other hand results in a 'landscape of immunity' 12 , which new 'antigenic drift' viruses need to overcome to fuel new epidemics. A typical phylogenetic tree of HA is shaped, as a result of this cat-and-mouse game, into a single trunk tree with short-lived branches 17 (Fig. 1).
Virus strains that are antigenically similar cluster along the trunk of the tree with only a limited number of amino acid positions involved in the jump from an existing into a new antigenic cluster 18 . These positions were previously identified with data obtained from the hemagglutination inhibition assay, a serological test to assess neutralizing antibody responses to HA.
Besides these long-lived and predominantly strain-specific antibody-mediated immune responses, a shorterlived, non-specific component has been proposed in particular to explain the limited virus genealogical diversity www.nature.com/scientificreports/ (single trunk) and lifespan (short-lived branches) of the vast majority of circulating dead-end virus lineages 12 . Evidence for such component has first come from in vitro and animal studies showing that pre-infection with one subtype induces partial cross-protection from infection with another subtype 19 . Observational studies addressing the potential role of cross-reactive immunity in global influenza seasonality has so far failed to show a clear pattern 20,21 . However, recent and marked observations to support a major role of cross-immunity were related to the fast disappearance of the A/H1N1 subtype from 1977, shortly after the introduction of the pandemic influenza A/H1N1 virus in 2009, while the A/H3N2 subtype managed to continue its circulation 13 . Another example was seen in the winter of 2018-2019 where in a matter of weeks A/H1N1 dominance was almost completely replaced by A/H3N2 dominance 22 .

Results
In search for the existence of such component we first followed a phylodynamic approach to jointly resolve spatio-temporal phylogenetic HA trees of A/H1N1 and A/H3N2 subtypes and to infer underlying host population dynamics 23,24 ( Fig. 1; Tables S1 and S2). The dataset used here had been collected globally during the first 5 years after the onset of the 2009 influenza pandemic. It enrolled patients year-around (> 1 year of age), the vast majority (> 97%) with uncomplicated and PCR-confirmed influenza, who had been admitted-within 48 h after symptom onset-to primary care centres and hospitals in Asia (Hong Kong; n = 6), Europe (n = 37), the US (n = 36) and the Pacific (Australia; n = 8) (Fig. S1). From these samples 2111 influenza A viruses were isolated, which allowed us to obtain complete HA sequences from 1078 A/H1N1 and 1033 A/H3N2 viruses. The extent of sampling, directly after the pandemic outbreak, in combination with an unprecedented resolution regarding quality-controlled Sanger sequencing, resulted in a high-resolution dataset. This offered us an unique window of opportunity to study the dynamics of the estimated effective reproductive number R over time (R-skylines).
R is a parameter of host immunity 25 , and is computed here as the rate at which an infected individual gives rise to a new infection in a defined period of time.
We observed that R-skylines estimated from the A/H1N1 and A/H3N2 trees showed alternate phases of increasing and declining R, with R < 1 and R > 1 respectively (Fig. 3). There was a significant negative correlation between phases (Pearson's ρ = − 0.511, P = 3.0e−07; D = 0.202, P = 0.052) with an average endogenous oscillation period estimated to be approximately 1.67 ± 0.01 years for A/H1N1 and 1.13 ± 0.02 years for A/H3N2 6 as computer by periodograms (Fig. S4). Of note, these periods were similar to the average lifespan of the dead-end virus lineages on the HA trees (1.7 ± 0.4 for A/H1N1 and 2.2 ± 0.5 for A/H3N2) (Fig. 1).
Given the finite nature of susceptible hosts, virus persistence relies on the availability of new susceptible ones, which forces viruses to migrate between geographical locations 26 . The interplay between antigenic drift and preexisting immunity may then determine the outcome of the competition between these viruses at the onset of each influenza season 27 . As the observed pattern of R-skylines indicates that a relatively short-lived cross-reactive immunity component drives subtype competition at the host scale, we wondered whether this competition also could determine the dynamics of global viral spread.
To build on existing global viral spread data and to study its patterns for the A/H1N1 virus after 2009 we expanded our dataset with complete HA sequences deposited in the IRD from viruses isolated prior to (2008-2009) and after (2013-2015) our study period (Dataset S1). We then inferred the number of geographical location changes at each internal node of these trees to identify all virus movements from one geographical location (source) to another location (sink). Again, and similar to the R-skylines, influenza A/H1N1 and A/ H3N2 viral spread alternated globally within our study period (Fig. 4). Global influenza A/H1N1 viral spread dominated in the first half of the study period, while A/H3N2 viral spread were more prevalent between 2012 and 2013. The observation of oscillation between influenza A/H1N1 and influenza A/H3N2 throughout the study period supports the evidence that inter-subtype competition presented here exists on both the population and individual host level, contributes to influenza seasonality and may determine the virus subtype that will dominate in a given influenza season.
Finally, previous work on global circulation had shown that East and South-East Asia (E-SEA) played a pivotal role in global dissemination of A/H3N2 viruses. Here, A/H3N2 virus activity was found year-round (between 2000 and 2012), from where new antigenic drift variants fuelled in the temperate climate zone epidemics 9 . In contrast, E-SEA did not seem to have a major role in the dissemination of pre-pandemic A/H1N1 viruses 9 . To study global virus spread after 2009 we constructed the networks of viral spread trajectories between the sampled geographical locations using measures of connectivity (Fig. S2) in a one-year time window (Figs. 2 and 4) and found that, in contrast to the pre-pandemic period 9 , E-SEA was equally important for the dissemination of both influenza A viruses. Within our dataset we examined the distribution of migration events between different locations and counted the amount of location switches on the tree. We counted 56 A/H1N1 and 58 A/H3N2 dissemination events from E-SEA to the other sampled regions in the world (Fig. S3). In addition, global viral spread patterns showed a similar degree of global network complexity (Fig. S2, max. graph density/diameter of 1.08/12.09 for A/H1N1 and 1.16/10.64 for A/H3N2) and similar patterns of virus circulation across the sampled geographic regions (Fig. S2, max. number of islands and graph reciprocity of 2 and 0.75 for A/H1N1 and 2 and 0.86 for A/H3N2).

Discussion
Extrinsic factors probably play a role in forcing influenza epidemics into the winter seasons in the global temperate climate zones [3][4][5]7 . The oscillating and alternating pattern of the global skylines of R we present here are consistent with the notion that cross-reactive host immunity is an important intrinsic driver of influenza seasonal patterns (Fig. 3). www.nature.com/scientificreports/ Global influenza dissemination dynamics also reveals alternation of global virus spread and complexity of viral spread trajectories between subtypes with two phases (Figs. 2 and 4). In the first phase (2009-2011) we observe that these parameters are high for the A/H1N1 and low for the A/H3N2 subtype (Figs. 2 and 4). This pattern is reversed during the second phase (2011-2013). These parameters are indicators of virus persistence and depend, therefore, on the availability of susceptible hosts within a defined geographical location 26 . This implies that an intrinsic correlation exists between change of cross-reactive host immunity landscapes and global viral spread. In the Netherlands patterns and duration for influenza epidemic periods similar to our results were previously found by were previously found. Additionally, statistical regression techniques showed that depletion of susceptible hosts was the most important factor in determining transmission of influenza virus during these epidemic 28 as well as in seasonal corona outbreaks 29 . Seasonal patterns are also described between influenza virus and other respiratory viruses. In mathematical models that studied the co-circulation of seasonal influenza A virus and non-influenza respiratory viruses, a short-term protection of one virus on the other induced seasonal   14 . Taking into account the strong cross-reactive immune responses previously observed between the seasonal and pandemic A/H1N1 virus subtypes, cross-reactive immunity may well have forced the 2009 pandemic into the northern hemisphere summer and mitigated the disease burden associated with this virus 16,30,31 . A great deal of the current understandings about influenza antigenic drift is based on data generated by the hemagglutination inhibition serological test. This assay is largely dependent on serum immunoglobulin G (IgG) content 32 , but it does not account for other immunity components of the immune system. The observed short endogenous oscillation periods observed (1.7 years for A/H1N and 1.1 years for A/H3N2) suggest that this is most likely the result of short-lived inter-subtypic immune responses rather than systemic antibody-mediated immunity from which major antigenic drift variants arise every few years ( Fig. 3 and Fig. S4) 9,12,18 . T-cells have been suggested as key players in mediating inter-subtypic immunity through several mechanisms, including direct cytolytic activity and interactions with B cells 30,31 . Inter-subtypic short-lasting immune responses likely also exist for other viruses. Seasonal coronaviruses induce a short-lasting immunity of 12 months in individuals 29,33 . Furthermore, SARS-CoV-2 reactive CD4+ T cells were found in healthy SARS-CoV-2 seronegative individuals, most likely due to cross-reactive T cells that stemmed from previous infections with endemic seasonal coronaviruses 34 .
Finally, secretory immunoglobulin A (IgA), the most abundant antibody isotype, is an interesting candidate meeting all criteria: IgA mediated-immunity is relatively short-lived 35 , is (partially) cross-reactive 36 , and thought to be the main driver of upper respiratory tract mucosal immunity against influenza 37 . Moreover, intramuscular administration of IgA, and not IgG, prevented airborne virus transmission in the ferret and guinea pig model 32 , indicating that cross-reactive IgA levels may directly impact person-to-person spread of the virus 38 . Crossreactive immunity to different influenza subtypes can be the result of antibodies that are able to bind to the highly conserved stalk of the HA protein. These antibodies get produced most upon sequential exposition to diverse HA subtypes 39 and in this, IgA has a more potent neutralizing effect against influenza than IgG 40 . Further studies elucidating the contribution of host immunity to seasonality of influenza and other multi-strain viruses, such as the coronavirus, paramyxoviruses, respiratory syncytial virus and human metapneumovirus are warranted 41 . This would further support the establishment and exploitation of global virus and bio-banks 42 , which will lead to a better understanding of the contribution of host immunity landscapes to the dynamic epidemiological circulation patterns of (multi-strain) pathogens.

Methods and materials
Study conduct. IRIS (NCT00884117) is a prospective, multicentre, global observational study offering unprecedented resolution with regard to quality-controlled Sanger sequencing 43,44 . This report summarizes the results from 87 centers in, Australia (n = 8), China (Hong Kong, n = 6), Europe (n = 37; France, Germany, Norway) and the United States (n = 36) from December 2008 to March 2013, comprising five Northern and four Southern Hemisphere seasons, and including the 2009-2010 pandemic. Centers were selected to achieve the widest geographic coverage possible within each country (Fig. S1).  Ethics approval. The study was performed in compliance with the principles of the Declaration of Helsinki and its amendments, and in accordance with Good Clinical Practice. Independent ethics committees and institutional review boards at each centre approved the study protocol and amendments. All patients or legal guardians provided written informed consent at the time of enrollment.

Patient selection. Adults
Virus identification. Throat and posterior nasal swab specimens were obtained on day 1, 3, 6 and 10 and shipped on dry ice to a central laboratory for analysis (Erasmus MC, Rotterdam, The Netherlands). Influenza A subtypes were identified using semi-quantitative real-time reverse transcription polymerase chain reaction (RT-PCR) 45 . Day 1 samples with cycle threshold (Ct) values of < 32 were cultured on Madin-Darby canine kidney cells. Virus-containing supernatants were cleared from cell debris by centrifugation (10 min at 1000×g) and stored at − 80 °C until further processing. For this study, A/H1N1 (n = 1078) and A/H3N2 (n = 1033) virus isolates were included, which were obtained upon patient admission (day 1).  46 . Numbers of additional HA sequences were 443 for A/H1N1 and 462 for A/H3N2 respectively. The complete list of IRD sequences is provided (Dataset S1).

Datasets and nucleotide sequence accession numbers.
Data pre-processing and alignments. Each expanded dataset was aligned using ProGraphMSA using default parameters 47 . Sequences were renamed to include sampled geographical locations, sampling dates (continuous values) and corresponding influenza season (when available).

Phylodynamics inference. The birth death serial skyline (BDSKY) phylodynamics model implemented in
BEAST v.2.3.1 was applied to the IRIS and expanded datasets to infer spatio-temporal resolved phylogenies and epidemiological parameters 23,26,48 . Phylogenetic trees were estimated under the general-time-reversible model (GTR+) with Γ-distribution to model among-site rate variation 49 (Fig. 1). A molecular clock rate prior was set to follow an uncorrelated log-normal distribution 50 . Internal node calibration was performed using tip sampling dates 51 . The BDSKY-model was set with the following parameter: the sampling rate prior for the influenza infected population/the real sampled population followed a Beta (1, 999) distribution; the prior probability of sampling an individual upon becoming non-infectious followed a LogNorm (4.5, 1.0) distribution. In addition, tree dating was performed using tip dates while an uncorrelated log-normal clock rate prior was applied to handle uncertainties in the sample collection dates. Finally, the analysis was run long enough to obtain a sufficient effective sample size ESS > 200 for all parameters. The converged parameters of the BDSKY-model are listed in Supporting Information Tables S1 and S2. To assess global model robustness, we performed two independent runs of each analysis (for a total of 20 runs). Markov chain Monte Carlo (MCMC) parameter convergences were diagnosed with Tracer 1.6. Thinning of BEAST2 output files (tree files and parameter files) was done using in-house bash scripts. After accurate MCMC trace monitoring, the first 10% of MCMC steps were discarded as burn-in resulting in around 6000 trees per each dataset. TreeAnnotator v2.3.1 was used to produce Maximum Clade Credibility (MCC) trees 23 .
Statistical analyses effective reproductive number. We estimated the effective reproductive number R using phylodynamics modelling as described above. The estimates of R allowed us to study the dynamics of virus spread within the population 52 . Values R < 1 indicate a decline of infections, while R > 1 indicates that the infection has increased its spreading in a more susceptible population. The skyline of R is used here to picture the underlying dynamics 'shaping' a phylogenetic tree 23 Viral spread routes and evolutionary rates. Datasets were partitioned according to the geographical sampling locations pooled by continent (North America, Europe, Asia, Pacific area). Viral spread rates were estimated using a discrete phylogeographic trait model with the Γ-distribution as substitution rate prior between geographical demes 56 . The influenza virus dissemination process was fitted to a discrete trait model using the Bayesian Stochastic Search Variable Selection method, by inferring the most parsimonious description of the phylogeographic diffusion process [57][58][59][60] . Counts of viral spread events were quantified by traversing the fully spatio-temporal resolved phylogenetic trees in post-order and by counting the number of most probable Markov chain jumps along the branches of the posterior set of trees 61,62 .
Branch geographical persistence. Geographical persistence was quantified by summing the phylogenetic branch lengths (measured in expected substitutions per site) grouped by their inferred geographical location on the phylogenetic tree trunk (inferred traversing the phylogenetic tree from 'leaf-to-root' and summing the number of branch traversals. The tree trunk was defined as the path on the phylogenetic tree that has been traversed more than 10 times. Number of seeding events was defined as the number of switches on the phylogenetic tree trunk per season.

Viral spread graphs.
Trajectory networks were reconstructed per each strain variant, pooling viral spread events occurred within a one-year time window (Fig. 2). Trajectory complexity was computed estimating graph density, number of islands (nodes), diameter, and reciprocity 63,64 . In addition, geographical location connections were estimated by computing the graph centrality measures (specifically: degree centrality and betweenness centrality) (Figs. S2 and S3) [65][66][67] . The Quade and correspondent post-hoc procedures were applied to test whether viral spread trends were significantly different between strains and whether preferred viral spread trajectories were selected 68 . The significance level was set at 5%.

Data availability
Full-length HA sequences obtained as part of the IRIS study have been deposited in Genbank with the primary accession codes mentioned above. All other HA sequences downloaded for this study are listed in the Supportive information.