A naturally protective epitope of limited variability as an influenza vaccine target

Current antigenic targets for influenza vaccine development are either highly immunogenic epitopes of high variability or conserved epitopes of low immunogenicity. This requires continuous update of the variable epitopes in the vaccine formulation or boosting of immunity to invariant epitopes of low natural efficacy. Here we identify a highly immunogenic epitope of limited variability in the head domain of the H1 haemagglutinin protein. We show that a cohort of young children exhibit natural immunity to a set of historical influenza strains which they could not have previously encountered and that this is partially mediated through the epitope. Furthermore, vaccinating mice with these epitope conformations can induce immunity to human H1N1 influenza strains that have circulated since 1918. The identification of epitopes of limited variability offers a mechanism by which a universal influenza vaccine can be created; these vaccines would also have the potential to protect against newly emerging influenza strains.

S easonal influenza is estimated to cause between 1 and 4 million cases of severe illness and 200,000 to 500,000 deaths per year 1 . The best way to protect against influenza infection is through vaccination. Currently, a trivalent (TIV) or quadravalent influenza (QIV) vaccine is given each year, targeting the circulating H1N1 and H3N2 influenza A strains and one or two lineages of the circulating influenza B strains. However, the vaccine has to be formulated at least 6 months prior to the influenza season and so the strains that are subsequently prevalent in the actual flu season do not always match the strains used in the vaccine 2 .
The antigenic evolution of influenza is known to occur through mutations in surface glycoproteins, principally haemagglutinin (HA), allowing strains to escape the pre-existing host immunity [3][4][5] . Epitopes within HA are commonly assumed to be either highly variable due to strong immune selection (and typically located in the head domain of HA) or conserved due to the absence of immune selection (for example, in the stalk of HA) 6 . Together, these form the backbone of the theory of antigenic drift, whereby the virus population slowly and incrementally acquires mutations in protective highly variable epitopes. However, the antigenic drift model can only explain the epidemiology and limited genetic diversity observed among influenza virus populations when very specific constraints are placed on the mode and tempo of mutation or by invoking short-term strain-transcending immunity 7,8 . An alternative model known as antigenic thrift successfully models the epidemiology and genetic diversity of influenza by assuming that immune responses against epitopes of limited variability drive the antigenic evolution of influenza [9][10][11] . Within this framework, new strains may be generated constantly through mutation, but most of these cannot expand in the host population due to pre-existing immune responses against epitopes of limited variability. This creates the conditions for the sequential appearance of antigenically distinct strains and provides a solution to the long-standing conundrum of why the virus population exhibits such limited antigenic and genetic diversity within an influenza epidemic. An important translational corollary of this model is that a universal influenza vaccine may be constructed by targeting such protective epitopes of limited variability.
We show that studies of sera from young children taken in 2006/7 using neutralisation assays and ELISAs reveal a periodic pattern of cross-reactivity to historical isolates consistent with the recycling of epitopes of limited variability. We identify one epitope of limited variability responsible for this pattern through a structural bioinformatics analysis. We demonstrate that mutagenesis of the epitope removes the cross-reactivity to historical strains, and vaccination of mice with the 2006 conformation of the epitope is able to reproduce the cross-reactivity pattern identified in the serology studies. We further show that vaccination of other epitope conformations induces similar but asynchronous cross-reactivity to historical strains. Finally, we demonstrate that vaccination with either the 2006 or 1977 epitope conformations is able to protect the mice from the challenge with a H1N1 influenza strain that last circulated in 1934. By establishing that the antigenic space within which influenza evolves is much smaller than previously thought, we show that there are epitopes in the major influenza antigen, HA, which if vaccinated against would allow us to avoid the requirement for yearly influenza vaccination necessitated by the current TIV and QIV vaccines.

Results
Periodic cross-reactivity to historical isolates. We tested the prediction that HA epitopes of limited variability exist by performing microneutralisation assays using pseudotyped lentiviruses, displaying the H1 HA proteins from a panel of historical influenza isolates (hereafter described as pMN assays 12,13  ELISA analysis using the HA1 domain of the same seven strains as an antigen was consistent with the pMN data and also identified broadly cross-reactive non-neutralising antibodies that bind the HA1 region of various H1 influenza strains ( Supplementary  Fig 1). These results are in agreement with a number of recent studies suggesting that antibody responses showing some degree of periodic cross-reactivity to historical strains [14][15][16][17][18][19][20][21][22] counter to the view of antigenic drift within which antigenic distance accumulates linearly with time.
We noted that the A/  Fig. 1c and p-value = 0.0056, Fig. 1d; p-values were determined using Student's t tests). In the case of the A/PR/8/1934 −147K mutant, there was a total loss of neutralisation in four samples and a reduction in two samples indicating that the bulk of crossreactivity between the A/Solomon Islands/3/2006 and the A/PR/ 8/1934 strains is mediated through an epitope located in the vicinity of the deletion (p-value = 0.0004 using Student's t test, Fig. 1c). Therefore, it seems that the absence of a positively charged lysine at position 147 mediates much of the observed cross-reactivity to historical strains induced by the 2006/2007 cohort sera in Fig. 1a.
Analysis of historical strain data shows that the deletion, the only one to occur in the H1 HA, appears periodically over the course to the antigenic evolution of H1N1 subtype of influenza occurring in 1933, 1934, 1943, 1957 and between 1995 and 2008. To ascertain whether the absence of an amino acid at position 147 would also mediate cross-reactivity with other historical strains possessing the deletion, A/WSN/1933 neutralisation-positive samples were run against the WT and −147K mutant A/Iowa/ 1943 and A/Denver/1957 pseudotyped lentiviruses (Fig. 1e, f). A statistically significant reduction in neutralisation was observed for the −147K A/Iowa/1943 and A/Denver/1957 HA mutants (pvalue = 0.012, Fig. 1e, p-value = 0.011 using Student's t test, Fig. 1f). Furthermore, three samples, which neutralised the WT A/Denver/1957 HA failed to neutralise the −147K mutant entirely. These results imply that at least part of the cross-reactive neutralising immune response within this cohort is mediated through the recognition of an epitope that contains a deletion at position 147. Moreover, the existence of a lysine at position 147 may contribute to the overall lack of neutralisation of A/California/4/2009 and A/South Carolina/1/1918, in addition to other variation across the HA.
Identification of an epitope of limited variability. Several previous studies have highlighted the importance of position 147. Although not included within any of the canonical antigenic sites defined by Caton et al. 3 (being absent in the A/PR/8/1934 Mt. Sinai strain), position 147 has recently been assigned to a new antigenic site-denoted Pa in Matsuzaki et al. 23 , where it was shown to be responsible for several A/Narita/1/2009 escape mutants 3,23 . Position 147 is also important for the binding of several known broadly neutralising antibodies: for example, the 5j8 antibody requires a lysine to be present at position 147, whilst the CH65 antibody cannot bind if a lysine is present at position 147 24,25 . Furthermore, Li et al. 18 demonstrated that certain demographics, such as individuals born between 1983 and 1996, possess antibodies that bind to an epitope containing a lysine residue at position 147 18 .
We next employed a structural bioinformatic approach to identify an epitope of limited variability that contained position 147. In silico analysis was used to determine how the accessibility and binding site area contributed to the variability of hypothetical antibody binding sites (ABSs) (Fig. 2a) [26][27][28][29][30] . The ABS of lowest variability containing position 147 was consistently represented by the site shown in Fig. 2b 31 and could be shown to locate to an exposed loop in the head domain of the H1 HA, not covered by N-linked glycosylation (Fig. 2b, c), and encompassing additional residues in the Ca 2 antigenic site. Analysis of this site (hereafter called OREO) suggested that various conformational epitope variants could be defined on the basis of variation and structural proximity of positions 147, 158 and 159. Combining these analyses with the site-directed mutagenesis SDM results, we arrived at a maximum of five epitope conformations of the epitope ( Fig. 3 and Supplementary  Fig 3), which arise and disappear in a cyclical manner during the known evolutionary history of the pre-pandemic and postpandemic H1N1 lineages (Fig. 2d). This analysis demonstrates that there are numerous potential sites of limited variability in the head domain of the H1 HA represented by the local minima in Fig. 2a, in addition to a range of highly variable sites; the antigenic trajectory of the latter has been tracked in detail by several previous studies 32, 33 .
Vaccination induces cross-reactivity to historical isolates. We next substituted the five proposed conformations of OREO ( Fig. 3 and Supplementary Table 1) into H5, H6 and H11 HAs, which have not circulated in the human population, to maintain the conformations of the epitope and allow the immune response to be focused on the epitope via a prime-boost-boost vaccination regimen (Fig. 4a). Consequently, mice were vaccinated using a DNA-DNA-pseudotyped lentivirus regimen alternating between different HA scaffolds (Fig. 4a). Analysis of sera obtained from the final bleed at 21 weeks prior to the influenza challenge demonstrated that vaccinating with these epitopes produces antibodies that are cross-reactive to a number of historical strains. Notably, the 2006-like epitope conformation (red) produces cross-reactive antibodies that mirror the neutralisation profile of sera taken in 2006/2007 from young children aged 6-12 years ( Fig. 1a and Supplementary Fig 1) Fig. 4b-g). Intriguingly, the 1977-like conformation (green) containing an arginine at position 147 also displays similar cross-reactivity to that of the 2006-like epitope (red), containing a deletion at position 147. Furthermore, the 2009-like (blue) and 1991-like (orange) conformations showed periodic cross-reactivity to historical strains demonstrating the chronological reoccurrence of epitopes of limited variability ( Fig. 4b-g).
Vaccination protects against heterologous challenge. To test whether antibodies directed against these epitopes conferred protective immunity, the 2006-like (red) and 1977-like (green) epitope-vaccinated groups were challenged with a strain collected in 1934 (A/PR/8/1934) (Fig. 5a, c). The 2009-like (blue), 1995-like (orange) and 1940-like (pink) groups were challenged with a 2009 pandemic strain (A/California/04/2009; Fig. 5b, d). In each challenge experiment, an unvaccinated group (n = 6) was included as well as a group vaccinated via the DNA-DNA-pseudotyped lentivirus regimen with the H6, H5 and H11 HAs without the substituted epitope conformations (n = 6). Vaccination with the 2006-like (red) and 1977-like (green) epitope conformations conferred immunity to challenge with the A/PR/8/1934 virus (Fig. 5a, c). As expected, vaccination with the 2009-like epitope also conferred immunity to challenge with A/California/04/ 2009 strain, which last circulated in 2009 (Fig. 5b, d). These results demonstrate that epitopes that circulated in the A/Solomon Islands/8/2006 and A/USSR/90/1977 strains, which last circulated in 2006 and 1977, respectively, were able to produce antibodies that confer protection against challenge with the A/ PR/8/1934 strain, which last circulated in 1934.

Discussion
Our results demonstrate the existence of a highly immunogenic epitope of limited variability in the head domain of the H1 HA, which has been theorised by mathematical modelling studies to drive the antigenic evolution of influenza 9,10 . Sera from children aged 6-12 years taken in 2006/7 were shown to cross-react with a panel of historical isolates, the majority of which they will not have experienced (Fig. 1a). This cross-reactivity was removed by mutagenesis of an epitope of limited variability identified through a structural bioinformatic analysis (Fig.1b-f and Fig. 2). We were further able to reproduce the cross-reactivity exhibited in the serology studies in a mouse model, and demonstrated that vaccination with the epitope conformations circulating in 2006 or 1977 induced protective immunity to challenge with a strain that last circulated in 1934 (Fig. 4). Vaccination with other conformations of the epitope produced complementary but asynchronous cross-reactivity to historical strains (Figs. 4 and 5). Furthermore, between 1918 and the present day, the 2006-like 147-deleted conformation of the epitope has occurred 5 times. In two instances, when circulating strains contained the 147-deleted conformation of the epitope, lineage replacement of the H1N1 strain occurred (in 1957 and 2008). This suggests that the possession of a conformation of the epitope in which 147 is present conferred a very significant selective advantage once population immunity has built up against the 147-deleted conformation.
Another site of limited variability identified by our analysis in the head domain of the H1 HA appears to be centred on position 180 (linear numbering, position 166: H3 numbering, position 163: WHO numbering). Linderman et al 19 19,20,22 . However, as this site is periodically covered by glycosylation, the OREO epitope is likely to be a better vaccine target.
Currently available influenza vaccines are believed to target epitopes of very high variability on the haemagglutinin and neuraminidase surface glycoproteins. This requires them to be continuously updated, with the only alternative being seen as the artificial boosting of immunity to conserved epitopes of low immunogenicity. By identifying such epitopes, we have established an alternate method of producing improved influenza vaccines: targeting highly immunogenic epitopes of limited variability as opposed to targeting highly immunogenic epitopes of high variability or conserved epitopes of low immunogenicity. Through vaccination against the various conformations of the epitope of limited variability identified in this study, it is possible to induce immunity to all previous and future H1N1 strains.
The OREO site has been under selective pressure over a period of 80 years, providing us the opportunity to observe and document its variation: the site has historically cycled between four conformations when position 147 is present and one conformation where it has undergone a deletion. Once the deletion has occurred, instead of the epitope varying further, on two occasions the circulating seasonal H1N1 strain died outonce in 1957 and again in 2009. In 2009, the seasonal influenza strain was replaced through zoonotic spillover with a pandemic H1N1 strain displaying a previously seen conformation of OREO, to which immunity in 2009 has not yet built up against 34,35 . In 1957, the seasonal influenza strain was replaced with H2N2 influenza (Fig. 2d) 36 . These observations suggest that the antibodies generated against these five conformations cover all possible variations within the OREO epitope that are found in evolutionary fit H1N1 influenza viruses. This limited variation exhibited by the epitope is determined by the composition of the site. Position 147 is next to, and affects the polarity of, the receptor binding site, and therefore there is a limited repertoire of amino acids it can cycle between. Furthermore, the OREO site is composed of a smaller number of residues compared with other sites containing position 147, which intrinsically limits its overall variability ( Supplementary  Fig 4a, b).
All the conformations of the OREO epitope could be displayed in a single vaccine by itself or in concert with other epitopes of limited variability to create a single universal influenza vaccine. Alternatively, the epitopes could be displayed in individual vaccines and deployed when the circulating conformation changes (currently the OREO epitope changes roughly every 10 years, see Supplementary Fig 3). The critical feature of both approaches described above is that the OREO epitope does not drift to the same level as the currently targeted highly variable epitopes. Hence, both of these approaches could provide longer-lasting vaccines in comparison to the trivalent and quadrivalent vaccines. Moreover, the longer cycling period of OREO might help decrease and also avoid mistakes in the formation of the vaccines, which sometimes occur due to formulation decisions having to be made at least 6 months prior to the influenza season 2 .
The evolutionary framework on which these studies are based 9,37 applies generally to other subtypes of influenza A such as H3N2 37 and also to influenza B, suggesting that epitopes of limited variability can also be identified in these viruses. Indeed, Zinder et al. 37 have shown that the phylodynamics of H3N2 influenza is easily reproduced using the antigenic thrift framework. Consequently, the same strategy could be used to produce vaccines for other subtypes of human influenza, as well as swine and avian influenza viruses and potentially other viruses. Enzyme-linked immunosorbent assay (ELISA). Anti-HA1 antibody responses were measured using ELISAs. In brief, Nunc-Immuno 96-well plates (Thermo Fischer Scientific, USA) were coated with 1.0 μg ml −1 of HA1 protein (Sino Biological Ltd, China) in PBS buffer and left overnight at 4°C. Plates were washed with 6x with PBS-Tween PBS/T, then blocked with casein in PBS for 1 hour at room temperature. Serum or plasma was diluted in casein-PBS solution at dilutions ranging from 1:50 to 1:1000 before being added to Nunc-Immuno 96-well plates in triplicate. Plates were incubated at 4°C overnight before being washed with 6x with PBS-Tween PBS/T. Secondary antibody rabbit anti-human whole IgG conjugated to alkaline phosphatase (Sigma, USA) was added at a dilution of 1:3000 in casein-PBS solution and incubated for 1.5 hours at room temperature. After a final wash, plates were developed by adding 4-nitrophenyl phosphate substrate in diethanolamine buffer (Pierce, Loughborough, UK), and optical density OD was read at 405 nm using an ELx800 microplate reader (Cole Parmer, London, UK). A reference standard comprising of pooled cross-reactive serum and naïve serum on each plate served as positive and negative controls, respectively.

Methods
The positive reference standard was used on each plate to produce a standard curve. The standard was made from cross-reactive serum against each HA1 protein. It was added in duplicate at an initial dilution of 1:100 in casein-PBS solution and diluted twofold 10 times, starting with an arbitrary value of antibody units determined using the NIH standard calculator 38 . Three blank wells containing casein-PBS solution only and further three blank wells containing naïve human sera or plasma were used as negative controls. The mean of the OD values of the naïve sera was then subtracted from all OD values on each plate before triplicates were fitted to a four-parameter standard curve using the positive reference standard 38 . At least two technical replicates were performed to ensure reproducibility. The p8.91, pCSFLW and TMPRSS4-expressing construct were gifts from Dr Nigel Temperton, whilst the HA-expressing plasmids were either gifts from Dr. Temperton or produced through the cloning of GeneArt Strings (Thermo Fischer Scientific, USA) into the pI.18 expression vector (also a gift from Dr. Temperton). All plasmids are available on request.  Pseudotype microneutralisation assay. Neutralising antibodies were quantified using a pseudotype microneutralisation assay. Serially diluted sera was added to Corning Costar plates 96-well plates (Promega, USA) before being incubated with 10 6 RLU-pseudotyped influenza virus for 1 hour at 37°C. Each dilution was made in duplicate. A total of 2 μl of sera were used per replicate in Fig. 1a. For comparison of WT and SDM-pseudotyped influenza viruses, 10 μl of sera was used per replicate in Fig. 1b-f. A total of 1 ul of sera per replicate was used in the vaccination experiment shown in Fig. 4 due to the small amounts of blood collected from the mice. HEK 293 T/17 cells 2.0*10^5 cells ml −1 were subsequently added to each well and incubated for 3 days at 37°C. The cells were lysed with BrightGlo reagent (Promega), and the relative light units of the cell lysate were determined using a Varioscan luminometer microplate reader (Thermo Fisher Scientific, USA). The reduction of infectivity was determined by comparing the RLU in the presence and absence of antibodies and expressed as percentage neutralisation. The 50%inhibitory dose, IC50, was defined as the sample concentration at which RLU were reduced 50% compared with virus control wells after subtraction of background RLU in cell-only control wells. At least two technical replicates were performed for each biological sample to ensure replicability.  Supplementary Table 1). Two further control groups were sequentially vaccinated with H6, H5 and H11 constructs without any sequence substituted into the HAs (vaccinated controls). Further two groups were mock vaccinated (unvaccinated controls). b-g Pseudotype microneutralisation assays using 0.5 μl of sera from the bleed at 21 weeks. Error bars are mean ± s.e.m. n = 6 for experimental groups and control groups. The values provided are an average of two replicates The C179 antibody used as a control in Fig. 1  HAs using Swiss-Pdb viewer. Areas of 600, 800 and 1000 Å 2 were mapped onto the surface of the crystal structures by determining the distances between the α carbon of a given amino acid and all others within a structure. Those residues whose α carbon sequences were within the specified area were recorded and used to produce disrupted peptide sequences for a given binding site. ABS variability was calculated as the mean pairwise hamming distance between the consensus sequences collected between 1918 and 2016. The sequences were aligned using MUSCLE before being manually curated using AliView.
Vaccination of mice. Animal studies were approved by the University of Oxford Ethical Review Committee and were performed in strict accordance with the terms of a licence granted by the UK Home Office under the Animal (Scientific Procedures) Act 1986 (licence number: 30/2889).
Female BALB/c mice, n = 6, (bought from Envigo, UK) were sequentially vaccinated with the OREO sequences substituted into H6, H5 and H11 HA backbones in a prime-boost-boost regime at intervals of 3-4 weeks. As a backbone control, two groups were vaccinated with native H6, H5 and H11 constructs.
Further two groups were mock vaccinated and used as an unvaccinated control. The prime and first boost were administered as a 100-µg intramuscular injection of DNA into the musculus tibialis, whilst the final vaccination was administered as an intramuscular injection into the musculus tibialis of eight HI units of lentivirus pseudotype displaying the chimeric H11 HA in Alum adjuvant Alhydrogel (Invitrogen, USA) at a 1:1 volume ratio.
Individuals carrying out the mouse vaccination experiments were blinded regarding the vaccines being administered to the mice; vaccines and cages were numbered and administered as outlined above. The individuals carrying out the experiments were only notified of the vaccine identities after completion of the experiment. No exclusion criteria were applied to the mice and no randomisation was applied to the experiments as inbred mice were used.
Haemagglutinin inhibition assay. Pseudotyped lentivirus displaying influenza HA was diluted twofold down a 96-well plate and mixed with 50 µl of 4% chicken red blood cells. After an hour, the coagulation of red blood cells was assessed visually to determine the point at which coagulation could no longer be observed.  ). a, b The graphs denote daily weight loss of the mice during the challenge. Mice of the same age, which were not vaccinated or challenged, are shown for reference and denoted unchallenged and unvaccinated. c, d Survival curves denoting the number of mice in each group. Mice were euthanised at 20% weight loss. Area under the curve was calculated for the mouse weight loss data and analysed in a single-factor ANOVA. Between-group comparisons were then performed using Tukey's post hoc method for pairwise comparison correction to provide corrected p-values. ****p-value < 0.0001 and **p-value < 0.010. Error bars are mean ± s.e.m. n = 6 for experimental groups and control groups Statistical analysis. Student's t tests were performed to determine all p-values shown in Fig. 1. Area under the curve was calculated for the mouse weight loss data ( Fig. 5a and c, main text) and analysed in a single-factor ANOVA. Between-group comparisons were then performed using Tukey's post hoc method for pairwise comparison correction to provide the corrected p-values.
Fisher's exact test was used to determine survival differences in the experimental groups after 7 days (Fig. 5b and d). All p-values were adjusted to multiple comparisons using the Bonferroni-Holm correction.
Phylogenetic analysis. RAxML version 8.2.11 was used to build a maximum likelihood tree based on the strain HA amino acid sequences, using a gamma distributed site heterogeneity model and the amino acid FLU substitution model. Tip-to-root distance was regressed against sequence dates, using a best fitting root, in Tempest V1.5.1. This yielded an R-squared of 0.886 and 0.834 for the ≤ 2008 and ≥ 2009 data, respectively, indicating a good fit between the genetic distance and the time of sampling. The colour of branches was determined by the identification of amino acids at positions 147 and 158, which are the variable amino acids at the centre of the amino acid binding site. Blue OREO was defined as position 147 as lysine and position 158 as no lysine; orange OREO as position 147 as lysine and position 158 as lysine; green OREO was defined as position 147 as arginine; red OREO was defined as position 147 as a deletion; pink OREO was defined as position 147 as isoleucine.
Accession numbers can be found in the Supplementary Tables 1 and 2.
Code availability. The code will be made available from the corresponding authors to anybody on request.
Data availability