Credit: Image is reproduced, with permission, from Nelson, M. I. & Holmes, E. C. © (2007) Macmillan Publishers Ltd. All rights reserved.

A study of how the sequence of the human influenza A virus has evolved over the past 40 years could help efforts to construct effective flu vaccines and drugs by improving predictions of which new mutations will arise and spread.

The sequences encoding the surface proteins haemagglutinin (HA) and neuraminidase (NA) — which determine whether a virus can enter and exit human cells — are the most rapidly evolving portions of the viral genome. These genes evolve through a complex series of pleiotropic mutations, which, until now, had not been systematically studied for their pattern of potentially meaningful interactions.

To detect such patterns, Plotkin and colleagues developed a new statistic that captures the temporal relationship between the emergence of two consecutive substitutions: the closer in time the second mutation appears relative to the first, in multiple lineages, the larger the statistic. A key strength of the method lies in explicitly considering the phylogenetic relationship between the successive mutations and allowing the two mutations to occur on separate phylogenetic branches.

In the two viral subtypes studied, the number of epistatically interacting pairs of mutations by far exceeds the non-epistatic null expectation (the greatest number of interactions, 333, was seen in type-3 HA). Further analysis of the location of the sites and their evolutionary and functional relationship reveals that positive epistasis plays a significant part in the evolution of HA and NA.

The predictive power of the model marries well with the timeframe for vaccine production

As well as generating a large catalogue of interacting sites, the authors validated their findings against known functional interactions. For example, they identified a pair of interacting sites in type-1 NA that had already been experimentally proven to confer resistance to the drug oseltamivir, and they predicted another pair that was then validated in vitro. The predictive power of the model marries well with the timeframe for vaccine production, as the time of emergence of the second mutation in an epistatic pair takes over 2 years, on average.

The statistic developed in this study detects sites with more subtle signals of positive selection than are detected in typical surveys, and so offers a practical system for studying not only public health but molecular evolution in general.