Correspondence | Published:

Reconciling disparate estimates of viral genetic diversity during human influenza infections

Nature Genetics (2019) | Download Citation

Access optionsAccess options

Rent or Buy article

Get time limited or full article access on ReadCube.


All prices are NET prices.

Data availability

We downloaded sequencing data generated by the Hong Kong study4 from!Synapse:syn8033988/, following the methods of a study that reanalyzed data from the Hong Kong study to estimate transmission-bottleneck sizes by using a new analytical method10. We obtained sequencing data for the Wisconsin study3 by contacting the corresponding authors of that study. We downloaded sequencing data for the other studies from SRA BioProject PRJNA344659 (ref. 5) and PRJNA412631 (ref. 6). More details are provided in the Nature Research Reporting Summary.


  1. 1.

    Xue, K. S., Moncla, L. H., Bedford, T. & Bloom, J. D. Trends Microbiol. 26, 781–793 (2018).

  2. 2.

    McCrone, J. T. & Lauring, A. S. Curr. Opin. Virol. 28, 20–25 (2018).

  3. 3.

    Dinis, J. M. et al. J. Virol. 90, 3355–3365 (2016).

  4. 4.

    Poon, L. L. M. et al. Nat. Genet. 48, 195–200 (2016).

  5. 5.

    Debbink, K. et al. PLoS Pathog. 13, e1006194 (2017).

  6. 6.

    McCrone, J. T. et al. eLife 7, e35962 (2018).

  7. 7.

    Xue, K. S., Greninger, A. L., Pérez-Osorio, A. & Bloom, J. D. MSphere 3, e00552–17 (2018).

  8. 8.

    McCrone, J. T. & Lauring, A. S. J. Virol. 90, 6884–6895 (2016).

  9. 9.

    Illingworth, C. J. R. et al. Virus Evol. 3, vex030 (2017).

  10. 10.

    Sobel Leonard, A., Weissman, D. B., Greenbaum, B., Ghedin, E. & Koelle, K. J. Virol. 91, e00171–17 (2017).

Download references


We thank P. Green for helpful comments on the manuscript. K.S.X. is supported by the Hertz Foundation Myhrvold Family Fellowship. The work of J.D.B. was supported by grant R01AI127893 from the NIAID of the NIH. J.D.B. is supported as an Investigator of the Howard Hughes Medical Institute.

Author information


  1. Division of Basic Sciences, Fred Hutchinson Cancer Research Center, Seattle, WA, USA

    • Katherine S. Xue
    •  & Jesse D. Bloom
  2. Computational Biology Program, Fred Hutchinson Cancer Research Center, Seattle, WA, USA

    • Katherine S. Xue
    •  & Jesse D. Bloom
  3. Department of Genome Sciences, University of Washington, Seattle, WA, USA

    • Katherine S. Xue
    •  & Jesse D. Bloom
  4. Howard Hughes Medical Institute, Seattle, WA, USA

    • Jesse D. Bloom


  1. Search for Katherine S. Xue in:

  2. Search for Jesse D. Bloom in:


K.S.X. and J.D.B. conceptualized the study and wrote the report. K.S.X. performed the analyses, some of which were independently reimplemented by J.D.B.

Competing interests

The authors declare no competing interests.

Corresponding author

Correspondence to Jesse D. Bloom.

Integrated supplementary information

  1. Supplementary Figure 1 High within-host genetic diversity of human influenza virus in our reanalysis of sequencing data from the Hong Kong study.

    This figure mimics the format of the second figure of Poon et al (2016) and shows that our re-analysis recapitulates the main reported results of high-frequency shared genetic diversity between epidemiologically unrelated individuals. (A) Viral genetic diversity at hemagglutinin codon 335 in H3N2 human influenza infections. At this codon, both variants encode the same amino acid. This plot shows within-host variants that were present at a frequency of at least 1% in both sequencing replicates at sites with minimum sequencing coverage of 200 reads. Shaded regions indicate individuals from the same household. (B) Viral genetic diversity in the HA1 domain of hemagglutinin in H3N2 human influenza infections in our re-analysis. Each panel represents a separate site in the genome and is labeled by the codon it represents. Sites shown harbored within-host variation at a frequency of at least 3% in both sequencing replicates for at least two samples in the study.

  2. Supplementary Figure 2 Comparison of within-host genetic diversity across studies and within different sequencing reads in a study.

    (A) Number of within-host variants identified in each sample in each study, normalized to the length of the genome sequenced in each study. For each sample, we identified within-host variants that were present at a frequency of at least 3% at sites with minimum sequencing coverage of 200 reads. The center line of each box plot displays the median value; the box limits display upper and lower quartiles; and the whiskers extend up to 1.5 times the interquartile range. The number of samples in each study is listed in Supplementary Table 1. (B) Number of within-host variants identified in the 46 H3N2 samples when analyzing both members of each sequenced read pair, just read 1, or just read 2. Variants were called and data plotted as in (A).

Supplementary information

  1. Supplementary Text and Figures

    Supplementary Figures 1 and 2, Supplementary Methods and Supplementary Table 1

  2. Reporting Summary

About this article

Publication history



Newsletter Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing