Reconciling disparate estimates of viral genetic diversity during human influenza infections

Access options

Rent or Buy article

Get time limited or full article access on ReadCube.


All prices are NET prices.

Fig. 1: Comparison of shared within-host viral genetic diversity in four large-scale deep-sequencing studies of human influenza virus.
Fig. 2: Paired-end-sequencing reads are frequently split between samples that were run on the same sequencing lane.

Data availability

We downloaded sequencing data generated by the Hong Kong study4 from!Synapse:syn8033988/, following the methods of a study that reanalyzed data from the Hong Kong study to estimate transmission-bottleneck sizes by using a new analytical method10. We obtained sequencing data for the Wisconsin study3 by contacting the corresponding authors of that study. We downloaded sequencing data for the other studies from SRA BioProject PRJNA344659 (ref. 5) and PRJNA412631 (ref. 6). More details are provided in the Nature Research Reporting Summary.


  1. 1.

    Xue, K. S., Moncla, L. H., Bedford, T. & Bloom, J. D. Trends Microbiol. 26, 781–793 (2018).

    CAS  Article  Google Scholar 

  2. 2.

    McCrone, J. T. & Lauring, A. S. Curr. Opin. Virol. 28, 20–25 (2018).

    Article  Google Scholar 

  3. 3.

    Dinis, J. M. et al. J. Virol. 90, 3355–3365 (2016).

    CAS  Article  Google Scholar 

  4. 4.

    Poon, L. L. M. et al. Nat. Genet. 48, 195–200 (2016).

    CAS  Article  Google Scholar 

  5. 5.

    Debbink, K. et al. PLoS Pathog. 13, e1006194 (2017).

    Article  Google Scholar 

  6. 6.

    McCrone, J. T. et al. eLife 7, e35962 (2018).

    Article  Google Scholar 

  7. 7.

    Xue, K. S., Greninger, A. L., Pérez-Osorio, A. & Bloom, J. D. MSphere 3, e00552–17 (2018).

    CAS  PubMed  PubMed Central  Google Scholar 

  8. 8.

    McCrone, J. T. & Lauring, A. S. J. Virol. 90, 6884–6895 (2016).

    CAS  Article  Google Scholar 

  9. 9.

    Illingworth, C. J. R. et al. Virus Evol. 3, vex030 (2017).

    Article  Google Scholar 

  10. 10.

    Sobel Leonard, A., Weissman, D. B., Greenbaum, B., Ghedin, E. & Koelle, K. J. Virol. 91, e00171–17 (2017).

    Article  Google Scholar 

Download references


We thank P. Green for helpful comments on the manuscript. K.S.X. is supported by the Hertz Foundation Myhrvold Family Fellowship. The work of J.D.B. was supported by grant R01AI127893 from the NIAID of the NIH. J.D.B. is supported as an Investigator of the Howard Hughes Medical Institute.

Author information




K.S.X. and J.D.B. conceptualized the study and wrote the report. K.S.X. performed the analyses, some of which were independently reimplemented by J.D.B.

Corresponding author

Correspondence to Jesse D. Bloom.

Ethics declarations

Competing interests

The authors declare no competing interests.

Integrated supplementary information

Supplementary Figure 1 High within-host genetic diversity of human influenza virus in our reanalysis of sequencing data from the Hong Kong study.

This figure mimics the format of the second figure of Poon et al (2016) and shows that our re-analysis recapitulates the main reported results of high-frequency shared genetic diversity between epidemiologically unrelated individuals. (A) Viral genetic diversity at hemagglutinin codon 335 in H3N2 human influenza infections. At this codon, both variants encode the same amino acid. This plot shows within-host variants that were present at a frequency of at least 1% in both sequencing replicates at sites with minimum sequencing coverage of 200 reads. Shaded regions indicate individuals from the same household. (B) Viral genetic diversity in the HA1 domain of hemagglutinin in H3N2 human influenza infections in our re-analysis. Each panel represents a separate site in the genome and is labeled by the codon it represents. Sites shown harbored within-host variation at a frequency of at least 3% in both sequencing replicates for at least two samples in the study.

Supplementary Figure 2 Comparison of within-host genetic diversity across studies and within different sequencing reads in a study.

(A) Number of within-host variants identified in each sample in each study, normalized to the length of the genome sequenced in each study. For each sample, we identified within-host variants that were present at a frequency of at least 3% at sites with minimum sequencing coverage of 200 reads. The center line of each box plot displays the median value; the box limits display upper and lower quartiles; and the whiskers extend up to 1.5 times the interquartile range. The number of samples in each study is listed in Supplementary Table 1. (B) Number of within-host variants identified in the 46 H3N2 samples when analyzing both members of each sequenced read pair, just read 1, or just read 2. Variants were called and data plotted as in (A).

Supplementary information

Supplementary Text and Figures

Supplementary Figures 1 and 2, Supplementary Methods and Supplementary Table 1

Reporting Summary

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Xue, K.S., Bloom, J.D. Reconciling disparate estimates of viral genetic diversity during human influenza infections. Nat Genet 51, 1298–1301 (2019).

Download citation

Further reading


Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing