Why prokaryotes have pangenomes

Article metrics

The existence of large amounts of within-species genome content variability is puzzling. Population genetics tells us that fitness effects of new variants—either deleterious, neutral or advantageous—combined with the long-term effective population size of the species determines the likelihood of a new variant being removed, spreading to fixation or remaining polymorphic. Consequently, we expect that selection and drift will reduce genetic variation, which makes large amounts of gene content variation in some species so puzzling. Here, we amalgamate population genetic theory with models of horizontal gene transfer and assert that pangenomes most easily arise in organisms with large long-term effective population sizes, as a consequence of acquiring advantageous genes, and that the focal species has the ability to migrate to new niches. Therefore, we suggest that pangenomes are the result of adaptive, not neutral, evolution.

Access optionsAccess options

Rent or Buy article

Get time limited or full article access on ReadCube.


All prices are NET prices.

Figure 1: Schematic representation of pangenomes as Venn diagrams.
Figure 2: Analysis of accessory gene functions in 228 E. coli ST131 genomes.


  1. 1

    Perna, N. T. et al. Genome sequence of enterohaemorrhagic Escherichia coli O157:H7. Nature 409, 529–533 (2001).

  2. 2

    Young, J. P. et al. The genome of Rhizobium leguminosarum has recognizable core and accessory components. Genome Biol. 7, R34 (2006).

  3. 3

    Tettelin, H. et al. Genome analysis of multiple pathogenic isolates of Streptococcus agalactiae: implications for the microbial “pan-genome”. Proc. Natl. Acad. Sci. USA 102, 13950–13955 (2005).

  4. 4

    Ku, C. et al. Endosymbiotic gene transfer from prokaryotic pangenomes: inherited chimerism in eukaryotes. Proc. Natl Acad. Sci. USA 112, 10139–10146 (2015).

  5. 5

    Treangen, T. J. & Rocha, E. P. Horizontal transfer, not duplication, drives the expansion of protein families in prokaryotes. PLoS Genet. 7, e1001284 (2011).

  6. 6

    Martinez-Murcia, A. J., Benlloch, S. & Collins, M. D. Phylogenetic interrelationships of members of the genera Aeromonas and Plesiomonas as determined by 16S ribosomal DNA sequencing: lack of congruence with results of DNA-DNA hybridizations. Int. J. Syst. Bacteriol. 42, 412–421 (1992).

  7. 7

    Creevey, C. J. et al. Does a tree-like phylogeny only exist at the tips in the prokaryotes? Proc. R. Soc. Lond. B 271, 2551–2558 (2004).

  8. 8

    Doolittle, W. F. Phylogenetic classification and the universal tree. Science 284, 2124–2129 (1999).

  9. 9

    Daubin, V., Moran, N. A. & Ochman, H. Phylogenetics and the cohesion of bacterial genomes. Science 301, 829–832 (2003).

  10. 10

    Bapteste, E. et al. Evolutionary analyses of non-genealogical bonds produced by introgressive descent. Proc. Natl Acad. Sci. USA 109, 18266–18272 (2012).

  11. 11

    Land, M. et al. Insights from 20 years of bacterial genome sequencing. Funct. Integr. Genomics 15, 141–161 (2015).

  12. 12

    Lukjancenko, O., Wassenaar, T. M. & Ussery, D. W. Comparison of 61 sequenced Escherichia coli genomes. Microb. Ecol. 60, 708–720 (2010).

  13. 13

    Lapierre, P. & Gogarten, J. P. Estimating the size of the bacterial pan-genome. Trends Genet. 25, 107–110 (2009).

  14. 14

    Li, R. et al. Building the sequence map of the human pan-genome. Nat. Biotechnol. 28, 57–63 (2010).

  15. 15

    Ku, C. et al. Endosymbiotic origin and differential loss of eukaryotic genes. Nature 524, 427–432 (2015).

  16. 16

    Lynch, M. & Conery, J. S. The origins of genome complexity. Science 302, 1401–1404 (2003).

  17. 17

    Shapiro, B. J. How clonal are bacteria over time? Curr. Opin. Microbiol. 31, 116–123 (2016).

  18. 18

    Vos, M., Hesselman, M. C., te Beek, T. A., van Passel, M. W. & Eyre-Walker, A. Rates of lateral gene transfer in prokaryotes: high but why? Trends Microbiol. 23, 598–605 (2015).

  19. 19

    Kimura, M. The Neutral Theory of Molecular Evolution (Cambridge Univ. Press, 1984).

  20. 20

    Lane, N. & Martin, W. The energetics of genome complexity. Nature 467, 929–934 (2010).

  21. 21

    Ohta, T. Slightly deleterious mutant substitutions in evolution. Nature 246, 96–98 (1973).

  22. 22

    Konstantinidis, K. T. & Tiedje, J. M. Trends between gene content and genome size in prokaryotic species with larger genomes. Proc. Natl Acad. Sci. USA 101, 3160–3165 (2004).

  23. 23

    Kuo, C. H. & Ochman, H. Deletional bias across the three domains of life. Genome Biol. Evol. 1, 145–152 (2009).

  24. 24

    Sela, I., Wolf, Y. I. & Koonin, E. V. Theory of prokaryotic genome evolution. Proc. Natl Acad. Sci. USA 113, 11399–11407 (2016).

  25. 25

    Nakamura, Y., Itoh, T., Matsuda, H. & Gojobori, T. Biased biological functions of horizontally transferred genes in prokaryotic genomes. Nat. Genet. 36, 760–766 (2004).

  26. 26

    Pandey, D. P. & Gerdes, K. Toxin-antitoxin loci are highly abundant in free-living but lost from host-associated prokaryotes. Nucleic Acids Res. 33, 966–976 (2005).

  27. 27

    McNally, A. et al. Combined analysis of variation in core, accessory and regulatory genome regions provides a super-resolution view into the evolution of bacterial populations. PLoS Genet. 12, e1006280 (2016).

  28. 28

    Baltrus, D. A. Exploring the costs of horizontal gene transfer. Trends Ecol. Evol. 28, 489–495 (2013).

  29. 29

    Charlesworth, B. Fundamental concepts in genetics: effective population size and patterns of molecular evolution and variation. Nat. Rev. Genet. 10, 195–205 (2009).

  30. 30

    Niehus, R., Mitri, S., Fletcher, A. G. & Foster, K. R. Migration and horizontal gene transfer divide microbial genomes into multiple niches. Nat. Commun. 6, 8924 (2015).

  31. 31

    Karcagi, I. et al. Indispensability of horizontally transferred genes and its impact on bacterial genome streamlining. Mol. Biol. Evol. 33, 1257–1269 (2016).

  32. 32

    Hutchison, C. A. 3rd et al. Design and synthesis of a minimal bacterial genome. Science 351, aad6253 (2016).

  33. 33

    Chang, Y. J. et al. Non-contiguous finished genome sequence and contextual data of the filamentous soil bacterium Ktedonobacter racemifer type strain (SOSP1–21). Stand. Genomic Sci. 5, 97–111 (2011).

  34. 34

    Lee, M. C. & Marx, C. J. Repeated, selection-driven genome reduction of accessory genes in experimental populations. PLoS Genet. 8, e1002651 (2012).

  35. 35

    Locey, K. J. & Lennon, J. T. Scaling laws predict global microbial diversity. Proc. Natl Acad. Sci. USA 113, 5970–5975 (2016).

  36. 36

    Erwin, D. H. A public goods approach to major evolutionary innovations. Geobiology 13, 308–315 (2015).

  37. 37

    McInerney, J. O., Pisani, D., Bapteste, E. & O'Connell, M. J. The public goods hypothesis for the evolution of life on Earth. Biol. Direct 6, 41 (2011).

  38. 38

    Schatz, M. C. et al. Whole genome de novo assemblies of three divergent strains of rice, Oryza sativa, document novel gene space of aus and indica. Genome Biol. 15, 506 (2014).

  39. 39

    Li, Y. H. et al. De novo assembly of soybean wild relatives for pan-genome analysis of diversity and agronomic traits. Nat. Biotechnol. 32, 1045–1052 (2014).

  40. 40

    Read, B. A. et al. Pan genome of the phytoplankton Emiliania underpins its global distribution. Nature 499, 209–213 (2013).

  41. 41

    Ding, W., Baumdicker, F. & Neher, R. A. panX: pan-genome analysis and exploration. Preprint at bioRxivhttps://doi.org/10.1101/072082 (2016).

  42. 42

    Sharp, P. M., Stenico, M., Peden, J. F. & Lloyd, A. T. Codon usage: mutational bias, translational selection, or both? Biochem. Soc. Trans. 21, 835–841 (1993).

  43. 43

    McInerney, J. O. Replicational and transcriptional selection on codon usage in Borrelia burgdorferi. Proc. Natl Acad. Sci. USA 95, 10698–10703 (1998).

  44. 44

    McInerney, J. O. Prokaryotic genome evolution as assessed by multivariate analysis of codon usage patterns. Microb. Comp. Genomics 2, 89–97 (1997).

  45. 45

    Doherty, A. & McInerney, J. O. Translational selection frequently overcomes genetic drift in shaping synonymous codon usage patterns in vertebrates. Mol. Biol. Evol. 30, 2263–2267 (2013).

Download references


We wish to thank J. Mallet for commenting on a draft of this manuscript. We would also like to thanks the anonymous reviewers. J.O.M. is funded by BBSRC grant no. BB/N018044/1 and the John Templeton Foundation.

Author information

J.O.M., A.M. and M.J.O. collectively conceived and wrote this manuscript.

Correspondence to James O. McInerney.

Ethics declarations

Competing interests

The authors declare no competing financial interests.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Further reading