Adaptation by natural selection depends on the rates, effects and interactions of many mutations, making it difficult to determine what proportion of mutations in an evolving lineage are beneficial. Here we analysed 264 complete genomes from 12 Escherichia coli populations to characterize their dynamics over 50,000 generations. The populations that retained the ancestral mutation rate support a model in which most fixed mutations are beneficial, the fraction of beneficial mutations declines as fitness rises, and neutral mutations accumulate at a constant rate. We also compared these populations to mutation-accumulation lines evolved under a bottlenecking regime that minimizes selection. Nonsynonymous mutations, intergenic mutations, insertions and deletions are overrepresented in the long-term populations, further supporting the inference that most mutations that reached high frequency were favoured by selection. These results illuminate the shifting balance of forces that govern genome evolution in populations adapting to a new environment.
At a glance
- Genetic signatures of strong recent positive selection at the lactase gene. Am. J. Hum. Genet. 74, 1111–1120 (2004) et al.
- Comparative population genomics of maize domestication and improvement. Nature Genet. 44, 808–811 (2012) et al.
- Genome-wide SNP and haplotype analyses reveal a rich history underlying dog domestication. Nature 464, 898–902 (2010) et al.
- Parallel bacterial evolution within multiple patients identifies candidate pathogenicity genes. Nature Genet. 43, 1275–1280 (2011) et al.
- PAML 4: phylogenetic analysis by maximum likelihood. Mol. Biol. Evol. 24, 1586–1591 (2007)
- The origins of genome complexity. Science 302, 1401–1404 (2003) &
- Did genetic drift drive increases in genome complexity? PLoS Genet. 6, e1001080 (2010) &
- Different trajectories of parallel evolution during viral adaptation. Science 285, 422–424 (1999) , , , &
- Genome evolution and adaptation in a long-term experiment with Escherichia coli. Nature 461, 1243–1247 (2009) et al.
- The molecular diversity of adaptive convergence. Science 335, 457–461 (2012) et al.
- Pervasive genetic hitchhiking and clonal interference in forty evolving yeast populations. Nature 500, 571–574 (2013) et al.
- Whole genome, whole population sequencing reveals that loss of signaling networks is the major adaptive strategy in a constant environment. PLoS Genet. 9, e1003972 (2013) &
- Genome-wide analysis of a long-term evolution experiment with Drosophila. Nature 467, 587–590 (2010) et al.
- Quantitative evolutionary dynamics using high-resolution lineage tracking. Nature 519, 181–186 (2015) et al.
- Negative epistasis between beneficial mutations in an evolving bacterial population. Science 332, 1193–1196 (2011) , , , &
- Long-term experimental evolution in Escherichia coli. I. Adaptation and divergence during 2000 generations. Am. Nat. 138, 1315–1341 (1991) , , &
- Long-term dynamics of adaptation in asexual populations. Science 342, 1364–1367 (2013) , &
- Evolution of high mutation rates in experimental populations of E. coli. Nature 387, 703–705 (1997) , &
- Genomic analysis of a key innovation in an experimental Escherichia coli population. Nature 489, 513–518 (2012) , , &
- Mutation rate dynamics in a bacterial population reflect tension between adaptation and genetic load. Proc. Natl Acad. Sci. USA 110, 222–227 (2013) et al.
- Tests of parallel molecular evolution in a long-term experiment with Escherichia coli. Proc. Natl Acad. Sci. USA 103, 9107–9112 (2006) , , , &
- Long-term experimental evolution in Escherichia coli. VIII. Dynamics of a balanced polymorphism. Am. Nat. 155, 24–35 (2000) &
- Epistasis and allele specificity in the emergence of a stable polymorphism in Escherichia coli. Science 343, 1366–1369 (2014) et al.
- Identification of mutations in laboratory-evolved microbes from next-generation sequencing data using breseq. Methods Mol. Biol. 1151, 165–188 (2014) &
- Identifying structural variation in haploid microbial genomes from short-read resequencing data using breseq. BMC Genomics 15, 1039 (2014) et al.
- Transposable elements as mutator genes in evolution. Nature 303, 633–635 (1983) , , &
- Second-order selection in bacterial evolution: selection acting on mutation and recombination rates in the course of adaptation. Res. Microbiol. 152, 11–16 (2001) , , &
- The fate of competing beneficial mutations in an asexual population. Genetica 102–103, 127–144 (1998) &
- Genome dynamics during experimental evolution. Nature Rev. Genet. 14, 827–839 (2013) &
- Deleterious passengers in adapting populations. Genetics 198, 1183–1208 (2014) &
- Adaptation, clonal interference, and frequency-dependent interactions in a long-term evolution experiment with Escherichia coli. Genetics 200, 619–631 (2015) , &
- Genetic drift in an infinite population. The pseudohitchhiking model. Genetics 155, 909–919 (2000)
- Genetic draft and quasi-neutrality in large facultatively sexual populations. Genetics 188, 975–996 (2011) &
- The dynamics of genetic draft in rapidly adapting populations. Genetics 195, 1007–1025 (2013) &
- The Neutral Theory of Molecular Evolution (Cambridge Univ. Press, 1983)
- Forces that influence the evolution of codon bias. Phil. Trans. R. Soc. Lond. B 365, 1203–1212 (2010) , &
- Synonymous but not the same: the causes and consequences of codon bias. Nature Rev. Genet. 12, 32–42 (2011) &
- Evolutionary developmental biology and the problem of variation. Evolution 54, 1079–1091 (2000)
- Evo-devo and an expanding evolutionary synthesis: a genetic theory of morphological evolution. Cell 134, 25–36 (2008)
- Transfer of noncoding DNA drives regulatory rewiring in bacteria. Proc. Natl Acad. Sci. USA 111, 16112–16117 (2014) et al.
- Estimate of the genomic mutation rate deleterious to overall fitness in E. coli. Nature 381, 694–696 (1996) &
- Mechanisms causing rapid and parallel losses of ribose catabolism in evolving populations of Escherichia coli B. J. Bacteriol. 183, 2834–2841 (2001) , , &
- The genetic basis of Escherichia coli pathoadaptation to macrophages. PLOS Pathog. 9, e1003802 (2013) et al.
- Adaptation to parasites and costs of parasite resistance in mutator and nonmutator bacteria. Mol. Biol. Evol. 33, 770–782 (2016) , , &
- The rate of adaptive evolution in enteric bacteria. Mol. Biol. Evol. 23, 1348–1356 (2006) &
- Bayesian analysis suggests that most amino acid replacements in Drosophila are driven by positive selection. J. Mol. Evol. 57 (suppl. 1), S154–S164 (2003) , , &
- Natural selection on protein-coding genes in the human genome. Nature 437, 1153–1157 (2005) et al.
- Recombination speeds adaptation by reducing competition between beneficial mutations in populations of Escherichia coli. PLoS Biol. 5, e225 (2007)
- Constraints on adaptation of Escherichia coli to mixed-resource environments increase over time. Evolution 69, 2067–2078 (2015) &
- Antagonistic coevolution accelerates molecular evolution. Nature 464, 275–278 (2010) et al.
- Tracing ancestors and relatives of Escherichia coli B, and the derivation of B strains REL606 and BL21(DE3). J. Mol. Biol. 394, 634–643 (2009) , , , &
- Genome sequences of Escherichia coli B strains REL606 and BL21(DE3). J. Mol. Biol. 394, 644–652 (2009) et al.
- Mutations of bacteria from virus sensitivity to virus resistance. Genetics 28, 491–511 (1943) &
- Historical contingency and the evolution of a key innovation in an experimental population of Escherichia coli. Proc. Natl Acad. Sci. USA 105, 7899–7906 (2008) , &
- Mutation rate inferred from synonymous substitutions in a long-term evolution experiment with Escherichia coli. G3 (Bethesda) 1, 183–186 (2011) et al.
- Large chromosomal rearrangements during a long-term evolution experiment with Escherichia coli. mBio 5, e01377–14 (2014) et al.
- Synonymous genetic variation in natural isolates of Escherichia coli does not predict where synonymous substitutions occur in a long-term experiment. Mol. Biol. Evol. 32, 2897–2904 (2015) et al.
- Versatile and open software for comparing large genomes. Genome Biol. 5, R12 (2004) et al.
- Fast and accurate phylogeny reconstruction algorithms based on the minimum-evolution principle. J. Comput. Biol. 9, 687–705 (2002) &
- APE: analyses of phylogenetics and evolution in R language. Bioinformatics 20, 289–290 (2004) , &
Extended data figures and tables
Extended Data Figures
- Extended Data Figure 1: Changes in genome size during the LTEE. (43 KB)
Box-and-whiskers plot showing the distribution of average genome length (Mb) for each of the 12 LTEE populations based on the two clones sequenced at each time point shown from 500 to 50,000 generations. The red line shows the length of the ancestral genome. The boxes are the interquartile range (IQR), which spans the second and third quartiles of the data (25th to 75th percentiles); the thick black lines are medians; the whiskers extend to the outermost values that are within 1.5 times the IQR; and the points show all outlier values beyond the whiskers.
- Extended Data Figure 2: Accumulation of synonymous mutations in populations that evolved point-mutation hypermutability. (91 KB)
Each symbol shows a sequenced genome from a hypermutable lineage. Colours are the same as those in Fig. 1. The accumulation of synonymous substitutions serves as a proxy for the underlying point-mutation rate. All four of the populations that became hypermutable before 10,000 generations accumulated synonymous mutations at higher rates between 10,000 and 20,000 generations than between 40,000 and 50,000 generations, indicating the evolution of reduced mutability.
- Extended Data Figure 3: Alternative models fit to trajectory of genome evolution for each LTEE population. (128 KB)
a, Ara−1. b, Ara+1. c, Ara−2. d, Ara+2. e, Ara−3. f, Ara+3. g, Ara−4. h, Ara+4. i, Ara−5. j, Ara+5. k, Ara−6. l, Ara+6. Each symbol shows the total mutations in a sequenced genome; in many cases, the symbols for the two genomes from the same population and generation are not distinguishable because they have the same, or almost the same, number of mutations. For the populations that evolved hypermutability, data are shown only for time points before mutators arose. In each panel, the dashed grey line shows the best fit to the linear model; the solid grey curve shows the best fit to the square-root model; and the solid black curve shows the best fit to the composite model with both linear and square-root terms.
- Extended Data Figure 4: Uncertainty in parameter estimation for the model describing the rates of accumulation for neutral and beneficial mutations. (171 KB)
Contours show relative likelihoods for simultaneously estimating the linear and square-root coefficients from the observed numbers of mutations that accumulated over time in non-mutator and premutator lineages (Fig. 3). The black central point shows the maximum likelihood estimates, and the three black contours show solutions 2, 6 and 10 log units away. The points on the horizontal and vertical axes show values for the best one-parameter models.
- Extended Data Figure 5: Accumulation of synonymous substitutions in non-mutator lineages. (58 KB)
Each filled symbol shows the mean number of synonymous mutations in the (usually two) non-mutator genomes from an LTEE population that were sequenced at that time point; non-integer values can occur if the two genomes have different numbers. Small horizontal offsets were added so that overlapping points are visible. Colours are the same as in Fig. 1. Open triangles show the grand means of the replicate populations. The grey line extends from the intercept to the final grand mean. The slope of that line was used to scale the relative rates of synonymous, nonsynonymous and intergenic point mutations in Fig. 4.
- Extended Data Figure 6: Temporal trend in accumulation of nonsynonymous mutations relative to the neutral expectation in non-mutator lineages. (78 KB)
Interval-specific accumulation of nonsynonymous mutations calculated from changes in the total number of nonsynonymous mutations between successive samples. As with the cumulative data in Fig. 4b, values are scaled by the average rate of accumulation for synonymous mutations over 50,000 generations, after adjusting for the numbers of genomic sites at risk for nonsynonymous and synonymous mutations. Each point shows the average rate calculated for a non-mutator or premutator population; small horizontal offsets were added so that overlapping points are visible. Note the discontinuous scale; populations with no additional mutations over an interval are plotted below. Colours are the same as in Fig. 1. Black lines connect grand means; the grey shading shows standard errors calculated from the replicate populations.
- Extended Data Figure 7: Mutational spectrum for non-mutator lineages in the LTEE. (107 KB)
Shaded bars show the distribution of different types of genetic change for all independent mutations found in the set of non-mutator clones that were sequenced at each generation. The total number of mutations in this set at each time point (N) is shown above each column. Base substitutions are divided into synonymous, nonsynonymous, intergenic, and other categories; the nonsynonymous category includes nonsense mutations, and the ‘other’ category includes rare point mutations in noncoding RNA genes and pseudogenes.
- Extended Data Figure 8: Changes in fitness of MAE lines after 550 single-cell bottlenecks and ~13,750 generations. (68 KB)
Each point shows the mean fitness based on nine competition assays between the MAE ancestor (REL1207) or one of the 15 MAE lineages (JEB807–JEB821) and the Ara− variant of the MAE ancestor (REL1206). One-day competition assays were performed using the standard procedures and same conditions as for the LTEE16, 17. Error bars show 95% confidence intervals. *P < 0.05, **P < 0.01, based on two-tailed t-tests of the null hypothesis that relative fitness equals 1. Ten of the fifteen MAE lines experienced significant fitness declines, while none had significant gains.
- Extended Data Figure 9: Trajectories for mutations by class in the LTEE in comparison with neutral expectations based on the MAE. (178 KB)
a–f, Accumulation of nonsynonymous mutations (a), intergenic point mutations (b), IS150 insertions (c), all other IS-element insertions (d), small indels (e) and large indels (f). Colours are the same as in Fig. 1. All values are expressed relative to the rate at which synonymous mutations accumulated in non-mutator LTEE lineages over 50,000 generations (Fig. 4a), and then scaled by the ratio of the number of the indicated class of mutation relative to the number of synonymous mutations in the MAE lines. In all panels, each symbol shows a non-mutator or premutator population. Note the discontinuous scale, in which populations with no mutations of the indicated type are plotted below. Black lines connect grand means over the replicate LTEE populations; the grey shading shows the corresponding standard errors.
- Supplementary Data 1 (65 KB)
This file contains descriptions of column titles (first sheet) and information on the 264 LTEE clones (second sheet) and 15 MAE clones (third sheet) sequenced and analyzed in this study.
- Supplementary Data 2 (1.2 MB)
This file contains the analysis of parallel evolution for nonmutator populations and premutator lineages sorted by gene order (first sheet), G score (second sheet), and excluding nonsynonymous and synonymous mutations (third sheet).
- Supplementary Data 3 (1.3 MB)
This file contains the analysis of parallel evolution for populations that evolved hypermutability sorted by gene order (first sheet), G score (second sheet), and excluding nonsynonymous and synonymous mutations (third sheet).
- Supplementary Data 4 (72 KB)
This file contains the numbers of each type of mutation inferred from sequencing the 264 LTEE genomes (first sheet) and 15 MAE (second sheet) genomes.