For many decades comparative analyses of protein sequences and structures have been used to investigate fundamental principles of molecular evolution1,2. In contrast, relatively little is known about the long-term evolution of species’ phenotypic and genetic properties. This represents an important gap in our understanding of evolution, as exactly these proprieties play key roles in natural selection and adaptation to diverse environments. Here we perform a comparative analysis of bacterial growth and gene deletion phenotypes using hundreds of genome-scale metabolic models. Overall, bacterial phenotypic evolution can be described by a two-stage process with a rapid initial phenotypic diversification followed by a slow long-term exponential divergence. The observed average divergence trend, with approximately similar fractions of phenotypic properties changing per unit time, continues for billions of years. We experimentally confirm the predicted divergence trend using the phenotypic profiles of 40 diverse bacterial species across more than 60 growth conditions. Our analysis suggests that, at long evolutionary distances, gene essentiality is significantly more conserved than the ability to utilize different nutrients, while synthetic lethality is significantly less conserved. We also find that although a rapid phenotypic evolution is sometimes observed within the same species, a transition from high to low phenotypic similarity occurs primarily at the genus level.
This is a preview of subscription content
Subscribe to Journal
Get full journal access for 1 year
only $3.90 per issue
All prices are NET prices.
VAT will be added later in the checkout.
Tax calculation will be finalised during checkout.
Rent or Buy article
Get time limited or full article access on ReadCube.
All prices are NET prices.
Zuckerkandl, E. & Pauling, L. in Evolving Genes and Proteins (eds Bryson, V. & Vogel, H. ) 97–166 (Academic, 1965)
Kimura, M. Evolutionary rate at the molecular level. Nature 217, 624–626 (1968)
Darwin, C. The Origin of Species (Barnes & Noble Classics, 2008)
Orth, J. D., Thiele, I. & Palsson, B. O. What is flux balance analysis? Nature Biotechnol. 28, 245–248 (2010)
Oberhardt, M. A., Palsson, B. O. & Papin, J. A. Applications of genome-scale metabolic reconstructions. Mol. Syst. Biol. 5, 320 (2009)
Kim, T. Y., Sohn, S. B., Kim, Y. B., Kim, W. J. & Lee, S. Y. Recent advances in reconstruction and applications of genome-scale metabolic models. Curr. Opin. Biotechnol. 24, 617–623 (2011)
Henry, C. S. et al. High-throughput generation, optimization and analysis of genome-scale metabolic models. Nature Biotechnol. 28, 977–982 (2010)
Bochner, B. R. Global phenotypic characterization of bacteria. FEMS Microbiol. Rev. 33, 191–205 (2009)
Schuetz, R., Zamboni, N., Zampieri, M., Heinemann, M. & Sauer, U. Multidimensional optimality of microbial metabolism. Science 336, 601–604 (2012)
Peregrin-Alvarez, J. M., Sanford, C. & Parkinson, J. The conservation and evolutionary modularity of metabolism. Genome Biol. 10, R63 (2009)
Moran, N. A., Munson, M. A., Baumann, P. & Ishikawa, H. A molecular clock in endosymbiotic bacteria is calibrated using the insect hosts. Proc. R. Soc. Lond. B 253, 167–171 (1993)
Sabarly, V. et al. The decoupling between genetic structure and metabolic phenotypes in Escherichia coli leads to continuous phenotypic diversity. J. Evol. Biol. 24, 1559–1571 (2011)
Gripp, E. et al. Closely related Campylobacter jejuni strains from different sources reveal a generalist rather than a specialist lifestyle. BMC Genomics 12, 584 (2011)
Monk, J. M. et al. Genome-scale metabolic reconstructions of multiple Escherichia coli strains highlight strain-specific adaptations to nutritional environments. Proc. Natl Acad. Sci. USA 110, 20338–20343 (2013)
de Hoon, M. J., Eichenberger, P. & Vitkup, D. Hierarchical evolution of the bacterial sporulation network. Curr. Biol. 20, R735–R745 (2010)
Kirschner, M. W. & Gerhart, J. C. The Plausibility of Life: Resolving Darwin’s Dilemma (Yale Univ. Press, 2005)
Dixon, S. J. et al. Significant conservation of synthetic lethal genetic interaction networks between distantly related eukaryotes. Proc. Natl Acad. Sci. USA 105, 16653–16658 (2008)
Tischler, J., Lehner, B. & Fraser, A. G. Evolutionary plasticity of genetic interaction networks. Nature Genet. 40, 390–391 (2008)
Szappanos, B. et al. An integrated approach to characterize genetic interaction networks in yeast metabolism. Nature Genet. 43, 656–662 (2011)
Barve, A. & Wagner, A. A latent capacity for evolutionary innovation through exaptation in metabolic systems. Nature 500, 203–206 (2013)
Plata, G., Fuhrer, T., Hsiao, T. L., Sauer, U. & Vitkup, D. Global probabilistic annotation of metabolic networks enables enzyme discovery. Nature Chem. Biol. 8, 848–854 (2012)
Benson, D. A., Karsch-Mizrachi, I., Lipman, D. J., Ostell, J. & Sayers, E. W. GenBank. Nucleic Acids Res. 37, D26–D31 (2009)
Sievers, F. et al. Fast, scalable generation of high-quality protein multiple sequence alignments using Clustal Omega. Mol. Syst. Biol. 7, 539 (2011)
Quast, C. et al. The SILVA ribosomal RNA gene database project: improved data processing and web-based tools. Nucleic Acids Res. 41, D590–D596 (2013)
Felsenstein, J. PHYLIP (Phylogeny Inference Package) version 3.6. Cladistics 5, 164–166 (1989)
Sayers, E. W. et al. Database resources of the National Center for Biotechnology Information. Nucleic Acids Res. 37, D5–D15 (2009)
Becker, S. A. et al. Quantitative prediction of cellular metabolism with constraint-based models: the COBRA Toolbox. Nature Protocols 2, 727–738 (2007)
Altschul, S. F. et al. Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res. 25, 3389–3402 (1997)
Glass, J. I. et al. Essential genes of a minimal bacterium. Proc. Natl Acad. Sci. USA 103, 425–430 (2006)
Kanehisa, M. & Goto, S. KEGG: Kyoto Encyclopedia of Genes and Genomes. Nucleic Acids Res. 28, 27–30 (2000)
We thank B. Bochner and Biolog for providing the experimental phenotypic growth data. We also thank members of the Vitkup laboratory for discussions. This work was supported in part by the National Institute of General Medical Sciences GM079759 grant to D.V. and the U54CA121852 grant to Columbia University. The work by C.S.H. was supported by the Department of Energy contract DE-ACO2-06CH11357, as part of the SB Knowledgebase.
The authors declare no competing financial interests.
The 322 models used in this study are available at http://vitkuplab.c2b2.columbia.edu/phenotypes.
Extended data figures and tables
Extended Data Figure 1 The evolution of phenotypic similarity in the usage of carbon sources for ATP production.
Genetic distances are based on 16S bacterial rRNA sequences. The colours represent the point density at a given genetic distance for all pairwise comparisons between metabolic models (n = 20,910). The black line shows a three-parameter exponential fit to the computational predictions; the red line shows a moving average of the predictions.
Genetic distances are based on 16S bacterial rRNA sequences. Phenotypic similarities of biomass synthesis are shown for 62, 149, and 446 carbon sources. Solid lines represent moving averages (using a 0.05 genetic distance window) of computational predictions; dashed lines represent exponential fits to the data.
Extended Data Figure 3 The effect of species selection on observed patterns of phenotypic divergence.
The black lines in all panels (marked ‘Original fit’) represent the exponential fit of the phenotypic similarity (carbon source utilization) as a function of genetic distance for all pairs of considered models, that is, models with fewer than 20% auto-completion reactions and more than five predicted carbon sources for growth. The observed trends of phenotypic evolution remain similar when (a) only models with a smaller fraction of auto-completion reactions are considered, (b) models from individual families that include more than ten modelled species are excluded from the analysis, (c) only a maximum number of species per family is considered, and (d) a subset of species is chosen at random from the pool of all considered models. In d, the average values at different genetic distance bins are shown for 1,000 random samples of a given number of species; error bars represent the s.e.m. obtained on the basis of the 1,000 replicates.
Extended Data Figure 4 Evolutionary relationship between 40 species for which experimental phenotype microarray data were considered.
The cladogram is based on 16S bacterial rRNA sequences. Different colours indicate different bacterial classes. The tree was rooted in the M. barkeri rRNA sequence.
Extended Data Figure 5 The effect of cutoff selection on the computational and experimental phenotypic similarity trends.
a, The exponential fits of phenotypic similarity as a function of genetic distance for metabolic models that predict growth on more than a given number of carbon sources. b, The frequency distribution of normalized Biolog scores for 40 species across 62 experimental growth conditions. The figure shows a bimodal pattern (scores of 0 are not plotted). The dashed red line shows the cutoff score of 10 used in the main analysis. c, The effect of different cutoffs used to define positive growth in the Biolog data. Different lines represent exponential fits to the experimental values of phenotypic similarity based on different values of the Biolog cutoff score.
The black line shows an exponential fit to the predicted phenotypic similarity in the usage of carbon sources calculated on the basis of the in silico minimal media used in the study of growth phenotypes (see Methods and Supplementary Data 2) (n = 26,106). The dashed red line shows an exponential fit to the predicted phenotypic similarity using the in silico medium in which all carbon sources that could be imported by the models were made available, with the total combined uptake of carbon constrained to a maximum value of 10 mM g−1 dry weight (n = 27,261).
Biolog phenotype microarray data for 40 species across 62 carbon sources. Phenotypic data for all species were obtained from the Biolog GEN III database (Biolog, Inc. Hayward CA, USA). (XLSX 21 kb)
Compounds included in the simulated media in order to test for biomass/ATP production on different carbon and nitrogen sources. (XLSX 12 kb)
About this article
Cite this article
Plata, G., Henry, C. & Vitkup, D. Long-term phenotypic evolution of bacteria. Nature 517, 369–372 (2015). https://doi.org/10.1038/nature13827
BMC Biology (2021)
Genomic analysis of novel Yarrowia-like yeast symbionts associated with the carrion-feeding burying beetle Nicrophorus vespilloides
BMC Genomics (2021)
Exploring the functional composition of the human microbiome using a hand-curated microbial trait database
BMC Bioinformatics (2021)
Whole genome analysis of Gluconacetobacter azotocaptans DS1 and its beneficial effects on plant growth
3 Biotech (2021)
Genome-scale metabolic models of Microbacterium species isolated from a high altitude desert environment
Scientific Reports (2020)