Abstract
For many decades comparative analyses of protein sequences and structures have been used to investigate fundamental principles of molecular evolution1,2. In contrast, relatively little is known about the long-term evolution of species’ phenotypic and genetic properties. This represents an important gap in our understanding of evolution, as exactly these proprieties play key roles in natural selection and adaptation to diverse environments. Here we perform a comparative analysis of bacterial growth and gene deletion phenotypes using hundreds of genome-scale metabolic models. Overall, bacterial phenotypic evolution can be described by a two-stage process with a rapid initial phenotypic diversification followed by a slow long-term exponential divergence. The observed average divergence trend, with approximately similar fractions of phenotypic properties changing per unit time, continues for billions of years. We experimentally confirm the predicted divergence trend using the phenotypic profiles of 40 diverse bacterial species across more than 60 growth conditions. Our analysis suggests that, at long evolutionary distances, gene essentiality is significantly more conserved than the ability to utilize different nutrients, while synthetic lethality is significantly less conserved. We also find that although a rapid phenotypic evolution is sometimes observed within the same species, a transition from high to low phenotypic similarity occurs primarily at the genus level.
This is a preview of subscription content, access via your institution
Relevant articles
Open Access articles citing this article.
-
Growth promotion and antibiotic induced metabolic shifts in the chicken gut microbiome
Communications Biology Open Access 01 April 2022
-
Exploring the functional composition of the human microbiome using a hand-curated microbial trait database
BMC Bioinformatics Open Access 07 June 2021
-
Genomic analysis of novel Yarrowia-like yeast symbionts associated with the carrion-feeding burying beetle Nicrophorus vespilloides
BMC Genomics Open Access 03 May 2021
Access options
Subscribe to this journal
Receive 51 print issues and online access
$199.00 per year
only $3.90 per issue
Rent or buy this article
Get just this article for as long as you need it
$39.95
Prices may be subject to local taxes which are calculated during checkout




References
Zuckerkandl, E. & Pauling, L. in Evolving Genes and Proteins (eds Bryson, V. & Vogel, H. ) 97–166 (Academic, 1965)
Kimura, M. Evolutionary rate at the molecular level. Nature 217, 624–626 (1968)
Darwin, C. The Origin of Species (Barnes & Noble Classics, 2008)
Orth, J. D., Thiele, I. & Palsson, B. O. What is flux balance analysis? Nature Biotechnol. 28, 245–248 (2010)
Oberhardt, M. A., Palsson, B. O. & Papin, J. A. Applications of genome-scale metabolic reconstructions. Mol. Syst. Biol. 5, 320 (2009)
Kim, T. Y., Sohn, S. B., Kim, Y. B., Kim, W. J. & Lee, S. Y. Recent advances in reconstruction and applications of genome-scale metabolic models. Curr. Opin. Biotechnol. 24, 617–623 (2011)
Henry, C. S. et al. High-throughput generation, optimization and analysis of genome-scale metabolic models. Nature Biotechnol. 28, 977–982 (2010)
Bochner, B. R. Global phenotypic characterization of bacteria. FEMS Microbiol. Rev. 33, 191–205 (2009)
Schuetz, R., Zamboni, N., Zampieri, M., Heinemann, M. & Sauer, U. Multidimensional optimality of microbial metabolism. Science 336, 601–604 (2012)
Peregrin-Alvarez, J. M., Sanford, C. & Parkinson, J. The conservation and evolutionary modularity of metabolism. Genome Biol. 10, R63 (2009)
Moran, N. A., Munson, M. A., Baumann, P. & Ishikawa, H. A molecular clock in endosymbiotic bacteria is calibrated using the insect hosts. Proc. R. Soc. Lond. B 253, 167–171 (1993)
Sabarly, V. et al. The decoupling between genetic structure and metabolic phenotypes in Escherichia coli leads to continuous phenotypic diversity. J. Evol. Biol. 24, 1559–1571 (2011)
Gripp, E. et al. Closely related Campylobacter jejuni strains from different sources reveal a generalist rather than a specialist lifestyle. BMC Genomics 12, 584 (2011)
Monk, J. M. et al. Genome-scale metabolic reconstructions of multiple Escherichia coli strains highlight strain-specific adaptations to nutritional environments. Proc. Natl Acad. Sci. USA 110, 20338–20343 (2013)
de Hoon, M. J., Eichenberger, P. & Vitkup, D. Hierarchical evolution of the bacterial sporulation network. Curr. Biol. 20, R735–R745 (2010)
Kirschner, M. W. & Gerhart, J. C. The Plausibility of Life: Resolving Darwin’s Dilemma (Yale Univ. Press, 2005)
Dixon, S. J. et al. Significant conservation of synthetic lethal genetic interaction networks between distantly related eukaryotes. Proc. Natl Acad. Sci. USA 105, 16653–16658 (2008)
Tischler, J., Lehner, B. & Fraser, A. G. Evolutionary plasticity of genetic interaction networks. Nature Genet. 40, 390–391 (2008)
Szappanos, B. et al. An integrated approach to characterize genetic interaction networks in yeast metabolism. Nature Genet. 43, 656–662 (2011)
Barve, A. & Wagner, A. A latent capacity for evolutionary innovation through exaptation in metabolic systems. Nature 500, 203–206 (2013)
Plata, G., Fuhrer, T., Hsiao, T. L., Sauer, U. & Vitkup, D. Global probabilistic annotation of metabolic networks enables enzyme discovery. Nature Chem. Biol. 8, 848–854 (2012)
Benson, D. A., Karsch-Mizrachi, I., Lipman, D. J., Ostell, J. & Sayers, E. W. GenBank. Nucleic Acids Res. 37, D26–D31 (2009)
Sievers, F. et al. Fast, scalable generation of high-quality protein multiple sequence alignments using Clustal Omega. Mol. Syst. Biol. 7, 539 (2011)
Quast, C. et al. The SILVA ribosomal RNA gene database project: improved data processing and web-based tools. Nucleic Acids Res. 41, D590–D596 (2013)
Felsenstein, J. PHYLIP (Phylogeny Inference Package) version 3.6. Cladistics 5, 164–166 (1989)
Sayers, E. W. et al. Database resources of the National Center for Biotechnology Information. Nucleic Acids Res. 37, D5–D15 (2009)
Becker, S. A. et al. Quantitative prediction of cellular metabolism with constraint-based models: the COBRA Toolbox. Nature Protocols 2, 727–738 (2007)
Altschul, S. F. et al. Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res. 25, 3389–3402 (1997)
Glass, J. I. et al. Essential genes of a minimal bacterium. Proc. Natl Acad. Sci. USA 103, 425–430 (2006)
Kanehisa, M. & Goto, S. KEGG: Kyoto Encyclopedia of Genes and Genomes. Nucleic Acids Res. 28, 27–30 (2000)
Acknowledgements
We thank B. Bochner and Biolog for providing the experimental phenotypic growth data. We also thank members of the Vitkup laboratory for discussions. This work was supported in part by the National Institute of General Medical Sciences GM079759 grant to D.V. and the U54CA121852 grant to Columbia University. The work by C.S.H. was supported by the Department of Energy contract DE-ACO2-06CH11357, as part of the SB Knowledgebase.
Author information
Authors and Affiliations
Contributions
G.P. and D.V. conceived the study and performed the research and data analysis. C.S.H. built the metabolic models. D.V. directed the research. G.P. and D.V. wrote the manuscript. All authors read and edited the manuscript.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing financial interests.
Additional information
The 322 models used in this study are available at http://vitkuplab.c2b2.columbia.edu/phenotypes.
Extended data figures and tables
Extended Data Figure 1 The evolution of phenotypic similarity in the usage of carbon sources for ATP production.
Genetic distances are based on 16S bacterial rRNA sequences. The colours represent the point density at a given genetic distance for all pairwise comparisons between metabolic models (n = 20,910). The black line shows a three-parameter exponential fit to the computational predictions; the red line shows a moving average of the predictions.
Extended Data Figure 2 The evolution of phenotypic similarity for different sets of carbon sources.
Genetic distances are based on 16S bacterial rRNA sequences. Phenotypic similarities of biomass synthesis are shown for 62, 149, and 446 carbon sources. Solid lines represent moving averages (using a 0.05 genetic distance window) of computational predictions; dashed lines represent exponential fits to the data.
Extended Data Figure 3 The effect of species selection on observed patterns of phenotypic divergence.
The black lines in all panels (marked ‘Original fit’) represent the exponential fit of the phenotypic similarity (carbon source utilization) as a function of genetic distance for all pairs of considered models, that is, models with fewer than 20% auto-completion reactions and more than five predicted carbon sources for growth. The observed trends of phenotypic evolution remain similar when (a) only models with a smaller fraction of auto-completion reactions are considered, (b) models from individual families that include more than ten modelled species are excluded from the analysis, (c) only a maximum number of species per family is considered, and (d) a subset of species is chosen at random from the pool of all considered models. In d, the average values at different genetic distance bins are shown for 1,000 random samples of a given number of species; error bars represent the s.e.m. obtained on the basis of the 1,000 replicates.
Extended Data Figure 4 Evolutionary relationship between 40 species for which experimental phenotype microarray data were considered.
The cladogram is based on 16S bacterial rRNA sequences. Different colours indicate different bacterial classes. The tree was rooted in the M. barkeri rRNA sequence.
Extended Data Figure 5 The effect of cutoff selection on the computational and experimental phenotypic similarity trends.
a, The exponential fits of phenotypic similarity as a function of genetic distance for metabolic models that predict growth on more than a given number of carbon sources. b, The frequency distribution of normalized Biolog scores for 40 species across 62 experimental growth conditions. The figure shows a bimodal pattern (scores of 0 are not plotted). The dashed red line shows the cutoff score of 10 used in the main analysis. c, The effect of different cutoffs used to define positive growth in the Biolog data. Different lines represent exponential fits to the experimental values of phenotypic similarity based on different values of the Biolog cutoff score.
Extended Data Figure 6 Effect of growth media on the predicted phenotypic similarity trends.
The black line shows an exponential fit to the predicted phenotypic similarity in the usage of carbon sources calculated on the basis of the in silico minimal media used in the study of growth phenotypes (see Methods and Supplementary Data 2) (n = 26,106). The dashed red line shows an exponential fit to the predicted phenotypic similarity using the in silico medium in which all carbon sources that could be imported by the models were made available, with the total combined uptake of carbon constrained to a maximum value of 10 mM g−1 dry weight (n = 27,261).
Supplementary information
Supplementary Data 1
Biolog phenotype microarray data for 40 species across 62 carbon sources. Phenotypic data for all species were obtained from the Biolog GEN III database (Biolog, Inc. Hayward CA, USA). (XLSX 21 kb)
Supplementary Data 2
Compounds included in the simulated media in order to test for biomass/ATP production on different carbon and nitrogen sources. (XLSX 12 kb)
Rights and permissions
About this article
Cite this article
Plata, G., Henry, C. & Vitkup, D. Long-term phenotypic evolution of bacteria. Nature 517, 369–372 (2015). https://doi.org/10.1038/nature13827
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1038/nature13827
This article is cited by
-
Growth promotion and antibiotic induced metabolic shifts in the chicken gut microbiome
Communications Biology (2022)
-
Assessment of assumptions underlying models of prokaryotic pangenome evolution
BMC Biology (2021)
-
Genomic analysis of novel Yarrowia-like yeast symbionts associated with the carrion-feeding burying beetle Nicrophorus vespilloides
BMC Genomics (2021)
-
Exploring the functional composition of the human microbiome using a hand-curated microbial trait database
BMC Bioinformatics (2021)
-
A metabolomics pipeline for the mechanistic interrogation of the gut microbiome
Nature (2021)
Comments
By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.