Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Article
  • Published:

Prochlorococcus have low global mutation rate and small effective population size

Abstract

Prochlorococcus are the most abundant free-living photosynthetic carbon-fixing organisms in the ocean. Prochlorococcus show small genome sizes, low genomic G+C content, reduced DNA repair gene pool and fast evolutionary rates, which are typical features of endosymbiotic bacteria. Nevertheless, their evolutionary mechanisms are believed to be different. Evolution of endosymbiotic bacteria is dominated by genetic drift owing to repeated population bottlenecks, whereas Prochlorococcus are postulated to have extremely large effective population sizes (Ne) and thus drift has rarely been considered. However, accurately extrapolating Ne requires measuring an unbiased global mutation rate through mutation accumulation, which is challenging for Prochlorococcus. Here, we managed this experiment over 1,065 days using Prochlorococcus marinus AS9601, sequenced genomes of 141 mutant lines and determined its mutation rate to be 3.50 × 10−10 per site per generation. Extrapolating Ne additionally requires identifying population boundaries, which we defined using PopCOGenT and over 400 genomes related to AS9601. Accordingly, we calculated its Ne to be 1.68 × 107, which is only reasonably greater than that of endosymbiotic bacteria but surprisingly smaller than that of many free-living bacteria extrapolated using the same approach. Our results therefore suggest that genetic drift is a key driver of Prochlorococcus evolution.

This is a preview of subscription content, access via your institution

Access options

Buy this article

Prices may be subject to local taxes which are calculated during checkout

Fig. 1: Determining the unbiased global mutation rate of Prochlorococcus marinus AS9601.
Fig. 2: Histogram of effective population size (Ne) estimates.
Fig. 3: Scaling relationships.

Similar content being viewed by others

Data availability

Source data are provided with this paper. All the datasets generated, analysed and presented in the current study are available in the Supplementary Information. Raw reads of the 141 surviving lines are available at the NCBI SRA under accession no. PRJNA733321. The 589 gene trees are available at https://doi.org/10.6084/m9.figshare.c.5638369.

Code availability

All the scripts are deposited at https://doi.org/10.6084/m9.figshare.c.5638369.

References

  1. Flombaum, P. et al. Present and future global distributions of the marine Cyanobacteria Prochlorococcus and Synechococcus. Proc. Natl Acad. Sci. USA 110, 9824–9829 (2013).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  2. Biller, S. J., Berube, P. M., Lindell, D. & Chisholm, S. W. Prochlorococcus: the structure and function of collective diversity. Nat. Rev. Microbiol. 13, 13–27 (2015).

    Article  CAS  PubMed  Google Scholar 

  3. Johnson, Z. I. et al. Niche partitioning among Prochlorococcus ecotypes along ocean-scale environmental gradients. Science 311, 1737–1740 (2006).

    Article  CAS  PubMed  Google Scholar 

  4. Zinser, E. R. et al. Influence of light and temperature on Prochlorococcus ecotype distributions in the Atlantic Ocean. Limnol. Oceanogr. 52, 2205–2220 (2007).

    Article  Google Scholar 

  5. Kettler, G. C. et al. Patterns and implications of gene gain and loss in the evolution of Prochlorococcus. PLoS Genet. 3, 2515–2528 (2007).

    Article  CAS  Google Scholar 

  6. Luo, H., Friedman, R., Tang, J. & Hughes, A. L. Genome reduction by deletion of paralogs in the marine cyanobacterium Prochlorococcus. Mol. Biol. Evol. 28, 2751–2760 (2011).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  7. Batut, B., Knibbe, C., Marais, G. & Daubin, V. Reductive genome evolution at both ends of the bacterial population size spectrum. Nat. Rev. Microbiol. 12, 841–850 (2014).

    Article  CAS  PubMed  Google Scholar 

  8. Fraser, C., Alm, E. J., Polz, M. F., Spratt, B. G. & Hanage, W. P. The bacterial species challenge: making sense of genetic and ecological diversity. Science 323, 741–746 (2009).

    Article  CAS  PubMed  Google Scholar 

  9. Bobay, L.-M. & Ochman, H. Factors driving effective population size and pan-genome evolution in bacteria. BMC Evol. Biol. 18, 153 (2018).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  10. Kuo, C.-H., Moran, N. A. & Ochman, H. The consequences of genetic drift for bacterial genome complexity. Genome Res. 19, 1450–1454 (2009).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  11. Luo, H., Swan, B. K., Stepanauskas, R., Hughes, A. L. & Moran, M. A. Comparing effective population sizes of dominant marine Alphaproteobacteria lineages. Environ. Microbiol. Rep. 6, 167–172 (2014).

    Article  PubMed  Google Scholar 

  12. Kryazhimskiy, S. & Plotkin, J. B. The population genetics of dN/dS. PLoS Genet. 4, e1000304 (2008).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  13. Rocha, E. P. C. & Feil, E. J. Mutational patterns cannot explain genome composition: are there any neutral sites in the genomes of bacteria? PLoS Genet. 6, e1001104 (2010).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  14. Luo, H., Thompson, L. R., Stingl, U. & Hughes, A. L. Selection maintains low genomic GC content in marine SAR11 lineages. Mol. Biol. Evol. 32, 2738–2748 (2015).

    Article  CAS  PubMed  Google Scholar 

  15. Hellweger, F. L., Huang, Y. & Luo, H. Carbon limitation drives GC content evolution of a marine bacterium in an individual-based genome-scale model. ISME J. 12, 1180–1187 (2018).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  16. Lynch, M. & Conery, J. S. The origins of genome complexity. Science 302, 1401–1404 (2003).

    Article  CAS  PubMed  Google Scholar 

  17. Andreani, N. A., Hesse, E. & Vos, M. Prokaryote genome fluidity is dependent on effective population size. ISME J. 11, 1719–1721 (2017).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  18. Kimura, M. Genetic variability maintained in a finite population due to mutational production of neutral and nearly neutral isoalleles. Genet. Res. 11, 247–270 (1968).

    Article  CAS  PubMed  Google Scholar 

  19. Lynch, M. et al. Genetic drift, selection and the evolution of the mutation rate. Nat. Rev. Genet. 17, 704–714 (2016).

    Article  CAS  PubMed  Google Scholar 

  20. Gu, J. et al. Unexpectedly high mutation rate of a deep-sea hyperthermophilic anaerobic archaeon. ISME J. 15, 1862–1869 (2021).

    Article  CAS  PubMed  Google Scholar 

  21. Charlesworth, B. Effective population size and patterns of molecular evolution and variation. Nat. Rev. Genet. 10, 195–205 (2009).

    Article  CAS  PubMed  Google Scholar 

  22. Kashtan, N. et al. Single-cell genomics reveals hundreds of coexisting subpopulations in wild Prochlorococcus. Science 344, 416–420 (2014).

    Article  CAS  PubMed  Google Scholar 

  23. Osburne, M. S., Holmbeck, B. M., Coe, A. & Chisholm, S. W. The spontaneous mutation frequencies of Prochlorococcus strains are commensurate with those of other bacteria: mutation frequencies in Prochlorococcus. Environ. Microbiol. Rep. 3, 744–749 (2011).

    Article  CAS  PubMed  Google Scholar 

  24. Lee, H., Popodi, E., Tang, H. & Foster, P. L. Rate and molecular spectrum of spontaneous mutations in the bacterium Escherichia coli as determined by whole-genome sequencing. Proc. Natl Acad. Sci. USA 109, E2774–E2783 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  25. Williams, A. B. Spontaneous mutation rates come into focus in Escherichia coli. DNA Repair 24, 73–79 (2014).

    Article  CAS  PubMed  Google Scholar 

  26. Rocha, E. P. C. Neutral theory, microbial practice: challenges in bacterial population genetics. Mol. Biol. Evol. 35, 1338–1347 (2018).

    Article  CAS  PubMed  Google Scholar 

  27. Marais, G. A. B., Calteau, A. & Tenaillon, O. Mutation rate and genome reduction in endosymbiotic and free-living bacteria. Genetica 134, 205–210 (2008).

    Article  PubMed  Google Scholar 

  28. Marais, G. A. B., Batut, B. & Daubin, V. Genome evolution: mutation is the main driver of genome size in prokaryotes. Curr. Biol. 30, R1083–R1085 (2020).

    Article  CAS  PubMed  Google Scholar 

  29. Morris, J. J., Lenski, R. E. & Zinser, E. R. The black queen hypothesis: evolution of dependencies through adaptive gene loss. mBio 3, e00036–12 (2012).

    Article  PubMed  PubMed Central  Google Scholar 

  30. Giovannoni, S. J., Cameron Thrash, J. & Temperton, B. Implications of streamlining theory for microbial ecology. ISME J. 8, 1553–1565 (2014).

    Article  PubMed  PubMed Central  Google Scholar 

  31. Wernegreen, J. J. Endosymbiont evolution: predictions from theory and surprises from genomes: endosymbiont genome evolution. Ann. N. Y. Acad. Sci. 1360, 16–35 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  32. Bobay, L.-M. & Ochman, H. The evolution of bacterial genome architecture. Front. Genet. 8, 72 (2017).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  33. Malmstrom, R. R. et al. Temporal dynamics of Prochlorococcus ecotypes in the Atlantic and Pacific oceans. ISME J. 4, 1252–1264 (2010).

    Article  PubMed  Google Scholar 

  34. Morris, J. J., Kirkegaard, R., Szul, M. J., Johnson, Z. I. & Zinser, E. R. Facilitation of robust growth of Prochlorococcus colonies and dilute liquid cultures by ‘helper’ heterotrophic bacteria. Appl. Environ. Microbiol. 74, 4530–4534 (2008).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  35. Sun, Y. et al. Spontaneous mutations of a model heterotrophic marine bacterium. ISME J. 11, 1713–1718 (2017).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  36. Dillon, M. M., Sung, W., Sebra, R., Lynch, M. & Cooper, V. S. Genome-wide biases in the rate and molecular spectrum of spontaneous mutations in Vibrio cholerae and Vibrio fischeri. Mol. Biol. Evol. 34, 93–109 (2017).

    Article  CAS  PubMed  Google Scholar 

  37. Strauss, C., Long, H., Patterson, C. E., Te, R. & Lynch, M. Genome-wide mutation rate response to pH change in the coral reef pathogen Vibrio shilonii AK1. mBio 8, e01021–17 (2017).

    Article  PubMed  PubMed Central  Google Scholar 

  38. Xue, C.-X. et al. Ancestral niche separation and evolutionary rate differentiation between sister marine flavobacteria lineages. Environ. Microbiol. 22, 3234–3247 (2020).

    Article  CAS  PubMed  Google Scholar 

  39. Bourguignon, T. et al. Increased mutation rate is linked to genome reduction in prokaryotes. Curr. Biol. 30, 3848–3855.e4 (2020).

    Article  CAS  PubMed  Google Scholar 

  40. Long, H. et al. Background mutational features of the radiation-resistant bacterium Deinococcus radiodurans. Mol. Biol. Evol. 32, 2383–2392 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  41. Arevalo, P., VanInsberghe, D., Elsherbini, J., Gore, J. & Polz, M. F. A reverse ecology approach based on a biological definition of microbial populations. Cell 178, 820–834 (2019).

    Article  CAS  PubMed  Google Scholar 

  42. Sung, W., Ackerman, M. S., Miller, S. F., Doak, T. G. & Lynch, M. Drift-barrier hypothesis and mutation-rate evolution. Proc. Natl Acad. Sci. USA 109, 18488–18492 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  43. Daubin, V. & Moran, N. A. Comment on ‘The origins of genome complexity’. Science 306, 978–978 (2004).

    Article  CAS  PubMed  Google Scholar 

  44. Kirchberger, P. C., Schmidt, M. L. & Ochman, H. The ingenuity of bacterial genomes. Annu. Rev. Microbiol. 74, 815–834 (2020).

    Article  CAS  PubMed  Google Scholar 

  45. Qu, L. et al. Metapopulation structure of diatom-associated marine bacteria. Preprint at bioRxiv https://doi.org/10.1101/2021.03.10.434754 (2021).

  46. Wiedenbeck, J. & Cohan, F. M. Origins of bacterial diversity through horizontal genetic transfer and adaptation to new ecological niches. FEMS Microbiol. Rev. 35, 957–976 (2011).

    Article  CAS  PubMed  Google Scholar 

  47. Gillespie, J. H. Genetic drift in an infinite population: the pseudohitchhiking model. Genetics 155, 909–919 (2000).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  48. Muñoz-Marín, M. C. et al. Mixotrophy in marine picocyanobacteria: use of organic compounds by Prochlorococcus and Synechococcus. ISME J. 14, 1065–1073 (2020).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  49. Luo, H., Huang, Y., Stepanauskas, R. & Tang, J. Excess of non-conservative amino acid changes in marine bacterioplankton lineages with reduced genomes. Nat. Microbiol. 2, 1–9 (2017).

    Article  CAS  Google Scholar 

  50. Shalapyonok, A., Olson, R. J. & Shalapyonok, L. S. Ultradian growth in Prochlorococcus spp. Appl. Environ. Microbiol. 64, 1066–1069 (1998).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  51. Moore, L. R. et al. Culturing the marine cyanobacterium Prochlorococcus. Limnol. Oceanogr. 5, 353–362 (2007).

    Article  CAS  Google Scholar 

  52. Lindell, D. in The Prokaryotes: Other Major Lineages of Bacteria and The Archaea (eds Rosenberg, E. et al.) 829–845 (Springer, 2014).

  53. Long, H. et al. Antibiotic treatment enhances the genome-wide mutation rate of target cells. Proc. Natl Acad. Sci. USA 113, E2498 (2016).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  54. Dillon, M. M., Sung, W., Lynch, M. & Cooper, V. S. The rate and molecular spectrum of spontaneous mutations in the GC-rich multichromosome genome of Burkholderia cenocepacia. Genetics 200, 935–946 (2015).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  55. Wahl, L. M. & Gerrish, P. J. The probability that beneficial mutations are lost in populations with periodic bottlenecks. Evolution 55, 2606–2610 (2001).

    CAS  PubMed  Google Scholar 

  56. Hall, D. W., Mahmoudizad, R., Hurd, A. W. & Joseph, S. B. Spontaneous mutations in diploid Saccharomyces cerevisiae: another thousand cell generations. Genet. Res. 90, 229–241 (2008).

    Article  CAS  Google Scholar 

  57. Agustí, S. Viability and niche segregation of Prochlorococcus and Synechococcus cells across the central Atlantic Ocean. Aquat. Microb. Ecol. 36, 53–59 (2004).

    Article  Google Scholar 

  58. Frenoy, A. & Bonhoeffer, S. Death and population dynamics affect mutation rate estimates and evolvability under stress in bacteria. PLoS Biol. 16, e2005056 (2018).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  59. Bolger, A. M., Lohse, M. & Usadel, B. Trimmomatic: a flexible trimmer for Illumina sequence data. Bioinformatics 30, 2114–2120 (2014).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  60. Li, H. & Durbin, R. Fast and accurate short read alignment with Burrows–Wheeler transform. Bioinformatics 25, 1754–1760 (2009).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  61. Li, H. et al. The sequence alignment/map format and SAMtools. Bioinformatics 25, 2078–2079 (2009).

    Article  PubMed  PubMed Central  Google Scholar 

  62. McKenna, A. et al. The genome analysis toolkit: a MapReduce framework for analyzing next-generation DNA sequencing data. Genome Res. 20, 1297–1303 (2010).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  63. Long, H., Behringer, M. G., Williams, E., Te, R. & Lynch, M. Similar mutation rates but highly diverse mutation spectra in ascomycete and basidiomycete yeasts. Genome Biol. Evol. 8, 3815–3821 (2016).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  64. DePristo, M. A. et al. A framework for variation discovery and genotyping using next-generation DNA sequencing data. Nat. Genet. 43, 491–498 (2011).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  65. Van der Auwera, G. A. et al. From FastQ data to high confidence variant calls: the Genome Analysis Toolkit best practices pipeline. Curr. Protoc. Bioinform. 43, 11.10.1–33 (2013).

    Google Scholar 

  66. Singh, V. K., Mangalam, A. K., Dwivedi, S. & Naik, S. Primer premier: program for design of degenerate primers from a protein sequence. BioTechniques 24, 318–319 (1998).

    Article  CAS  PubMed  Google Scholar 

  67. R Core Team. R: A Language and Environment for Statistical Computing (R Foundation for Statistical Computing, 2021).

  68. Shewaramani, S. et al. Anaerobically grown Escherichia coli has an enhanced mutation rate and distinct mutational spectra. PLoS Genet. 13, e1006570 (2017).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  69. Deatherage, D. E. & Barrick, J. E. in Engineering and Analyzing Multicellular Systems: Methods and Protocols (eds Sun, L. & Shou, W.) 165–188 (Springer, 2014).

  70. Bobay, L.-M., Ellis, B. S.-H. & Ochman, H. ConSpeciFix: classifying prokaryotic species based on gene flow. Bioinformatics 34, 3738–3740 (2018).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  71. Bobay, L.-M. & Ochman, H. Biological species are universal across life’s domains. Genome Biol. Evol. 9, 491–501 (2017).

    Article  PubMed Central  Google Scholar 

  72. VanInsberghe, D., Arevalo, P., Chien, D. & Polz, M. F. How can microbial population genomics inform community ecology? Philos. Trans. R. Soc. B 375, 20190253 (2020).

    Article  Google Scholar 

  73. Parks, D. H., Imelfort, M., Skennerton, C. T., Hugenholtz, P. & Tyson, G. W. CheckM: assessing the quality of microbial genomes recovered from isolates, single cells, and metagenomes. Genome Res. 25, 1043–1055 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  74. Emms, D. M. & Kelly, S. OrthoFinder: solving fundamental biases in whole genome comparisons dramatically improves orthogroup inference accuracy. Genome Biol. 16, 157 (2015).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  75. Katoh, K. & Standley, D. M. MAFFT multiple sequence alignment software version 7: improvements in performance and usability. Mol. Biol. Evol. 30, 772–780 (2013).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  76. Lasken, R. S. & McLean, J. S. Recent advances in genomic DNA sequencing of microbial species from single cells. Nat. Rev. Genet. 15, 577–584 (2014).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  77. Clingenpeel, S., Clum, A., Schwientek, P., Rinke, C. & Woyke, T. Reconstructing each cell’s genome within complex microbial communities—dream or reality? Front. Microbiol. 5, 771 (2015).

    Article  PubMed  PubMed Central  Google Scholar 

  78. Seemann, T. Prokka: rapid prokaryotic genome annotation. Bioinformatics 30, 2068–2069 (2014).

    Article  CAS  PubMed  Google Scholar 

  79. Nguyen, L.-T., Schmidt, H. A., von Haeseler, A. & Minh, B. Q. IQ-TREE: a fast and effective stochastic algorithm for estimating maximum-likelihood phylogenies. Mol. Biol. Evol. 32, 268–274 (2015).

    Article  CAS  PubMed  Google Scholar 

  80. Kalyaanamoorthy, S., Minh, B. Q., Wong, T. K. F., von Haeseler, A. & Jermiin, L. S. ModelFinder: fast model selection for accurate phylogenetic estimates. Nat. Methods 14, 587–589 (2017).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  81. Benson, D. A., Karsch-Mizrachi, I., Lipman, D. J., Ostell, J. & Wheeler, D. L. GenBank. Nucleic Acids Res. 35, D21–D25 (2007).

    Article  CAS  PubMed  Google Scholar 

  82. O’Leary, N. A. et al. Reference sequence (RefSeq) database at NCBI: current status, taxonomic expansion, and functional annotation. Nucleic Acids Res. 44, D733–D745 (2016).

    Article  PubMed  CAS  Google Scholar 

  83. Coleman, G. A. et al. A rooted phylogeny resolves early bacterial evolution. Science 372, eabe0511 (2021).

    Article  CAS  PubMed  Google Scholar 

  84. Orme, D. et al. The caper package: comparative analysis of phylogenetics and evolution in R (2013).

  85. Fox, J. & Weisberg, S. An R Companion to Applied Regression (SAGE, 2019).

  86. Chaumeil, P.-A., Mussig, A. J., Hugenholtz, P. & Parks, D. H. GTDB-Tk: a toolkit to classify genomes with the Genome Taxonomy Database. Bioinformatics 36, 1925–1927 (2020).

    CAS  Google Scholar 

  87. Pagel, M. Inferring the historical patterns of biological evolution. Nature 401, 877–884 (1999).

    Article  CAS  PubMed  Google Scholar 

  88. Aziz, R. K. et al. The RAST server: rapid annotations using subsystems technology. BMC Genom. 9, 75 (2008).

    Article  CAS  Google Scholar 

  89. Overbeek, R. et al. The SEED and the Rapid Annotation of microbial genomes using Subsystems Technology (RAST). Nucleic Acids Res. 42, D206–D214 (2014).

    Article  CAS  PubMed  Google Scholar 

  90. Darling, A. E., Mau, B. & Perna, N. T. Progressivemauve: multiple genome alignment with gene gain, loss and rearrangement. PLoS ONE 5, e11147 (2010).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  91. Darling, A. C. E., Mau, B., Blattner, F. R. & Perna, N. T. Mauve: multiple alignment of conserved genomic sequence with rearrangements. Genome Res. 14, 1394–1403 (2004).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  92. Didelot, X. & Wilson, D. J. ClonalFrameML: efficient inference of recombination in whole bacterial genomes. PLoS Comput. Biol. 11, e1004041 (2015).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

Download references

Acknowledgements

The authors thank the three reviewers for providing constructive suggestions that substantially improved the manuscript, H. Long and J. Pan for sharing their script to calculate πS based on fourfold degenerate sites and X. Feng for contributing the script to simulate the SAG assemblies based on the isolates’ genomes. Y.Z. was supported by the National Science Fund for Distinguished Young Scholars (42125603) and NSFC project 92051114. H.L. was supported by the Shenzhen Science and Technology Committee (JCYJ20180508161811899), the Hong Kong Branch of Southern Marine Science and Engineering Guangdong Laboratory (Guangzhou) (SMSEGL20SC02) and the Hong Kong Research Grants Council Area of Excellence Scheme (AoE/M-403/16). Z.C. was supported by the PhD Fellowship of the State Key Laboratory of Marine Environmental Science at Xiamen University.

Author information

Authors and Affiliations

Authors

Contributions

H.L. conceptualized the work and strategy, directed the bioinformatics analyses, interpreted the data and wrote the main manuscript. Y.Z. directed the experimental analyses and related writing, co-interpreted the data and provided comments on the manuscript. Z.C. performed all the experiments with contributions from Y.S., co-interpreted the data, drafted the experimental methods and prepared Fig. 1. X.W. performed all the bioinformatics analyses, co-interpreted the data, drafted the bioinformatics methods and prepared Figs. 2 and 3, and all the supplementary figures. Q.Z. co-directed the experimental work.

Corresponding authors

Correspondence to Yao Zhang or Haiwei Luo.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Peer review information Nature Ecology & Evolution thanks Louis-Marie Bobay, Sébastien Wielgoss and the other, anonymous, reviewer(s) for their contribution to the peer review of this work.

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Extended data

Extended Data Fig. 1 The maximum-likelihood phylogenomic tree of 523 high-light (HL) adapted Prochlorococcus genomes (26 isolates’ genomes and 497 high-quality SAGs from HLII, HLI, HLIII/IV), constructed with IQ-TREE based on concatenated single-copy orthologous genes at the amino acid level and rooted with low-light (LL) adapted clade I (LLI) genomes.

To help show the fine phylogenetic structure within the HLII clade, all distant relatives (LLI, HLIII/IV, HLI and HLVI) were pruned. Solid circles at the nodes indicate the frequency of the group defined by that node greater than 90 out of the 100 bootstrapped replicates. The phylogeny is visualized and annotated with iTOL. From the outer to inner rings: (1) denotes the 15 HLII isolates used in Bobay and Ochman (2018) for defining the species ‘Prochlorococcus marinus’ by ConSpeciFix. (2) denotes the 93 genomes (13 isolates and 80 high quality SAGs) used for Ne estimation by Kashtan et al (2014), among which 80 HLII SAGs were classified into seven backbone subpopulations and are marked with corresponding color strips; (3)-(6) represents the season, water depth, longitude and latitude of the samples they collected, respectively; (7) shows the identity of the SAGs or isolates. Colored stars at the tips of the phylogeny differentiate the strains of distinct sources (that is, ocean, cruise, and station). Strains without the above information are not marked at the tips. The strain P. marinus AS9601 subjected to unbiased global mutation rate determination is highlighted in deep blue. Cells with the genome id highlighted with blue, pink, green, and orange compose the population MC0, MC1, MC4, and MC16, respectively, delineated by PopCOGenT. Other populations delineated by PopCOGenT were highlighted with light yellow, with each consisting of at least three non-redundant genomes (Supplementary Table 3). (8) illustrates the progressive extensions of the four main populations (MC0, MC1, MC4, and MC16) defined by PopCOGenT, with arrows marking the most recent common ancestor of each extension. The purpose of this analysis is to estimate the impact of population delineation on Ne estimates. Both πS and Ne were estimated for the extended groups (left bottom). The ConSpeciFix grouped all 418 HLII members (23 isolates and 395 high-quality SAGs) into one species except 11 strains (marked with purple triangles), which ConSpeciFix inappropriately reported as distantly related relatives to the defined species.

Source data

Extended Data Fig. 2 The SAG simulation analysis evaluates the impact of using error-prone SAG data on population delineation by PopCOGenT and subsequently on the estimates of πS and Ne.

Of the 31 prokaryotic species with their unbiased global mutation rate data publicly available, 19 are used in the simulation analysis because these species each have multiple members and thus are amenable for population delineation. SAGs are simulated from the isolates’ genomes in each of these 19 species by incorporating the realistic error rates (collected from literature) associated with SAG sequences and the genomic statistics of all available Prochlorococcus clade HLII SAGs (without quality filter). For each species, the whole procedure was replicated for 10 times. (a-b) Summary of πS and Ne estimates for each of the 19 species based on populations delineated by PopCOGenT using simulated SAGs (box and whisker plot) and the original isolates’ genomes (red solid circles). Within each box of the SAG data, the horizontal line marks the median; boxes extend from the 25th to the 75th percentile of each group’s πS or Ne; whiskers above and below the box indicate the 10th and 90th percentiles. (c-d) Including redundant members from a clonal complex underestimates πS and Ne. Before each simulation, the clonal complex identified in original genomes by PopCOGenT was preprocessed by excluding redundant strains. This is because these strains can be erroneously identified as non-redundant population members by PopCOGenT when simulated SAG data are used, which underestimates πS and Ne owing to the extremely close relationship between members from a clonal complex. In total, 12 species have the problem of clonal complex. As expected, the inclusion of redundant strains in a clonal complex before SAG simulations results in a decrease of πS and Ne estimates compared with the exclusion of the redundant strains from a clonal complex.

Source data

Extended Data Fig. 3 No scaling relationship was found between the logarithmically transformed estimated effective population size (Ne) and the logarithmically transformed genome size across 21 bacterial and two archaeal species.

The generalized linear model (GLM) regression and the phylogenetic generalized least square (PGLS) regression of the 23 species are identical from what is presented in Fig. 3c. The Pagel’s λ among 23 species is near to 1, suggesting strong phylogenetic signal (that is, traits evolve in close association with the phylogeny). Thus, the scaling was further investigated at a lower taxonomic rank. Specifically, the 21 bacterial species were divided into two deep lineages, the Terrabacteria clade (seven species marked by green dots) and the Gracilicutes clade (14 species marked by blue dots), and both GLM and PGLS regression analyses were applied to each. Again, no scaling relationship between Ne and genome size was found for either Terrabacteria or Gracilicutes.

Source data

Extended Data Fig. 4 The global distribution of 418 HLII members (23 isolates and 395 high-quality SAGs).

Members of the four main populations (MC0, MC1, MC4, and MC16) defined by PopCOGenT are represented with blue, pink, green, and orange, respectively. These members were largely sampled from two sites (BATS and GA03) in North Atlantic Ocean marked by white dots, where pie charts are used to illustrate the proportion of each population in the sampled cells. Another two pink circles with numbers denote the sites where four members of MC1 were from. The sampling sites of members from other minor populations defined by PopCOGenT are marked as black dots, and at least one cell was sampled in these sites. The table on the right bottom lists the available environmental factors of the two sites. Genomes from BATS were derived from four independent samples, and thus the range of each environmental factor is provided.

Source data

Extended Data Fig. 5 The phylogenetic trees used for phylogenetic generalized least squares (PGLS) regression analyses.

Species affiliated with Terrabacteria, Gracilicutes, and Archaea were shadowed with green, blue, and pink, respectively. a, The maximum likelihood phylogeny built from the 16S rRNA gene sequences of 21 prokaryotic species, which was generated in a recently published study. Solid circles at the nodes indicate the frequency of the group defined by that node greater than 90 out of the 100 bootstrapped replicates. b, In this phylogeny, the tree topology was pruned from GTDB release95, followed by branch length estimation with the fixed tree topology under maximum likelihood framework.

Source data

Supplementary information

Source data

Source Data Fig. 1

Statistical source data.

Source Data Fig. 2

Statistical source data.

Source Data Fig. 3

Statistical source data.

Source Data Extended Data Fig. 1

Statistical source data.

Source Data Extended Data Fig. 2

Statistical source data.

Source Data Extended Data Fig. 3

Statistical source data.

Source Data Extended Data Fig. 4

Statistical source data.

Source Data Extended Data Fig. 5

Statistical source data.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Chen, Z., Wang, X., Song, Y. et al. Prochlorococcus have low global mutation rate and small effective population size. Nat Ecol Evol 6, 183–194 (2022). https://doi.org/10.1038/s41559-021-01591-0

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1038/s41559-021-01591-0

This article is cited by

Search

Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing