Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Resource
  • Published:

A standardized archaeal taxonomy for the Genome Taxonomy Database

Abstract

The accrual of genomic data from both cultured and uncultured microorganisms provides new opportunities to develop systematic taxonomies based on evolutionary relationships. Previously, we established a bacterial taxonomy through the Genome Taxonomy Database. Here, we propose a standardized archaeal taxonomy that is derived from a 122-concatenated-protein phylogeny that resolves polyphyletic groups and normalizes ranks based on relative evolutionary divergence. The resulting archaeal taxonomy, which forms part of the Genome Taxonomy Database, is stable for a range of phylogenetic variables including marker gene selection, inference methods, corrections for rate heterogeneity and compositional bias, tree rooting scenarios and expansion of the genome database. Rank normalization is shown to robustly correct for substitution rates varying up to 30-fold using simulated datasets. Taxonomic curation follows the rules of the International Code of Nomenclature of Prokaryotes while taking into account proposals to formally recognize the rank of phylum and to use genome sequences as type material. This taxonomy is based on 2,392 archaeal genomes, 93.3% of which required one or more changes to their existing taxonomy, mainly owing to incomplete classification. We identify 16 archaeal phyla and reclassify 3 major monophyletic units from the former Euryarchaeota and one phylum that unites the Thaumarchaeota–Aigarchaeota–Crenarchaeota–Korarchaeota (TACK) superphylum into a single phylum.

This is a preview of subscription content, access via your institution

Access options

Buy this article

Prices may be subject to local taxes which are calculated during checkout

Fig. 1: Comparison of rank-normalized archaeal GTDB and NCBI taxonomies.
Fig. 2: Comparison of marker sets, inference methods and models.
Fig. 3: Impact of different rooting scenarios on RED intervals.
Fig. 4: Rank-normalized archaeal GTDB taxonomy.
Fig. 5: Reclassification of the Thaumarchaeota.

Similar content being viewed by others

Data availability

The GTDB taxonomy is available at the GTDB website (https://gtdb.ecogenomic.org/), including the ar122.r89 tree and the GTDB and NCBI taxonomic assignments for all 2,392 archaeal genomes in GTDB 04-RS89. Genome assemblies are available from the NCBI Assembly database (BioProject: PRJNA593905). All GTDB decorated phylogenetic trees are provided as Newick files in Supplementary Data 1. The SR4 model used for data recoded inferences is provided in Supplementary Data 2. Source data are provided with this paper.

Code availability

The standalone tool GTDB-Tk, which enables researchers to classify their own genomes according to the GTDB taxonomy, is available from GitHub (https://github.com/Ecogenomics/GTDBTk/) and through KBase (https://kbase.us/applist/apps/kb_gtdbtk/run_kb_gtdbtk/release). Taxonomic assignment and rank standardization were carried out based on the RED calculated using PhyloRank v0.0.37, which is available from GitHub (https://github.com/dparks1134/PhyloRank/).

References

  1. Woese, C. R. & Fox, G. E. Phylogenetic structure of the prokaryotic domain: the primary kingdoms. Proc. Natl Acad. Sci. USA 74, 5088–5090 (1977).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  2. Gribaldo, S. & Brochier-Armanet, C. The origin and evolution of Archaea: a state of the art. Philos. Trans. R. Soc. Lond. B Biol. Sci. 361, 1007–1022 (2006).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  3. Zuo, G., Xu, Z. & Hao, B. Phylogeny and taxonomy of Archaea: a comparison of the whole-genome-based CVTree approach with 16S rRNA sequence analysis. Life 5, 949–968 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  4. Woese, C. R., Kandler, O. & Wheelis, M. L. Towards a natural system of organisms: proposal for the domains Archaea, Bacteria, and Eucarya. Proc. Natl Acad. Sci. USA 87, 4576–4579 (1990).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  5. Adam, P. S., Borrel, G., Brochier-Armanet, C. & Gribaldo, S. The growing tree of Archaea: new perspectives on their diversity, evolution and ecology. ISME J. https://doi.org/10.1038/ismej.2017.122 (2017).

  6. Baker, B. J. et al. Diversity, ecology and evolution of Archaea. Nat. Microbiol. 5, 887–900 (2020).

    Article  PubMed  Google Scholar 

  7. Spang, A., Caceres, E. F. & Ettema, T. J. G. Genomic exploration of the diversity, ecology, and evolution of the archaeal domain of life. Science 357, eaaf3883 (2017).

    Article  PubMed  Google Scholar 

  8. Barns, S. M., Delwiche, C. F., Palmer, J. D. & Pace, N. R. Perspectives on archaeal diversity, thermophily and monophyly from environmental rRNA sequences. Proc. Natl Acad. Sci. USA 93, 9188–9193 (1996).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  9. Huber, H. et al. A new phylum of Archaea represented by a nanosized hyperthermophilic symbiont. Nature 417, 63–67 (2002).

    Article  CAS  PubMed  Google Scholar 

  10. Hallam, S. J. et al. Genomic analysis of the uncultivated marine crenarchaeote Cenarchaeum symbiosum. Proc. Natl Acad. Sci. USA 103, 18296–18301 (2006).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  11. Brochier-Armanet, C., Boussau, B., Gribaldo, S. & Forterre, P. Mesophilic crenarchaeota: proposal for a third archaeal phylum, the Thaumarchaeota. Nat. Rev. Microbiol. 6, 245–252 (2008).

    Article  CAS  PubMed  Google Scholar 

  12. Nunoura, T. et al. Insights into the evolution of Archaea and eukaryotic protein modifier systems revealed by the genome of a novel archaeal group. Nucleic Acids Res. 39, 3204–3223 (2011).

    Article  CAS  PubMed  Google Scholar 

  13. Kozubal, M. A. et al. Geoarchaeota: a new candidate phylum in the Archaea from high-temperature acidic iron mats in Yellowstone National Park. ISME J. 7, 622–634 (2013).

    Article  CAS  PubMed  Google Scholar 

  14. Meng, J. et al. Genetic and functional properties of uncultivated MCG Archaea assessed by metagenome and gene expression analyses. ISME J. 8, 650–659 (2014).

    Article  CAS  PubMed  Google Scholar 

  15. Guy, L., Spang, A., Saw, J. H. & Ettema, T. J. G. ‘Geoarchaeote NAG1’ is a deeply rooting lineage of the archaeal order Thermoproteales rather than a new phylum. ISME J. 8, 1353–1357 (2014).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  16. Guy, L. & Ettema, T. J. G. The archaeal ‘TACK’ superphylum and the origin of eukaryotes. Trends Microbiol. 19, 580–587 (2011).

    Article  CAS  PubMed  Google Scholar 

  17. Vanwonterghem, I. et al. Methylotrophic methanogenesis discovered in the archaeal phylum Verstraetearchaeota. Nat. Microbiol. 1, 16170 (2016).

    Article  CAS  PubMed  Google Scholar 

  18. Rinke, C. et al. Insights into the phylogeny and coding potential of microbial dark matter. Nature 499, 431–437 (2013).

  19. Zaremba-Niedzwiedzka, K. et al. Asgard Archaea illuminate the origin of eukaryotic cellular complexity. Nature 541, 353–358 (2017).

    Article  CAS  PubMed  Google Scholar 

  20. Baker, B. J. et al. Enigmatic, ultrasmall, uncultivated Archaea. Proc. Natl Acad. Sci. USA 107, 8806–8811 (2010).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  21. Castelle, C. J. et al. Genomic expansion of domain Archaea highlights roles for organisms from new phyla in anaerobic carbon cycling. Curr. Biol. 16, 690–701 (2015).

    Article  Google Scholar 

  22. Probst, A. J. et al. Differential depth distribution of microbial function and putative symbionts through sediment-hosted aquifers in the deep terrestrial subsurface. Nat. Microbiol. 3, 328–336 (2018).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  23. Probst, A. J. et al. Biology of a widespread uncultivated archaeon that contributes to carbon fixation in the subsurface. Nat. Commun. 5, 5497 (2014).

    Article  CAS  PubMed  Google Scholar 

  24. Seitz, K. W., Lazar, C. S., Hinrichs, K.-U., Teske, A. P. & Baker, B. J. Genomic reconstruction of a novel, deeply branched sediment archaeal phylum with pathways for acetogenesis and sulfur reduction. ISME J. 10, 1696–1705 (2016).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  25. Spang, A. et al. Complex Archaea that bridge the gap between prokaryotes and eukaryotes. Nature 521, 173–179 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  26. Seitz, K. W. et al. Asgard Archaea capable of anaerobic hydrocarbon cycling. Nat. Commun. 10, 1822 (2019).

    Article  PubMed  PubMed Central  Google Scholar 

  27. Petitjean, C., Deschamps, P., López-García, P. & Moreira, D. Rooting the domain Archaea by phylogenomic analysis supports the foundation of the new kingdom Proteoarchaeota. Genome Biol. Evol. 7, 191–204 (2014).

    Article  PubMed  PubMed Central  Google Scholar 

  28. Petitjean, C., Deschamps, P., López-García, P., Moreira, D. & Brochier-Armanet, C. Extending the conserved phylogenetic core of Archaea disentangles the evolution of the third domain of life. Mol. Biol. Evol. 32, 1242–1254 (2015).

    Article  CAS  PubMed  Google Scholar 

  29. Parker, C. T., Tindall, B. J. & Garrity, G. M. International Code of Nomenclature of Prokaryotes. Int. J. Syst. Evol. Microbiol. 69, S1–S111 (2019).

    Article  Google Scholar 

  30. Oren, A. et al. Proposal to include the rank of phylum in the International Code of Nomenclature of Prokaryotes. Int. J. Syst. Evol. Microbiol. 65, 4284–4287 (2015).

    Article  CAS  PubMed  Google Scholar 

  31. Whitman, W. B. Modest proposals to expand the type material for naming of prokaryotes. Int. J. Syst. Evol. Microbiol. 66, 2108–2112 (2016).

    Article  CAS  PubMed  Google Scholar 

  32. Chuvochina, M. et al. The importance of designating type material for uncultured taxa. Syst. Appl. Microbiol. 42, 15–21 (2019).

    Article  PubMed  Google Scholar 

  33. Murray, R. G. E. & Stackebrandt, E. Taxonomic note: implementation of the provisional status Candidatus for incompletely described procaryotes. Int. J. Syst. Evol. Microbiol. 45, 186–187 (1995).

    CAS  Google Scholar 

  34. Oren, A. A plea for linguistic accuracy—also for Candidatus taxa. Int. J. Syst. Evolut. Microbiol. 67, 1085–1094 (2017).

    Article  Google Scholar 

  35. Parks, D. H. et al. A standardized bacterial taxonomy based on genome phylogeny substantially revises the tree of life. Nat. Biotechnol. 36, 996–1004 (2018).

    Article  CAS  PubMed  Google Scholar 

  36. Parks, D. H. et al. A complete domain-to-species taxonomy for Bacteria and Archaea. Nat. Biotechnol. 38, 1079–1086 (2020).

    Article  CAS  PubMed  Google Scholar 

  37. Haft, D. H. et al. RefSeq: an update on prokaryotic genome annotation and curation. Nucleic Acids Res. 46, D851–D860 (2018).

    Article  CAS  PubMed  Google Scholar 

  38. Parks, D. H. et al. A complete domain-to-species taxonomy for Bacteria and Archaea. Nat. Biotechnol. 38, 1079–1086 (2020).

  39. Parks, D. H. et al. Recovery of nearly 8,000 metagenome-assembled genomes substantially expands the tree of life. Nat. Microbiol. 2, 1533–1542 (2017).

    Article  CAS  PubMed  Google Scholar 

  40. Wang, H.-C., Minh, B. Q., Susko, E. & Roger, A. J. Modeling site heterogeneity with posterior mean site frequency profiles accelerates accurate phylogenomic estimation. Syst. Biol. 67, 216–235 (2018).

    Article  CAS  PubMed  Google Scholar 

  41. Nguyen, L.-T., Schmidt, H. A., von Haeseler, A. & Minh, B. Q. IQ-TREE: a fast and effective stochastic algorithm for estimating maximum-likelihood phylogenies. Mol. Biol. Evol. 32, 268–274 (2015).

    Article  CAS  PubMed  Google Scholar 

  42. Federhen, S. The NCBI Taxonomy database. Nucleic Acids Res. 40, D136–D143 (2012).

    Article  CAS  PubMed  Google Scholar 

  43. Marin, J., Battistuzzi, F. U., Brown, A. C. & Hedges, S. B. The timetree of prokaryotes: new insights into their evolution and speciation. Mol. Biol. Evol. 34, 437–446 (2017).

    CAS  PubMed  Google Scholar 

  44. Sieber, C. M. K. et al. Recovery of genomes from metagenomes via a dereplication, aggregation and scoring strategy. Nat. Microbiol. 3, 836–843 (2018).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  45. Hug, L. A. et al. A new view of the tree of life. Nat. Microbiol. 1, 16048 (2016).

    Article  CAS  PubMed  Google Scholar 

  46. Dombrowski, N. et al. Undinarchaeota illuminate DPANN phylogeny and the impact of gene transfer on archaeal evolution. Nat. Commun. 11, 3939 (2020).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  47. Galtier, N. & Lobry, J. R. Relationships between genomic G+C content, RNA secondary structures, and optimal growth temperature in prokaryotes. J. Mol. Evol. 44, 632–636 (1997).

    Article  CAS  PubMed  Google Scholar 

  48. Segata, N., Börnigen, D., Morgan, X. C. & Huttenhower, C. PhyloPhlAn is a new method for improved phylogenetic and taxonomic placement of microbes. Nat. Commun. 4, 2304 (2013).

    Article  PubMed  Google Scholar 

  49. Ali, R. H., Bogusz, M. & Whelan, S. Identifying clusters of high confidence homologies in multiple sequence alignments. Mol. Biol. Evol. 36, 2340–2351 (2019).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  50. Criscuolo, A. & Gribaldo, S. BMGE (block mapping and gathering with entropy): a new software for selection of phylogenetic informative regions from multiple sequence alignments. BMC Evol. Biol. 10, 210 (2010).

    Article  PubMed  PubMed Central  Google Scholar 

  51. Raymann, K., Brochier-Armanet, C. & Gribaldo, S. The two-domain tree of life is linked to a new root for the Archaea. Proc. Natl Acad. Sci. USA 112, 6670–6675 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  52. Williams, T. A. et al. Integrative modeling of gene and genome evolution roots the archaeal tree of life. Proc. Natl Acad. Sci. USA 114, E4602–E4611 (2017).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  53. Whitman, W. B. et al. Proposal of the suffix –ota to denote phyla. Addendum to ‘Proposal to include the rank of phylum in the International Code of Nomenclature of Prokaryotes’. Int. J. Syst. Evol. Microbiol. 68, 967–969 (2018).

    Article  PubMed  Google Scholar 

  54. Jungbluth, S. P., Amend, J. P. & Rappé, M. S. Metagenome sequencing and 98 microbial genomes from Juan de Fuca Ridge flank subsurface fluids. Sci. Data 4, sdata201737 (2017).

  55. Reysenbach, A.-L. Class I. Thermoprotei class. nov. in Bergey’s Manual of Systematic Bacteriology Volume 1: The Archaea and the Deeply Branching and Phototrophic Bacteria (eds Garrity, G. et al.) 169–210 (Springer Verlag, 2001).

  56. Stieglmeier, M. et al. Nitrososphaera viennensis gen. nov., sp. nov., an aerobic and mesophilic, ammonia-oxidizing archaeon from soil and a member of the archaeal phylum Thaumarchaeota. Int. J. Syst. Evol. Microbiol. 64, 2738–2752 (2014).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  57. Elkins, J. G. et al. A korarchaeal genome reveals insights into the evolution of the Archaea. Proc. Natl Acad. Sci. USA 105, 8102–8107 (2008).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  58. Oren, A., Garrity, G. M., Parker, C. T., Chuvochina, M. & Trujillo, M. E. Lists of names of prokaryotic Candidatus taxa. Int. J. Syst. Evol. Microbiol. https://doi.org/10.1099/ijsem.0.003789 (2020).

  59. Imachi, H. et al. Isolation of an archaeon at the prokaryote–eukaryote interface. Nature 577, 519–525 (2020).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  60. Fuchs, T., Huber, H., Burggraf, S. & Stetter, K. O. 16S rDNA-based phylogeny of the archaeal order Sulfolobales and reclassification of Desulfurolobus ambivalens as Acidianus ambivalens comb. nov. Syst. Appl. Microbiol. 19, 56–60 (1996).

    Article  CAS  Google Scholar 

  61. Quehenberger, J., Shen, L., Albers, S.-V., Siebers, B. & Spadiut, O. Sulfolobus—a potential key organism in future biotechnology. Front. Microbiol 8, 2474 (2017).

    Article  PubMed  PubMed Central  Google Scholar 

  62. Minegishi, H. et al. Further refinement of the phylogeny of the Halobacteriaceae based on the full-length RNA polymerase subunit B′ (rpoB′) gene. Int. J. Syst. Evol. Microbiol. 60, 2398–2408 (2010).

    Article  PubMed  Google Scholar 

  63. Sorokin, D. Y. et al. Natronolimnobius sulfurireducens sp. nov. and Halalkaliarchaeum desulfuricum gen. nov., sp. nov., the first sulfur-respiring alkaliphilic Haloarchaea from hypersaline alkaline lakes. Int. J. Syst. Evol. Microbiol. 69, 2662–2673 (2019).

    Article  CAS  PubMed  Google Scholar 

  64. Sorokin, D. Y. et al. Sulfur respiration in a group of facultatively anaerobic natronoarchaea ubiquitous in hypersaline soda lakes. Front. Microbiol. 9, 2359 (2018).

    Article  PubMed  PubMed Central  Google Scholar 

  65. Mendler, K. et al. AnnoTree: visualization and exploration of a functionally annotated microbial tree of life. Nucleic Acids Res. https://doi.org/10.1093/nar/gkz246 (2019).

  66. Chaumeil, P.-A., Mussig, A. J., Hugenholtz, P. & Parks, D. H. GTDB-Tk: a toolkit to classify genomes with the Genome Taxonomy Database. Bioinformatics https://doi.org/10.1093/bioinformatics/btz848 (2019).

  67. Parks, D. H., Imelfort, M., Skennerton, C. T., Hugenholtz, P. & Tyson, G. W. CheckM: assessing the quality of microbial genomes recovered from isolates, single cells, and metagenomes. Genome Res. 25, 1043–1055 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  68. McDonald, D. et al. An improved Greengenes taxonomy with explicit ranks for ecological and evolutionary analyses of Bacteria and Archaea. ISME J. 6, 610–618 (2012).

    Article  CAS  PubMed  Google Scholar 

  69. Wheeler, T. J. & Eddy, S. R. nhmmer: DNA homology search with profile HMMs. Bioinformatics 29, 2487–2489 (2013).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  70. Kalvari, I. et al. Rfam 13.0: shifting to a genome-centric resource for non-coding RNA families. Nucleic Acids Res. 46, D335–D342 (2018).

    Article  CAS  PubMed  Google Scholar 

  71. Nawrocki, E. Structural RNA Homology Search and Alignment Using Covariance Models PhD thesis, Washington Univ. St Louis (2009).

  72. Price, M. N., Dehal, P. S. & Arkin, A. P. FastTree 2—approximately maximum-likelihood trees for large alignments. PLoS ONE 5, e9490 (2010).

    Article  PubMed  PubMed Central  Google Scholar 

  73. Kozlov, A. M., Aberer, A. J. & Stamatakis, A. ExaML version 3: a tool for phylogenomic analyses on supercomputers. Bioinformatics 31, 2577–2579 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  74. Lartillot, N. & Philippe, H. A Bayesian mixture model for across-site heterogeneities in the amino-acid replacement process. Mol. Biol. Evol. 21, 1095–1109 (2004).

    Article  CAS  PubMed  Google Scholar 

  75. Zhou, X., Shen, X.-X., Hittinger, C. T. & Rokas, A. Evaluating fast maximum likelihood-based phylogenetic programs using empirical phylogenomic data sets. Mol. Biol. Evol. 35, 486–503 (2018).

    Article  CAS  PubMed  Google Scholar 

  76. Quang, L. S., Gascuel, O. & Lartillot, N. Empirical profile mixture models for phylogenetic reconstruction. Bioinformatics 24, 2317–2323 (2008).

    Article  CAS  Google Scholar 

  77. Katoh, K. & Standley, D. M. MAFFT multiple sequence alignment software version 7: improvements in performance and usability. Mol. Biol. Evol. 30, 772–780 (2013).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  78. Robinson, D. F. & Foulds, L. R. Comparison of phylogenetic trees. Math. Biosci. 53, 131–147 (1981).

    Article  Google Scholar 

  79. Kupczok, A., Schmidt, H. A. & von Haeseler, A. Accuracy of phylogeny reconstruction methods combining overlapping gene data sets. Algorithms Mol. Biol. 5, 37 (2010).

    Article  PubMed  PubMed Central  Google Scholar 

  80. Letunic, I. & Bork, P. Interactive tree of life (iTOL) v4: recent updates and new developments. Nucleic Acids Res. 47, W256–W259 (2019).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

Download references

Acknowledgements

We thank B. Kemish and D. Senanayake for system administration support, P. Yilmaz for stimulating discussions on archaeal taxonomy and the GTDB user community for their feedback. We also thank the Australian Centre for Ecogenomics (ACE) at The University of Queensland and the New Zealand eScience Infrastructure (NeSI) for providing high-performance computing facilities. The project was supported by an Australian Research Council (ARC) Future Fellowship (FT170100213) awarded to C.R. and by an Australian Research Council Laureate Fellowship (FL150100038) awarded to P.H. The funders had no role in study design, data collection and analysis, decision to publish or preparation of the manuscript.

Author information

Authors and Affiliations

Authors

Contributions

P.H., C.R. and D.H.P. conceived the archaeal GTDB study and designed experiments. C.R., A.J.M., D.H.P. and D.W.W. performed phylogenetic inferences. D.H.P. and P.-A.C. calculated rank normalizations. D.H.P., P.-A.C. and A.J.M. created the GTDB web interface and underlying databases. M.C., C.R., P.H. and W.B.W. curated the GTDB taxonomy with input from all co-authors and the scientific community. A.A.D. performed the RED simulation analysis. C.R. and P.H. wrote the manuscript with contributions from all co-authors.

Corresponding authors

Correspondence to Christian Rinke or Philip Hugenholtz.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Peer review information Nature Microbiology thanks the anonymous reviewers for their contribution to the peer review of this work.

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Extended data

Extended Data Fig. 1 Implementing the shifting substitution rate model (SSR).

a, Timetree, that is a tree scaled to time. A grid has been overlayed on top of the tree to delineate the length of the individual branch segments. b, An array of substitution rate multipliers, ordered from lowest to highest. In this specific example, the slowest evolving lineage evolves 16 times more slowly than the fastest evolving lineage. c, The result of running the SSR model on the tree in a. Every circle represents a shift in the substitution rate, whereby the colour correlates to the substitution rates multipliers shown in c). At the start of the simulation the model starts with the grey circle. How quick the changes take place depend on the shifting substitution rate parameter. d, The result of taking the tree in a) and scaling it according to the active substitution rate multiplier in every branch at every branch segment. For instance, we can see that the single segment in dark red (x4) has the same length as the two prior segments in light red (x2) on the same branch.

Extended Data Fig. 2 Impact of variable evolutionary rates on the RED approach.

Each panel shows the true ranking of every inner node of the species tree (which we can directly obtain from the simulation) and the x-axis, and the inferred rankings of every node, resulting of modifying the branch lengths of the tree using the SSR model and subsequently applying the RED algorithm to recover an ultrametric tree, on the y-axis. The main diagonal corresponds therefore to the proportion of nodes for the ranking given by the column that has been correctly classified. In every panel we have the results of using a different shifting substitution rate, from 0.1 to 0.5, and 1 (events/speciation). These results show that, as expected, higher numbers of changes in the substitution rates impact the performance of the RED approach to a higher degree. However, the levels of accuracy remain high even for the most extreme cases.

Extended Data Fig. 3 Average number of monophyletic (green bars), operationally monophyletic (yellow bars) and polyphyletic (orange bars) taxa across higher ranks (phylum, class, and order) in percent.

Shown are GTDB taxonomy decorated phylogenetic trees inferred with different methods, from a range of markers, from alignments trimmed to reduce compositional bias and fast evolving sites, and from alignments created as part of the simulated database expansion. Note that only taxa with two or more genomes were included, and that the data set (order representatives) used for PhyloBayes restricts the analysis to the ranks of phylum and class. Details for each inferred tree are provided in Supplementary Table 10. Percentages for all ranks are shown in Supplementary Fig. 11. Monophyly and operational monophyly was determined based on the F measure of decorated internal nodes.

Extended Data Fig. 4 Higher rank GTDB lineages not resolved with alternative inference methods.

Shown are taxa that were not recovered as monophyletic or operational monophyletic (green) and hence were polyphyletic (red; F measure < 0.95) in at least one of the different alignments and inference methods. The bootstrap support for each taxon in the ar.122.r89 reference tree is given in the last column ‘BS in ar.122.r89 tree’.

Extended Data Fig. 5 Examples of application of names using the manual curation workflow.

Provided are five examples of taxon names that have been updated in GTDB following the manual curation workflow. Thereby, each example is shown in a distinct colour: Ca. Thaumarchaeota (red), Ca. Diapherotrites (blue), Ca. Bathyarchaeota (green), Ca. Verstraetearchaeota (purple), and Ca. Korarchaeota (orange). For example, Ca. Thaumarchaeota (red) has no designated nomenclature type and has no lower-ranking taxon based on the same stem as the taxon. Furthermore, it has been united with another taxon of the same rank in GTDB, which resulted in a name being chosen based on priority, in this case Thermoproteota. *Nomenclature type of the taxon (for ranks above genus) is defined as one of its subordinate taxa with which the name is permanently associated.

Supplementary information

Supplementary Information

Supplementary Figs. 1–21, Tables 13–18 and Notes 1–7.

Reporting Summary

Peer Review File

Supplementary Tables

Supplementary Tables 1–14 and 19–29.

Supplementary Data 1

Newick files of all GTDB decorated trees.

Supplementary Data 2

SR4 model used for data recoded inferences.

Source data

Source Data Fig. 1

Tab delimited table of the data used to create the plots.

Source Data Fig. 2

Tab delimited table of the data used to create the plots.

Source Data Fig. 3

Newick files of the phylogenetic trees.

Source Data Fig. 4

Newick file of the phylogenetic tree.

Source Data Fig. 5

Newick file of the phylogenetic tree.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Rinke, C., Chuvochina, M., Mussig, A.J. et al. A standardized archaeal taxonomy for the Genome Taxonomy Database. Nat Microbiol 6, 946–959 (2021). https://doi.org/10.1038/s41564-021-00918-8

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1038/s41564-021-00918-8

This article is cited by

Search

Quick links

Nature Briefing Microbiology

Sign up for the Nature Briefing: Microbiology newsletter — what matters in microbiology research, free to your inbox weekly.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing: Microbiology