Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

A standardized archaeal taxonomy for the Genome Taxonomy Database

Abstract

The accrual of genomic data from both cultured and uncultured microorganisms provides new opportunities to develop systematic taxonomies based on evolutionary relationships. Previously, we established a bacterial taxonomy through the Genome Taxonomy Database. Here, we propose a standardized archaeal taxonomy that is derived from a 122-concatenated-protein phylogeny that resolves polyphyletic groups and normalizes ranks based on relative evolutionary divergence. The resulting archaeal taxonomy, which forms part of the Genome Taxonomy Database, is stable for a range of phylogenetic variables including marker gene selection, inference methods, corrections for rate heterogeneity and compositional bias, tree rooting scenarios and expansion of the genome database. Rank normalization is shown to robustly correct for substitution rates varying up to 30-fold using simulated datasets. Taxonomic curation follows the rules of the International Code of Nomenclature of Prokaryotes while taking into account proposals to formally recognize the rank of phylum and to use genome sequences as type material. This taxonomy is based on 2,392 archaeal genomes, 93.3% of which required one or more changes to their existing taxonomy, mainly owing to incomplete classification. We identify 16 archaeal phyla and reclassify 3 major monophyletic units from the former Euryarchaeota and one phylum that unites the Thaumarchaeota–Aigarchaeota–Crenarchaeota–Korarchaeota (TACK) superphylum into a single phylum.

Access options

Rent or Buy article

Get time limited or full article access on ReadCube.

from$8.99

All prices are NET prices.

Fig. 1: Comparison of rank-normalized archaeal GTDB and NCBI taxonomies.
Fig. 2: Comparison of marker sets, inference methods and models.
Fig. 3: Impact of different rooting scenarios on RED intervals.
Fig. 4: Rank-normalized archaeal GTDB taxonomy.
Fig. 5: Reclassification of the Thaumarchaeota.

Data availability

The GTDB taxonomy is available at the GTDB website (https://gtdb.ecogenomic.org/), including the ar122.r89 tree and the GTDB and NCBI taxonomic assignments for all 2,392 archaeal genomes in GTDB 04-RS89. Genome assemblies are available from the NCBI Assembly database (BioProject: PRJNA593905). All GTDB decorated phylogenetic trees are provided as Newick files in Supplementary Data 1. The SR4 model used for data recoded inferences is provided in Supplementary Data 2. Source data are provided with this paper.

Code availability

The standalone tool GTDB-Tk, which enables researchers to classify their own genomes according to the GTDB taxonomy, is available from GitHub (https://github.com/Ecogenomics/GTDBTk/) and through KBase (https://kbase.us/applist/apps/kb_gtdbtk/run_kb_gtdbtk/release). Taxonomic assignment and rank standardization were carried out based on the RED calculated using PhyloRank v0.0.37, which is available from GitHub (https://github.com/dparks1134/PhyloRank/).

References

  1. 1.

    Woese, C. R. & Fox, G. E. Phylogenetic structure of the prokaryotic domain: the primary kingdoms. Proc. Natl Acad. Sci. USA 74, 5088–5090 (1977).

    CAS  PubMed  PubMed Central  Google Scholar 

  2. 2.

    Gribaldo, S. & Brochier-Armanet, C. The origin and evolution of Archaea: a state of the art. Philos. Trans. R. Soc. Lond. B Biol. Sci. 361, 1007–1022 (2006).

    CAS  PubMed  PubMed Central  Google Scholar 

  3. 3.

    Zuo, G., Xu, Z. & Hao, B. Phylogeny and taxonomy of Archaea: a comparison of the whole-genome-based CVTree approach with 16S rRNA sequence analysis. Life 5, 949–968 (2015).

    CAS  PubMed  PubMed Central  Google Scholar 

  4. 4.

    Woese, C. R., Kandler, O. & Wheelis, M. L. Towards a natural system of organisms: proposal for the domains Archaea, Bacteria, and Eucarya. Proc. Natl Acad. Sci. USA 87, 4576–4579 (1990).

    CAS  PubMed  PubMed Central  Google Scholar 

  5. 5.

    Adam, P. S., Borrel, G., Brochier-Armanet, C. & Gribaldo, S. The growing tree of Archaea: new perspectives on their diversity, evolution and ecology. ISME J. https://doi.org/10.1038/ismej.2017.122 (2017).

  6. 6.

    Baker, B. J. et al. Diversity, ecology and evolution of Archaea. Nat. Microbiol. 5, 887–900 (2020).

    PubMed  Google Scholar 

  7. 7.

    Spang, A., Caceres, E. F. & Ettema, T. J. G. Genomic exploration of the diversity, ecology, and evolution of the archaeal domain of life. Science 357, eaaf3883 (2017).

    PubMed  Google Scholar 

  8. 8.

    Barns, S. M., Delwiche, C. F., Palmer, J. D. & Pace, N. R. Perspectives on archaeal diversity, thermophily and monophyly from environmental rRNA sequences. Proc. Natl Acad. Sci. USA 93, 9188–9193 (1996).

    CAS  PubMed  PubMed Central  Google Scholar 

  9. 9.

    Huber, H. et al. A new phylum of Archaea represented by a nanosized hyperthermophilic symbiont. Nature 417, 63–67 (2002).

    CAS  PubMed  Google Scholar 

  10. 10.

    Hallam, S. J. et al. Genomic analysis of the uncultivated marine crenarchaeote Cenarchaeum symbiosum. Proc. Natl Acad. Sci. USA 103, 18296–18301 (2006).

    CAS  PubMed  PubMed Central  Google Scholar 

  11. 11.

    Brochier-Armanet, C., Boussau, B., Gribaldo, S. & Forterre, P. Mesophilic crenarchaeota: proposal for a third archaeal phylum, the Thaumarchaeota. Nat. Rev. Microbiol. 6, 245–252 (2008).

    CAS  PubMed  Google Scholar 

  12. 12.

    Nunoura, T. et al. Insights into the evolution of Archaea and eukaryotic protein modifier systems revealed by the genome of a novel archaeal group. Nucleic Acids Res. 39, 3204–3223 (2011).

    CAS  PubMed  Google Scholar 

  13. 13.

    Kozubal, M. A. et al. Geoarchaeota: a new candidate phylum in the Archaea from high-temperature acidic iron mats in Yellowstone National Park. ISME J. 7, 622–634 (2013).

    CAS  PubMed  Google Scholar 

  14. 14.

    Meng, J. et al. Genetic and functional properties of uncultivated MCG Archaea assessed by metagenome and gene expression analyses. ISME J. 8, 650–659 (2014).

    CAS  PubMed  Google Scholar 

  15. 15.

    Guy, L., Spang, A., Saw, J. H. & Ettema, T. J. G. ‘Geoarchaeote NAG1’ is a deeply rooting lineage of the archaeal order Thermoproteales rather than a new phylum. ISME J. 8, 1353–1357 (2014).

    CAS  PubMed  PubMed Central  Google Scholar 

  16. 16.

    Guy, L. & Ettema, T. J. G. The archaeal ‘TACK’ superphylum and the origin of eukaryotes. Trends Microbiol. 19, 580–587 (2011).

    CAS  PubMed  Google Scholar 

  17. 17.

    Vanwonterghem, I. et al. Methylotrophic methanogenesis discovered in the archaeal phylum Verstraetearchaeota. Nat. Microbiol. 1, 16170 (2016).

    CAS  PubMed  Google Scholar 

  18. 18.

    Rinke, C. et al. Insights into the phylogeny and coding potential of microbial dark matter. Nature 499, 431–437 (2013).

  19. 19.

    Zaremba-Niedzwiedzka, K. et al. Asgard Archaea illuminate the origin of eukaryotic cellular complexity. Nature 541, 353–358 (2017).

    CAS  PubMed  Google Scholar 

  20. 20.

    Baker, B. J. et al. Enigmatic, ultrasmall, uncultivated Archaea. Proc. Natl Acad. Sci. USA 107, 8806–8811 (2010).

    CAS  PubMed  PubMed Central  Google Scholar 

  21. 21.

    Castelle, C. J. et al. Genomic expansion of domain Archaea highlights roles for organisms from new phyla in anaerobic carbon cycling. Curr. Biol. 16, 690–701 (2015).

    Google Scholar 

  22. 22.

    Probst, A. J. et al. Differential depth distribution of microbial function and putative symbionts through sediment-hosted aquifers in the deep terrestrial subsurface. Nat. Microbiol. 3, 328–336 (2018).

    CAS  PubMed  PubMed Central  Google Scholar 

  23. 23.

    Probst, A. J. et al. Biology of a widespread uncultivated archaeon that contributes to carbon fixation in the subsurface. Nat. Commun. 5, 5497 (2014).

    CAS  PubMed  Google Scholar 

  24. 24.

    Seitz, K. W., Lazar, C. S., Hinrichs, K.-U., Teske, A. P. & Baker, B. J. Genomic reconstruction of a novel, deeply branched sediment archaeal phylum with pathways for acetogenesis and sulfur reduction. ISME J. 10, 1696–1705 (2016).

    CAS  PubMed  PubMed Central  Google Scholar 

  25. 25.

    Spang, A. et al. Complex Archaea that bridge the gap between prokaryotes and eukaryotes. Nature 521, 173–179 (2015).

    CAS  PubMed  PubMed Central  Google Scholar 

  26. 26.

    Seitz, K. W. et al. Asgard Archaea capable of anaerobic hydrocarbon cycling. Nat. Commun. 10, 1822 (2019).

    PubMed  PubMed Central  Google Scholar 

  27. 27.

    Petitjean, C., Deschamps, P., López-García, P. & Moreira, D. Rooting the domain Archaea by phylogenomic analysis supports the foundation of the new kingdom Proteoarchaeota. Genome Biol. Evol. 7, 191–204 (2014).

    PubMed  PubMed Central  Google Scholar 

  28. 28.

    Petitjean, C., Deschamps, P., López-García, P., Moreira, D. & Brochier-Armanet, C. Extending the conserved phylogenetic core of Archaea disentangles the evolution of the third domain of life. Mol. Biol. Evol. 32, 1242–1254 (2015).

    CAS  PubMed  Google Scholar 

  29. 29.

    Parker, C. T., Tindall, B. J. & Garrity, G. M. International Code of Nomenclature of Prokaryotes. Int. J. Syst. Evol. Microbiol. 69, S1–S111 (2019).

    Google Scholar 

  30. 30.

    Oren, A. et al. Proposal to include the rank of phylum in the International Code of Nomenclature of Prokaryotes. Int. J. Syst. Evol. Microbiol. 65, 4284–4287 (2015).

    CAS  PubMed  Google Scholar 

  31. 31.

    Whitman, W. B. Modest proposals to expand the type material for naming of prokaryotes. Int. J. Syst. Evol. Microbiol. 66, 2108–2112 (2016).

    CAS  PubMed  Google Scholar 

  32. 32.

    Chuvochina, M. et al. The importance of designating type material for uncultured taxa. Syst. Appl. Microbiol. 42, 15–21 (2019).

    PubMed  Google Scholar 

  33. 33.

    Murray, R. G. E. & Stackebrandt, E. Taxonomic note: implementation of the provisional status Candidatus for incompletely described procaryotes. Int. J. Syst. Evol. Microbiol. 45, 186–187 (1995).

    CAS  Google Scholar 

  34. 34.

    Oren, A. A plea for linguistic accuracy—also for Candidatus taxa. Int. J. Syst. Evolut. Microbiol. 67, 1085–1094 (2017).

    Google Scholar 

  35. 35.

    Parks, D. H. et al. A standardized bacterial taxonomy based on genome phylogeny substantially revises the tree of life. Nat. Biotechnol. 36, 996–1004 (2018).

    CAS  PubMed  Google Scholar 

  36. 36.

    Parks, D. H. et al. A complete domain-to-species taxonomy for Bacteria and Archaea. Nat. Biotechnol. 38, 1079–1086 (2020).

    CAS  Google Scholar 

  37. 37.

    Haft, D. H. et al. RefSeq: an update on prokaryotic genome annotation and curation. Nucleic Acids Res. 46, D851–D860 (2018).

    CAS  PubMed  Google Scholar 

  38. 38.

    Parks, D. H. et al. A complete domain-to-species taxonomy for Bacteria and Archaea. Nat. Biotechnol. 38, 1079–1086 (2020).

  39. 39.

    Parks, D. H. et al. Recovery of nearly 8,000 metagenome-assembled genomes substantially expands the tree of life. Nat. Microbiol. 2, 1533–1542 (2017).

    CAS  PubMed  Google Scholar 

  40. 40.

    Wang, H.-C., Minh, B. Q., Susko, E. & Roger, A. J. Modeling site heterogeneity with posterior mean site frequency profiles accelerates accurate phylogenomic estimation. Syst. Biol. 67, 216–235 (2018).

    CAS  PubMed  Google Scholar 

  41. 41.

    Nguyen, L.-T., Schmidt, H. A., von Haeseler, A. & Minh, B. Q. IQ-TREE: a fast and effective stochastic algorithm for estimating maximum-likelihood phylogenies. Mol. Biol. Evol. 32, 268–274 (2015).

    CAS  PubMed  Google Scholar 

  42. 42.

    Federhen, S. The NCBI Taxonomy database. Nucleic Acids Res. 40, D136–D143 (2012).

    CAS  PubMed  Google Scholar 

  43. 43.

    Marin, J., Battistuzzi, F. U., Brown, A. C. & Hedges, S. B. The timetree of prokaryotes: new insights into their evolution and speciation. Mol. Biol. Evol. 34, 437–446 (2017).

    CAS  PubMed  Google Scholar 

  44. 44.

    Sieber, C. M. K. et al. Recovery of genomes from metagenomes via a dereplication, aggregation and scoring strategy. Nat. Microbiol. 3, 836–843 (2018).

    CAS  PubMed  PubMed Central  Google Scholar 

  45. 45.

    Hug, L. A. et al. A new view of the tree of life. Nat. Microbiol. 1, 16048 (2016).

    CAS  PubMed  Google Scholar 

  46. 46.

    Dombrowski, N. et al. Undinarchaeota illuminate DPANN phylogeny and the impact of gene transfer on archaeal evolution. Nat. Commun. 11, 3939 (2020).

    CAS  PubMed  PubMed Central  Google Scholar 

  47. 47.

    Galtier, N. & Lobry, J. R. Relationships between genomic G+C content, RNA secondary structures, and optimal growth temperature in prokaryotes. J. Mol. Evol. 44, 632–636 (1997).

    CAS  PubMed  Google Scholar 

  48. 48.

    Segata, N., Börnigen, D., Morgan, X. C. & Huttenhower, C. PhyloPhlAn is a new method for improved phylogenetic and taxonomic placement of microbes. Nat. Commun. 4, 2304 (2013).

    PubMed  Google Scholar 

  49. 49.

    Ali, R. H., Bogusz, M. & Whelan, S. Identifying clusters of high confidence homologies in multiple sequence alignments. Mol. Biol. Evol. 36, 2340–2351 (2019).

    CAS  PubMed  PubMed Central  Google Scholar 

  50. 50.

    Criscuolo, A. & Gribaldo, S. BMGE (block mapping and gathering with entropy): a new software for selection of phylogenetic informative regions from multiple sequence alignments. BMC Evol. Biol. 10, 210 (2010).

    PubMed  PubMed Central  Google Scholar 

  51. 51.

    Raymann, K., Brochier-Armanet, C. & Gribaldo, S. The two-domain tree of life is linked to a new root for the Archaea. Proc. Natl Acad. Sci. USA 112, 6670–6675 (2015).

    CAS  PubMed  PubMed Central  Google Scholar 

  52. 52.

    Williams, T. A. et al. Integrative modeling of gene and genome evolution roots the archaeal tree of life. Proc. Natl Acad. Sci. USA 114, E4602–E4611 (2017).

    CAS  PubMed  PubMed Central  Google Scholar 

  53. 53.

    Whitman, W. B. et al. Proposal of the suffix –ota to denote phyla. Addendum to ‘Proposal to include the rank of phylum in the International Code of Nomenclature of Prokaryotes’. Int. J. Syst. Evol. Microbiol. 68, 967–969 (2018).

    PubMed  Google Scholar 

  54. 54.

    Jungbluth, S. P., Amend, J. P. & Rappé, M. S. Metagenome sequencing and 98 microbial genomes from Juan de Fuca Ridge flank subsurface fluids. Sci. Data 4, sdata201737 (2017).

  55. 55.

    Reysenbach, A.-L. Class I. Thermoprotei class. nov. in Bergey’s Manual of Systematic Bacteriology Volume 1: The Archaea and the Deeply Branching and Phototrophic Bacteria (eds Garrity, G. et al.) 169–210 (Springer Verlag, 2001).

  56. 56.

    Stieglmeier, M. et al. Nitrososphaera viennensis gen. nov., sp. nov., an aerobic and mesophilic, ammonia-oxidizing archaeon from soil and a member of the archaeal phylum Thaumarchaeota. Int. J. Syst. Evol. Microbiol. 64, 2738–2752 (2014).

    CAS  PubMed  PubMed Central  Google Scholar 

  57. 57.

    Elkins, J. G. et al. A korarchaeal genome reveals insights into the evolution of the Archaea. Proc. Natl Acad. Sci. USA 105, 8102–8107 (2008).

    CAS  PubMed  PubMed Central  Google Scholar 

  58. 58.

    Oren, A., Garrity, G. M., Parker, C. T., Chuvochina, M. & Trujillo, M. E. Lists of names of prokaryotic Candidatus taxa. Int. J. Syst. Evol. Microbiol. https://doi.org/10.1099/ijsem.0.003789 (2020).

  59. 59.

    Imachi, H. et al. Isolation of an archaeon at the prokaryote–eukaryote interface. Nature 577, 519–525 (2020).

    CAS  PubMed  PubMed Central  Google Scholar 

  60. 60.

    Fuchs, T., Huber, H., Burggraf, S. & Stetter, K. O. 16S rDNA-based phylogeny of the archaeal order Sulfolobales and reclassification of Desulfurolobus ambivalens as Acidianus ambivalens comb. nov. Syst. Appl. Microbiol. 19, 56–60 (1996).

    CAS  Google Scholar 

  61. 61.

    Quehenberger, J., Shen, L., Albers, S.-V., Siebers, B. & Spadiut, O. Sulfolobus—a potential key organism in future biotechnology. Front. Microbiol 8, 2474 (2017).

    PubMed  PubMed Central  Google Scholar 

  62. 62.

    Minegishi, H. et al. Further refinement of the phylogeny of the Halobacteriaceae based on the full-length RNA polymerase subunit B′ (rpoB′) gene. Int. J. Syst. Evol. Microbiol. 60, 2398–2408 (2010).

    PubMed  Google Scholar 

  63. 63.

    Sorokin, D. Y. et al. Natronolimnobius sulfurireducens sp. nov. and Halalkaliarchaeum desulfuricum gen. nov., sp. nov., the first sulfur-respiring alkaliphilic Haloarchaea from hypersaline alkaline lakes. Int. J. Syst. Evol. Microbiol. 69, 2662–2673 (2019).

    CAS  PubMed  Google Scholar 

  64. 64.

    Sorokin, D. Y. et al. Sulfur respiration in a group of facultatively anaerobic natronoarchaea ubiquitous in hypersaline soda lakes. Front. Microbiol. 9, 2359 (2018).

    PubMed  PubMed Central  Google Scholar 

  65. 65.

    Mendler, K. et al. AnnoTree: visualization and exploration of a functionally annotated microbial tree of life. Nucleic Acids Res. https://doi.org/10.1093/nar/gkz246 (2019).

  66. 66.

    Chaumeil, P.-A., Mussig, A. J., Hugenholtz, P. & Parks, D. H. GTDB-Tk: a toolkit to classify genomes with the Genome Taxonomy Database. Bioinformatics https://doi.org/10.1093/bioinformatics/btz848 (2019).

  67. 67.

    Parks, D. H., Imelfort, M., Skennerton, C. T., Hugenholtz, P. & Tyson, G. W. CheckM: assessing the quality of microbial genomes recovered from isolates, single cells, and metagenomes. Genome Res. 25, 1043–1055 (2015).

    CAS  PubMed  PubMed Central  Google Scholar 

  68. 68.

    McDonald, D. et al. An improved Greengenes taxonomy with explicit ranks for ecological and evolutionary analyses of Bacteria and Archaea. ISME J. 6, 610–618 (2012).

    CAS  PubMed  Google Scholar 

  69. 69.

    Wheeler, T. J. & Eddy, S. R. nhmmer: DNA homology search with profile HMMs. Bioinformatics 29, 2487–2489 (2013).

    CAS  PubMed  PubMed Central  Google Scholar 

  70. 70.

    Kalvari, I. et al. Rfam 13.0: shifting to a genome-centric resource for non-coding RNA families. Nucleic Acids Res. 46, D335–D342 (2018).

    CAS  PubMed  PubMed Central  Google Scholar 

  71. 71.

    Nawrocki, E. Structural RNA Homology Search and Alignment Using Covariance Models PhD thesis, Washington Univ. St Louis (2009).

  72. 72.

    Price, M. N., Dehal, P. S. & Arkin, A. P. FastTree 2—approximately maximum-likelihood trees for large alignments. PLoS ONE 5, e9490 (2010).

    PubMed  PubMed Central  Google Scholar 

  73. 73.

    Kozlov, A. M., Aberer, A. J. & Stamatakis, A. ExaML version 3: a tool for phylogenomic analyses on supercomputers. Bioinformatics 31, 2577–2579 (2015).

    CAS  PubMed  PubMed Central  Google Scholar 

  74. 74.

    Lartillot, N. & Philippe, H. A Bayesian mixture model for across-site heterogeneities in the amino-acid replacement process. Mol. Biol. Evol. 21, 1095–1109 (2004).

    CAS  PubMed  Google Scholar 

  75. 75.

    Zhou, X., Shen, X.-X., Hittinger, C. T. & Rokas, A. Evaluating fast maximum likelihood-based phylogenetic programs using empirical phylogenomic data sets. Mol. Biol. Evol. 35, 486–503 (2018).

    CAS  PubMed  Google Scholar 

  76. 76.

    Quang, L. S., Gascuel, O. & Lartillot, N. Empirical profile mixture models for phylogenetic reconstruction. Bioinformatics 24, 2317–2323 (2008).

    CAS  Google Scholar 

  77. 77.

    Katoh, K. & Standley, D. M. MAFFT multiple sequence alignment software version 7: improvements in performance and usability. Mol. Biol. Evol. 30, 772–780 (2013).

    CAS  PubMed  PubMed Central  Google Scholar 

  78. 78.

    Robinson, D. F. & Foulds, L. R. Comparison of phylogenetic trees. Math. Biosci. 53, 131–147 (1981).

    Google Scholar 

  79. 79.

    Kupczok, A., Schmidt, H. A. & von Haeseler, A. Accuracy of phylogeny reconstruction methods combining overlapping gene data sets. Algorithms Mol. Biol. 5, 37 (2010).

    PubMed  PubMed Central  Google Scholar 

  80. 80.

    Letunic, I. & Bork, P. Interactive tree of life (iTOL) v4: recent updates and new developments. Nucleic Acids Res. 47, W256–W259 (2019).

    CAS  PubMed  PubMed Central  Google Scholar 

Download references

Acknowledgements

We thank B. Kemish and D. Senanayake for system administration support, P. Yilmaz for stimulating discussions on archaeal taxonomy and the GTDB user community for their feedback. We also thank the Australian Centre for Ecogenomics (ACE) at The University of Queensland and the New Zealand eScience Infrastructure (NeSI) for providing high-performance computing facilities. The project was supported by an Australian Research Council (ARC) Future Fellowship (FT170100213) awarded to C.R. and by an Australian Research Council Laureate Fellowship (FL150100038) awarded to P.H. The funders had no role in study design, data collection and analysis, decision to publish or preparation of the manuscript.

Author information

Affiliations

Authors

Contributions

P.H., C.R. and D.H.P. conceived the archaeal GTDB study and designed experiments. C.R., A.J.M., D.H.P. and D.W.W. performed phylogenetic inferences. D.H.P. and P.-A.C. calculated rank normalizations. D.H.P., P.-A.C. and A.J.M. created the GTDB web interface and underlying databases. M.C., C.R., P.H. and W.B.W. curated the GTDB taxonomy with input from all co-authors and the scientific community. A.A.D. performed the RED simulation analysis. C.R. and P.H. wrote the manuscript with contributions from all co-authors.

Corresponding authors

Correspondence to Christian Rinke or Philip Hugenholtz.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Peer review information Nature Microbiology thanks the anonymous reviewers for their contribution to the peer review of this work.

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Extended data

Extended Data Fig. 1 Implementing the shifting substitution rate model (SSR).

a, Timetree, that is a tree scaled to time. A grid has been overlayed on top of the tree to delineate the length of the individual branch segments. b, An array of substitution rate multipliers, ordered from lowest to highest. In this specific example, the slowest evolving lineage evolves 16 times more slowly than the fastest evolving lineage. c, The result of running the SSR model on the tree in a. Every circle represents a shift in the substitution rate, whereby the colour correlates to the substitution rates multipliers shown in c). At the start of the simulation the model starts with the grey circle. How quick the changes take place depend on the shifting substitution rate parameter. d, The result of taking the tree in a) and scaling it according to the active substitution rate multiplier in every branch at every branch segment. For instance, we can see that the single segment in dark red (x4) has the same length as the two prior segments in light red (x2) on the same branch.

Extended Data Fig. 2 Impact of variable evolutionary rates on the RED approach.

Each panel shows the true ranking of every inner node of the species tree (which we can directly obtain from the simulation) and the x-axis, and the inferred rankings of every node, resulting of modifying the branch lengths of the tree using the SSR model and subsequently applying the RED algorithm to recover an ultrametric tree, on the y-axis. The main diagonal corresponds therefore to the proportion of nodes for the ranking given by the column that has been correctly classified. In every panel we have the results of using a different shifting substitution rate, from 0.1 to 0.5, and 1 (events/speciation). These results show that, as expected, higher numbers of changes in the substitution rates impact the performance of the RED approach to a higher degree. However, the levels of accuracy remain high even for the most extreme cases.

Extended Data Fig. 3 Average number of monophyletic (green bars), operationally monophyletic (yellow bars) and polyphyletic (orange bars) taxa across higher ranks (phylum, class, and order) in percent.

Shown are GTDB taxonomy decorated phylogenetic trees inferred with different methods, from a range of markers, from alignments trimmed to reduce compositional bias and fast evolving sites, and from alignments created as part of the simulated database expansion. Note that only taxa with two or more genomes were included, and that the data set (order representatives) used for PhyloBayes restricts the analysis to the ranks of phylum and class. Details for each inferred tree are provided in Supplementary Table 10. Percentages for all ranks are shown in Supplementary Fig. 11. Monophyly and operational monophyly was determined based on the F measure of decorated internal nodes.

Extended Data Fig. 4 Higher rank GTDB lineages not resolved with alternative inference methods.

Shown are taxa that were not recovered as monophyletic or operational monophyletic (green) and hence were polyphyletic (red; F measure < 0.95) in at least one of the different alignments and inference methods. The bootstrap support for each taxon in the ar.122.r89 reference tree is given in the last column ‘BS in ar.122.r89 tree’.

Extended Data Fig. 5 Examples of application of names using the manual curation workflow.

Provided are five examples of taxon names that have been updated in GTDB following the manual curation workflow. Thereby, each example is shown in a distinct colour: Ca. Thaumarchaeota (red), Ca. Diapherotrites (blue), Ca. Bathyarchaeota (green), Ca. Verstraetearchaeota (purple), and Ca. Korarchaeota (orange). For example, Ca. Thaumarchaeota (red) has no designated nomenclature type and has no lower-ranking taxon based on the same stem as the taxon. Furthermore, it has been united with another taxon of the same rank in GTDB, which resulted in a name being chosen based on priority, in this case Thermoproteota. *Nomenclature type of the taxon (for ranks above genus) is defined as one of its subordinate taxa with which the name is permanently associated.

Supplementary information

Supplementary Information

Supplementary Figs. 1–21, Tables 13–18 and Notes 1–7.

Reporting Summary

Peer Review File

Supplementary Tables

Supplementary Tables 1–14 and 19–29.

Supplementary Data 1

Newick files of all GTDB decorated trees.

Supplementary Data 2

SR4 model used for data recoded inferences.

Source data

Source Data Fig. 1

Tab delimited table of the data used to create the plots.

Source Data Fig. 2

Tab delimited table of the data used to create the plots.

Source Data Fig. 3

Newick files of the phylogenetic trees.

Source Data Fig. 4

Newick file of the phylogenetic tree.

Source Data Fig. 5

Newick file of the phylogenetic tree.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Rinke, C., Chuvochina, M., Mussig, A.J. et al. A standardized archaeal taxonomy for the Genome Taxonomy Database. Nat Microbiol 6, 946–959 (2021). https://doi.org/10.1038/s41564-021-00918-8

Download citation

Search

Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing