Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Article
  • Published:

Expanded diversity of Asgard archaea and their relationships with eukaryotes

Abstract

Asgard is a recently discovered superphylum of archaea that appears to include the closest archaeal relatives of eukaryotes1,2,3,4,5. Debate continues as to whether the archaeal ancestor of eukaryotes belongs within the Asgard superphylum or whether this ancestor is a sister group to all other archaea (that is, a two-domain versus a three-domain tree of life)6,7,8. Here we present a comparative analysis of 162 complete or nearly complete genomes of Asgard archaea, including 75 metagenome-assembled genomes that—to our knowledge—have not previously been reported. Our results substantially expand the phylogenetic diversity of Asgard and lead us to propose six additional phyla that include a deep branch that we have provisionally named Wukongarchaeota. Our phylogenomic analysis does not resolve unequivocally the evolutionary relationship between eukaryotes and Asgard archaea, but instead—depending on the choice of species and conserved genes used to build the phylogeny—supports either the origin of eukaryotes from within Asgard (as a sister group to the expanded Heimdallarchaeota–Wukongarchaeota branch) or a deeper branch for the eukaryote ancestor within archaea. Our comprehensive protein domain analysis using the 162 Asgard genomes results in a major expansion of the set of eukaryotic signature proteins. The Asgard eukaryotic signature proteins show variable phyletic distributions and domain architectures, which is suggestive of dynamic evolution through horizontal gene transfer, gene loss, gene duplication and domain shuffling. The phylogenomics of the Asgard archaea points to the accumulation of the components of the mobile archaeal ‘eukaryome’ in the archaeal ancestor of eukaryotes (within or outside Asgard) through extensive horizontal gene transfer.

This is a preview of subscription content, access via your institution

Access options

Buy this article

Prices may be subject to local taxes which are calculated during checkout

Fig. 1: Phylogenetic analysis of Asgard archaea and their relationships with eukaryotes.
Fig. 2: Domain architectures of selected ESPs in Asgard archaea.
Fig. 3: Reconstruction and evolution of key metabolic processes in Asgard archaea.

Similar content being viewed by others

Data availability

Asgard archaea genomes generated in this study have been deposited in the eLibrary of Microbial Systematics and Genomics (https://www.biosino.org/elmsg/index) and are also available from the NCBI under BioProject identifier PRJNA680430. Publicly available genomes were retrieved from NCBI GenBank, MG-RAST and the figshare repository. The accession numbers of the newly generated and the public genomes are available in Supplementary Table 1. Supplementary data file 1 comprises the complete Asgard COG data archive (supplementary_data_file_1.tgz), and supplementary data file 2 contains the phylogenetic trees and alignments archive (supplementary_data_file_2.tgz); these files are available without restriction from https://doi.org/10.5281/zenodo.4624280 or https://ftp.ncbi.nih.gov/pub/wolf/_suppl/asgard20/. Any other relevant data are available from the corresponding authors upon reasonable request.

References

  1. Spang, A. et al. Complex archaea that bridge the gap between prokaryotes and eukaryotes. Nature 521, 173–179 (2015).

    Article  CAS  PubMed  PubMed Central  ADS  Google Scholar 

  2. Zaremba-Niedzwiedzka, K. et al. Asgard archaea illuminate the origin of eukaryotic cellular complexity. Nature 541, 353–358 (2017).

    Article  CAS  PubMed  ADS  Google Scholar 

  3. MacLeod, F., Kindler, G. S., Wong, H. L., Chen, R. & Burns, B. P. Asgard archaea: diversity, function, and evolutionary implications in a range of microbiomes. AIMS Microbiol. 5, 48–61 (2019).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  4. Cai, M. et al. Diverse Asgard archaea including the novel phylum Gerdarchaeota participate in organic matter degradation. Sci. China Life Sci. 63, 886–897 (2020).

    Article  CAS  PubMed  Google Scholar 

  5. Williams, T. A., Cox, C. J., Foster, P. G., Szöllősi, G. J. & Embley, T. M. Phylogenomics provides robust support for a two-domains tree of life. Nat. Ecol. Evol. 4, 138–147 (2020).

    Article  PubMed  Google Scholar 

  6. Williams, T. A., Foster, P. G., Cox, C. J. & Embley, T. M. An archaeal origin of eukaryotes supports only two primary domains of life. Nature 504, 231–236 (2013).

    Article  CAS  PubMed  ADS  Google Scholar 

  7. Cox, C. J., Foster, P. G., Hirt, R. P., Harris, S. R. & Embley, T. M. The archaebacterial origin of eukaryotes. Proc. Natl Acad. Sci. USA 105, 20356–20361 (2008).

    Article  CAS  PubMed  ADS  PubMed Central  Google Scholar 

  8. Yutin, N., Makarova, K. S., Mekhedov, S. L., Wolf, Y. I. & Koonin, E. V. The deep archaeal roots of eukaryotes. Mol. Biol. Evol. 25, 1619–1630 (2008).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  9. Da Cunha, V., Gaia, M., Gadelle, D., Nasir, A. & Forterre, P. Lokiarchaea are close relatives of Euryarchaeota, not bridging the gap between prokaryotes and eukaryotes. PLoS Genet. 13, e1006810 (2017).

    Article  PubMed  PubMed Central  Google Scholar 

  10. Da Cunha, V., Gaia, M., Nasir, A. & Forterre, P. Asgard archaea do not close the debate about the universal tree of life topology. PLoS Genet. 14, e1007215 (2018).

    Article  PubMed  PubMed Central  Google Scholar 

  11. Spang, A. et al. Asgard archaea are the closest prokaryotic relatives of eukaryotes. PLoS Genet. 14, e1007080 (2018).

    Article  PubMed  PubMed Central  Google Scholar 

  12. Forterre, P. The universal tree of life: an update. Front. Microbiol. 6, 717 (2015).

    Article  PubMed  PubMed Central  Google Scholar 

  13. Lombard, J., López-García, P. & Moreira, D. The early evolution of lipid membranes and the three domains of life. Nat. Rev. Microbiol. 10, 507–515 (2012).

    Article  CAS  PubMed  Google Scholar 

  14. Akıl, C. & Robinson, R. C. Genomes of Asgard archaea encode profilins that regulate actin. Nature 562, 439–443 (2018).

    Article  PubMed  ADS  Google Scholar 

  15. Akıl, C. et al. Insights into the evolution of regulated actin dynamics via characterization of primitive gelsolin/cofilin proteins from Asgard archaea. Proc. Natl Acad. Sci. USA 117, 19904–19913 (2020).

    Article  PubMed  PubMed Central  Google Scholar 

  16. Lu, Z. et al. Coevolution of eukaryote-like Vps4 and ESCRT-III subunits in the Asgard archaea. mBio 11, e00417-20 (2020).

    Article  PubMed  PubMed Central  Google Scholar 

  17. Imachi, H. et al. Isolation of an archaeon at the prokaryote–eukaryote interface. Nature 577, 519–525 (2020).

    Article  CAS  PubMed  PubMed Central  ADS  Google Scholar 

  18. Zhang, J.-W. et al. Newly discovered Asgard archaea Hermodarchaeota potentially degrade alkanes and aromatics via alkyl/benzyl-succinate synthase and benzoyl-CoA pathway. ISME J. https://doi.org/10.1038/s41396-020-00890-x (2021).

  19. Ciccarelli, F. D. et al. Toward automatic reconstruction of a highly resolved tree of life. Science 311, 1283–1287 (2006).

    Article  CAS  PubMed  ADS  Google Scholar 

  20. Rinke, C. et al. Insights into the phylogeny and coding potential of microbial dark matter. Nature 499, 431–437 (2013).

    Article  CAS  PubMed  ADS  Google Scholar 

  21. Nayfach, S. et al. A genomic catalog of Earth’s microbiomes. Nat. Biotechnol. 39, 499–509 (2021).

  22. Balasingam, N., Brandon, H. E., Ross, J. A., Wieden, H.-J. & Thakor, N. Cellular roles of the human Obg-like ATPase 1 (hOLA1) and its YchF homologs. Biochem. Cell Biol. 98, 1–11 (2020).

    Article  CAS  PubMed  Google Scholar 

  23. Rinke, C. et al. Resolving widespread incomplete and uneven archaeal classifications based on a rank-normalized genome-based taxonomy. Preprint at https://doi.org/10.1101/2020.03.01.972265 (2020).

  24. Eme, L., Spang, A., Lombard, J., Stairs, C. W. & Ettema, T. J. G. Archaea and the origin of eukaryotes. Nat. Rev. Microbiol. 15, 711–723 (2017).

    Article  CAS  PubMed  Google Scholar 

  25. Klinger, C. M., Spang, A., Dacks, J. B. & Ettema, T. J. G. Tracing the archaeal origins of eukaryotic membrane-trafficking system building blocks. Mol. Biol. Evol. 33, 1528–1541 (2016).

    Article  CAS  PubMed  Google Scholar 

  26. Puigbò, P., Lobkovsky, A. E., Kristensen, D. M., Wolf, Y. I. & Koonin, E. V. Genomes in turmoil: quantification of genome dynamics in prokaryote supergenomes. BMC Biol. 12, 66 (2014).

    Article  PubMed  PubMed Central  Google Scholar 

  27. Christ, L., Raiborg, C., Wenzel, E. M., Campsteijn, C. & Stenmark, H. Cellular functions and molecular mechanisms of the ESCRT membrane-scission machinery. Trends Biochem. Sci. 42, 42–56 (2017).

    Article  CAS  PubMed  Google Scholar 

  28. Su, M.-Y., Fromm, S. A., Zoncu, R. & Hurley, J. H. Structure of the C9orf72 ARF GAP complex that is haploinsufficient in ALS and FTD. Nature 585, 251–255 (2020).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  29. de Martín Garrido, N. & Aylett, C. H. S. Nutrient signaling and lysosome positioning crosstalk through a multifunctional protein, folliculin. Front. Cell Dev. Biol. 8, 108 (2020).

    Article  PubMed  PubMed Central  Google Scholar 

  30. Shen, K. et al. Architecture of the human GATOR1 and GATOR1–Rag GTPases complexes. Nature 556, 64–69 (2018).

    Article  CAS  PubMed  PubMed Central  ADS  Google Scholar 

  31. López-García, P. & Moreira, D. The syntrophy hypothesis for the origin of eukaryotes revisited. Nat. Microbiol. 5, 655–667 (2020).

    Article  PubMed  Google Scholar 

  32. Martin, W. & Müller, M. The hydrogen hypothesis for the first eukaryote. Nature 392, 37–41 (1998).

    Article  CAS  PubMed  ADS  Google Scholar 

  33. Moreira, D. & López-García, P. Symbiosis between methanogenic archaea and δ-proteobacteria as the origin of eukaryotes: the syntrophic hypothesis. J. Mol. Evol. 47, 517–530 (1998).

    Article  CAS  PubMed  ADS  Google Scholar 

  34. López-García, P. & Moreira, D. Cultured Asgard archaea shed light on eukaryogenesis. Cell 181, 232–235 (2020).

    Article  PubMed  Google Scholar 

  35. Spang, A. et al. Proposal of the reverse flow model for the origin of the eukaryotic cell based on comparative analyses of Asgard archaeal metabolism. Nat. Microbiol. 4, 1138–1148 (2019).

    Article  CAS  PubMed  Google Scholar 

  36. Koonin, E. V. & Yutin, N. The dispersed archaeal eukaryome and the complex archaeal ancestor of eukaryotes. Cold Spring Harb. Perspect. Biol. 6, a016188 (2014).

    Article  PubMed  PubMed Central  Google Scholar 

  37. Liu, Y. et al. Comparative genomic inference suggests mixotrophic lifestyle for Thorarchaeota. ISME J. 12, 1021–1031 (2018).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  38. Cheng, L. et al. Isolation and characterization of Methanoculleus receptaculi sp. nov. from Shengli oil field, China. FEMS Microbiol. Lett. 285, 65–71 (2008).

    Article  CAS  PubMed  Google Scholar 

  39. Peng, J., Lü, Z., Rui, J. & Lu, Y. Dynamics of the methanogenic archaeal community during plant residue decomposition in an anoxic rice field soil. Appl. Environ. Microbiol. 74, 2894–2901 (2008).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  40. Uritskiy, G. V., DiRuggiero, J. & Taylor, J. MetaWRAP–a flexible pipeline for genome-resolved metagenomic data analysis. Microbiome 6, 158 (2018).

    Article  PubMed  PubMed Central  Google Scholar 

  41. Li, D., Liu, C. M., Luo, R., Sadakane, K. & Lam, T. W. MEGAHIT: an ultra-fast single-node solution for large and complex metagenomics assembly via succinct de Bruijn graph. Bioinformatics 31, 1674–1676 (2015).

    Article  CAS  PubMed  Google Scholar 

  42. Langmead, B. & Salzberg, S. L. Fast gapped-read alignment with Bowtie 2. Nat. Methods 9, 357–359 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  43. Kang, D. D. et al. MetaBAT 2: an adaptive binning algorithm for robust and efficient genome reconstruction from metagenome assemblies. PeerJ 7, e7359 (2019).

    Article  PubMed  PubMed Central  Google Scholar 

  44. Sieber, C. M. K. et al. Recovery of genomes from metagenomes via a dereplication, aggregation and scoring strategy. Nat. Microbiol. 3, 836–843 (2018).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  45. Peng, Y., Leung, H. C. M., Yiu, S. M. & Chin, F. Y. L. IDBA-UD: a de novo assembler for single-cell and metagenomic sequencing data with highly uneven depth. Bioinformatics 28, 1420–1428 (2012).

    Article  CAS  PubMed  Google Scholar 

  46. Bolger, A. M., Lohse, M. & Usadel, B. Trimmomatic: a flexible trimmer for Illumina sequence data. Bioinformatics 30, 2114–2120 (2014).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  47. Nurk, S., Meleshko, D., Korobeynikov, A. & Pevzner, P. A. metaSPAdes: a new versatile metagenomic assembler. Genome Res. 27, 824–834 (2017).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  48. Parks, D. H., Imelfort, M., Skennerton, C. T., Hugenholtz, P. & Tyson, G. W. CheckM: assessing the quality of microbial genomes recovered from isolates, single cells, and metagenomes. Genome Res. 25, 1043–1055 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  49. Hyatt, D. et al. Prodigal: prokaryotic gene recognition and translation initiation site identification. BMC Bioinformatics 11, 119 (2010).

    Article  PubMed  PubMed Central  Google Scholar 

  50. Seemann, T. Prokka: rapid prokaryotic genome annotation. Bioinformatics 30, 2068–2069 (2014).

    Article  CAS  PubMed  Google Scholar 

  51. Chan, P. P. & Lowe, T. M. tRNAscan-SE: searching for tRNA genes in genomic sequences. Methods Mol. Biol. 1962, 1–14 (2019).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  52. Schäffer, A. A. et al. Improving the accuracy of PSI-BLAST protein database searches with composition-based statistics and other refinements. Nucleic Acids Res. 29, 2994–3005 (2001).

    Article  PubMed  PubMed Central  Google Scholar 

  53. Steinegger, M. & Söding, J. MMseqs2 enables sensitive protein sequence searching for the analysis of massive data sets. Nat. Biotechnol. 35, 1026–1028 (2017).

    Article  CAS  PubMed  Google Scholar 

  54. Edgar, R. C. MUSCLE: a multiple sequence alignment method with reduced time and space complexity. BMC Bioinformatics 5, 113 (2004).

    Article  PubMed  PubMed Central  Google Scholar 

  55. Söding, J. Protein homology detection by HMM–HMM comparison. Bioinformatics 21, 951–960 (2005).

    Article  PubMed  Google Scholar 

  56. Price, M. N., Dehal, P. S. & Arkin, A. P. FastTree 2—approximately maximum-likelihood trees for large alignments. PLoS ONE 5, e9490 (2010).

    Article  PubMed  PubMed Central  ADS  Google Scholar 

  57. Rice, P., Longden, I. & Bleasby, A. EMBOSS: the European Molecular Biology Open Software Suite. Trends Genet. 16, 276–277 (2000).

    Article  CAS  PubMed  Google Scholar 

  58. Katoh, K. & Standley, D. M. MAFFT multiple sequence alignment software version 7: improvements in performance and usability. Mol. Biol. Evol. 30, 772–780 (2013).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  59. Nguyen, L.-T., Schmidt, H. A., von Haeseler, A. & Minh, B. Q. IQ-TREE: a fast and effective stochastic algorithm for estimating maximum-likelihood phylogenies. Mol. Biol. Evol. 32, 268–274 (2015).

    Article  CAS  PubMed  Google Scholar 

  60. Larsson, A. AliView: a fast and lightweight alignment viewer and editor for large datasets. Bioinformatics 30, 3276–3278 (2014).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  61. Esterman, E. S., Wolf, Y. I., Kogay, R., Koonin, E. V. & Zhaxybayeva, O. Evolution of DNA packaging in gene transfer agents. Virus Evol. 7, veab015 (2021).

    Article  PubMed  PubMed Central  Google Scholar 

  62. Puigbò, P., Wolf, Y. I. & Koonin, E. V. Search for a ‘Tree of Life’ in the thicket of the phylogenetic forest. J. Biol. 8, 59 (2009).

    Article  PubMed  PubMed Central  Google Scholar 

  63. R Core Team. R: A Language and Environment for Statistical Computing (R Foundation for Statistical Computing, 2019).

  64. Zimmermann, L. et al. A completely reimplemented MPI bioinformatics toolkit with a new HHpred server at its core. J. Mol. Biol. 430, 2237–2243 (2018).

    Article  CAS  PubMed  Google Scholar 

  65. Yang, M., Derbyshire, M. K., Yamashita, R. A. & Marchler-Bauer, A. NCBI’s conserved domain database and tools for protein domain analysis. Curr. Protoc. Bioinformatics 69, e90 (2020).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  66. Søndergaard, D., Pedersen, C. N. S. & Greening, C. HydDB: a web tool for hydrogenase classification and analysis. Sci. Rep. 6, 34212 (2016).

    Article  PubMed  PubMed Central  ADS  Google Scholar 

  67. Criscuolo, A. & Gribaldo, S. BMGE (block mapping and gathering with entropy): a new software for selection of phylogenetic informative regions from multiple sequence alignments. BMC Evol. Biol. 10, 210 (2010).

    Article  PubMed  PubMed Central  Google Scholar 

  68. Swofford, D. L. & Maddison, W. P. Reconstructing ancestral character states under Wagner parsimony. Math. Biosci. 87, 199–229 (1987).

    Article  MathSciNet  MATH  Google Scholar 

Download references

Acknowledgements

We thank P. Forterre and T. Williams for critical reading of the manuscript and helpful suggestions; J. Chen, H. Li, P. Du and D. Zou for support with sampling and preliminary analysis of Changjiang estuary sediments; Z. Zhou and J.-D. Gu for support with sampling and preliminary analysis of Mai Po Nature Reserve sediments; S. Zheng and F. Liu for support with sampling and preliminary analysis of Rongcheng Swan Lake Nature Reserve sediments; and the crew and scientific team of RV Xiangyanghong 09, the pilots and the supporting team of Jiaolong manned submersible in the 37th Dayang Cruise for the sampling. M.L., Y.L., X.Z., W.X., Z.L. and L.C. are supported by National Natural Science Foundation of China (grant no. 91851105, 31970105, 92051102, 31700430, 91951102, 41776170 and 92051108), the Innovation Team Project of Universities in Guangdong Province (no. 2020KCXTD023), the Shenzhen Science and Technology Program (grant no. JCYJ20200109105010363 and JCYJ20190808152403587), the Scientific Research Foundation of Third Institute of Oceanography, MNR (2019022), the China Ocean Mineral Resources R&D Association (COMRA) Program (DY135-B2-09) and the National Key Basic Research Program of China (‘973’-Program, 2015CB755903). K.S.M., Y.I.W., A.N. and E.V.K. are supported by the Intramural Research Program of the National Institutes of Health of the USA (National Library of Medicine).

Author information

Authors and Affiliations

Authors

Contributions

M.L., E.V.K., K.S.M. and Y.L. initiated the study; Y.L., W.-C.H., M.C., C.-J.Z., W.X., Z.L. and L.C. participated in sample collections; Y.L., X.Z., M.C., C.-J.Z., W.X., Z.L. and L.C. performed metagenomic assembly and binning analysis. Y.L. performed metabolism analysis; K.S.M., A.N.N. and Y.I.W. performed comparative genomic analysis; Y.L., K.S.M., Y.I.W. and W.-C.H. performed phylogenetic analysis; K.S.M. and Y.I.W. constructed Asgard COGs; K.S.M., Y.I.W., Y.L., M.L. and E.V.K. analysed the data; Y.L., K.S.M., W.-C.H., X.Z., M.C., C.-J.Z., W.X., Z.L., L.C., E.V.K. and M.L. wrote the manuscript that was read, edited and approved by all authors.

Corresponding authors

Correspondence to Eugene V. Koonin or Meng Li.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Peer review information Nature thanks the anonymous reviewer(s) for their contribution to the peer review of this work.

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Extended data figures and tables

Extended Data Fig. 1 Global distribution of the Asgard genomes analysed in this Article.

The world map was generated using R package rnaturalearth v.0.1.0., in R v.3.6.363. The pie chart shows the proportion of Asgard genomes that were found in a given biotope. The numbers of these genomes per biotope are as follows: coastal sediment, 94; freshwater sediment, 15; hot spring, 1; hydrothermal vent, 13; hypersaline lake sediment, 1; marine sediment 26; marine water, 26; petroleum seep (marine), 6; and petroleum field, 1. Boldface in the map indicates the sampling locations.

Extended Data Fig. 2 Completeness and contamination for 75 Asgard MAGs.

These MAGs were assessed using CheckM v.1.0.12. a, Distribution of completeness and contamination for 75 Asgard MAGs assessed by CheckM v.1.0.12. b, c, Distribution of depth coverage (b) and N50 statistics (c) for Asgard MAGs reconstructed in this Article. The numbers in parentheses indicate the number of Asgard genomes recovered from a given sampling location. In cases in which fewer than three samples were recovered, these are presented as individual points. Thick black bar, median; upper and lower bounds of the box plot, first and third quartile, respectively; upper and lower whiskers, largest and smallest values less than 1.5× interquartile range, respectively; black points, values greater than 1.5× interquartile range. Data for this plot are given in Supplementary Table 1.

Extended Data Fig. 3 Gene commonality plot for Asgard archaea and the TACK superphylum.

Gene commonality plot showing the number of Asgard COGs (log scale) (y axis) that include the given fraction of analysed genomes (x axis). The Asgard plot is compared with the TACK superphylum plot on the basis of the assignment of TACK genomes to archaeal COGs.

Extended Data Fig. 4 Comparison of the mean amino acid identity of Asgard and TACK superphyla.

In this figure, -archaeota is omitted from the phylum names. Sample sizes of less than three are presented as individual points. a, Shared amino acid identity across Asgard and TACK lineages. Comparison of representative genomes from all Asgard and TACK lineages analysed in this Article (excluding the six putative phyla proposed in this Article), which characterizes the distribution of amino acid identities that is typical of a phylum. bm, Amino acid identity comparisons between Thorarchaeota (b), Hermodarchaeota (c), Odinarchaeota (d), Baldrarchaeota (e), Lokiarchaeota (f), Helarchaeota (g), Borrarchaeota (h), Heimdallarchaeota (i), Kariarchaeota (j), Gerdarchaeota (k), Hodarchaeota (l) and Wukongarchaeota (m) and other Asgard and TACK lineages. Thick black bar, median; upper and lower bounds of the box plot, first and third quartile respectively; upper and lower whiskers, largest and smallest values less than 1.5× interquartile range, respectively; black points, values greater than 1.5× interquartile range; number in the parentheses, number of genomes in the lineage. Data for this plot are given in Supplementary Table 2.

Extended Data Fig. 5 Comparison of the 16S rRNA gene sequence identity of Asgard and TACK lineages.

In this figure, -archaeota is omitted from the phylum names. Sample sizes of less than three are presented as individual points. a, 16S rRNA gene sequence identity across Asgard and TACK lineages. Comparison of 16S RNA gene sequences from representative genomes of all Asgard and TACK lineages analysed in this Article (excluding the six putative phyla proposed in this Article), which characterizes the distribution of 16S rRNA sequence that is typical of a phylum. bk, Comparison of 16S rRNA gene sequence identity between Thorarchaeota (b), Hermodarchaeota (c), Odinarchaeota (d), Lokiarchaeota (e), Helarchaeota (f), Heimdallarchaeota (g), Kariarchaeota (h), Gerdarchaeota (i), Hodarchaeota (j) and Wukongarchaeota (k) and other Asgard and TACK lineages. Thick black bar, median; upper and lower bounds of the box plot, first and third quartile respectively; upper and lower whiskers, largest and smallest values less than 1.5× interquartile range, respectively; black points, values greater than 1.5× interquartile range; number in the parentheses, number of genomes in the lineage. Data for this plot are given in Supplementary Table 3.

Extended Data Fig. 6 Classification of Asgard archaea by the phyletic patterns and the core gene set of Asgard archaea.

a, Classical multidimensional scaling analysis of binary presence–absence phyletic patterns for 13,939 Asgard COGs that are represented in at least two genomes (Methods). b, Functional breakdown of Asgard core genes (378 Asgard COGs) compared with TACK-superphylum core genes (489 archaeal COGs). Values were normalized as described in the Methods. Functional classes of genes: J, translation, ribosomal structure and biogenesis; K, transcription; L, replication, recombination and repair; D, cell cycle control, cell division and chromosome partitioning; V, defence mechanisms; T, signal transduction mechanisms; M, biogenesis of the cell wall, membrane or envelope; N, cell motility; U, intracellular trafficking, secretion and vesicular transport; O, posttranslational modification, protein turnover and chaperones; X, mobilome (prophages, plasmids and transposons); C, energy production and conversion; G, carbohydrate transport and metabolism; E, amino acid transport and metabolism; F, nucleotide transport and metabolism; H, coenzyme transport and metabolism; I, lipid transport and metabolism; P, inorganic ion transport and metabolism; Q, secondary metabolites biosynthesis, transport and catabolism; R, general function prediction only; S, function unknown. c, Presence–absence of orthologues of Asgard core genes in other archaea, bacteria and eukaryotes.

Extended Data Fig. 7 Phylogenetic trees.

a, Phylogenetic tree of bacteria, archaea and eukaryotes (inferred with IQ-tree using the LG + R10 model) that was constructed from the concatenated alignments of the protein sequences of 30 universally conserved genes (Methods). The tree shows the relationships between the major clades. b, Phylogenetic tree of COG0012 (ribosome-binding ATPase YchF) the tree was reconstructed using IQ-tree with LG + R10 evolutionary model (selected by IQ-tree ModelFinder as the best fit). zc, Phylogenetic tree of COG0201 (preprotein translocase subunit SecY). The tree was reconstructed using IQ-tree with LG + F + R10 evolutionary model (selected by IQ-tree ModelFinder as the best fit). d, Phylogenetic tree of the reduced set of bacteria, archaea and eukaryotes (excluding the genomes of derived parasites), constructed from concatenated alignments of the protein sequences of 29 universal markers (excluding COG0012) using IQ-tree with LG + R10 evolutionary model (selected by IQ-tree ModelFinder as the best fit). The tree shows the relationships between the major clades. e, Phylogenetic analysis of the evolutionary relationship between archaea and eukaryotes, excluding the Asgard superphylum. The tree was reconstructed from a concatenated alignment of the 29 universal markers (excluding COG0012) using IQ-tree with LG + R10 evolutionary model (selected by IQ-tree ModelFinder as the best fit).

Extended Data Fig. 8 Phyletic patterns of ESPs in Asgard genomes.

All 505 Asgard COGs that correspond to ESP are grouped by distance between binary presence–absence phyletic patterns. For a given pair of Asgard COGs A and B that are present in the set of genomes {GA} and {GB}, respectively, we calculate the similarity between the patterns as SA,B = |{GA} × {GB}|/|{GA}+{GB}|, and the distance between the patterns as DA,B = −ln(SA,B). A dendrogram was reconstructed using the unweighted-pair group method with arithmetic mean, from the distance matrix D; the order of leaves in the tree determines the order of Asgard COGs in the figure. Top, patterns are shown schematically by pale blue lines, in which the respective Asgard COG is present and mapped to the 12 major Asgard lineages (as shown by the coloured bar above). The Asgard COGs that correspond to the most highly conserved ESP protein families are shown within the red rectangle. Bottom, plot of the number of Asgard COGs that correspond to ESPs in each of 76 genomes is shown. Complete data are provided in Supplementary Table 7. The colour code for the plot is the same as for the bar graph.

Extended Data Fig. 9 Metabolic features of Asgard archaea.

Schematic of the presence and absence of selected metabolic features in all phyla and putative phyla of Asgard archaea.

Extended Data Fig. 10 Phylogenetic analysis of [NiFe] hydrogenases in Asgard archaea.

a, Phylogenetic analysis of group-4 [NiFe] hydrogenases in Asgard archaea. The unrooted maximum-likelihood phylogenetic tree was built from an alignment of 425 sequences that included 110 sequences of Asgard archaea, with 308 amino acid positions. b, Phylogenetic analysis of group-3 [NiFe] hydrogenases in Asgard archaea. The unrooted maximum-likelihood phylogenetic tree was built from an alignment of 813 sequences that included 335 sequences of Asgard archaea, with 331 amino acid positions. c, Phylogenetic analysis of group-1 [NiFe] hydrogenases in the Asgard archaea. The unrooted maximum-likelihood phylogenetic tree was built from an alignment of 541 sequences that included 2 sequences of Wukongarchaeota, with 376 amino acid positions.

Supplementary information

Supplementary Information

This file contains (1) Description of new taxa; (2) Clusters of orthologous genes of Asgard archaea; (3) The core gene set of Asgard archaea; (4) Phylogenomic analysis of the Asgard superphylum and Asgard-eukaryote evolutionary relationship; (5) Eukaryotic Signature Proteins in Asgard archaea; and (6) Reconstruction of metabolic pathways in Asgard archaea.

Reporting Summary

Supplementary Table 1

Genome information, proposed taxonomy and isolation data.

Supplementary Table 2

Mean amino-acid identity values (%) among 66 TACK genomes and 184 Asgard genomes (162 high quality and 22 low-quality).

Supplementary Table 3

The 16S rRNA gene sequence identity (%) among TACK and Asgard lineages. Identity was calculated using sequences longer than 1,300 bps.

Supplementary Table 4

Species and phylogenetic markers used for the tree of life reconstruction.

Supplementary Table 5

Data for phylogenetic trees: methods, markers, bootstrap data and comments. The trees in the Newick format and the underlying alignments are provided in the additional data file 2.

Supplementary Table 6

The core asCOGs list.

Supplementary Table 7

Eukaryotic signature proteins in Asgard archaea.

Supplementary Table 8

The presence-absence of metabolic enzymes in Asgard archaea.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Liu, Y., Makarova, K.S., Huang, WC. et al. Expanded diversity of Asgard archaea and their relationships with eukaryotes. Nature 593, 553–557 (2021). https://doi.org/10.1038/s41586-021-03494-3

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1038/s41586-021-03494-3

This article is cited by

Comments

By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.

Search

Quick links

Nature Briefing Microbiology

Sign up for the Nature Briefing: Microbiology newsletter — what matters in microbiology research, free to your inbox weekly.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing: Microbiology