Skip to main content

Thank you for visiting You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

Microbial predators form a new supergroup of eukaryotes


Molecular phylogenetics of microbial eukaryotes has reshaped the tree of life by establishing broad taxonomic divisions, termed supergroups, that supersede the traditional kingdoms of animals, fungi and plants, and encompass a much greater breadth of eukaryotic diversity1. The vast majority of newly discovered species fall into a small number of known supergroups. Recently, however, a handful of species with no clear relationship to other supergroups have been described2,3,4, raising questions about the nature and degree of undiscovered diversity, and exposing the limitations of strictly molecular-based exploration. Here we report ten previously undescribed strains of microbial predators isolated through culture that collectively form a diverse new supergroup of eukaryotes, termed Provora. The Provora supergroup is genetically, morphologically and behaviourally distinct from other eukaryotes, and comprises two divergent clades of predators—Nebulidia and Nibbleridia—that are superficially similar to each other, but differ fundamentally in ultrastructure, behaviour and gene content. These predators are globally distributed in marine and freshwater environments, but are numerically rare and have consequently been overlooked by molecular-diversity surveys. In the age of high-throughput analyses, investigation of eukaryotic diversity through culture remains indispensable for the discovery of rare but ecologically and evolutionarily important eukaryotes.

This is a preview of subscription content, access via your institution

Access options

Rent or buy this article

Get just this article for as long as you need it


Prices may be subject to local taxes which are calculated during checkout

Fig. 1: Cell morphology.
Fig. 2: Phylogeny of eukaryotes reconstructed with a concatenated 320-gene dataset.
Fig. 3: Estimated gene family diversity in Provora.
Fig. 4: Mitochondrial genomes support the distinctness and diversity of Provora.

Data availability

Raw transcriptome reads from Provora are deposited in GenBank (PRJNA866092), along with the SSU rRNA gene sequences of species (OP101998–OP102010). Assembled transcriptomes, mitochondrial genomes, materials of orthogroup and phylogenetic analyses, along with individual gene alignments, concatenated and trimmed alignments, and maximum-likelihood and Bayesian tree files for the phylogenomic dataset are available at Figshare ( The following databases were used in this study: NCBI nt (, NCBI non-redundant database (, Swiss-Prot (, EukProt (, KEGG (, Pfam ( The following environmental sequencing datasets were used for 18S rRNA gene analysis: Tara Oceans (, protists in European coastal waters and sediments (, Autonomous Reef Monitoring Structures (ARMS) in Red Sea (, Stream biofilm eukaryotic assemblages (, Deep sea basin sediments (, eukaryotic plankton in reef environments in Panama (, eukaryote communities in a high-alpine lake (, mountain lake microbial communities (, microbial eukaryotes in Lake Baikal ( A 320-gene dataset was used for constructing alignments for phylogenomic analyses ( The new taxa have been registered with the Zoobank database ( under the following accession codes:,,,,,,,, and


  1. Keeling, P. J. & Burki, F. Progress towards the tree of eukaryotes. Curr. Biol. 29, R808–R817 (2019).

    Article  CAS  Google Scholar 

  2. Gawryluk, R. M. R. et al. Non-photosynthetic predators are sister to red algae. Nature 572, 240–243 (2019).

    Article  CAS  Google Scholar 

  3. Janouškovec, J. et al. A new lineage of eukaryotes illuminates early mitochondrial genome reduction. Curr. Biol. 27, 3717–3724 (2017).

    Article  Google Scholar 

  4. Lax, G. et al. Hemimastigophora is a novel supra-kingdom-level lineage of eukaryotes. Nature 564, 410–414 (2018).

    Article  ADS  CAS  Google Scholar 

  5. Oren, A. Prokaryote diversity and taxonomy: current status and future challenges. Philos. Trans. R. Soc. Lond. B 359, 623–638 (2004).

    Article  CAS  Google Scholar 

  6. Shu, W. S. & Huang, L. N. Microbial diversity in extreme environments. Nat. Rev. Microbiol. 20, 219–235 (2022).

    Article  CAS  Google Scholar 

  7. Massana, R., del Campo, J., Sieracki, M. E., Audic, S. & Logares, R. Exploring the uncultured microeukaryote majority in the oceans: reevaluation of ribogroups within stramenopiles. ISME J. 8, 854–866 (2014).

    Article  Google Scholar 

  8. de Vargas, C. et al. Eukaryotic plankton diversity in the sunlit ocean. Science 348, 1261605 (2015).

    Article  Google Scholar 

  9. Flegontova, O. et al. Extreme diversity of diplonemid eukaryotes in the ocean. Curr. Biol. 26, 3060–3065 (2016).

    Article  CAS  Google Scholar 

  10. Ahlering, M. A. & Carrel, J. E. Predators are rare even when they are small. Oikos 95, 471–475 (2001).

    Article  Google Scholar 

  11. Hehenberger, E. et al. Novel predators reshape holozoan phylogeny and reveal the presence of a two-component signaling system in the ancestor of animals. Curr. Biol. 27, 2043–2050 (2017).

    Article  CAS  Google Scholar 

  12. Tikhonenkov, D. V. et al. Description of Colponema vietnamica sp. n. and Acavomonas peruviana n. gen. n. sp., two new alveolate phyla (Colponemidia nom. nov. and Acavomonidia nom. nov.) and their contributions to reconstructing the ancestral state of alveolates and eukaryotes. PLoS ONE 9, e95467 (2014).

    Article  ADS  Google Scholar 

  13. Tikhonenkov, D. V. et al. New lineage of microbial predators adds complexity to reconstructing the evolutionary origin of animals. Curr. Biol. 30, 4500–4509 (2020).

    Article  CAS  Google Scholar 

  14. Mylnikov, A. P. & Tikhonenkov, D. V. The new alveolate carnivorous flagellate Colponema marisrubri sp. n. (Colponemida, Alveolata) from the Red Sea. Zool. Zh. 88, 1163–1169 (2009).

    Google Scholar 

  15. Strassert, J. F. H., Irisarri, I., Williams, T. A. & Burki, F. A molecular timescale for eukaryote evolution with implications for the origin of red algal-derived plastids. Nat. Commun. 12, 1879 (2021).

    Article  ADS  CAS  Google Scholar 

  16. Rodriguez-Ezpeleta, N. et al. Detecting and overcoming systematic errors in genome-scale phylogenies. Syst. Biol. 56, 389–399 (2007).

    Article  CAS  Google Scholar 

  17. Strassert, J. F. H., Jamy, M., Mylnikov, A. P., Tikhonenkov, D. V. & Burki, F. New phylogenomic analysis of the enigmatic phylum Telonemia further resolves the eukaryote tree of life. Mol. Biol. Evol. 36, 757–765 (2019).

    Article  CAS  Google Scholar 

  18. Lanfear, R., Kokko, H. & Eyre-Walker, A. Population size and the rate of evolution. Trends Ecol. Evol. 29, 33–41 (2014).

    Article  Google Scholar 

  19. Bahler, M. & Rhoads, A. Calmodulin signaling via the IQ motif. FEBS Lett. 513, 107–113 (2002).

    Article  CAS  Google Scholar 

  20. Schaffer, D. E., Iyer, L. M., Burroughs, A. M. & Aravind, L. Functional innovation in the evolution of the calcium-dependent system of the eukaryotic endoplasmic reticulum. Front. Genet. 11, 34 (2020).

    Article  Google Scholar 

  21. Morita-Yamamuro, C. et al. The Arabidopsis gene CAD1 controls programmed cell death in the plant immune system and encodes a protein containing a MACPF domain. Plant Cell Physiol. 46, 902–912 (2005).

    Article  CAS  Google Scholar 

  22. Rosado, C. J. et al. The MACPF/CDC family of pore-forming toxins. Cell. Microbiol. 10, 1765–1774 (2008).

    Article  CAS  Google Scholar 

  23. Ishino, T., Chinzei, Y. & Yuda, M. A Plasmodium sporozoite protein with a membrane attack complex domain is required for breaching the liver sinusoidal cell layer prior to hepatocyte infection. Cell. Microbiol. 7, 199–208 (2005).

    Article  CAS  Google Scholar 

  24. Satoh, H., Oshiro, N., Iwanaga, S., Namikoshi, M. & Nagai, H. Characterization of PsTX-60B, a new membrane-attack complex/perforin (MACPF) family toxin, from the venomous sea anemone Phyllodiscus semoni. Toxicon 49, 1208–1210 (2007).

    Article  CAS  Google Scholar 

  25. Tikhonenkov, D. V., Mazei, Y. A. & Embulaeva, E. A. Degradation succession of heterotrophic flagellate communities in microcosms. Zh. Obs. Biol. 69, 57–64 (2008).

    CAS  Google Scholar 

  26. Tikhonenkov, D. V. et al. On the origin of TSAR: morphology, diversity and phylogeny of Telonemia. Open Biol. 12, 210325 (2022).

    Article  CAS  Google Scholar 

  27. Picelli, S. et al. Full-length RNA-seq from single cells using Smart-seq2. Nat. Protoc. 9, 171–181 (2014).

    Article  CAS  Google Scholar 

  28. Keeling, P. J., Poulson, N. & McFadden, G. I. Phylogenetic diversity of parabasalian symbionts from termites, including the phylogenetic position of Pseudotrypanosoma and Trichonympha. J. Eukaryot. Microbiol. 45, 643–650 (1998).

    Article  CAS  Google Scholar 

  29. Medlin, L., Elwood, H. J., Stickel, S. & Sogin, M. L. The characterization of enzymatically amplified eukaryotic 16S-like rRNA-coding regions. Gene 71, 491–499 (1988).

    Article  CAS  Google Scholar 

  30. Tikhonenkov, D. V., Janouškovec, J., Keeling, P. J. & Mylnikov, A. P. The morphology, ultrastructure and SSU rRNA gene sequence of a new freshwater flagellate, Neobodo borokensis n. sp. (Kinetoplastea, Excavata). J. Eukaryot. Microbiol. 63, 220–232 (2016).

    Article  CAS  Google Scholar 

  31. Andrews, S. FastQC: a quality control tool for high throughput sequence data (Babraham Bioinformatics, 2010);

  32. Zhang, J., Kobert, K., Flouri, T. & Stamatakis, A. PEAR: a fast and accurate Illumina Paired-End reAd mergeR. Bioinformatics 30, 614–620 (2013).

    Article  Google Scholar 

  33. Bolger, A. M., Lohse, M. & Usadel, B. Trimmomatic: a flexible trimmer for Illumina sequence data. Bioinformatics 30, 2114–2120 (2014).

    Article  CAS  Google Scholar 

  34. Grabherr, M. G. et al. Full-length transcriptome assembly from RNA-seq data without a reference genome. Nat. Biotechnol. 29, 644–652 (2011).

    Article  CAS  Google Scholar 

  35. Altschul, S. F., Gish, W., Miller, W., Myers, E. W. & Lipman, D. J. Basic local alignment search tool. J. Mol. Biol. 215, 403–410 (1990).

    Article  CAS  Google Scholar 

  36. Laetsch, D. R. & Blaxter, M. L. BlobTools: interrogation of genome assemblies. F1000Research 6, 1287 (2017).

    Article  Google Scholar 

  37. Haas, B. J. et al. Denovo transcript sequence reconstruction from RNA-seq using the Trinity platform for reference generation and analysis. Nat. Protoc. 8, 1494–1512 (2013).

    Article  CAS  Google Scholar 

  38. Li, W. & Godzik, A. Cd-hit: a fast program for clustering and comparing large sets of protein or nucleotide sequences. Bioinformatics 22, 1658–1659 (2006).

    Article  CAS  Google Scholar 

  39. Buchfink, B., Xie, C. & Huson, D. H. Fast and sensitive protein alignment using DIAMOND. Nat. Methods 12, 59–60 (2015).

    Article  CAS  Google Scholar 

  40. Shen, W. & Ren, H. TaxonKit: a practical and efficient NCBI taxonomy toolkit. J. Genet. Genomics 48, 844–850 (2021).

  41. Richter, D. J. et al. EukProt: a database of genome-scale predicted proteins across the diversity of eukaryotes. Peer Community Journal 2, e56 (2022).

  42. Simao, F. A., Waterhouse, R. M., Ioannidis, P., Kriventseva, E. V. & Zdobnov, E. M. BUSCO: assessing genome assembly and annotation completeness with single-copy orthologs. Bioinformatics 31, 3210–3212 (2015).

    Article  CAS  Google Scholar 

  43. Kanehisa, M., Furumichi, M., Sato, Y., Ishiguro-Watanabe, M. & Tanabe, M. KEGG: integrating viruses and cellular organisms. Nucleic Acids Res. 49, D545–D551 (2021).

    Article  CAS  Google Scholar 

  44. Moriya, Y., Itoh, M., Okuda, S., Yoshizawa, A. C. & Kanehisa, M. KAAS: an automatic genome annotation and pathway reconstruction server. Nucleic Acids Res. 35, W182–W185 (2007).

    Article  Google Scholar 

  45. Burki, F. The eukaryotic tree of life from a global phylogenomic perspective. Cold Spring Harb. Perspect. Biol. 6, a016147 (2014).

    Article  Google Scholar 

  46. Waskom, M. et al. mwaskom/Seaborn: v0.8.1 (September 2017). Zenodo (2017).

  47. Eddy, S. R. Accelerated profile HMM searches. PLoS Comput. Biol. 7, e1002195 (2011).

    Article  ADS  MathSciNet  CAS  Google Scholar 

  48. Finn, R. D. et al. The Pfam protein families database: towards a more sustainable future. Nucleic Acids Res. 44, D279–D285 (2016).

    Article  CAS  Google Scholar 

  49. Letunic, I. & Bork, P. 20 years of the SMART protein domain annotation resource. Nucleic Acids Res. 46, D493–D496 (2018).

    Article  CAS  Google Scholar 

  50. Almagro Armenteros, J. J. et al. SignalP 5.0 improves signal peptide predictions using deep neural networks. Nat. Biotechnol. 37, 420–423 (2019).

    Article  CAS  Google Scholar 

  51. Katoh, K. & Standley, D. M. MAFFT multiple sequence alignment software version 7: improvements in performance and usability. Mol. Biol. Evol. 30, 772–780 (2013).

    Article  CAS  Google Scholar 

  52. Burns, J. A., Pittis, A. A. & Kim, E. Gene-based predictive models of trophic modes suggest Asgard archaea are not phagocytotic. Nat. Ecol. Evol. 2, 697–704 (2018).

    Article  Google Scholar 

  53. Emms, D. M. & Kelly, S. OrthoFinder: phylogenetic orthology inference for comparative genomics. Genome Biol. 20, 238 (2019).

    Article  Google Scholar 

  54. Altschul, S. F. et al. Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res. 25, 3389–3402 (1997).

    Article  CAS  Google Scholar 

  55. Hall, T. A. BioEdit: a user-friendly biological sequence alignment editor and analysis program for Windows 95/98/NT. Nucleic Acids Symp. Ser. 41, 95–98 (1999).

    CAS  Google Scholar 

  56. Minh, B. Q. et al. IQ-TREE 2: new models and efficient methods for phylogenetic inference in the genomic era. Mol. Biol. Evol. 37, 1530–1534 (2020).

    Article  CAS  Google Scholar 

  57. Whelan, S., Irisarri, I. & Burki, F. PREQUAL: detecting non-homologous characters in sets of unaligned homologous sequences. Bioinformatics 34, 3929–3930 (2018).

    CAS  Google Scholar 

  58. Capella-Gutierrez, S., Silla-Martinez, J. M. & Gabaldon, T. trimAl: a tool for automated alignment trimming in large-scale phylogenetic analyses. Bioinformatics 25, 1972–1973 (2009).

    Article  CAS  Google Scholar 

  59. Roure, B., Rodriguez-Ezpeleta, N. & Philippe, H. SCaFoS: a tool for selection, concatenation and fusion of sequences for phylogenomics. BMC Evol. Biol. 7, S2 (2007).

    Article  Google Scholar 

  60. Lartillot, N., Rodrigue, N., Stubbs, D. & Richer, J. PhyloBayes MPI: phylogenetic reconstruction with infinite mixtures of profiles in a parallel environment. Syst. Biol. 62, 611–615 (2013).

    Article  CAS  Google Scholar 

  61. Dayhoff, M., Schwartz, R. & Orcutt, B. in Atlas of Protein Sequence and Structure (ed. Dayhoff, M.) 345–352 (National Biomedical Research Foundation, 1978).

  62. Susko, E. & Roger, A. J. On reduced amino acid alphabets for phylogenetic inference. Mol. Biol. Evol. 24, 2139–2150 (2007).

    Article  CAS  Google Scholar 

  63. Lartillot, N. & Philippe, H. A Bayesian mixture model for across-site heterogeneities in the amino-acid replacement process. Mol. Biol. Evol. 21, 1095–1109 (2004).

    Article  CAS  Google Scholar 

  64. Quang le, S., Gascuel, O. & Lartillot, N. Empirical profile mixture models for phylogenetic reconstruction. Bioinformatics 24, 2317–2323 (2008).

    Article  Google Scholar 

  65. Wang, H. C., Minh, B. Q., Susko, E. & Roger, A. J. Modeling site heterogeneity with posterior mean site frequency profiles accelerates accurate phylogenomic estimation. Syst. Biol. 67, 216–235 (2018).

    Article  CAS  Google Scholar 

  66. Kück, P. & Struck, T. H. BaCoCa—a heuristic software tool for the parallel assessment of sequence biases in hundreds of gene and taxon partitions. Mol. Phylogenet. Evol. 70, 94–98 (2014).

    Article  Google Scholar 

  67. Shimodaira, H. An approximately unbiased test of phylogenetic tree selection. Syst. Biol. 51, 492–508 (2002).

    Article  Google Scholar 

  68. Kumar, S., Stecher, G. & Tamura, K. MEGA7: molecular evolutionary genetics analysis version 7.0 for bigger datasets. Mol. Biol. Evol. 33, 1870–1874 (2016).

    Article  CAS  Google Scholar 

  69. Bankevich, A. et al. SPAdes: a new genome assembly algorithm and its applications to single-cell sequencing. J. Comput. Biol. 19, 455–477 (2012).

    Article  MathSciNet  CAS  Google Scholar 

  70. Dierckxsens, N., Mardulyn, P. & Smits, G. NOVOPlasty: de novo assembly of organelle genomes from whole genome data. Nucleic Acids Res. 45, e18 (2017).

    Google Scholar 

  71. Kuznetsov, A. & Bollin, C. J. in Multiple Sequence Alignment (ed. Katoh, K.) 261–295 (Springer, 2021).

  72. Lohse, M., Drechsel, O., Kahlau, S. & Bock, R. OrganellarGenomeDRAW—a suite of tools for generating physical maps of plastid and mitochondrial genomes and visualizing expression data sets. Nucleic Acids Res. 41, W575–W581 (2013).

    Article  Google Scholar 

  73. Johnson, P. Z., Kasprzak, W. K., Shapiro, B. A. & Simon, A. E. RNA2Drawer: geometrically strict drawing of nucleic acid structures with graphical structure editing and highlighting of complementary subsequences. RNA Biol. 16, 1667–1671 (2019).

    Article  Google Scholar 

  74. Burger, G., Gray, M. W., Forget, L. & Lang, B. F. Strikingly bacteria-like and gene-rich mitochondrial genomes throughout jakobid protists. Genome Biol. Evol. 5, 418–438 (2013).

    Article  Google Scholar 

  75. Criscuolo, A. & Gribaldo, S. BMGE (Block Mapping and Gathering with Entropy): a new software for selection of phylogenetic informative regions from multiple sequence alignments. BMC Evol. Biol. 10, 210 (2010).

    Article  Google Scholar 

  76. Zhang, D. et al. PhyloSuite: an integrated and scalable desktop platform for streamlined molecular sequence data management and evolutionary phylogenetics studies. Mol. Ecol. Resour. 20, 348–355 (2020).

    Article  Google Scholar 

  77. Nguyen, L.-T., Schmidt, H. A., von Haeseler, A. & Minh, B. Q. IQ-TREE: a fast and effective stochastic algorithm for estimating maximum-likelihood phylogenies. Mol. Biol. Evol. 32, 268–274 (2015).

    Article  CAS  Google Scholar 

  78. Ibarbalz, F. M. et al. Global trends in marine plankton diversity across kingdoms of life. Cell 179, 1084–1097 (2019).

    Article  CAS  Google Scholar 

  79. Massana, R. et al. Marine protist diversity in European coastal waters and sediments as revealed by high-throughput sequencing. Environ. Microbiol. 17, 4035–4049 (2015).

    Article  CAS  Google Scholar 

  80. Gendron, E. M. S., Darcy, J. L., Hell, K. & Schmidt, S. K. Structure of bacterial and eukaryote communities reflect in situ controls on community assembly in a high-alpine lake. J. Microbiol. 57, 852–864 (2019).

    Article  CAS  Google Scholar 

  81. Minerovic, A. D. et al. 18S-V9 DNA metabarcoding detects the effect of water-quality impairment. Ecol. Indic. 113, 106225 (2020).

    Article  CAS  Google Scholar 

  82. Pearman, J. K. et al. Cross-shelf investigation of coral reef cryptic benthic organisms reveals diversity patterns of the hidden majority. Sci. Rep. 8, 8090 (2018).

    Article  ADS  CAS  Google Scholar 

  83. Rodas, A. M. et al. Eukaryotic plankton communities across reef environments in Bocas del Toro Archipelago, Panamá. Coral Reefs 39, 1453–1467 (2020).

    Article  Google Scholar 

  84. Schoenle, A. et al. High and specific diversity of protists in the deep-sea basins dominated by diplonemids, kinetoplastids, ciliates and foraminiferans. Commun. Biol. 4, 501 (2021).

    Article  CAS  Google Scholar 

  85. Schulhof, M. A. et al. Sierra Nevada mountain lake microbial communities are structured by temperature, resources and geographic location. Mol. Ecol. 29, 2080–2093 (2020).

    Article  CAS  Google Scholar 

  86. Yi, Z. et al. High-throughput sequencing of microbial eukaryotes in Lake Baikal reveals ecologically differentiated communities and novel evolutionary radiations. FEMS Microbiol. Ecol. 93, fix073 (2017).

Download references


We thank M. Vermeij and the staff at the CARMABI research station for field sampling support; and N. Kosolapova for help with sample collection in the Arctic. This research was supported by grants from the Russian Foundation for Basic Research (to D.V.T., grant no. 20-34-70049), the Tyumen Oblast Government, as part of the West-Siberian Interregional Science and Education Center’s project no. 89-DON (2) (to D.V.T.), the Ministry of Science and Higher Education of the Russian Federation within the framework of the Federal Scientific and Technical Program for the Development of Genetic Technologies for 2019-2027 (agreement no. 075-15-2021-1345, unique identifier RF-193021X0012), the Gordon and Betty Moore Foundation (to P.J.K.,, GenomeBC and the Natural Sciences and Engineering Research Council of Canada (to P.J.K., grant no. 2019-03994), and was carried out within the framework of state assignment no. 121051100102-2.

Author information

Authors and Affiliations



D.V.T., K.V.M., R.M.R.G. and P.J.K. designed the study. D.V.T. and A.P.M. discovered the organisms and isolated the cultures. D.V.T. generated material for sequencing. A.O.B., S.A.K., D.G.Z., A.S.B., K.I.P. and D.V.T. performed light and electron microscopy and cultured the cells. K.V.M. and R.M.R.G. performed transcriptomic analyses and phylogenetic analyses. V.M. and V.V.A. performed the environmental distribution analysis and phylogenetic analysis of the SSU rRNA. D.V.T., K.V.M., R.M.R.G. and P.J.K. wrote the manuscript with input from all of the authors.

Corresponding author

Correspondence to Denis V. Tikhonenkov.

Ethics declarations

Competing interests

The authors declare no competing interests.

Peer review

Peer review information

Nature thanks Thijs Ettema, James McInerney and the other, anonymous, reviewer(s) for their contribution to the peer review of this work.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Extended data figures and tables

Extended Data Fig. 1 Outline of tree topologies obtained in the phylogenomic analyses and the geographical distribution of Provora.

(a) Maximum-likelihood tree topology obtained with the 320-gene dataset; nodes with support values below 100% (PMSF model, 100 replicates) are labelled red, and the corresponding values are provided next to the tree nodes; established eukaryotic groups with full support in the analysis are collapsed and shown in the tree schematically with triangles. (b) PhyloBayes consensus tree topology obtained using four analysis chains with the native 320-gene dataset; posterior probabilities are shown for tree nodes that fail to achieve full support in the analysis. (c) PhyloBayes consensus tree topology obtained with the Dayhoff 6-recoded 320-gene dataset; the low posterior probability (0.58 pp) for the union of Provora and Haptista reflects the marginal support for this group in all four analysis chains, rather than the lack of convergence between the chains (maxdiff = 0.27). (d) PhyloBayes consensus tree topology obtained with the SR4-recoded 320-gene dataset. (e) Geographical distribution of environmental sequences of 18S rRNA belonging to Provora.

Extended Data Fig. 2 Phylogenies with variable regions of 18S rRNA featuring identified environmental sequences belonging to Provora.

(a) Phylogenetic tree based on the V4 region of the 18S rRNA gene showing the diversity of environmental lineages of Provora. (b) Phylogenetic tree based on the V9 region of the 18S rRNA gene. The 18S rRNA of Provora described in this paper are shown in red. Environmental sequences related to the members of Provora are labelled in blue. Bootstrap values ≥ 90% are indicated with black circles at the tree nodes.

Extended Data Fig. 3 Conservation of functional categories and trophic mode prediction for the transcriptomes of Provora.

(a) Heatmap of annotated KEGG orthology entry counts (presence/absence data) for functional categories defined by BRITE in the transcriptomic data of Provora isolates and the genomic data of eukaryotic organisms; the counts only include entries inferred to be ancestral for eukaryotes by the Dollo parsimony principle: entries that only have hits in one of the major eukaryotic subdivisions (Diaphoretickes, Discoba or Amorphea) were excluded; the counts were normalized to the inferred ancestral eukaryotic KEGG orthologs. (b) Principal component analysis plot with gene ontology category scores for categories associated with free-living phagocytic organisms; (c) Prediction probabilities of trophic modes (phagocytosis, prototrophy, photosynthesis) in Provora isolates, conducted by the Trophic Mode Prediction Tool.

Extended Data Fig. 4 Maximum likelihood phylogenetic tree with eukaryotic members of the inositol trisphosphate receptor family, identified by the presence of a RyR and IP3R homology associated domain (RIHa, PF08454) and an ion channel domain (PF00520).

The phylogeny was reconstructed by IQ-TREE using an alignment with 396 eukaryotic sequences, spanning the RIHa and ion channel regions of the proteins; reconstruction was done under the best-fitting LG+F+R10 model of evolution, and node support was evaluated with 1000 UFBoot replicates; nodes with over 95% support are marked with black circles; clades uniting members of a single taxon are collapsed in the tree and labelled in accordance with their taxonomy; branches that belong to Provora are coloured red; protein domain architectures are displayed for the IP3R family sequences in Provora: Ins145_P3_rec (PF08709), MIR (PF02815), RIH (PF01365), RIHa (PF08454), Ion channel (PF00520).

Extended Data Fig. 5 Maximum likelihood phylogenetic tree with MACPF domain-containing proteins in Provora.

The phylogeny was reconstructed by IQ-TREE with the best-fitting WAG+F+R5 model of evolution; node support was evaluated with 1000 UFBoot replicates, and nodes with over 95% support are marked with black circles; clades uniting putatively orthologous MACPF sequences in Nibbleromonas species are collapsed; species name abbreviations: At – Ancoracysta twista, Nm – Nebulomonas marisrubri, Uf – Ubysseya fretuma, Na – Nibbleromonas arcticus, Nk – Nibbleromonas kosolapovi, Nc – Nibbleromonas curacaus, Nq – Nibbleromonas quarantinus; the domain architectures of MACPF proteins identified using SMART searches are shown; MACPF domains outlined with dotted lines correspond to findings below the default detection threshold.

Extended Data Fig. 6 Proportions of shared to total orthogroup counts in pairwise comparisons of eukaryotic organisms.

Arithmetic means of the proportions of shared orthogroups between pairs of genomes or transcriptomes are shown using a heatmap; the organisms are grouped using a tree, which summarizes the current concept of eukaryotic phylogeny; orthogroup inference for members of the Provora lineage relied on the transcriptomic data; the Provora species are labelled in red, and the corresponding intragroup comparisons are outlined with a red square in the heatmap.

Extended Data Fig. 7 Mitochondrial genome maps of nibblerids.

Nibblerid mitochondrial genomes are typically circular-mapping, and gene-rich. All maps were edited to arbitrarily start at the ccmA gene. Genes are colour-coded according to their functional classification, as shown in the legend.

Extended Data Fig. 8 Mitochondrial genome maps of the nebulids, Ancoracysta twista and Nebulomonas marisrubri.

Nebulid mitochondrial genomes are circular-mapping, but are presented in a linear format to facilitate comparison of gene order. Mitochondrial genomes of A. twista (NC_036491.1) and N. marisrubri each contain duplications due to the presence of inverted repeats. All maps were edited to arbitrarily start at the ccmA gene. Genes are colour-coded according to their functional classification, as shown in the legend.

Extended Data Fig. 9 Provoran mitochondrial genomes retain ancestral features, but their sizes are variable due to group-I intron accumulation.

(a) Secondary structure predictions of mitochondrion-encoded RNAse P RNAs from Ubysseya fretuma, Nibbleromonas quarantinus, and N. curacaus; genes encoding rnpB have been identified in a small and phylogenetically disparate collection of eukaryotes, and are often very dissimilar from their counterparts in Alphaproteobacteria. All nibblerid mitochondrial genomes described here encode rnpB, and bear a strong resemblance to bacterial and jakobid rnpB homologs. Nucleotides with black borders indicate positions that are found in eubacterial consensus and jakobid rnpB homologs, and conserved helices are noted (P1-19). (b) Group-I introns that encode LAGLIDADG homing endonucleases are present in mitochondrial genomes in the genus Nibbleromonas; phylogenetic relationships between intron-encoded homing endonucleases of cox1 are shown as an exemplar of introns presence in nibblerid mitochondrial genomes. Some homologous homing endonucleases are present in the same position of N. kosolapovi, N. quarantinus and N. curacaus cox1 (e.g., intron 1 of each species), indicating that they were present in their common ancestor and have been broadly retained. Other introns are found in only N. kosolapovi, and one of N. quarantinus or N. curacaus, suggesting lineage-specific intron loss. In contrast, the endonuclease encoded in intron 6 of N. kosolapovi cox1 was likely gained via lateral transfer from fungi, where the endonuclease is also encoded by cox1 introns.

Extended Data Fig. 10 Maximum likelihood phylogenetic tree of nucleus-encoded holocytochrome c synthase (HCCS) from diverse eukaryotes (140 sites, LG+R7 model, 1000 ultrafast bootstraps).

A prior report demonstrated that the nebulid Ancoracysta twista retains both mitochondrion-encoded type-I and nucleus-encoded type-III cytochrome c maturation systems. Although nibblerids retain only the former, multiple strains of the newly described nebulid, Nebulomonas marisrubri, also have both types of cytochrome c maturation systems. In our phylogenetic reconstruction, N. marisrubri and A. twista HCCS proteins are monophyletic, though with only moderate statistical support. One thousand ultrafast bootstrap replicates were performed as a measure of statistical support. For clarity, bipartitions receiving full statistical support are represented by black circles and values less than 70 are not presented.

Supplementary information

Supplementary Discussion

Formal taxonomic descriptions and details of morphology. This file provides formal taxonomic descriptions of the new supergroup of eukaryotes Provora and all its taxa.

Reporting Summary

Supplementary Table 1

Support values (PMSF bootstrap with 100 replicates) for important tree bipartitions in the maximum-likelihood analysis with datasets generated by progressive removal of fast-evolving sites.

Supplementary Table 2

Approximately unbiased test P values for hypotheses, summarizing possible phylogenetic placements of Provora and Hemimastigophora, evaluated in a series of datasets generated by progressive removal of fast-evolving sites.

Supplementary Table 3

Mitochondrial genome characteristics across Provora. Statistics relating to the size, composition and content of nibblerid and nebulid mitochondrial genomes are presented. All genomes encode a similar number of proteins and RNAs, and are compositionally biased towards A + T, as frequently observed in mitochondria. The conformation of the N. kosolapovi mitochondrial genome could not be determined due to incomplete coverage.

Supplementary Data 1

List of 18S rRNA surveys from diverse environments used for analysis.

Supplementary Data 2

Table of KEGG orthologues identified by the KEGG automatic annotation server in the transcriptomic data of Provora isolates and the proteomes of other eukaryotic species; the KEGG orthology entries are defined and arranged according to the KEGG BRITE classification system; a measure of over- or under-representation is provided for each entry to highlight entries enriched (positive) or depleted (negative) in Provora relative to other eukaryotes.

Supplementary Data 3

Table of Pfam domains identified in the transcriptomic data of Provora isolates and the collection of proteomes in the EukProt database; the numbers in the table represent the counts of proteins with the corresponding Pfam domain rather than the total counts of the domain itself; a measure of over- or under-representation is provided for each Pfam entry to highlight entries enriched (positive) or depleted (negative) in Provora relative to the EukProt proteomes.

Supplementary Data 4

List of OTUs used in the analyses and the corresponding sources of sequencing data.

Supplementary Video 1

Feeding of N. kosolapovi on stramenopile prey. The video is accelerated by a factor of 2.

Supplementary Video 2

Biting off and engulfing part of the prey’s cell (P. sorokini) by the cell of N. quarantinus. The video is accelerated 10 times from the 20th second.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Tikhonenkov, D.V., Mikhailov, K.V., Gawryluk, R.M.R. et al. Microbial predators form a new supergroup of eukaryotes. Nature 612, 714–719 (2022).

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI:


By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.


Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing