Molecular phylogenetics of microbial eukaryotes has reshaped the tree of life by establishing broad taxonomic divisions, termed supergroups, that supersede the traditional kingdoms of animals, fungi and plants, and encompass a much greater breadth of eukaryotic diversity1. The vast majority of newly discovered species fall into a small number of known supergroups. Recently, however, a handful of species with no clear relationship to other supergroups have been described2,3,4, raising questions about the nature and degree of undiscovered diversity, and exposing the limitations of strictly molecular-based exploration. Here we report ten previously undescribed strains of microbial predators isolated through culture that collectively form a diverse new supergroup of eukaryotes, termed Provora. The Provora supergroup is genetically, morphologically and behaviourally distinct from other eukaryotes, and comprises two divergent clades of predators—Nebulidia and Nibbleridia—that are superficially similar to each other, but differ fundamentally in ultrastructure, behaviour and gene content. These predators are globally distributed in marine and freshwater environments, but are numerically rare and have consequently been overlooked by molecular-diversity surveys. In the age of high-throughput analyses, investigation of eukaryotic diversity through culture remains indispensable for the discovery of rare but ecologically and evolutionarily important eukaryotes.
This is a preview of subscription content, access via your institution
Access Nature and 54 other Nature Portfolio journals
Get Nature+, our best-value online-access subscription
$29.99 per month
cancel any time
Subscribe to this journal
Receive 51 print issues and online access
$199.00 per year
only $3.90 per issue
Rent or buy this article
Get just this article for as long as you need it
Prices may be subject to local taxes which are calculated during checkout
Raw transcriptome reads from Provora are deposited in GenBank (PRJNA866092), along with the SSU rRNA gene sequences of species (OP101998–OP102010). Assembled transcriptomes, mitochondrial genomes, materials of orthogroup and phylogenetic analyses, along with individual gene alignments, concatenated and trimmed alignments, and maximum-likelihood and Bayesian tree files for the phylogenomic dataset are available at Figshare (https://doi.org/10.6084/m9.figshare.20497143). The following databases were used in this study: NCBI nt (https://ftp.ncbi.nlm.nih.gov/blast/db/FASTA/nt.gz), NCBI non-redundant database (https://ftp.ncbi.nlm.nih.gov/blast/db/FASTA/nr.gz), Swiss-Prot (https://ftp.uniprot.org/pub/databases/uniprot/current_release/knowledgebase/complete/uniprot_sprot.fasta.gz), EukProt (https://figshare.com/articles/dataset/EukProt_a_database_of_genome-scale_predicted_proteins_across_the_diversity_of_eukaryotic_life/12417881/2), KEGG (https://www.genome.jp/kegg/), Pfam (http://ftp.ebi.ac.uk/pub/databases/Pfam/releases/Pfam32.0/). The following environmental sequencing datasets were used for 18S rRNA gene analysis: Tara Oceans (https://zenodo.org/record/3768510#.Y1ZtKuzMI1I), protists in European coastal waters and sediments (https://doi.org/10.1111/1462-2920.12955), Autonomous Reef Monitoring Structures (ARMS) in Red Sea (https://doi.org/10.1038/s41598-018-26332-5), Stream biofilm eukaryotic assemblages (https://doi.org/10.1016/j.ecolind.2020.106225), Deep sea basin sediments (https://doi.org/10.1038/s42003-021-02012-5), eukaryotic plankton in reef environments in Panama (https://doi.org/10.1007/s00338-020-01979-7), eukaryote communities in a high-alpine lake (https://doi.org/10.1007/s12275-019-8668-8), mountain lake microbial communities (https://doi.org/10.1111/mec.15469), microbial eukaryotes in Lake Baikal (https://doi.org/10.1093/femsec/fix073). A 320-gene dataset was used for constructing alignments for phylogenomic analyses (https://static-content.springer.com/esm/art%3A10.1038%2Fs41467-021-22044-z/MediaObjects/41467_2021_22044_MOESM5_ESM.zip). The new taxa have been registered with the Zoobank database (http://zoobank.org/) under the following accession codes: urn:lsid:zoobank.org:act:9EE01A01-E294-415B-A36F-0FB4373183D0, urn:lsid:zoobank.org:act:A54BD0FB-7FA3-42CB-9D3D-2211FA657DC0, urn:lsid:zoobank.org:act:F6395E20-7BDF-4CBE-95FB-E4CE1E7B8185, urn:lsid:zoobank.org:act:F1E8545D-BAC1-44FF-9B6B-8FEE4AC028BB, urn:lsid:zoobank.org:act:66A5C066-890F-4F25-AAB6-5CDCE2028034, urn:lsid:zoobank.org:act:830A4372-62D9-4CE1-BFD8-9FE9EED67FED, urn:lsid:zoobank.org:act:DFE7080B-6201-455A-99CE-903103CBB049, urn:lsid:zoobank.org:act:A230EC14-DC4B-4F05-8D69-8FE0BAB3DE09, urn:lsid:zoobank.org:act:B8894608-40D4-4D16-A4D9-6F448614F22C and urn:lsid:zoobank.org:act:97B89F6F-72D6-482A-9EA7-88E5C63E6EB6.
Keeling, P. J. & Burki, F. Progress towards the tree of eukaryotes. Curr. Biol. 29, R808–R817 (2019).
Gawryluk, R. M. R. et al. Non-photosynthetic predators are sister to red algae. Nature 572, 240–243 (2019).
Janouškovec, J. et al. A new lineage of eukaryotes illuminates early mitochondrial genome reduction. Curr. Biol. 27, 3717–3724 (2017).
Lax, G. et al. Hemimastigophora is a novel supra-kingdom-level lineage of eukaryotes. Nature 564, 410–414 (2018).
Oren, A. Prokaryote diversity and taxonomy: current status and future challenges. Philos. Trans. R. Soc. Lond. B 359, 623–638 (2004).
Shu, W. S. & Huang, L. N. Microbial diversity in extreme environments. Nat. Rev. Microbiol. 20, 219–235 (2022).
Massana, R., del Campo, J., Sieracki, M. E., Audic, S. & Logares, R. Exploring the uncultured microeukaryote majority in the oceans: reevaluation of ribogroups within stramenopiles. ISME J. 8, 854–866 (2014).
de Vargas, C. et al. Eukaryotic plankton diversity in the sunlit ocean. Science 348, 1261605 (2015).
Flegontova, O. et al. Extreme diversity of diplonemid eukaryotes in the ocean. Curr. Biol. 26, 3060–3065 (2016).
Ahlering, M. A. & Carrel, J. E. Predators are rare even when they are small. Oikos 95, 471–475 (2001).
Hehenberger, E. et al. Novel predators reshape holozoan phylogeny and reveal the presence of a two-component signaling system in the ancestor of animals. Curr. Biol. 27, 2043–2050 (2017).
Tikhonenkov, D. V. et al. Description of Colponema vietnamica sp. n. and Acavomonas peruviana n. gen. n. sp., two new alveolate phyla (Colponemidia nom. nov. and Acavomonidia nom. nov.) and their contributions to reconstructing the ancestral state of alveolates and eukaryotes. PLoS ONE 9, e95467 (2014).
Tikhonenkov, D. V. et al. New lineage of microbial predators adds complexity to reconstructing the evolutionary origin of animals. Curr. Biol. 30, 4500–4509 (2020).
Mylnikov, A. P. & Tikhonenkov, D. V. The new alveolate carnivorous flagellate Colponema marisrubri sp. n. (Colponemida, Alveolata) from the Red Sea. Zool. Zh. 88, 1163–1169 (2009).
Strassert, J. F. H., Irisarri, I., Williams, T. A. & Burki, F. A molecular timescale for eukaryote evolution with implications for the origin of red algal-derived plastids. Nat. Commun. 12, 1879 (2021).
Rodriguez-Ezpeleta, N. et al. Detecting and overcoming systematic errors in genome-scale phylogenies. Syst. Biol. 56, 389–399 (2007).
Strassert, J. F. H., Jamy, M., Mylnikov, A. P., Tikhonenkov, D. V. & Burki, F. New phylogenomic analysis of the enigmatic phylum Telonemia further resolves the eukaryote tree of life. Mol. Biol. Evol. 36, 757–765 (2019).
Lanfear, R., Kokko, H. & Eyre-Walker, A. Population size and the rate of evolution. Trends Ecol. Evol. 29, 33–41 (2014).
Bahler, M. & Rhoads, A. Calmodulin signaling via the IQ motif. FEBS Lett. 513, 107–113 (2002).
Schaffer, D. E., Iyer, L. M., Burroughs, A. M. & Aravind, L. Functional innovation in the evolution of the calcium-dependent system of the eukaryotic endoplasmic reticulum. Front. Genet. 11, 34 (2020).
Morita-Yamamuro, C. et al. The Arabidopsis gene CAD1 controls programmed cell death in the plant immune system and encodes a protein containing a MACPF domain. Plant Cell Physiol. 46, 902–912 (2005).
Rosado, C. J. et al. The MACPF/CDC family of pore-forming toxins. Cell. Microbiol. 10, 1765–1774 (2008).
Ishino, T., Chinzei, Y. & Yuda, M. A Plasmodium sporozoite protein with a membrane attack complex domain is required for breaching the liver sinusoidal cell layer prior to hepatocyte infection. Cell. Microbiol. 7, 199–208 (2005).
Satoh, H., Oshiro, N., Iwanaga, S., Namikoshi, M. & Nagai, H. Characterization of PsTX-60B, a new membrane-attack complex/perforin (MACPF) family toxin, from the venomous sea anemone Phyllodiscus semoni. Toxicon 49, 1208–1210 (2007).
Tikhonenkov, D. V., Mazei, Y. A. & Embulaeva, E. A. Degradation succession of heterotrophic flagellate communities in microcosms. Zh. Obs. Biol. 69, 57–64 (2008).
Tikhonenkov, D. V. et al. On the origin of TSAR: morphology, diversity and phylogeny of Telonemia. Open Biol. 12, 210325 (2022).
Picelli, S. et al. Full-length RNA-seq from single cells using Smart-seq2. Nat. Protoc. 9, 171–181 (2014).
Keeling, P. J., Poulson, N. & McFadden, G. I. Phylogenetic diversity of parabasalian symbionts from termites, including the phylogenetic position of Pseudotrypanosoma and Trichonympha. J. Eukaryot. Microbiol. 45, 643–650 (1998).
Medlin, L., Elwood, H. J., Stickel, S. & Sogin, M. L. The characterization of enzymatically amplified eukaryotic 16S-like rRNA-coding regions. Gene 71, 491–499 (1988).
Tikhonenkov, D. V., Janouškovec, J., Keeling, P. J. & Mylnikov, A. P. The morphology, ultrastructure and SSU rRNA gene sequence of a new freshwater flagellate, Neobodo borokensis n. sp. (Kinetoplastea, Excavata). J. Eukaryot. Microbiol. 63, 220–232 (2016).
Andrews, S. FastQC: a quality control tool for high throughput sequence data (Babraham Bioinformatics, 2010); https://www.bioinformatics.babraham.ac.uk/projects/fastqc/.
Zhang, J., Kobert, K., Flouri, T. & Stamatakis, A. PEAR: a fast and accurate Illumina Paired-End reAd mergeR. Bioinformatics 30, 614–620 (2013).
Bolger, A. M., Lohse, M. & Usadel, B. Trimmomatic: a flexible trimmer for Illumina sequence data. Bioinformatics 30, 2114–2120 (2014).
Grabherr, M. G. et al. Full-length transcriptome assembly from RNA-seq data without a reference genome. Nat. Biotechnol. 29, 644–652 (2011).
Altschul, S. F., Gish, W., Miller, W., Myers, E. W. & Lipman, D. J. Basic local alignment search tool. J. Mol. Biol. 215, 403–410 (1990).
Laetsch, D. R. & Blaxter, M. L. BlobTools: interrogation of genome assemblies. F1000Research 6, 1287 (2017).
Haas, B. J. et al. Denovo transcript sequence reconstruction from RNA-seq using the Trinity platform for reference generation and analysis. Nat. Protoc. 8, 1494–1512 (2013).
Li, W. & Godzik, A. Cd-hit: a fast program for clustering and comparing large sets of protein or nucleotide sequences. Bioinformatics 22, 1658–1659 (2006).
Buchfink, B., Xie, C. & Huson, D. H. Fast and sensitive protein alignment using DIAMOND. Nat. Methods 12, 59–60 (2015).
Shen, W. & Ren, H. TaxonKit: a practical and efficient NCBI taxonomy toolkit. J. Genet. Genomics 48, 844–850 (2021).
Richter, D. J. et al. EukProt: a database of genome-scale predicted proteins across the diversity of eukaryotes. Peer Community Journal 2, e56 (2022).
Simao, F. A., Waterhouse, R. M., Ioannidis, P., Kriventseva, E. V. & Zdobnov, E. M. BUSCO: assessing genome assembly and annotation completeness with single-copy orthologs. Bioinformatics 31, 3210–3212 (2015).
Kanehisa, M., Furumichi, M., Sato, Y., Ishiguro-Watanabe, M. & Tanabe, M. KEGG: integrating viruses and cellular organisms. Nucleic Acids Res. 49, D545–D551 (2021).
Moriya, Y., Itoh, M., Okuda, S., Yoshizawa, A. C. & Kanehisa, M. KAAS: an automatic genome annotation and pathway reconstruction server. Nucleic Acids Res. 35, W182–W185 (2007).
Burki, F. The eukaryotic tree of life from a global phylogenomic perspective. Cold Spring Harb. Perspect. Biol. 6, a016147 (2014).
Waskom, M. et al. mwaskom/Seaborn: v0.8.1 (September 2017). Zenodo https://doi.org/10.5281/zenodo.883859 (2017).
Eddy, S. R. Accelerated profile HMM searches. PLoS Comput. Biol. 7, e1002195 (2011).
Finn, R. D. et al. The Pfam protein families database: towards a more sustainable future. Nucleic Acids Res. 44, D279–D285 (2016).
Letunic, I. & Bork, P. 20 years of the SMART protein domain annotation resource. Nucleic Acids Res. 46, D493–D496 (2018).
Almagro Armenteros, J. J. et al. SignalP 5.0 improves signal peptide predictions using deep neural networks. Nat. Biotechnol. 37, 420–423 (2019).
Katoh, K. & Standley, D. M. MAFFT multiple sequence alignment software version 7: improvements in performance and usability. Mol. Biol. Evol. 30, 772–780 (2013).
Burns, J. A., Pittis, A. A. & Kim, E. Gene-based predictive models of trophic modes suggest Asgard archaea are not phagocytotic. Nat. Ecol. Evol. 2, 697–704 (2018).
Emms, D. M. & Kelly, S. OrthoFinder: phylogenetic orthology inference for comparative genomics. Genome Biol. 20, 238 (2019).
Altschul, S. F. et al. Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res. 25, 3389–3402 (1997).
Hall, T. A. BioEdit: a user-friendly biological sequence alignment editor and analysis program for Windows 95/98/NT. Nucleic Acids Symp. Ser. 41, 95–98 (1999).
Minh, B. Q. et al. IQ-TREE 2: new models and efficient methods for phylogenetic inference in the genomic era. Mol. Biol. Evol. 37, 1530–1534 (2020).
Whelan, S., Irisarri, I. & Burki, F. PREQUAL: detecting non-homologous characters in sets of unaligned homologous sequences. Bioinformatics 34, 3929–3930 (2018).
Capella-Gutierrez, S., Silla-Martinez, J. M. & Gabaldon, T. trimAl: a tool for automated alignment trimming in large-scale phylogenetic analyses. Bioinformatics 25, 1972–1973 (2009).
Roure, B., Rodriguez-Ezpeleta, N. & Philippe, H. SCaFoS: a tool for selection, concatenation and fusion of sequences for phylogenomics. BMC Evol. Biol. 7, S2 (2007).
Lartillot, N., Rodrigue, N., Stubbs, D. & Richer, J. PhyloBayes MPI: phylogenetic reconstruction with infinite mixtures of profiles in a parallel environment. Syst. Biol. 62, 611–615 (2013).
Dayhoff, M., Schwartz, R. & Orcutt, B. in Atlas of Protein Sequence and Structure (ed. Dayhoff, M.) 345–352 (National Biomedical Research Foundation, 1978).
Susko, E. & Roger, A. J. On reduced amino acid alphabets for phylogenetic inference. Mol. Biol. Evol. 24, 2139–2150 (2007).
Lartillot, N. & Philippe, H. A Bayesian mixture model for across-site heterogeneities in the amino-acid replacement process. Mol. Biol. Evol. 21, 1095–1109 (2004).
Quang le, S., Gascuel, O. & Lartillot, N. Empirical profile mixture models for phylogenetic reconstruction. Bioinformatics 24, 2317–2323 (2008).
Wang, H. C., Minh, B. Q., Susko, E. & Roger, A. J. Modeling site heterogeneity with posterior mean site frequency profiles accelerates accurate phylogenomic estimation. Syst. Biol. 67, 216–235 (2018).
Kück, P. & Struck, T. H. BaCoCa—a heuristic software tool for the parallel assessment of sequence biases in hundreds of gene and taxon partitions. Mol. Phylogenet. Evol. 70, 94–98 (2014).
Shimodaira, H. An approximately unbiased test of phylogenetic tree selection. Syst. Biol. 51, 492–508 (2002).
Kumar, S., Stecher, G. & Tamura, K. MEGA7: molecular evolutionary genetics analysis version 7.0 for bigger datasets. Mol. Biol. Evol. 33, 1870–1874 (2016).
Bankevich, A. et al. SPAdes: a new genome assembly algorithm and its applications to single-cell sequencing. J. Comput. Biol. 19, 455–477 (2012).
Dierckxsens, N., Mardulyn, P. & Smits, G. NOVOPlasty: de novo assembly of organelle genomes from whole genome data. Nucleic Acids Res. 45, e18 (2017).
Kuznetsov, A. & Bollin, C. J. in Multiple Sequence Alignment (ed. Katoh, K.) 261–295 (Springer, 2021).
Lohse, M., Drechsel, O., Kahlau, S. & Bock, R. OrganellarGenomeDRAW—a suite of tools for generating physical maps of plastid and mitochondrial genomes and visualizing expression data sets. Nucleic Acids Res. 41, W575–W581 (2013).
Johnson, P. Z., Kasprzak, W. K., Shapiro, B. A. & Simon, A. E. RNA2Drawer: geometrically strict drawing of nucleic acid structures with graphical structure editing and highlighting of complementary subsequences. RNA Biol. 16, 1667–1671 (2019).
Burger, G., Gray, M. W., Forget, L. & Lang, B. F. Strikingly bacteria-like and gene-rich mitochondrial genomes throughout jakobid protists. Genome Biol. Evol. 5, 418–438 (2013).
Criscuolo, A. & Gribaldo, S. BMGE (Block Mapping and Gathering with Entropy): a new software for selection of phylogenetic informative regions from multiple sequence alignments. BMC Evol. Biol. 10, 210 (2010).
Zhang, D. et al. PhyloSuite: an integrated and scalable desktop platform for streamlined molecular sequence data management and evolutionary phylogenetics studies. Mol. Ecol. Resour. 20, 348–355 (2020).
Nguyen, L.-T., Schmidt, H. A., von Haeseler, A. & Minh, B. Q. IQ-TREE: a fast and effective stochastic algorithm for estimating maximum-likelihood phylogenies. Mol. Biol. Evol. 32, 268–274 (2015).
Ibarbalz, F. M. et al. Global trends in marine plankton diversity across kingdoms of life. Cell 179, 1084–1097 (2019).
Massana, R. et al. Marine protist diversity in European coastal waters and sediments as revealed by high-throughput sequencing. Environ. Microbiol. 17, 4035–4049 (2015).
Gendron, E. M. S., Darcy, J. L., Hell, K. & Schmidt, S. K. Structure of bacterial and eukaryote communities reflect in situ controls on community assembly in a high-alpine lake. J. Microbiol. 57, 852–864 (2019).
Minerovic, A. D. et al. 18S-V9 DNA metabarcoding detects the effect of water-quality impairment. Ecol. Indic. 113, 106225 (2020).
Pearman, J. K. et al. Cross-shelf investigation of coral reef cryptic benthic organisms reveals diversity patterns of the hidden majority. Sci. Rep. 8, 8090 (2018).
Rodas, A. M. et al. Eukaryotic plankton communities across reef environments in Bocas del Toro Archipelago, Panamá. Coral Reefs 39, 1453–1467 (2020).
Schoenle, A. et al. High and specific diversity of protists in the deep-sea basins dominated by diplonemids, kinetoplastids, ciliates and foraminiferans. Commun. Biol. 4, 501 (2021).
Schulhof, M. A. et al. Sierra Nevada mountain lake microbial communities are structured by temperature, resources and geographic location. Mol. Ecol. 29, 2080–2093 (2020).
Yi, Z. et al. High-throughput sequencing of microbial eukaryotes in Lake Baikal reveals ecologically differentiated communities and novel evolutionary radiations. FEMS Microbiol. Ecol. 93, fix073 (2017).
We thank M. Vermeij and the staff at the CARMABI research station for field sampling support; and N. Kosolapova for help with sample collection in the Arctic. This research was supported by grants from the Russian Foundation for Basic Research (to D.V.T., grant no. 20-34-70049), the Tyumen Oblast Government, as part of the West-Siberian Interregional Science and Education Center’s project no. 89-DON (2) (to D.V.T.), the Ministry of Science and Higher Education of the Russian Federation within the framework of the Federal Scientific and Technical Program for the Development of Genetic Technologies for 2019-2027 (agreement no. 075-15-2021-1345, unique identifier RF-193021X0012), the Gordon and Betty Moore Foundation (to P.J.K., https://doi.org/10.37807/GBMF9201), GenomeBC and the Natural Sciences and Engineering Research Council of Canada (to P.J.K., grant no. 2019-03994), and was carried out within the framework of state assignment no. 121051100102-2.
The authors declare no competing interests.
Peer review information
Nature thanks Thijs Ettema, James McInerney and the other, anonymous, reviewer(s) for their contribution to the peer review of this work.
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Extended data figures and tables
Extended Data Fig. 1 Outline of tree topologies obtained in the phylogenomic analyses and the geographical distribution of Provora.
(a) Maximum-likelihood tree topology obtained with the 320-gene dataset; nodes with support values below 100% (PMSF model, 100 replicates) are labelled red, and the corresponding values are provided next to the tree nodes; established eukaryotic groups with full support in the analysis are collapsed and shown in the tree schematically with triangles. (b) PhyloBayes consensus tree topology obtained using four analysis chains with the native 320-gene dataset; posterior probabilities are shown for tree nodes that fail to achieve full support in the analysis. (c) PhyloBayes consensus tree topology obtained with the Dayhoff 6-recoded 320-gene dataset; the low posterior probability (0.58 pp) for the union of Provora and Haptista reflects the marginal support for this group in all four analysis chains, rather than the lack of convergence between the chains (maxdiff = 0.27). (d) PhyloBayes consensus tree topology obtained with the SR4-recoded 320-gene dataset. (e) Geographical distribution of environmental sequences of 18S rRNA belonging to Provora.
Extended Data Fig. 2 Phylogenies with variable regions of 18S rRNA featuring identified environmental sequences belonging to Provora.
(a) Phylogenetic tree based on the V4 region of the 18S rRNA gene showing the diversity of environmental lineages of Provora. (b) Phylogenetic tree based on the V9 region of the 18S rRNA gene. The 18S rRNA of Provora described in this paper are shown in red. Environmental sequences related to the members of Provora are labelled in blue. Bootstrap values ≥ 90% are indicated with black circles at the tree nodes.
Extended Data Fig. 3 Conservation of functional categories and trophic mode prediction for the transcriptomes of Provora.
(a) Heatmap of annotated KEGG orthology entry counts (presence/absence data) for functional categories defined by BRITE in the transcriptomic data of Provora isolates and the genomic data of eukaryotic organisms; the counts only include entries inferred to be ancestral for eukaryotes by the Dollo parsimony principle: entries that only have hits in one of the major eukaryotic subdivisions (Diaphoretickes, Discoba or Amorphea) were excluded; the counts were normalized to the inferred ancestral eukaryotic KEGG orthologs. (b) Principal component analysis plot with gene ontology category scores for categories associated with free-living phagocytic organisms; (c) Prediction probabilities of trophic modes (phagocytosis, prototrophy, photosynthesis) in Provora isolates, conducted by the Trophic Mode Prediction Tool.
Extended Data Fig. 4 Maximum likelihood phylogenetic tree with eukaryotic members of the inositol trisphosphate receptor family, identified by the presence of a RyR and IP3R homology associated domain (RIHa, PF08454) and an ion channel domain (PF00520).
The phylogeny was reconstructed by IQ-TREE using an alignment with 396 eukaryotic sequences, spanning the RIHa and ion channel regions of the proteins; reconstruction was done under the best-fitting LG+F+R10 model of evolution, and node support was evaluated with 1000 UFBoot replicates; nodes with over 95% support are marked with black circles; clades uniting members of a single taxon are collapsed in the tree and labelled in accordance with their taxonomy; branches that belong to Provora are coloured red; protein domain architectures are displayed for the IP3R family sequences in Provora: Ins145_P3_rec (PF08709), MIR (PF02815), RIH (PF01365), RIHa (PF08454), Ion channel (PF00520).
Extended Data Fig. 5 Maximum likelihood phylogenetic tree with MACPF domain-containing proteins in Provora.
The phylogeny was reconstructed by IQ-TREE with the best-fitting WAG+F+R5 model of evolution; node support was evaluated with 1000 UFBoot replicates, and nodes with over 95% support are marked with black circles; clades uniting putatively orthologous MACPF sequences in Nibbleromonas species are collapsed; species name abbreviations: At – Ancoracysta twista, Nm – Nebulomonas marisrubri, Uf – Ubysseya fretuma, Na – Nibbleromonas arcticus, Nk – Nibbleromonas kosolapovi, Nc – Nibbleromonas curacaus, Nq – Nibbleromonas quarantinus; the domain architectures of MACPF proteins identified using SMART searches are shown; MACPF domains outlined with dotted lines correspond to findings below the default detection threshold.
Extended Data Fig. 6 Proportions of shared to total orthogroup counts in pairwise comparisons of eukaryotic organisms.
Arithmetic means of the proportions of shared orthogroups between pairs of genomes or transcriptomes are shown using a heatmap; the organisms are grouped using a tree, which summarizes the current concept of eukaryotic phylogeny; orthogroup inference for members of the Provora lineage relied on the transcriptomic data; the Provora species are labelled in red, and the corresponding intragroup comparisons are outlined with a red square in the heatmap.
Extended Data Fig. 7 Mitochondrial genome maps of nibblerids.
Nibblerid mitochondrial genomes are typically circular-mapping, and gene-rich. All maps were edited to arbitrarily start at the ccmA gene. Genes are colour-coded according to their functional classification, as shown in the legend.
Extended Data Fig. 8 Mitochondrial genome maps of the nebulids, Ancoracysta twista and Nebulomonas marisrubri.
Nebulid mitochondrial genomes are circular-mapping, but are presented in a linear format to facilitate comparison of gene order. Mitochondrial genomes of A. twista (NC_036491.1) and N. marisrubri each contain duplications due to the presence of inverted repeats. All maps were edited to arbitrarily start at the ccmA gene. Genes are colour-coded according to their functional classification, as shown in the legend.
Extended Data Fig. 9 Provoran mitochondrial genomes retain ancestral features, but their sizes are variable due to group-I intron accumulation.
(a) Secondary structure predictions of mitochondrion-encoded RNAse P RNAs from Ubysseya fretuma, Nibbleromonas quarantinus, and N. curacaus; genes encoding rnpB have been identified in a small and phylogenetically disparate collection of eukaryotes, and are often very dissimilar from their counterparts in Alphaproteobacteria. All nibblerid mitochondrial genomes described here encode rnpB, and bear a strong resemblance to bacterial and jakobid rnpB homologs. Nucleotides with black borders indicate positions that are found in eubacterial consensus and jakobid rnpB homologs, and conserved helices are noted (P1-19). (b) Group-I introns that encode LAGLIDADG homing endonucleases are present in mitochondrial genomes in the genus Nibbleromonas; phylogenetic relationships between intron-encoded homing endonucleases of cox1 are shown as an exemplar of introns presence in nibblerid mitochondrial genomes. Some homologous homing endonucleases are present in the same position of N. kosolapovi, N. quarantinus and N. curacaus cox1 (e.g., intron 1 of each species), indicating that they were present in their common ancestor and have been broadly retained. Other introns are found in only N. kosolapovi, and one of N. quarantinus or N. curacaus, suggesting lineage-specific intron loss. In contrast, the endonuclease encoded in intron 6 of N. kosolapovi cox1 was likely gained via lateral transfer from fungi, where the endonuclease is also encoded by cox1 introns.
Extended Data Fig. 10 Maximum likelihood phylogenetic tree of nucleus-encoded holocytochrome c synthase (HCCS) from diverse eukaryotes (140 sites, LG+R7 model, 1000 ultrafast bootstraps).
A prior report demonstrated that the nebulid Ancoracysta twista retains both mitochondrion-encoded type-I and nucleus-encoded type-III cytochrome c maturation systems. Although nibblerids retain only the former, multiple strains of the newly described nebulid, Nebulomonas marisrubri, also have both types of cytochrome c maturation systems. In our phylogenetic reconstruction, N. marisrubri and A. twista HCCS proteins are monophyletic, though with only moderate statistical support. One thousand ultrafast bootstrap replicates were performed as a measure of statistical support. For clarity, bipartitions receiving full statistical support are represented by black circles and values less than 70 are not presented.
Formal taxonomic descriptions and details of morphology. This file provides formal taxonomic descriptions of the new supergroup of eukaryotes Provora and all its taxa.
Supplementary Table 1
Support values (PMSF bootstrap with 100 replicates) for important tree bipartitions in the maximum-likelihood analysis with datasets generated by progressive removal of fast-evolving sites.
Supplementary Table 2
Approximately unbiased test P values for hypotheses, summarizing possible phylogenetic placements of Provora and Hemimastigophora, evaluated in a series of datasets generated by progressive removal of fast-evolving sites.
Supplementary Table 3
Mitochondrial genome characteristics across Provora. Statistics relating to the size, composition and content of nibblerid and nebulid mitochondrial genomes are presented. All genomes encode a similar number of proteins and RNAs, and are compositionally biased towards A + T, as frequently observed in mitochondria. The conformation of the N. kosolapovi mitochondrial genome could not be determined due to incomplete coverage.
Supplementary Data 1
List of 18S rRNA surveys from diverse environments used for analysis.
Supplementary Data 2
Table of KEGG orthologues identified by the KEGG automatic annotation server in the transcriptomic data of Provora isolates and the proteomes of other eukaryotic species; the KEGG orthology entries are defined and arranged according to the KEGG BRITE classification system; a measure of over- or under-representation is provided for each entry to highlight entries enriched (positive) or depleted (negative) in Provora relative to other eukaryotes.
Supplementary Data 3
Table of Pfam domains identified in the transcriptomic data of Provora isolates and the collection of proteomes in the EukProt database; the numbers in the table represent the counts of proteins with the corresponding Pfam domain rather than the total counts of the domain itself; a measure of over- or under-representation is provided for each Pfam entry to highlight entries enriched (positive) or depleted (negative) in Provora relative to the EukProt proteomes.
Supplementary Data 4
List of OTUs used in the analyses and the corresponding sources of sequencing data.
Supplementary Video 1
Feeding of N. kosolapovi on stramenopile prey. The video is accelerated by a factor of 2.
Supplementary Video 2
Biting off and engulfing part of the prey’s cell (P. sorokini) by the cell of N. quarantinus. The video is accelerated 10 times from the 20th second.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Tikhonenkov, D.V., Mikhailov, K.V., Gawryluk, R.M.R. et al. Microbial predators form a new supergroup of eukaryotes. Nature 612, 714–719 (2022). https://doi.org/10.1038/s41586-022-05511-5
By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.