Unexpected mitochondrial genome diversity revealed by targeted single-cell genomics of heterotrophic flagellated protists


Most eukaryotic microbial diversity is uncultivated, under-studied and lacks nuclear genome data. Mitochondrial genome sampling is more comprehensive, but many phylogenetically important groups remain unsampled. Here, using a single-cell sorting approach combining tubulin-specific labelling with photopigment exclusion, we sorted flagellated heterotrophic unicellular eukaryotes from Pacific Ocean samples. We recovered 206 single amplified genomes, predominantly from underrepresented branches on the tree of life. Seventy single amplified genomes contained unique mitochondrial contigs, including 21 complete or near-complete mitochondrial genomes from formerly under-sampled phylogenetic branches, including telonemids, katablepharids, cercozoans and marine stramenopiles, effectively doubling the number of available samples of heterotrophic flagellate mitochondrial genomes. Collectively, these data identify a dynamic history of mitochondrial genome evolution including intron gain and loss, extensive patterns of genetic code variation and complex patterns of gene loss. Surprisingly, we found that stramenopile mitochondrial content is highly plastic, resembling patterns of variation previously observed only in plants.

Access options

Rent or Buy article

Get time limited or full article access on ReadCube.


All prices are NET prices.

Fig. 1: V9-nSSU phylogenetic mapping of Monterey Bay SAGs.
Fig. 2: Clade-specific maximum-likelihood subtrees showing subsections of the eukaryotic diversity sampled.
Fig. 3: Distribution and groupings of mitochondrial sequence coverage relative to estimated nuclear genome completeness.
Fig. 4: Uncharacterized mtDNAs from underrepresented eukaryotic groups.
Fig. 5: Comparison between mtDNA gene repertoires.
Fig. 6: Phylogenetic reconstruction of representative stramenopiles using concatenated conserved mitochondria-encoded electron transport chain proteins.

Data availability

Complete mtDNA sequences assembled from this study are available at GenBank under the accession numbers MK188935 to MK188947, MN082144 and MN082145. Sequencing data are available under NCBI BioProject PRJNA379597. Reads have been deposited at NCBI Sequence Read Archive with accession number SRP102236. Partial mtDNA contigs and other important contigs mentioned in the text are available from Figshare at https://doi.org/10.6084/m9.figshare.7314728. Nuclear SAG assemblies are available from Figshare at https://doi.org/10.6084/m9.figshare.7352966. A protocol is available from protocols.io at: https://doi.org/10.17504/protocols.io.ywpfxdn.

Code availability

The bioinformatic workflow is available at https://doi.org/10.5281/zenodo.192677; additional statistical analysis code is available at https://doi.org/10.6084/m9.figshare.9884309.


  1. 1.

    Martijn, J., Vosseberg, J., Guy, L., Offre, P. & Ettema, T. J. G. Deep mitochondrial origin outside the sampled alphaproteobacteria. Nature 557, 101–105 (2018).

    CAS  PubMed  Google Scholar 

  2. 2.

    Roger, A. J., Muñoz-Gómez, S. A. & Kamikawa, R. The origin and diversification of mitochondria. Curr. Biol. 27, R1177–R1192 (2017).

    CAS  PubMed  Google Scholar 

  3. 3.

    Martin, W. & Herrmann, R. G. Gene transfer from organelles to the nucleus: how much, what happens, and why? Plant Physiol. 118, 9–17 (1998).

    CAS  PubMed  PubMed Central  Google Scholar 

  4. 4.

    Gray, M. W. et al. Genome structure and gene content in protist mitochondrial DNAs. Nucleic Acids Res. 26, 865–878 (1998).

    CAS  PubMed  PubMed Central  Google Scholar 

  5. 5.

    Worden, A. Z. et al. Rethinking the marine carbon cycle: factoring in the multifarious lifestyles of microbes. Science. 347, 1257594 (2015).

    PubMed  Google Scholar 

  6. 6.

    Cuvelier, M. L. et al. Targeted metagenomics and ecology of globally important uncultured eukaryotic phytoplankton. Proc. Natl Acad. Sci. USA 107, 14679–14684 (2010).

    CAS  PubMed  Google Scholar 

  7. 7.

    Worden, A. Z. et al. Global distribution of a wild alga revealed by targeted metagenomics. Curr. Biol. 22, R675–R677 (2012).

    CAS  PubMed  Google Scholar 

  8. 8.

    Keeling, P. J. et al. The Marine Microbial Eukaryote Transcriptome Sequencing Project (MMETSP): illuminating the functional diversity of eukaryotic life in the oceans through transcriptome sequencing. PLoS Biol. 12, e1001889 (2014).

    PubMed  PubMed Central  Google Scholar 

  9. 9.

    Gawryluk, R. M. R. et al. Morphological identification and single-cell genomics of marine diplonemids. Curr. Biol. 26, 3053–3059 (2016).

    CAS  PubMed  Google Scholar 

  10. 10.

    Strassert, J. F. H. et al. Single cell genomics of uncultured marine alveolates shows paraphyly of basal dinoflagellates. ISME J. 12, 304–308 (2018).

    CAS  PubMed  Google Scholar 

  11. 11.

    Yoon, H. S. et al. Single-cell genomics reveals organismal interactions in uncultivated marine protists. Science 332, 714–717 (2011).

    CAS  PubMed  Google Scholar 

  12. 12.

    Bhattacharya, D. et al. Single cell genome analysis supports a link between phagotrophy and primary plastid endosymbiosis. Sci. Rep. 2, 356 (2012).

    PubMed  PubMed Central  Google Scholar 

  13. 13.

    Roy, R. S. et al. Single cell genome analysis of an uncultured heterotrophic stramenopile. Sci. Rep. 4, 4780 (2014).

    PubMed  PubMed Central  Google Scholar 

  14. 14.

    Mangot, J.-F. et al. Accessing the genomic information of unculturable oceanic picoeukaryotes by combining multiple single cells. Sci. Rep. 7, 41498 (2017).

    CAS  PubMed  PubMed Central  Google Scholar 

  15. 15.

    Seeleuthner, Y. et al. Single-cell genomics of multiple uncultured stramenopiles reveals underestimated functional diversity across oceans. Nat. Commun. 9, 310 (2018).

    PubMed  PubMed Central  Google Scholar 

  16. 16.

    Martinez-Garcia, M. et al. Unveiling in situ interactions between marine protists and bacteria through single cell sequencing. ISME J. 6, 703–707 (2012).

    CAS  PubMed  Google Scholar 

  17. 17.

    Sieracki, M. E. et al. Single cell genomics yields a wide diversity of small planktonic protists across major ocean ecosystems. Sci. Rep. 9, 6025 (2019).

    CAS  PubMed  PubMed Central  Google Scholar 

  18. 18.

    Rose, J., Caron, D., Sieracki, M. & Poulton, N. Counting heterotrophic nanoplanktonic protists in cultures and aquatic communities by flow cytometry. Aquat. Microb. Ecol. 34, 263–277 (2004).

    Google Scholar 

  19. 19.

    Richards, T. A. & Talbot, N. J. Horizontal gene transfer in osmotrophs: playing with public goods. Nat. Rev. Microbiol. 11, 720–727 (2013).

    CAS  PubMed  Google Scholar 

  20. 20.

    Vrieling, E. G., Gieskes, W. W. C. & Beelen, T. P. M. Silicon deposition in diatoms: control by the pH inside the silicon deposition vesicle. J. Phycol. 35, 548–559 (1999).

    CAS  Google Scholar 

  21. 21.

    Kawai, A., Uchiyama, H., Takano, S., Nakamura, N. & Ohkuma, S. Autophagosome–lysosome fusion depends on the pH in acidic compartments in CHO cells. Autophagy 3, 154–157 (2007).

    CAS  PubMed  Google Scholar 

  22. 22.

    Wilken, S. et al. The need to account for cell biology in characterizing predatory mixotrophs in aquatic environments. Philos. T. R. Soc. B 374, 20190090 (2019).

    Google Scholar 

  23. 23.

    Dean, F. B. et al. Comprehensive human genome amplification using multiple displacement amplification. Proc. Natl Acad. Sci. USA 99, 5261–5266 (2002).

    CAS  PubMed  Google Scholar 

  24. 24.

    Richards, T. A., Jones, M. D. M., Leonard, G. & Bass, D. Marine fungi: their ecology and molecular diversity. Annu. Rev. Mar. Sci. 4, 495–522 (2012).

    Google Scholar 

  25. 25.

    Altschul, S. F. et al. Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res. 25, 3389–3402 (1997).

    CAS  PubMed  PubMed Central  Google Scholar 

  26. 26.

    Hotelling, H. The generalization of Student’s ratio. Ann. Math. Stat. 2, 360–378 (1931).

    Google Scholar 

  27. 27.

    Derelle, R., López-García, P., Timpano, H. & Moreira, D. A phylogenomic framework to study the diversity and evolution of stramenopiles (=heterokonts). Mol. Biol. Evol. 33, 2890–2898 (2016).

    CAS  PubMed  PubMed Central  Google Scholar 

  28. 28.

    Flegontov, P. et al. Divergent mitochondrial respiratory chains in phototrophic relatives of apicomplexan parasites. Mol. Biol. Evol. 32, 1115–1131 (2015).

    CAS  PubMed  Google Scholar 

  29. 29.

    Janouškovec, J. et al. A new lineage of eukaryotes illuminates early mitochondrial genome reduction. Curr. Biol. 27, 3717–3724 (2017).

    PubMed  Google Scholar 

  30. 30.

    Gray, M. W., Lang, B. F. & Burger, G. Mitochondria of protists. Annu. Rev. Genet. 38, 477–524 (2004).

    CAS  PubMed  Google Scholar 

  31. 31.

    Wang, Z. et al. Complete mitochondrial genome of a DHA-rich protist Schizochytrium sp. TIO1101. Mitochondrial DNA B 1, 126–127 (2016).

    Google Scholar 

  32. 32.

    Saldanha, R., Mohr, G., Belfort, M. & Lambowitz, A. M. Group I and group II introns. FASEB J. 7, 15–24 (1993).

    CAS  PubMed  Google Scholar 

  33. 33.

    Goddard, M. R. & Burt, A. Recurrent invasion and extinction of a selfish gene. Proc. Natl Acad. Sci. USA 96, 13880–13885 (1999).

    CAS  PubMed  Google Scholar 

  34. 34.

    Hauth, A. M., Maier, U. G., Lang, B. F. & Burger, G. The Rhodomonas salina mitochondrial genome: bacteria-like operons, compact gene arrangement and complex repeat region. Nucleic Acids Res. 33, 4433–4442 (2005).

    CAS  PubMed  PubMed Central  Google Scholar 

  35. 35.

    Kim, E. et al. Complete sequence and analysis of the mitochondrial genome of Hemiselmis andersenii CCMP644 (cryptophyceae). BMC Genomics 9, 215 (2008).

    PubMed  PubMed Central  Google Scholar 

  36. 36.

    Nishimura, Y. et al. Mitochondrial genome of Palpitomonas bilix: derived genome structure and ancestral system for cytochrome c maturation. Genome Biol. Evol. 8, 3090–3098 (2016).

    CAS  PubMed  PubMed Central  Google Scholar 

  37. 37.

    Riisberg, I. et al. Seven gene phylogeny of heterokonts. Protist 160, 191–204 (2009).

    CAS  PubMed  Google Scholar 

  38. 38.

    Oudot-Le Secq, M.-P., Loiseaux-de Goër, S., Stam, W. T. & Olsen, J. L. Complete mitochondrial genomes of the three brown algae (heterokonta: Phaeophyceae) Dictyota dichotoma, Fucus vesiculosus and Desmarestia viridis. Curr. Genet. 49, 47–58 (2006).

    CAS  PubMed  Google Scholar 

  39. 39.

    Leonard, G. et al. Comparative genomic analysis of the ‘pseudofungus’ Hyphochytrium catenoides. Open Biol. 8, 170184 (2018).

    PubMed  PubMed Central  Google Scholar 

  40. 40.

    Massana, R., del Campo, J., Sieracki, M. E., Audic, S. & Logares, R. Exploring the uncultured microeukaryote majority in the oceans: reevaluation of ribogroups within stramenopiles. ISME J. 8, 854–866 (2014).

    PubMed  Google Scholar 

  41. 41.

    Kannan, S., Rogozin, I. B. & Koonin, E. V. MitoCOGs: clusters of orthologous genes from mitochondria and implications for the evolution of eukaryotes. BMC Evol. Biol. 14, 237 (2014).

    PubMed  PubMed Central  Google Scholar 

  42. 42.

    Ševčíková, T. et al. A comparative analysis of mitochondrial genomes in eustigmatophyte algae. Genome Biol. Evol. 8, 705–722 (2016).

    PubMed  PubMed Central  Google Scholar 

  43. 43.

    Johnston, I. G. & Williams, B. P. Evolutionary Inference across eukaryotes identifies specific pressures favoring mitochondrial gene retention. Cell Syst. 2, 101–111 (2016).

    CAS  PubMed  Google Scholar 

  44. 44.

    Keeling, P. J. Genomics: evolution of the genetic code. Curr. Biol. 26, R851–R853 (2016).

    CAS  PubMed  Google Scholar 

  45. 45.

    Demir-Hilton, E. et al. Global distribution patterns of distinct clades of the photosynthetic picoeukaryote Ostreococcus. ISME J. 5, 1095–1107 (2011).

    CAS  PubMed  PubMed Central  Google Scholar 

  46. 46.

    Logares, R. et al. Patterns of rare and abundant marine microbial eukaryotes. Curr. Biol. 24, 813–821 (2014).

    CAS  PubMed  Google Scholar 

  47. 47.

    Zhang, J., Kobert, K., Flouri, T. & Stamatakis, A. PEAR: a fast and accurate Illumina Paired-End reAd mergeR. Bioinformatics 30, 614–620 (2014).

    CAS  PubMed  Google Scholar 

  48. 48.

    Bankevich, A. et al. SPAdes: a new genome assembly algorithm and its applications to single- cell sequencing. J. Comput. Biol. 19, 455–477 (2012).

    CAS  PubMed  PubMed Central  Google Scholar 

  49. 49.

    Gurevich, A., Saveliev, V., Vyahhi, N. & Tesler, G. QUAST: quality assessment tool for genome assemblies. Bioinformatics 29, 1072–1075 (2013).

    CAS  PubMed  PubMed Central  Google Scholar 

  50. 50.

    Parra, G., Bradnam, K. & Korf, I. CEGMA: a pipeline to accurately annotate core genes in eukaryotic genomes. Bioinformatics 23, 1061–1067 (2007).

    CAS  PubMed  Google Scholar 

  51. 51.

    Kumar, S., Jones, M., Koutsovoulos, G., Clarke, M. & Blaxter, M. Blobology: exploring raw genome data for contaminants, symbionts and parasites using taxon-annotated GC-coverage plots. Front. Genet. 4, 237 (2013).

    PubMed  PubMed Central  Google Scholar 

  52. 52.

    Okonechnikov, K., Conesa, A. & García-Alcalde, F. Qualimap 2: advanced multi-sample quality control for high-throughput sequencing data. Bioinformatics 32, btv566 (2015).

    Google Scholar 

  53. 53.

    Amaral-Zettler, L. A., McCliment, E. A., Ducklow, H. W. & Huse, S. M. A method for studying protistan diversity using massively parallel sequencing of V9 hypervariable regions of small-subunit ribosomal RNA genes. PLoS ONE 4, e6372 (2009).

    PubMed  PubMed Central  Google Scholar 

  54. 54.

    Guillou, L. et al. The Protist Ribosomal Reference database (PR2): a catalog of unicellular eukaryote small sub-unit rRNA sequences with curated taxonomy. Nucleic Acids Res. 41, D597–D604 (2013).

    CAS  PubMed  Google Scholar 

  55. 55.

    Fu, L., Niu, B., Zhu, Z., Wu, S. & Li, W. CD-HIT: accelerated for clustering the next-generation sequencing data. Bioinformatics 28, 3150–3152 (2012).

    CAS  PubMed  PubMed Central  Google Scholar 

  56. 56.

    Caporaso, J. G. et al. PyNAST: a flexible tool for aligning sequences to a template alignment. Bioinformatics 26, 266–267 (2010).

    CAS  PubMed  Google Scholar 

  57. 57.

    Quast, C. et al. The SILVA ribosomal RNA gene database project: improved data processing and web-based tools. Nucleic Acids Res. 41, D590–D596 (2012).

    PubMed  PubMed Central  Google Scholar 

  58. 58.

    Capella-Gutiérrez, S., Silla-Martínez, J. M. & Gabaldón, T. trimAl: a tool for automated alignment trimming in large-scale phylogenetic analyses. Bioinformatics 25, 1972–1973 (2009).

    PubMed  PubMed Central  Google Scholar 

  59. 59.

    Stamatakis, A. RAxML version 8: a tool for phylogenetic analysis and post-analysis of large phylogenies. Bioinformatics 30, 1312–1313 (2014).

    CAS  PubMed  PubMed Central  Google Scholar 

  60. 60.

    Berger, S. A., Krompass, D. & Stamatakis, A. Performance, accuracy, and web server for evolutionary placement of short sequence reads under maximum likelihood. Syst. Biol. 60, 291–302 (2011).

    PubMed  PubMed Central  Google Scholar 

  61. 61.

    Shimodaira, H. & Hasegawa, M. Multiple comparisons of log-likelihoods with applications to phylogenetic inference. Mol. Biol. Evol. 16, 1114–1116 (1999).

    CAS  Google Scholar 

  62. 62.

    Price, M. N., Dehal, P. S. & Arkin, A. P. FastTree 2—approximately maximum-likelihood trees for large alignments. PLoS ONE 5, e9490 (2010).

    PubMed  PubMed Central  Google Scholar 

  63. 63.

    Junier, T. & Zdobnov, E. M. The Newick utilities: high-throughput phylogenetic tree processing in the UNIX shell. Bioinformatics 26, 1669–1670 (2010).

    CAS  PubMed  PubMed Central  Google Scholar 

  64. 64.

    Yu, G., Smith, D. K., Zhu, H., Guan, Y. & Lam, T. T.-Y. ggtree: an R package for visualization and annotation of phylogenetic trees with their covariates and other associated data. Methods Ecol. Evol. 8, 28–36 (2017).

    Google Scholar 

  65. 65.

    Martin, M. Cutadapt removes adapter sequences from high-throughput sequencing reads. EMBnet.journal 17, 10 (2011).

    Google Scholar 

  66. 66.

    Camacho, C. et al. BLAST+: architecture and applications. BMC Bioinformatics 10, 421 (2009).

    PubMed  PubMed Central  Google Scholar 

  67. 67.

    Callahan, B. J. et al. DADA2: high-resolution sample inference from Illumina amplicon data. Nat. Methods 13, 581–583 (2016).

    CAS  PubMed  PubMed Central  Google Scholar 

  68. 68.

    Wang, Q., Garrity, G. M., Tiedje, J. M. & Cole, J. R. Naive Bayesian classifier for rapid assignment of rRNA sequences into the new bacterial taxonomy. Appl. Environ. Microb. 73, 5261–5267 (2007).

    CAS  Google Scholar 

  69. 69.

    Rice, P., Longden, I. & Bleasby, A. EMBOSS: the European Molecular Biology Open Software Suite. Trends Genet. 16, 276–277 (2000).

    CAS  PubMed  Google Scholar 

  70. 70.

    McMurdie, P. J. & Holmes, S. phyloseq: an R package for reproducible interactive analysis and graphics of microbiome census data. PLoS ONE 8, e61217 (2013).

    CAS  PubMed  PubMed Central  Google Scholar 

  71. 71.

    Li, H. & Durbin, R. Fast and accurate short read alignment with Burrows–Wheeler transform. Bioinformatics 25, 1754–1760 (2009).

    CAS  PubMed  PubMed Central  Google Scholar 

  72. 72.

    Li, H. et al. The Sequence Alignment/Map format and SAMtools. Bioinformatics 25, 2078–2079 (2009).

    PubMed  PubMed Central  Google Scholar 

  73. 73.

    Barnett, D. W., Garrison, E. K., Quinlan, A. R., Stromberg, M. P. & Marth, G. T. BamTools: a C++ API and toolkit for analyzing and managing BAM files. Bioinformatics 27, 1691–1692 (2011).

    CAS  PubMed  PubMed Central  Google Scholar 

  74. 74.

    Wheeler, T. J. & Eddy, S. R. nhmmer: DNA homology search with profile HMMs. Bioinformatics 29, 2487–2489 (2013).

    CAS  PubMed  PubMed Central  Google Scholar 

  75. 75.

    Jackson, C. J. et al. Broad genomic and transcriptional analysis reveals a highly derived genome in dinoflagellate mitochondria. BMC Biol. 5, 41 (2007).

    PubMed  PubMed Central  Google Scholar 

  76. 76.

    Grant, J. R. & Stothard, P. The CGView server: a comparative genomics tool for circular genomes. Nucleic Acids Res. 36, W181–W184 (2008).

    CAS  PubMed  PubMed Central  Google Scholar 

  77. 77.

    Wickham, H. ggplot2: Elegant Graphics for Data Analysis (Springer–Verlag, 2009).

  78. 78.

    Signorell, A. DescTools: tools for descriptive statistics R package v.0.99.23 (2017).

  79. 79.

    R Core Team. R: a Language and Environment for Statistical Computing http://www.r-project.org/ (R Foundation for Statistical Computing, 2013).

  80. 80.

    Edgar, R. C. MUSCLE: a multiple sequence alignment method with reduced time and space complexity. BMC Bioinformatics 5, 113 (2004).

    PubMed  PubMed Central  Google Scholar 

  81. 81.

    Nguyen, L.-T., Schmidt, H. A., von Haeseler, A. & Minh, B. Q. IQ-TREE: a fast and effective stochastic algorithm for estimating maximum-likelihood phylogenies. Mol. Biol. Evol. 32, 268–274 (2015).

    CAS  PubMed  Google Scholar 

  82. 82.

    Ronquist, F. & Huelsenbeck, J. P. MrBayes 3: Bayesian phylogenetic inference under mixed models. Bioinformatics 19, 1572–1574 (2003).

    CAS  Google Scholar 

  83. 83.

    Stamatakis, A. RAxML-VI-HPC: maximum likelihood-based phylogenetic analyses with thousands of taxa and mixed models. Bioinformatics 22, 2688–2690 (2006).

    CAS  PubMed  Google Scholar 

  84. 84.

    Le, S. Q. & Gascuel, O. An improved general amino acid replacement matrix. Mol. Biol. Evol. 25, 1307–1320 (2008).

    CAS  PubMed  Google Scholar 

Download references


We thank F. Lang and N. Beck for annotation assistance and access to an unreleased version of mfannot, D. Price for assistance with picozoan SAG data, and C. Dunn for discussions and encouragement. This project was supported by a Gordon and Betty Moore foundation grant (GBMF3307) to T.A.R., A.E.S., A.Z.W. and P.J.K. and a Philip Leverhulme Award (PLP-2014–147) to T.A.R.. Field sampling was supported by the David and Lucile Packard Foundation and GBMF3788 to A.Z.W., T.A.R. and A.M. are supported by Royal Society University Research Fellowships. J.G.W. was supported by the European Molecular Biology Organization Long-term Fellowship (ALTF 761–2014) co-funded by the European Commission (EMBOCOFUND2012, GA-2012–600394) support from Marie Curie Actions and a College for Life Sciences Fellowship at the Wissenschaftskolleg zu Berlin. R.R.-M. is supported by CONICYT FONDECYT 11170748. F.M. is supported by Genome Canada.

Author information




J.G.W. performed bioinformatic and phylogenetic analyses and wrote the manuscript. R.R.-M. performed molecular biological analyses. A.M. performed bioinformatic and phylogenetic analyses and G.L. performed bioinformatic analyses. E.C. and C.P. collected the samples and performed flow cytometry. F.M. performed statistical and bioinformatic analyses. D.M. performed molecular biological experiments and generated biochemical reagents. K.M. performed genome sequencing. N.A.T.I. analysed genomic data. T.A.R. devised the project. J.G.W., A.E.S., P.J.K., A.Z.W. and T.A.R. supervised the project and wrote the manuscript. All authors contributed to the editing of the final manuscript.

Corresponding authors

Correspondence to Jeremy G. Wideman or Thomas A. Richards.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Extended data

Extended Data Fig. 1 Rank abundance curve of amplicon sequence variants (ASVs) from the Monterey Bay nSSU-V9 environmental census.

Relative abundances correspond to the mean relative abundance of each ASV in samples from two depths (20 m and 30 m) of eastern North Pacific station M2 (SAGs were recovered from 30 m depth). ASV sequences identical to V9 sequences from SAGs with recovered mitochondrial genomic information are represented by red circles; ASVs with no identical sequence match to V9 SAGs with mitochondrial data are represented by grey circles. For each ASV identical to a SAG V9, the corresponding SAG codenames are provided (in some cases there are multiple of each type). Samples are coloured according to taxonomic affiliation in V9 sorting. Blue, stramenopile; teal, hacrobian; purple, rhizarian; brown, opisthokont. See Supplemenatry Table 7 for details on ASV relative abundance.

Extended Data Fig. 2 Cox1 protein phylogeny.

Cox1 proteins were collected from representative eukaryote groups using BLAST26, aligned using MUSCLE81, and manually trimmed to a resulting 402 sites. We reconstructed the phylogeny of Cox1 using RAxML v8.2.1084 (100 bootstrap pseudoreplicates) under the LG model85. Maximum likelihood support values are indicated above each branch. The Cox1 from As1 grouped within the myzozoan alveolates within a fully supported clade comprising dinoflagellates, apicomplexans, and ‘chromerid’ algae. Picozoan M5584–11 Cox1 does not branch strongly with any eukaryotic group. Numbers in brackets indicate number of sequences collapsed.

Extended Data Fig. 3 Telonemid mtDNAs encode a putative rpl18 and retain partial synteny with the bacterial-like genomes of jakobids.

In all telonemid mitochondrial DNAs examined rps8, rpl6, and rpl18 were found in synteny as in mtDNAs of jakobids. Malawimonas jakobiformis is somewhat similar as rpl6 and rpl18 are found adjacent to one another. Genbank: Andalucia godoyi NC_021124.1, Histiona aroides NC_021125.1, Jakoba bahamiensis NC_021126.1, Jakoba libera NC_021127.1, Reclinomonas americana NC_001823.1, Seculamonas ecuadoriensis NC_021128.1, Malawimonas jakobiformis NC_002553.1. Small subunit ribosomal genes are coloured in pink, large subunit ribosomal genes in red, SecY in purple, and electron transport chain components in grey.

Extended Data Fig. 4 Thraustochytrid mtDNAs harbour a unique genetic code.

Alignment of mitochondria-encoded Cob proteins from Thraustochytrium aureum, Schizochytrium sp., and four putative thraustochytrid SAGs. Cob genes with internal stop codons were identified in mitochondrial contigs from each SAG and translated using the standard genetic code. These proteins were aligned using MUSCLE81 with proteins from publicly available thraustochytrid mtDNAs (KU183024.1 and AF288091.2). Positions occupied by TAG or TAA codons are marked with yellow asterisks and aligned most often with tyrosine or other hydrophobic residues (marked in orange). Relatively few TAA and TAG codons were conserved between genome sequences suggesting that these changes occurred during the recent radiation of this lineage.

Extended Data Fig. 5 Distribution of mitochondria-encoded tRNAs.

Comparison of mtDNA tRNA coding capacities from: new assemblies from this study (bold font), previously sequenced mtDNAs (regular font), and ancestral reconstructions (L-Dia- CA, Last Diaphoretickes Common Ancestor; L-Amo-CA, Last Amorphean Common Ancestor - including malawimonads and collodictyonids)); L-Jak-CA, Last Jakobid Common Ancestor; LECA, Last Eukaryote Common Ancestor. # symbols indicate incomplete mtDNA. Asterisks indicate genomes assembled from publicly available datasets. Black filled square, present; empty square, absent. Red filled squares indicate an independent codon reassignment. In some lineages extra tRNAs are also present other than the common tRNAs presented: a, I (uau), one cercozoan lineage (R32) contained a possible suppressor tRNA (gcaa); b, I (uau); c, L (caa); d, I (aau); e, L (gag), N (auu).

Extended Data Fig. 6 Gating strategy for cell sort 35 from which most SAGs originated.

A combination of gates (black polygons) was applied to select. a. cells larger than Synechococcus displaying low red fluorescence to exclude photosynthetic eukaryotes and b. cells stained with Oregon Green as compared to c. an unstained sample. The green rectangles show the position of 0.75 μm yellow-green beads.

Supplementary information

Reporting Summary

Supplementary Tables

Supplementary Tables 1–7.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Wideman, J.G., Monier, A., Rodríguez-Martínez, R. et al. Unexpected mitochondrial genome diversity revealed by targeted single-cell genomics of heterotrophic flagellated protists. Nat Microbiol 5, 154–165 (2020). https://doi.org/10.1038/s41564-019-0605-4

Download citation

Further reading