Phylogenetic analyses with systematic taxon sampling show that mitochondria branch within Alphaproteobacteria


Though it is well accepted that mitochondria originated from an alphaproteobacteria-like ancestor, the phylogenetic relationship of the mitochondrial endosymbiont to extant Alphaproteobacteria is yet unresolved. The focus of much debate is whether the affinity between mitochondria and fast-evolving alphaproteobacterial lineages reflects true homology or artefacts. Approaches such as site exclusion have been claimed to mitigate compositional heterogeneity between taxa, but this comes at the cost of information loss, and the reliability of such methods is so far unproven. Here we demonstrate that site-exclusion methods produce erratic phylogenetic estimates of mitochondrial origin. Thus, previous phylogenetic hypotheses on the origin of mitochondria based on pretreated datasets should be re-evaluated. We applied alternative strategies to reduce phylogenetic noise by systematic taxon sampling while keeping site substitution information intact. Cross-validation based on a series of trees placed mitochondria robustly within Alphaproteobacteria, sharing an ancient common ancestor with Rickettsiales and currently unclassified marine lineages.

Fig. 1: Relationships between alignment sites, the phylogenetic position of mitochondria and model fit (mean square heterogeneity across taxa test) based on different datasets, site-exclusion approaches and taxon-selection approaches.
Fig. 2: Schematic phylogenetic relationships of alphaproteobacterial subgroups.
Fig. 3: Schematic phylogenetic relationships of mitochondria and alphaproteobacterial subgroups.
Fig. 4: Phylogenetic relationships of mitochondria and Alpha IIb Alphaproteobacteria.

Data availability

The alignments and tree files generated in this study have been deposited in figshare ( (ref. 30).

Code availability

The script of the Bowker’s test score-based site-exclusion method is available as Supplementary Software.


This work was financially supported by the National Natural Science Foundation of China (91851210, 91951120, 41530105 and 81774152), the European Research Council (ERC 666053), the Shenzhen Key Laboratory of Marine Archaea Geo-Omics, Southern University of Science and Technology (ZDSYS201802081843490), the Shenzhen Science and Technology Innovation Commission (JCYJ20180305123458107), the Southern Marine Science and Engineering Guangdong Laboratory (Guangzhou) (K19313901) and the VW foundation (93 046). Computation in this study was supported by the Centre for Computational Science and Engineering at the Southern University of Science and Technology.

L.F., W.F.M. and R.Z. conceived this study. L.F., D.W., V.G., J.X., Y.X. and S.G. were involved in data analysis. L.F., V.G., C.Z., W.F.M. and R.Z. interpreted the results and drafted the manuscript. All authors participated in the critical revision of the manuscript.

Corresponding authors

Correspondence to Lu Fan or William F. Martin or Ruixin Zhu.

Competing interests

The authors declare no competing interests.

Publisher's note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Supplementary methods, references, Notes 1–9 and Figs. 1–60.

Supplementary Table

Supplementary Tables 1–7.

Supplementary Software

The script of the Bowker's test score-based site-exclusion method.

