Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Review Article
  • Published:

Incongruence in the phylogenomics era

Abstract

Genome-scale data and the development of novel statistical phylogenetic approaches have greatly aided the reconstruction of a broad sketch of the tree of life and resolved many of its branches. However, incongruence — the inference of conflicting evolutionary histories — remains pervasive in phylogenomic data, hampering our ability to reconstruct and interpret the tree of life. Biological factors, such as incomplete lineage sorting, horizontal gene transfer, hybridization, introgression, recombination and convergent molecular evolution, can lead to gene phylogenies that differ from the species tree. In addition, analytical factors, including stochastic, systematic and treatment errors, can drive incongruence. Here, we review these factors, discuss methodological advances to identify and handle incongruence, and highlight avenues for future research.

This is a preview of subscription content, access via your institution

Access options

Buy this article

Prices may be subject to local taxes which are calculated during checkout

Fig. 1: Incongruence at different levels of genomic organization.
Fig. 2: Major biological factors that contribute to incongruence.
Fig. 3: Analytical factors can contribute to incongruence at every step in a phylogenomic workflow.

Similar content being viewed by others

References

  1. Simpson, G. G. The Principles of Classification and a Classification of Mammals Vol. 85 (American Museum of Natural History, 1945).

  2. Jarvis, E. D. et al. Whole-genome analyses resolve early branches in the tree of life of modern birds. Science 346, 1320–1331 (2014).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  3. Parks, D. H. et al. A standardized bacterial taxonomy based on genome phylogeny substantially revises the tree of life. Nat. Biotechnol. 36, 996–1004 (2018).

    Article  CAS  PubMed  Google Scholar 

  4. One Thousand Plant Transcriptomes Initiative. One thousand plant transcriptomes and the phylogenomics of green plants. Nature 574, 679–685 (2019).

    Article  CAS  Google Scholar 

  5. Li, Y. et al. HGT is widespread in insects and contributes to male courtship in lepidopterans. Cell 185, 2975–2987.e10 (2022).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  6. Eisen, J. A. Phylogenomics: improving functional predictions for uncharacterized genes by evolutionary analysis. Genome Res. 8, 163–167 (1998).

    Article  CAS  PubMed  Google Scholar 

  7. Delsuc, F., Brinkmann, H. & Philippe, H. Phylogenomics and the reconstruction of the tree of life. Nat. Rev. Genet. 6, 361–375 (2005).

    Article  CAS  PubMed  Google Scholar 

  8. Crotty, S. M. et al. GHOST: recovering historical signal from heterotachously evolved sequence alignments. Syst. Biol. 69, 249–264 (2020).

    CAS  PubMed  Google Scholar 

  9. Rokas, A., Williams, B. L., King, N. & Carroll, S. B. Genome-scale approaches to resolving incongruence in molecular phylogenies. Nature 425, 798–804 (2003).

    Article  CAS  PubMed  Google Scholar 

  10. Kawahara, A. Y. et al. Phylogenomics reveals the evolutionary timing and pattern of butterflies and moths. Proc. Natl Acad. Sci. USA 116, 22657–22663 (2019).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  11. Misof, B. et al. Phylogenomics resolves the timing and pattern of insect evolution. Science 346, 763–767 (2014).

    Article  CAS  PubMed  Google Scholar 

  12. Dunn, C. W. et al. Broad phylogenomic sampling improves resolution of the animal tree of life. Nature 452, 745–749 (2008).

    Article  CAS  PubMed  Google Scholar 

  13. Bond, J. E. et al. Phylogenomics resolves a spider backbone phylogeny and rejects a prevailing paradigm for Orb web evolution. Curr. Biol. 24, 1765–1771 (2014).

    Article  CAS  PubMed  Google Scholar 

  14. Li, Y. et al. A genome-scale phylogeny of the kingdom Fungi. Curr. Biol. 31, 1653–1665.e5 (2021).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  15. Simion, P. et al. A large and consistent phylogenomic dataset supports sponges as the sister group to all other animals. Curr. Biol. 27, 958–967 (2017).

    Article  CAS  PubMed  Google Scholar 

  16. Whelan, N. V. et al. Ctenophore relationships and their placement as the sister group to all other animals. Nat. Ecol. Evol. 1, 1737–1746 (2017).

    Article  PubMed  PubMed Central  Google Scholar 

  17. Lemmon, A. R. & Moriarty, E. C. The importance of proper model assumption in Bayesian phylogenetics. Syst. Biol. 53, 265–277 (2004).

    Article  PubMed  Google Scholar 

  18. Mao, Y. et al. A high-quality bonobo genome refines the analysis of hominid evolution. Nature 594, 77–81 (2021).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  19. Meleshko, O. et al. Extensive genome-wide phylogenetic discordance is due to incomplete lineage sorting and not ongoing introgression in a rapidly radiated bryophyte genus. Mol. Biol. Evol. 38, 2750–2766 (2021).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  20. Feng, S. et al. Incomplete lineage sorting and phenotypic evolution in marsupials. Cell 185, 1646–1660.e18 (2022).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  21. Avise, J. C. & Robinson, T. J. Hemiplasy: a new term in the lexicon of phylogenetics. Syst. Biol. 57, 503–507 (2008).

    Article  PubMed  Google Scholar 

  22. Maddison, W. P. & Knowles, L. L. Inferring phylogeny despite incomplete lineage sorting. Syst. Biol. 55, 21–30 (2006).

    Article  PubMed  Google Scholar 

  23. Degnan, J. H. & Rosenberg, N. A. Gene tree discordance, phylogenetic inference and the multispecies coalescent. Trends Ecol. Evol. 24, 332–340 (2009).

    Article  PubMed  Google Scholar 

  24. Song, S., Liu, L., Edwards, S. V. & Wu, S. Resolving conflict in eutherian mammal phylogeny using phylogenomics and the multispecies coalescent model. Proc. Natl Acad. Sci. USA 109, 14942–14947 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  25. Flouri, T., Jiao, X., Rannala, B. & Yang, Z. Species tree inference with BPP using genomic sequences and the multispecies coalescent. Mol. Biol. Evol. 35, 2585–2593 (2018).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  26. Bouckaert, R. et al. BEAST 2.5: an advanced software platform for Bayesian evolutionary analysis. PLoS Comput. Biol. 15, e1006650 (2019).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  27. Liu, L., Yu, L., Kubatko, L., Pearl, D. K. & Edwards, S. V. Coalescent methods for estimating phylogenetic trees. Mol. Phylogenet. Evol. 53, 320–328 (2009).

    Article  CAS  PubMed  Google Scholar 

  28. Liu, L., Yu, L. & Edwards, S. V. A maximum pseudo-likelihood approach for estimating species trees under the coalescent model. BMC Evol. Biol. 10, 302 (2010).

    Article  PubMed  PubMed Central  Google Scholar 

  29. Zhang, C., Rabiee, M., Sayyari, E. & Mirarab, S. ASTRAL-III: polynomial time species tree reconstruction from partially resolved gene trees. BMC Bioinform. 19, 153 (2018).

    Article  Google Scholar 

  30. Zhang, C. & Mirarab, S. Weighting by gene tree uncertainty improves accuracy of quartet-based species trees. Mol. Biol. Evol. 39, msac215 (2022). This study describes the latest version of the state-of-the-art software for phylogenomic inference using summary-based coalescence methods. By incorporating weighting schemes that reduce the contribution of weakly supported gene trees and/or of trees with long branch lengths.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  31. Morel, B., Williams, T. A. & Stamatakis, A. Asteroid: a new algorithm to infer species trees from gene trees under high proportions of missing data. Bioinformatics 39, btac832 (2023).

    Article  PubMed  Google Scholar 

  32. Kominek, J. et al. Eukaryotic acquisition of a bacterial operon. Cell 176, 1356–1366.e10 (2019).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  33. Arnold, B. J., Huang, I.-T. & Hanage, W. P. Horizontal gene transfer and adaptive evolution in bacteria. Nat. Rev. Microbiol. 20, 206–218 (2022).

    Article  CAS  PubMed  Google Scholar 

  34. Gophna, U. & Altman-Price, N. Horizontal gene transfer in Archaea — from mechanisms to genome evolution. Annu. Rev. Microbiol. 76, 481–502 (2022).

    Article  PubMed  Google Scholar 

  35. Van Etten, J. & Bhattacharya, D. Horizontal gene transfer in eukaryotes: not if, but how much? Trends Genet. 36, 915–925 (2020).

    Article  PubMed  Google Scholar 

  36. Lapierre, P., Lasek-Nesselquist, E. & Gogarten, J. P. The impact of HGT on phylogenomic reconstruction methods. Brief. Bioinform. 15, 79–90 (2014).

    Article  PubMed  Google Scholar 

  37. Wisecaver, J. H. & Rokas, A. Fungal metabolic gene clusters: caravans traveling across genomes and environments. Front. Microbiol. 6, 161 (2015).

    Article  PubMed  PubMed Central  Google Scholar 

  38. Sevillya, G., Adato, O. & Snir, S. Detecting horizontal gene transfer: a probabilistic approach. BMC Genomics 21, 106 (2020).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  39. Gladyshev, E. A., Meselson, M. & Arkhipova, I. R. Massive horizontal gene transfer in Bdelloid rotifers. Science 320, 1210–1213 (2008).

    Article  CAS  PubMed  Google Scholar 

  40. Szöllősi, G. J., Boussau, B., Abby, S. S., Tannier, E. & Daubin, V. Phylogenetic modeling of lateral gene transfer reconstructs the pattern and relative timing of speciations. Proc. Natl Acad. Sci. USA 109, 17513–17518 (2012). This study uses a statistical model of genome evolution that considers gene duplications, gene losses and horizontal gene transfers in phylogenetic reconstruction, demonstrating that incongruence stemming from these processes can inform inferences of evolutionary history.

    Article  PubMed  PubMed Central  Google Scholar 

  41. Williams, T. A. et al. Integrative modeling of gene and genome evolution roots the archaeal tree of life. Proc. Natl Acad. Sci. USA 114, E4602–E4611 (2017).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  42. Morel, B. et al. SpeciesRax: a tool for maximum likelihood species tree inference from gene family trees under duplication, transfer, and loss. Mol. Biol. Evol. 39, msab365 (2022).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  43. Zhang, D. et al. Most genomic loci misrepresent the phylogeny of an avian radiation because of ancient gene flow. Syst. Biol. 70, 961–975 (2021).

    Article  PubMed  PubMed Central  Google Scholar 

  44. Hibbins, M. S. & Hahn, M. W. Phylogenomic approaches to detecting and characterizing introgression. Genetics 220, iyab173 (2022).

    Article  PubMed  Google Scholar 

  45. Sang, T. & Zhong, Y. Testing hybridization hypotheses based on incongruent gene trees. Syst. Biol. 49, 422–434 (2000).

    Article  CAS  PubMed  Google Scholar 

  46. Langdon, Q. K., Peris, D., Kyle, B. & Hittinger, C. T. sppIDer: a species identification tool to investigate hybrid genomes with high-throughput sequencing. Mol. Biol. Evol. 35, 2835–2849 (2018).

    CAS  PubMed  PubMed Central  Google Scholar 

  47. Steenwyk, J. L. et al. Pathogenic allodiploid hybrids of Aspergillus fungi. Curr. Biol. 30, 2495–2507.e7 (2020).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  48. Yu, Y., Dong, J., Liu, K. J. & Nakhleh, L. Maximum likelihood inference of reticulate evolutionary histories. Proc. Natl Acad. Sci. USA 111, 16448–16453 (2014).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  49. Durand, E. Y., Patterson, N., Reich, D. & Slatkin, M. Testing for ancient admixture between closely related populations. Mol. Biol. Evol. 28, 2239–2252 (2011).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  50. Pease, J. B. & Hahn, M. W. Detection and polarization of introgression in a five-taxon phylogeny. Syst. Biol. 64, 651–662 (2015). This work describes a method for detecting incomplete lineage sorting and introgression in the five-taxon case, enabling identification of the taxa involved and the direction of introgression.

    Article  CAS  PubMed  Google Scholar 

  51. Hahn, M. W. & Hibbins, M. S. A three-sample test for introgression. Mol. Biol. Evol. 36, 2878–2882 (2019).

    Article  CAS  PubMed  Google Scholar 

  52. Suvorov, A. et al. Widespread introgression across a phylogeny of 155 Drosophila genomes. Curr. Biol. 32, 111–123.e5 (2022).

    Article  CAS  PubMed  Google Scholar 

  53. Posada, D. & Crandall, K. A. The effect of recombination on the accuracy of phylogeny estimation. J. Mol. Evol. 54, 396–402 (2002).

    Article  CAS  PubMed  Google Scholar 

  54. Bruen, T. C., Philippe, H. & Bryant, D. A simple and robust statistical test for detecting the presence of recombination. Genetics 172, 2665–2681 (2006).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  55. Martin, D. P. et al. RDP5: a computer program for analyzing recombination in, and removing signals of recombination from, nucleotide sequence datasets. Virus Evol. 7, veaa087 (2021).

    Article  PubMed  Google Scholar 

  56. Sackton, T. B. & Clark, N. Convergent evolution in the genomics era: new insights and directions. Phil. Trans. R. Soc. B 374, 20190102 (2019).

    Article  PubMed  PubMed Central  Google Scholar 

  57. Li, Y., Liu, Z., Shi, P. & Zhang, J. The hearing gene Prestin unites echolocating bats and whales. Curr. Biol. 20, R55–R56 (2010). Striking example of convergent molecular evolution in Prestin, a gene that encodes a protein involved in echolocation. Even though echolocating bats and whales are not sister lineages, bat and whale sequences of Prestin group these lineages together, demonstrating how convergent evolution can contribute to incongruence.

    Article  CAS  PubMed  Google Scholar 

  58. Castoe, T. A. et al. Evidence for an ancient adaptive episode of convergent molecular evolution. Proc. Natl Acad. Sci. USA 106, 8986–8991 (2009).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  59. Minh, B. Q. et al. IQ-TREE 2: new models and efficient methods for phylogenetic inference in the genomic era. Mol. Biol. Evol. 37, 1530–1534 (2020).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  60. Musil, M. et al. FireProtASR: a web server for fully automated ancestral sequence reconstruction. Brief. Bioinform. 22, bbaa337 (2021).

    Article  PubMed  Google Scholar 

  61. Hanson-Smith, V. & Johnson, A. PhyloBot: a web portal for automated phylogenetics, ancestral sequence reconstruction, and exploration of mutational trajectories. PLoS Comput. Biol. 12, e1004976 (2016).

    Article  PubMed  PubMed Central  Google Scholar 

  62. Martijn, J. et al. Hikarchaeia demonstrate an intermediate stage in the methanogen-to-halophile transition. Nat. Commun. 11, 5490 (2020).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  63. Martijn, J., Vosseberg, J., Guy, L., Offre, P. & Ettema, T. J. G. Deep mitochondrial origin outside the sampled alphaproteobacteria. Nature 557, 101–105 (2018).

    Article  CAS  PubMed  Google Scholar 

  64. Muñoz-Gómez, S. A. et al. Site-and-branch-heterogeneous analyses of an expanded dataset favour mitochondria as sister to known Alphaproteobacteria. Nat. Ecol. Evol. 6, 253–262 (2022). This article describes a novel model of protein evolution that considers compositional heterogeneity both across sites of a data matrix and across branches of a phylogeny. This model is likely better than site-homogeneous or site-heterogenous models in cases where compositional heterogeneity varies across time and across the phylogeny such as the thorny question of the origin of mitochondria.

    Article  PubMed  Google Scholar 

  65. Riley, R. et al. Comparative genomics of biotechnologically important yeasts. Proc. Natl Acad. Sci. USA 113, 9882–9887 (2016).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  66. Shen, X.-X. et al. Reconstructing the backbone of the Saccharomycotina yeast phylogeny using genome-scale data. G3 6, 3927–3939 (2016).

    Article  PubMed  PubMed Central  Google Scholar 

  67. Shen, X.-X., Hittinger, C. T. & Rokas, A. Contentious relationships in phylogenomic studies can be driven by a handful of genes. Nat. Ecol. Evol. 1, 0126 (2017). This article describes a novel approach to visualize single-gene and single-site support for conflicting phylogenetic hypotheses. Application of this approach on phylogenomic data from different instances of incongruence reveals that a few, or even single, genes or sites in very large phylogenomic data matrices can drive incongruence.

    Article  Google Scholar 

  68. Shen, X.-X. et al. Tempo and mode of genome evolution in the budding yeast subphylum. Cell 175, 1533–1545.e20 (2018).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  69. Gitzendanner, M. A., Soltis, P. S., Wong, G. K.-S., Ruhfel, B. R. & Soltis, D. E. Plastid phylogenomic analysis of green plants: a billion years of evolutionary history. Am. J. Bot. 105, 291–301 (2018).

    Article  PubMed  Google Scholar 

  70. Wickett, N. J. et al. Phylotranscriptomic analysis of the origin and early diversification of land plants. Proc. Natl Acad. Sci. USA 111, E4859–E4868 (2014).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  71. Cheng, S. et al. Genomes of subaerial Zygnematophyceae provide insights into land plant evolution. Cell 179, 1057–1067.e14 (2019).

    Article  CAS  PubMed  Google Scholar 

  72. Aberer, A. J., Krompass, D. & Stamatakis, A. Pruning rogue taxa improves phylogenetic accuracy: an efficient algorithm and webservice. Syst. Biol. 62, 162–166 (2013).

    Article  PubMed  Google Scholar 

  73. Struck, T. H. TreSpEx — detection of misleading signal in phylogenetic reconstructions based on tree information. Evol. Bioinform. Online 10, EBO.S14239 (2014).

    Article  Google Scholar 

  74. Amemiya, C. T. et al. The African coelacanth genome provides insights into tetrapod evolution. Nature 496, 311–316 (2013).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  75. Liu, S. et al. Ancient and modern genomes unravel the evolutionary history of the rhinoceros family. Cell 184, 4874–4885.e16 (2021).

    Article  CAS  PubMed  Google Scholar 

  76. Perri, A. R. et al. Dire wolves were the last of an ancient New World canid lineage. Nature 591, 87–91 (2021).

    Article  CAS  PubMed  Google Scholar 

  77. Townsend, J. P. Profiling phylogenetic informativeness. Syst. Biol. 56, 222–231 (2007).

    Article  CAS  PubMed  Google Scholar 

  78. Patel, S., Kimball, R. T. & Braun, E. L. Error in phylogenetic estimation for bushes in the tree of life. J. Phylogenet. Evol. Biol. 01, 1000110 (2013).

    Article  Google Scholar 

  79. Rokas, A. & Carroll, S. B. Bushes in the tree of life. PLoS Biol. 4, e352 (2006).

    Article  PubMed  PubMed Central  Google Scholar 

  80. Pipes, L., Wang, H., Huelsenbeck, J. P. & Nielsen, R. Assessing uncertainty in the rooting of the SARS-CoV-2 phylogeny. Mol. Biol. Evol. 38, 1537–1543 (2021). This article shows that statistical support for the rooting of the SAR-CoV-2 phylogeny is weak, suggesting that there is a limit in our power to resolve certain phylogenetic branches.

    Article  CAS  PubMed  Google Scholar 

  81. Steenwyk, J. L. et al. OrthoSNAP: a tree splitting and pruning algorithm for retrieving single-copy orthologs from gene family trees. PLoS Biol. 20, e3001827 (2022).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  82. Willson, J., Roddur, M. S., Liu, B., Zaharias, P. & Warnow, T. DISCO: species tree inference using multicopy gene family tree decomposition. Syst. Biol. 71, 610–629 (2022).

    Article  CAS  PubMed  Google Scholar 

  83. Springer, M. S. & Gatesy, J. The gene tree delusion. Mol. Phylogenet. Evol. 94, 1–33 (2016).

    Article  PubMed  Google Scholar 

  84. Sanderson, M. J., McMahon, M. M. & Steel, M. Terraces in phylogenetic tree space. Science 333, 448–450 (2011).

    Article  CAS  PubMed  Google Scholar 

  85. Xi, Z. et al. Phylogenomics and a posteriori data partitioning resolve the Cretaceous angiosperm radiation Malpighiales. Proc. Natl Acad. Sci. USA 109, 17519–17524 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  86. Sanderson, M. J., McMahon, M. M., Stamatakis, A., Zwickl, D. J. & Steel, M. Impacts of terraces on phylogenetic inference. Syst. Biol. 64, 709–726 (2015).

    Article  PubMed  Google Scholar 

  87. Steenwyk, J. L., Shen, X.-X., Lind, A. L., Goldman, G. H. & Rokas, A. A robust phylogenomic time tree for biotechnologically and medically important fungi in the genera Aspergillus and Penicillium. mBio 10, e00925-19 (2019).

    Article  PubMed  PubMed Central  Google Scholar 

  88. Smith, B. T., Mauck, W. M., Benz, B. W. & Andersen, M. J. Uneven missing data skew phylogenomic relationships within the lories and lorikeets. Genome Biol. Evol. 12, 1131–1147 (2020).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  89. Emms, D. M. & Kelly, S. OrthoFinder: phylogenetic orthology inference for comparative genomics. Genome Biol. 20, 238 (2019). This article describes OrthoFinder, a state-of-the-art software for the identification of groups of orthologous genes that considers incomplete lineage sorting and gene duplication and loss, improving the accuracy of ortholog inference.

    Article  PubMed  PubMed Central  Google Scholar 

  90. Weisman, C. M., Murray, A. W. & Eddy, S. R. Many, but not all, lineage-specific genes can be explained by homology detection failure. PLoS Biol. 18, e3000862 (2020).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  91. Martín-Durán, J. M., Ryan, J. F., Vellutini, B. C., Pang, K. & Hejnol, A. Increased taxon sampling reveals thousands of hidden orthologs in flatworms. Genome Res. 27, 1263–1272 (2017).

    Article  PubMed  PubMed Central  Google Scholar 

  92. Eddy, S. R. Accelerated profile HMM searches. PLoS Comput. Biol. 7, e1002195 (2011).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  93. Tassia, M. G., David, K. T., Townsend, J. P. & Halanych, K. M. TIAMMAt: leveraging biodiversity to revise protein domain models, evidence from innate immunity. Mol. Biol. Evol. 38, 5806–5818 (2021).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  94. Scannell, D. R., Byrne, K. P., Gordon, J. L., Wong, S. & Wolfe, K. H. Multiple rounds of speciation associated with reciprocal gene loss in polyploid yeasts. Nature 440, 341–345 (2006).

    Article  CAS  PubMed  Google Scholar 

  95. Philippe, H. et al. Phylogenomics revives traditional views on deep animal relationships. Curr. Biol. 19, 706–712 (2009).

    Article  CAS  PubMed  Google Scholar 

  96. Steenwyk, J. L. et al. PhyKIT: a broadly applicable UNIX shell toolkit for processing and analyzing phylogenomic data. Bioinformatics 37, 2325–2331 (2021).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  97. Mai, U. & Mirarab, S. TreeShrink: fast and accurate detection of outlier long branches in collections of phylogenetic trees. BMC Genom. 19, 272 (2018).

    Article  Google Scholar 

  98. Tice, A. K. et al. PhyloFisher: a phylogenomic package for resolving eukaryotic relationships. PLoS Biol. 19, e3001365 (2021).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  99. Kocot, K. M., Citarella, M. R., Moroz, L. L. & Halanych, K. M. PhyloTreePruner: a phylogenetic tree-based approach for selection of orthologous sequences for phylogenomics. Evol. Bioinform. Online 9, EBO.S12813 (2013).

    Article  Google Scholar 

  100. Parks, D. H., Imelfort, M., Skennerton, C. T., Hugenholtz, P. & Tyson, G. W. CheckM: assessing the quality of microbial genomes recovered from isolates, single cells, and metagenomes. Genome Res. 25, 1043–1055 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  101. Hugoson, E., Lam, W. T. & Guy, L. miComplete: weighted quality evaluation of assembled microbial genomes. Bioinformatics 36, 936–937 (2020).

    Article  CAS  PubMed  Google Scholar 

  102. Jukes, T. H. & Cantor, C. R. In Mammalian Protein Metabolism 1st edn, Vol. III (ed. Munro, H. N.) Ch. 24 (Academic Press, 1969).

  103. Kimura, M. A simple method for estimating evolutionary rates of base substitutions through comparative studies of nucleotide sequences. J. Mol. Evol. 16, 111–120 (1980).

    Article  CAS  PubMed  Google Scholar 

  104. Felsenstein, J. Evolutionary trees from DNA sequences: a maximum likelihood approach. J. Mol. Evol. 17, 368–376 (1981).

    Article  CAS  PubMed  Google Scholar 

  105. Tavaré, S. Some probabilistic and statistical problems in the analysis of DNA sequences. Lect. Math. Life Sci. 17, 57–86 (1986).

    Google Scholar 

  106. Arenas, M. Trends in substitution models of molecular evolution. Front. Genet. 6, 319 (2015).

    Article  PubMed  PubMed Central  Google Scholar 

  107. Yang, Z., Nielsen, R. & Hasegawa, M. Models of amino acid substitution and applications to mitochondrial protein evolution. Mol. Biol. Evol. 15, 1600–1611 (1998).

    Article  CAS  PubMed  Google Scholar 

  108. Whelan, S. & Goldman, N. A general empirical model of protein evolution derived from multiple protein families using a maximum-likelihood approach. Mol. Biol. Evol. 18, 691–699 (2001).

    Article  CAS  PubMed  Google Scholar 

  109. Le, S. Q. & Gascuel, O. An improved general amino acid replacement matrix. Mol. Biol. Evol. 25, 1307–1320 (2008).

    Article  CAS  PubMed  Google Scholar 

  110. Darriba, D., Taboada, G. L., Doallo, R. & Posada, D. jModelTest 2: more models, new heuristics and parallel computing. Nat. Methods 9, 772–772 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  111. Susko, E. & Roger, A. J. On the use of information criteria for model selection in phylogenetics. Mol. Biol. Evol. 37, 549–562 (2020).

    Article  CAS  PubMed  Google Scholar 

  112. Spielman, S. J. Relative model fit does not predict topological accuracy in single-gene protein phylogenetics. Mol. Biol. Evol. 37, 2110–2123 (2020).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  113. Abadi, S., Azouri, D., Pupko, T. & Mayrose, I. Model selection may not be a mandatory step for phylogeny reconstruction. Nat. Commun. 10, 934 (2019).

    Article  PubMed  PubMed Central  Google Scholar 

  114. Bloom, J. D. An experimentally determined evolutionary model dramatically improves phylogenetic fit. Mol. Biol. Evol. 31, 1956–1978 (2014). Through systematic mutagenesis, functional selection and sequencing experiments, this study experimentally determines a substitution model for a viral protein. This parameter-free model is a much better fit than models with hundreds of parameters, highlighting the potential of high-throughput experimental strategies in improving the accuracy of phylogenetic inference.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  115. Kainer, D. & Lanfear, R. The effects of partitioning on phylogenetic inference. Mol. Biol. Evol. 32, 1611–1627 (2015).

    Article  CAS  PubMed  Google Scholar 

  116. Lanfear, R., Frandsen, P. B., Wright, A. M., Senfeld, T. & Calcott, B. PartitionFinder 2: new methods for selecting partitioned models of evolution for molecular and morphological phylogenetic analyses. Mol. Biol. Evol. 34, 772–773 (2016).

    Google Scholar 

  117. Lartillot, N. & Philippe, H. A Bayesian mixture model for across-site heterogeneities in the amino-acid replacement process. Mol. Biol. Evol. 21, 1095–1109 (2004). This landmark study introduces site-heterogeneous models of sequence evolution. By considering compositional heterogeneity across sites, these models can better ameliorate the impact of long-branch attraction artefacts.

    Article  CAS  PubMed  Google Scholar 

  118. Si Quang, L., Gascuel, O. & Lartillot, N. Empirical profile mixture models for phylogenetic reconstruction. Bioinformatics 24, 2317–2323 (2008).

    Article  Google Scholar 

  119. Stairs, C. W. et al. Anaeramoebae are a divergent lineage of eukaryotes that shed light on the transition from anaerobic mitochondria to hydrogenosomes. Curr. Biol. 31, 5605–5612.e5 (2021).

    Article  CAS  PubMed  Google Scholar 

  120. Galindo, L. J., López-García, P., Torruella, G., Karpov, S. & Moreira, D. Phylogenomics of a new fungal phylum reveals multiple waves of reductive evolution across Holomycota. Nat. Commun. 12, 4973 (2021).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  121. Williams, T. A., Cox, C. J., Foster, P. G., Szöllősi, G. J. & Embley, T. M. Phylogenomics provides robust support for a two-domains tree of life. Nat. Ecol. Evol. 4, 138–147 (2019).

    Article  PubMed  PubMed Central  Google Scholar 

  122. Minin, V., Abdo, Z., Joyce, P. & Sullivan, J. Performance-based selection of likelihood models for phylogeny estimation. Syst. Biol. 52, 674–683 (2003).

    Article  PubMed  Google Scholar 

  123. Yang, Z. & Rannala, B. Molecular phylogenetics: principles and practice. Nat. Rev. Genet. 13, 303–314 (2012).

    Article  CAS  PubMed  Google Scholar 

  124. Sullivan, J. & Swofford, D. L. Are guinea pigs rodents? The importance of adequate models in molecular phylogenetics. J. Mamm. Evol. 4, 77–86 (1997).

    Article  Google Scholar 

  125. Lartillot, N., Brinkmann, H. & Philippe, H. Suppression of long-branch attraction artefacts in the animal phylogeny using a site-heterogeneous model. BMC Evol. Biol. 7, S4 (2007).

    Article  PubMed  PubMed Central  Google Scholar 

  126. Susko, E. & Roger, A. J. Long branch attraction biases in phylogenetics. Syst. Biol. 70, 838–843 (2021).

    Article  PubMed  Google Scholar 

  127. Husník, F., Chrudimský, T. & Hypša, V. Multiple origins of endosymbiosis within the Enterobacteriaceae (γ-Proteobacteria): convergence of complex phylogenetic approaches. BMC Biol. 9, 87 (2011).

    Article  PubMed  PubMed Central  Google Scholar 

  128. Capella-Gutiérrez, S., Marcet-Houben, M. & Gabaldón, T. Phylogenomics supports microsporidia as the earliest diverging clade of sequenced fungi. BMC Biol. 10, 47 (2012).

    Article  PubMed  PubMed Central  Google Scholar 

  129. Graybeal, A. Is it better to add taxa or characters to a difficult phylogenetic problem? Syst. Biol. 47, 9–17 (1998).

    Article  CAS  PubMed  Google Scholar 

  130. Hillis, D. M. Inferring complex phytogenies. Nature 383, 130–131 (1996).

    Article  CAS  PubMed  Google Scholar 

  131. Lopez, P., Casane, D. & Philippe, H. Heterotachy, an important process of protein evolution. Mol. Biol. Evol. 19, 1–7 (2002).

    Article  CAS  PubMed  Google Scholar 

  132. Philippe, H., Zhou, Y., Brinkmann, H., Rodrigue, N. & Delsuc, F. Heterotachy and long-branch attraction in phylogenetics. BMC Evol. Biol. 5, 50 (2005).

    Article  PubMed  PubMed Central  Google Scholar 

  133. Bergsten, J. A review of long-branch attraction. Cladistics 21, 163–193 (2005).

    Article  PubMed  Google Scholar 

  134. Geuten, K., Massingham, T., Darius, P., Smets, E. & Goldman, N. Experimental design criteria in phylogenetics: where to add taxa. Syst. Biol. 56, 609–622 (2007).

    Article  PubMed  Google Scholar 

  135. Pollock, D. D., Zwickl, D. J., McGuire, J. A. & Hillis, D. M. Increased taxon sampling is advantageous for phylogenetic inference. Syst. Biol. 51, 664–671 (2002).

    Article  PubMed  Google Scholar 

  136. Brady, S. G., Litman, J. R. & Danforth, B. N. Rooting phylogenies using gene duplications: an empirical example from the bees (Apoidea). Mol. Phylogenet. Evol. 60, 295–304 (2011).

    Article  PubMed  Google Scholar 

  137. Mathews, S., Clements, M. D. & Beilstein, M. A. A duplicate gene rooting of seed plants and the phylogenetic position of flowering plants. Phil. Trans. R. Soc. B 365, 383–395 (2010).

    Article  PubMed  PubMed Central  Google Scholar 

  138. Emms, D. M. & Kelly, S. STRIDE: species tree root inference from gene duplication events. Mol. Biol. Evol. 34, 3267–3278 (2017).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  139. Naser-Khdour, S., Quang Minh, B. & Lanfear, R. Assessing confidence in root placement on phylogenies: an empirical study using nonreversible models for mammals. Syst. Biol. 71, 959–972 (2022).

    Article  CAS  PubMed  Google Scholar 

  140. Bettisworth, B. & Stamatakis, A. Root Digger: a root placement program for phylogenetic trees. BMC Bioinformatics 22, 225 (2021).

    Article  PubMed  PubMed Central  Google Scholar 

  141. Drummond, A. J., Ho, S. Y. W., Phillips, M. J. & Rambaut, A. Relaxed phylogenetics and dating with confidence. PLoS Biol. 4, e88 (2006).

    Article  PubMed  PubMed Central  Google Scholar 

  142. Tria, F. D. K., Landan, G. & Dagan, T. Phylogenetic rooting using minimal ancestor deviation. Nat. Ecol. Evol. 1, 0193 (2017).

    Article  Google Scholar 

  143. Ashkenazy, H., Sela, I., Levy, K. E., Landan, G. & Pupko, T. Multiple sequence alignment averaging improves phylogeny reconstruction. Syst. Biol. 68, 117–130 (2019).

    Article  CAS  PubMed  Google Scholar 

  144. Li-San, W. et al. The impact of multiple protein sequence alignment on phylogenetic estimation. IEEE/ACM Trans. Comput. Biol. Bioinform. 8, 1108–1119 (2011).

    Article  Google Scholar 

  145. Landan, G. & Graur, D. Characterization of pairwise and multiple sequence alignment errors. Gene 441, 141–147 (2009).

    Article  CAS  PubMed  Google Scholar 

  146. Ali, R. H., Bogusz, M. & Whelan, S. Identifying clusters of high confidence homologies in multiple sequence alignments. Mol. Biol. Evol. 36, 2340–2351 (2019).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  147. Zhang, C., Zhao, Y., Braun, E. L. & Mirarab, S. TAPER: pinpointing errors in multiple sequence alignments despite varying rates of evolution. Methods Ecol. Evol. 12, 2145–2158 (2021).

    Article  Google Scholar 

  148. Tan, G. et al. Current methods for automated filtering of multiple sequence alignments frequently worsen single-gene phylogenetic inference. Syst. Biol. 64, 778–791 (2015). Upending conventional wisdom, this study convincingly demonstrates that trimming typically reduces the accuracy of phylogenetic inference and contributes to incongruence.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  149. Steenwyk, J. L., Buida, T. J., Li, Y., Shen, X.-X. & Rokas, A. ClipKIT: a multiple sequence alignment trimming software for accurate phylogenomic inference. PLoS Biol. 18, e3001007 (2020). This article describes a novel and more accurate approach to multiple sequence alignment trimming, where phylogenetically informative sites, which are more easily defined than phylogenetically uninformative sites, are retained and other sites are removed.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  150. Susko, E. & Roger, A. J. On reduced amino acid alphabets for phylogenetic inference. Mol. Biol. Evol. 24, 2139–2150 (2007).

    Article  CAS  PubMed  Google Scholar 

  151. Blanquart, S. A Bayesian compound stochastic process for modeling nonstationary and nonhomogeneous sequence evolution. Mol. Biol. Evol. 23, 2058–2071 (2006).

    Article  CAS  PubMed  Google Scholar 

  152. Phillips, M. J., Delsuc, F. & Penny, D. Genome-scale phylogeny and the detection of systematic biases. Mol. Biol. Evol. 21, 1455–1458 (2004).

    Article  CAS  PubMed  Google Scholar 

  153. Laumer, C. E. et al. Support for a clade of Placozoa and Cnidaria in genes with minimal compositional bias. eLife 7, e36278 (2018).

    Article  PubMed  PubMed Central  Google Scholar 

  154. Hernandez, A. M. & Ryan, J. F. Six-state amino acid recoding is not an effective strategy to offset compositional heterogeneity and saturation in phylogenetic analyses. Syst. Biol. 70, 1200–1212 (2021).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  155. Foster, P. G. et al. Recoding amino acids to a reduced alphabet may increase or decrease phylogenetic accuracy. Syst. Biol. https://doi.org/10.1093/sysbio/syac042 (2022).

    Article  Google Scholar 

  156. Wascher, M. & Kubatko, L. Consistency of SVDQuartets and maximum likelihood for coalescent-based species tree estimation. Syst. Biol. 70, 33–48 (2021).

    Article  PubMed  Google Scholar 

  157. Alda, F. et al. Resolving deep nodes in an ancient radiation of neotropical fishes in the presence of conflicting signals from incomplete lineage sorting. Syst. Biol. 68, 573–593 (2019).

    Article  PubMed  Google Scholar 

  158. Shen, X.-X., Steenwyk, J. L. & Rokas, A. Dissecting incongruence between concatenation- and quartet-based approaches in phylogenomic data. Syst. Biol. 70, 997–1014 (2021).

    Article  PubMed  Google Scholar 

  159. Darriba, D., Flouri, T. & Stamatakis, A. The state of software for evolutionary biology. Mol. Biol. Evol. 35, 1037–1046 (2018).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  160. Shen, X.-X., Li, Y., Hittinger, C. T., Chen, X. & Rokas, A. An investigation of irreproducibility in maximum likelihood phylogenetic inference. Nat. Commun. 11, 6096 (2020). This study reports that a considerable fraction of single gene phylogenies inferred from phylogenomic data matrices is irreproducible, leading to a novel source of incongruence in phylogenomic studies.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  161. Shen, X.-X., Salichos, L. & Rokas, A. A genome-scale investigation of how sequence, function, and tree-based gene properties influence phylogenetic inference. Genome Biol. Evol. 8, 2565–2580 (2016).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  162. Mongiardino Koch, N. Phylogenomic subsampling and the search for phylogenetically reliable loci. Mol. Biol. Evol. 38, 4025–4038 (2021).

    Article  PubMed  PubMed Central  Google Scholar 

  163. Haag, J., Höhler, D., Bettisworth, B. & Stamatakis, A. From easy to hopeless — predicting the difficulty of phylogenetic analyses. Mol. Biol. Evol. 39, msac254 (2022).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  164. Hillis, D. M. & Bull, J. J. An empirical test of bootstrapping as a method for assessing confidence in phylogenetic analysis. Syst. Biol. 42, 182–192 (1993).

    Article  Google Scholar 

  165. Anisimova, M., Gil, M., Dufayard, J.-F., Dessimoz, C. & Gascuel, O. Survey of branch support methods demonstrates accuracy, power, and robustness of fast likelihood-based approximation schemes. Syst. Biol. 60, 685–699 (2011).

    Article  PubMed  PubMed Central  Google Scholar 

  166. Lemoine, F. et al. Renewing Felsenstein’s phylogenetic bootstrap in the era of big data. Nature 556, 452–456 (2018).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  167. Molloy, E. K. & Warnow, T. To include or not to include: the impact of gene filtering on species tree estimation methods. Syst. Biol. 67, 285–303 (2018).

    Article  PubMed  Google Scholar 

  168. Minh, B. Q., Hahn, M. W. & Lanfear, R. New methods to calculate concordance factors for phylogenomic datasets. Mol. Biol. Evol. 37, 2727–2733 (2020). This article reports the development of methods to calculate the degree to which sites or genes support a particular branch of a phylogeny, also known as concordance factors, and their implementation in the IQ-TREE software. Concordance factors are very useful in identifying the presence of incongruence among a set of trees.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  169. Ane, C., Larget, B., Baum, D. A., Smith, S. D. & Rokas, A. Bayesian estimation of concordance among gene trees. Mol. Biol. Evol. 24, 412–426 (2006).

    Article  PubMed  Google Scholar 

  170. Baum, D. A. Concordance trees, concordance factors, and the exploration of reticulate genealogy. Taxon 56, 417–426 (2007).

    Article  Google Scholar 

  171. Larget, B. R., Kotha, S. K., Dewey, C. N. & Ané, C. BUCKy: gene tree/species tree reconciliation with Bayesian concordance analysis. Bioinformatics 26, 2910–2911 (2010).

    Article  CAS  PubMed  Google Scholar 

  172. Salichos, L. & Rokas, A. Inferring ancient divergences requires genes with strong phylogenetic signals. Nature 497, 327–331 (2013).

    Article  CAS  PubMed  Google Scholar 

  173. Kobert, K., Salichos, L., Rokas, A. & Stamatakis, A. Computing the internode certainty and related measures from partial gene trees. Mol. Biol. Evol. 33, 1606–1617 (2016).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  174. Zhou, X. et al. Quartet-based computations of internode certainty provide robust measures of phylogenetic incongruence. Syst. Biol. 69, 308–324 (2020). This article reports the development of internode certainty measures for phylogenomic data matrices with partial taxon coverage. By explicitly quantifying the level of incongruence of a given internal branch among a set of phylogenetic trees, internode certainty measures are a key tool for diagnosing the presence of incongruence in phylogenomic studies.

    Article  PubMed  Google Scholar 

  175. Salichos, L., Stamatakis, A. & Rokas, A. Novel information theory-based measures for quantifying incongruence among phylogenetic trees. Mol. Biol. Evol. 31, 1261–1271 (2014).

    Article  CAS  PubMed  Google Scholar 

  176. Huson, D. H. & Bryant, D. Application of phylogenetic networks in evolutionary studies. Mol. Biol. Evol. 23, 254–267 (2006).

    Article  CAS  PubMed  Google Scholar 

  177. Huson, D. H. SplitsTree: analyzing and visualizing evolutionary data. Bioinformatics 14, 68–73 (1998).

    Article  CAS  PubMed  Google Scholar 

  178. Huson, D. H., Klöpper, T., Lockhart, P. J. & Steel, M. A. Reconstruction of reticulate networks from gene trees. In Proc. 9th Annual International Conference on Research in Computational Molecular Biology, RECOMB 2005 (eds Miyano, S. et al.) 233–249 (Springer, Berlin, 2005).

  179. Wen, D., Yu, Y., Zhu, J. & Nakhleh, L. Inferring phylogenetic networks using PhyloNet. Syst. Biol. 67, 735–740 (2018).

    Article  PubMed  PubMed Central  Google Scholar 

  180. Lutteropp, S., Scornavacca, C., Kozlov, A. M., Morel, B. & Stamatakis, A. NetRAX: accurate and fast maximum likelihood phylogenetic network inference. Bioinformatics 38, 3725–3733 (2022).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  181. Arcila, D. et al. Genome-wide interrogation advances resolution of recalcitrant groups in the tree of life. Nat. Ecol. Evol. 1, 0020 (2017).

    Article  Google Scholar 

  182. Pease, J. B., Brown, J. W., Walker, J. F., Hinchliff, C. E. & Smith, S. A. Quartet sampling distinguishes lack of support from conflicting support in the green plant tree of life. Am. J. Bot. 105, 385–403 (2018).

    Article  PubMed  Google Scholar 

  183. Sayyari, E. & Mirarab, S. Testing for polytomies in phylogenetic species trees using quartet frequencies. Genes 9, 132 (2018).

    Article  PubMed  PubMed Central  Google Scholar 

  184. Ogden, T. H. & Rosenberg, M. S. Multiple sequence alignment accuracy and phylogenetic inference. Syst. Biol. 55, 314–328 (2006).

    Article  PubMed  Google Scholar 

  185. Zhou, X., Shen, X.-X., Hittinger, C. T. & Rokas, A. Evaluating fast maximum likelihood-based phylogenetic programs using empirical phylogenomic data sets. Mol. Biol. Evol. 35, 486–503 (2018).

    Article  CAS  PubMed  Google Scholar 

  186. Suvorov, A., Hochuli, J. & Schrider, D. R. Accurate inference of tree topologies from multiple sequence alignments using deep learning. Syst. Biol. 69, 221–233 (2020).

    Article  PubMed  Google Scholar 

  187. Azouri, D., Abadi, S., Mansour, Y., Mayrose, I. & Pupko, T. Harnessing machine learning to guide phylogenetic-tree search algorithms. Nat. Commun. 12, 1983 (2021).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  188. Rosenzweig, B. K., Hahn, M. W. & Kern, A. Accurate detection of incomplete lineage sorting via supervised machine learning. Preprint at bioRxiv https://doi.org/10.1101/2022.11.09.515828 (2022).

    Article  Google Scholar 

  189. Grealey, J. et al. The carbon footprint of bioinformatics. Mol. Biol. Evol. 39, msac034 (2022). This article examines the environmental impact and carbon footprint of bioinformatic analyses, including phylogenetics, offering numerous suggestions for greener computing.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  190. Darriba, D. et al. ModelTest-NG: a new and scalable tool for the selection of DNA and protein evolutionary models. Mol. Biol. Evol. 37, 291–294 (2020).

    Article  CAS  PubMed  Google Scholar 

  191. Posada, D. jModelTest: phylogenetic model averaging. Mol. Biol. Evol. 25, 1253–1256 (2008).

    Article  CAS  PubMed  Google Scholar 

  192. Kumar, S. Embracing green computing in molecular phylogenetics. Mol. Biol. Evol. 39, msac043 (2022).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  193. Höhler, D., Haag, J., Kozlov, A. M. & Stamatakis, A. A representative performance assessment of maximum likelihood based phylogenetic inference tools. Preprint at bioRxiv https://doi.org/10.1101/2022.10.31.514545 (2022).

    Article  Google Scholar 

  194. Scornavacca, C. & Galtier, N. Incomplete lineage sorting in mammalian phylogenomics. Syst. Biol. 66, 112–120 (2016).

    Google Scholar 

  195. Galtier, N. A model of horizontal gene transfer and the bacterial phylogeny problem. Syst. Biol. 56, 633–642 (2007).

    Article  PubMed  Google Scholar 

  196. Stolzer, M. et al. Inferring duplications, losses, transfers and incomplete lineage sorting with nonbinary species trees. Bioinformatics 28, i409–i415 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  197. Nabhan, A. R. & Sarkar, I. N. The impact of taxon sampling on phylogenetic inference: a review of two decades of controversy. Brief. Bioinform. 13, 122–134 (2012).

    Article  PubMed  Google Scholar 

  198. Li, Y., Shen, X.-X., Evans, B., Dunn, C. W. & Rokas, A. Rooting the animal tree of life. Mol. Biol. Evol. 38, 4322–4333 (2021). A systematic and in-depth examination of the evidence in favour of the sponge-sister and ctenophore-sister hypotheses concerning the rooting of the animal tree of life.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  199. Cheon, S., Zhang, J. & Park, C. Is phylotranscriptomics as reliable as phylogenomics? Mol. Biol. Evol. 37, 3672–3683 (2020).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  200. Minh, B. Q., Dang, C. C., Vinh, L. S. & Lanfear, R. QMaker: fast and accurate method to estimate empirical models of protein evolution. Syst. Biol. 70, 1046–1060 (2021).

    Article  PubMed  PubMed Central  Google Scholar 

  201. Sharma, S. & Kumar, S. Fast and accurate bootstrap confidence limits on genome-scale phylogenies using little bootstraps. Nat. Comput. Sci. 1, 573–577 (2021).

    Article  PubMed  PubMed Central  Google Scholar 

  202. Hoang, D. T., Chernomor, O., von Haeseler, A., Minh, B. Q. & Vinh, L. S. UFBoot2: improving the ultrafast bootstrap approximation. Mol. Biol. Evol. 35, 518–522 (2018).

    Article  CAS  PubMed  Google Scholar 

  203. Kowalczyk, A. et al. RERconverge: an R package for associating evolutionary rates with convergent traits. Bioinformatics 35, 4815–4817 (2019).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  204. Leigh, J. W., Susko, E., Baumgartner, M. & Roger, A. J. Testing congruence in phylogenomic analysis. Syst. Biol. 57, 104–115 (2008).

    Article  PubMed  Google Scholar 

  205. Al Jewari, C. & Baldauf, S. L. Conflict over the Eukaryote root resides in strong outliers, mosaics and missing data sensitivity of site-specific (CAT) mixture models. Syst. Biol. 72, 1–16 (2023).

    Article  PubMed  Google Scholar 

  206. Camacho, C. et al. BLAST+: architecture and applications. BMC Bioinform. 10, 421 (2009).

    Article  Google Scholar 

  207. Zhang, C., Scornavacca, C., Molloy, E. K. & Mirarab, S. ASTRAL-Pro: quartet-based species-tree inference despite paralogy. Mol. Biol. Evol. 37, 3292–3307 (2020).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  208. Lartillot, N., Rodrigue, N., Stubbs, D. & Richer, J. PhyloBayes MPI: phylogenetic reconstruction with infinite mixtures of profiles in a parallel environment. Syst. Biol. 62, 611–615 (2013).

    Article  CAS  PubMed  Google Scholar 

  209. Kozlov, A. M., Darriba, D., Flouri, T., Morel, B. & Stamatakis, A. RAxML-NG: a fast, scalable and user-friendly tool for maximum likelihood phylogenetic inference. Bioinformatics 35, 4453–4455 (2019).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  210. Liu, L., Yu, L., Pearl, D. K. & Edwards, S. V. Estimating species phylogenies using coalescence times among sequences. Syst. Biol. 58, 468–477 (2009).

    Article  CAS  PubMed  Google Scholar 

  211. Chifman, J. & Kubatko, L. Quartet inference from SNP data under the coalescent model. Bioinformatics 30, 3317–3324 (2014).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  212. Redmond, A. K. & McLysaght, A. Evidence for sponges as sister to all other animals from partitioned phylogenomics with mixture models and recoding. Nat. Commun. 12, 1783 (2021).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  213. Pisani, D. et al. Genomic data do not support comb jellies as the sister group to all other animals. Proc. Natl Acad. Sci. USA 112, 15402–15407 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  214. Feuda, R. et al. Improved modeling of compositional heterogeneity supports sponges as sister to all other animals. Curr. Biol. 27, 3864–3870.e4 (2017).

    Article  CAS  PubMed  Google Scholar 

  215. Ryan, J. F. et al. The genome of the Ctenophore Mnemiopsis leidyi and its implications for cell type evolution. Science 342, 1242592 (2013).

    Article  PubMed  PubMed Central  Google Scholar 

  216. Moroz, L. L. et al. The ctenophore genome and the evolutionary origins of neural systems. Nature 510, 109–114 (2014).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  217. King, N. & Rokas, A. Embracing uncertainty in reconstructing early animal evolution. Curr. Biol. 27, R1081–R1088 (2017).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  218. Dunn, C. W., Leys, S. P. & Haddock, S. H. D. The hidden biology of sponges and ctenophores. Trends Ecol. Evol. 30, 282–291 (2015).

    Article  PubMed  Google Scholar 

  219. Nielsen, C. Early animal evolution: a morphologist’s view. R. Soc. Open Sci. 6, 190638 (2019).

    Article  PubMed  PubMed Central  Google Scholar 

  220. Burkhardt, P. et al. Syncytial nerve net in a ctenophore adds insights on the evolution of nervous systems. Science 380, 293–297 (2023).

    Article  CAS  PubMed  Google Scholar 

  221. Liebeskind, B. J., Hillis, D. M., Zakon, H. H. & Hofmann, H. A. Complex homology and the evolution of nervous systems. Trends Ecol. Evol. 31, 127–135 (2016).

    Article  PubMed  Google Scholar 

  222. Sachkova, M. Y. et al. Neuropeptide repertoire and 3D anatomy of the ctenophore nervous system. Curr. Biol. 31, 5274–5285.e6 (2021).

    Article  CAS  PubMed  Google Scholar 

  223. Burkhardt, P. Ctenophores and the evolutionary origin(s) of neurons. Trends Neurosci. 45, 878–880 (2022).

    Article  CAS  PubMed  Google Scholar 

  224. Baños, H., Susko, E. & Roger, A. J. Is over-parameterization a problem for profile mixture models? Preprint at bioRxiv https://doi.org/10.1101/2022.02.18.481053 (2022).

    Article  Google Scholar 

  225. Kapli, P. & Telford, M. J. Topology-dependent asymmetry in systematic errors affects phylogenetic placement of Ctenophora and Xenacoelomorpha. Sci. Adv. 6, eabc5162 (2020).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  226. Whelan, N. V. & Halanych, K. M. Who let the CAT out of the Bag? Accurately dealing with substitutional heterogeneity in phylogenomic analyses. Syst. Biol. 66, 232–255 (2017).

    CAS  PubMed  Google Scholar 

  227. Whelan, N. V. & Halanych, K. M. Available data do not rule out Ctenophora as the sister group to all other Metazoa. Nat. Commun. 14, 711 (2023).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  228. Parey, E. et al. Genome structures resolve the early diversification of teleost fishes. Science 379, 572–575 (2023). This study uses conservation of genome structure or synteny as an independent source of phylogenomic data. In combination with phylogenomic sequence data, these rare genomic changes resolve controversial relationships in early fish evolution.

    Article  CAS  PubMed  Google Scholar 

  229. Schultz, D. T. et al. Ancient gene linkages support ctenophores as sister to other animals. Nature 618, 110–117 (2023).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

Download references

Acknowledgements

J.L.S. and A.R. were funded by the Howard Hughes Medical Institute through the James H. Gilliam Fellowships for Advanced Study Program. Research in A.R.’s lab is supported by grants from the National Science Foundation (DEB-2110404), the National Institutes of Health/National Institute of Allergy and Infectious Diseases (R01 AI153356), and the Burroughs Wellcome Fund. A.R. acknowledges support from a Klaus Tschira Guest Professorship from the Heidelberg Institute for Theoretical Studies and from a Visiting Research Fellowship from Merton College of the University of Oxford. X.X.S. was supported by the National Key R&D Program of China (2022YFD1401600). Y.L. was supported by Shandong University Outstanding Youth Fund (62420082260514). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

Author information

Authors and Affiliations

Authors

Contributions

All authors researched the literature, contributed substantially to discussions of the content, and reviewed and/or edited the manuscript before submission. J.L.S. and A.R. wrote the article.

Corresponding author

Correspondence to Antonis Rokas.

Ethics declarations

Competing interests

J.L.S. is a scientific adviser for WittGen Biotechnologies and an adviser for ForensisGroup. A.R. is a scientific consultant for LifeMine Therapeutics. The other authors declare no competing interests.

Peer review

Peer review information

Nature Reviews Genetics thanks Thijs J.G. Ettema, who co-reviewed with Daniel Tamarit, and the other, anonymous, reviewer(s) for their contribution to the peer review of this work.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Glossary

Convergent molecular evolution

Independent evolution of similar or identical molecular changes (for example, gene deletions, nucleotide substitutions, gene order rearrangements) in organisms from different lineages that exhibit similar adaptations.

Evolutionary radiation

The occurrence of an elevated rate of speciation events in a narrow window of evolutionary time.

Heterotachy

The phenomenon of changes in the evolutionary rate of a nucleotide or amino acid sequence through time.

Hidden orthology

Undetected orthologous relationships of genes.

Hidden paralogy

Orthologous groups of genes that contain orthologues and paralogues (inparalogues and outparalogues) stemming from asymmetric patterns of duplication and loss.

Horizontal gene transfer

Also known as lateral gene transfer. The transfer of genetic material between organisms of the same or different species through non-reproductive means.

Hybridization

The interbreeding of two distinct species or lineages.

Inparalogues

Lineage-specific or species-specific paralogues wherein the duplication event occurred after divergence from a reference common ancestor.

Introgression

The interbreeding of two distinct species or lineages, followed by backcrossing with one of the parental species.

Long-branch attraction

The inaccurate inference of taxa with high evolutionary rates (giving rise to long branches in their phylogenetic trees) as closely related.

Model of sequence evolution

Also known as the substitution model. Markov models that describe rates of nucleotide or amino acid substitutions in a locus during evolution.

Partial or incomplete taxon coverage

The lack of sequences (either because they are genuinely absent or because they were not collected) from particular taxa in a group of orthologous genes.

Phylogenetic irreproducibility

Lack of reproducibility of a tree topology between two replicate tree inferences using the same software parameters (for example, same model of sequence evolution or starting seed).

Phylogenetic networks

Graphs of evolutionary relationships that, in addition to depicting the splitting of lineages, also depict the merging of lineages (due to events such as hybridization and convergent molecular evolution or due to different gene tree topologies).

Taxon sampling

Which and how many taxa are selected for a phylogenetic analysis.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Steenwyk, J.L., Li, Y., Zhou, X. et al. Incongruence in the phylogenomics era. Nat Rev Genet 24, 834–850 (2023). https://doi.org/10.1038/s41576-023-00620-x

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1038/s41576-023-00620-x

This article is cited by

Search

Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing