Abstract
All inferences in comparative biology depend on accurate estimates of evolutionary relationships. Recent phylogenetic analyses have turned away from maximum parsimony towards the probabilistic techniques of maximum likelihood and bayesian Markov chain Monte Carlo (BMCMC). These probabilistic techniques represent a parametric approach to statistical phylogenetics, because their criterion for evaluating a topology—the probability of the data, given the tree—is calculated with reference to an explicit evolutionary model from which the data are assumed to be identically distributed. Maximum parsimony can be considered nonparametric, because trees are evaluated on the basis of a general metric—the minimum number of character state changes required to generate the data on a given tree—without assuming a specific distribution1. The shift to parametric methods was spurred, in large part, by studies showing that although both approaches perform well most of the time2, maximum parsimony is strongly biased towards recovering an incorrect tree under certain combinations of branch lengths, whereas maximum likelihood is not3,4,5,6. All these evaluations simulated sequences by a largely homogeneous evolutionary process in which data are identically distributed. There is ample evidence, however, that real-world gene sequences evolve heterogeneously and are not identically distributed7,8,9,10,11,12,13,14,15,16. Here we show that maximum likelihood and BMCMC can become strongly biased and statistically inconsistent when the rates at which sequence sites evolve change non-identically over time. Maximum parsimony performs substantially better than current parametric methods over a wide range of conditions tested, including moderate heterogeneity and phylogenetic problems not normally considered difficult.
This is a preview of subscription content, access via your institution
Access options
Subscribe to this journal
Receive 51 print issues and online access
$199.00 per year
only $3.90 per issue
Rent or buy this article
Prices vary by article type
from$1.95
to$39.95
Prices may be subject to local taxes which are calculated during checkout




Similar content being viewed by others
References
Sanderson, M. J. & Kim, J. Parametric phylogenetics? Syst. Biol. 49, 817–829 (2000)
Hillis, D. M., Huelsenbeck, J. P. & Cunningham, C. W. Application and accuracy of molecular phylogenies. Science 264, 671–677 (1994)
Felsenstein, J. Cases in which parsimony and compatibility methods will be positively misleading. Syst. Zool. 27, 401–410 (1978)
Kuhner, M. K. & Felsenstein, J. A simulation comparison of phylogeny algorithms under equal and unequal evolutionary rates. Mol. Biol. Evol. 11, 459–468 (1994)
Huelsenbeck, J. P. Systematic bias in phylogenetic analysis: is the Strepsiptera problem solved? Syst. Biol. 47, 519–537 (1998)
Gaut, B. S. & Lewis, P. O. Success of maximum likelihood phylogeny inference in the four-taxon case. Mol. Biol. Evol. 12, 152–162 (1995)
Huelsenbeck, J. P. Testing a covariotide model of DNA substitution. Mol. Biol. Evol. 19, 698–707 (2002)
Miyamoto, M. M. & Fitch, W. M. Testing the covarion hypothesis of molecular evolution. Mol. Biol. Evol. 12, 503–513 (1995)
Lopez, P., Casane, D. & Philippe, H. Heterotachy, an important process of protein evolution. Mol. Biol. Evol. 19, 1–7 (2002)
Fitch, W. M. The molecular evolution of cytochrome c in eukaryotes. J. Mol. Evol. 8, 13–40 (1976)
Pollock, D. D., Taylor, W. R. & Goldman, N. Coevolving protein residues: maximum likelihood identification and relationship to structure. J. Mol. Biol. 287, 187–198 (1999)
Pupko, T. & Galtier, N. A covarion-based method for detecting molecular adaptation: application to the evolution of primate mitochondrial genomes. Proc. R. Soc. Lond. B 269, 1313–1316 (2002)
Inagaki, Y., Susko, E., Fast, N. M. & Roger, A. J. Covarion shifts cause a long-branch attraction artifact that unites Microsporidia and Archaebacteria in EF-1{alpha} phylogenies. Mol. Biol. Evol. 21, 1340–1349 (2004)
Lockhart, P. J. et al. A covariotide model explains apparent phylogenetic structure of oxygenic photosynthetic lineages. Mol. Biol. Evol. 15, 1183–1188 (1998)
Misof, B. et al. An empirical analysis of mt 16S rRNA covarion-like evolution in insects: site-specific rate variation is clustered and frequently detected. J. Mol. Evol. 55, 460–469 (2002)
Philippe, H. & Lopez, P. On the conservation of protein sequences in evolution. Trends Biochem. Sci. 26, 414–416 (2001)
Donaldson, T. S. Robustness of the F-test to errors of both kinds and the correlation between the numerator and denominator of the F-ratio. J. Am. Stat. Assoc. 63, 660–676 (1968)
Sullivan, J. & Swofford, D. L. Should we use model-based methods for phylogenetic inference when we know that assumptions about among-site rate variation and nucleotide substitution pattern are violated? Syst. Biol. 50, 723–729 (2001)
Hillis, D. M. Taxonomic sampling, phylogenetic accuracy, and investigator bias. Syst. Biol. 47, 3–8 (1998)
Chang, J. T. Inconsistency of evolutionary tree topology reconstruction methods when substitution rates vary across characters. Math. Biosci. 134, 189–215 (1996)
Rokas, A., Williams, B. L., King, N. & Carroll, S. B. Genome-scale approaches to resolving incongruence in molecular phylogenies. Nature 425, 798–804 (2003)
Russo, C. A., Takezaki, N. & Nei, M. Efficiencies of different genes and different tree-building methods in recovering a known vertebrate phylogeny. Mol. Biol. Evol. 13, 525–536 (1996)
Delarbre, C., Gallut, C., Barriel, V., Janvier, P. & Gachelin, G. Complete mitochondrial DNA of the hagfish, Eptatretus burgeri: the comparative analysis of mitochondrial DNA sequences strongly supports the cyclostome monophyly. Mol. Phylogenet. Evol. 22, 184–192 (2002)
Naylor, G. J. & Brown, W. M. Amphioxus mitochondrial DNA, chordate phylogeny, and the limits of inference based on comparisons of sequences. Syst. Biol. 47, 61–76 (1998)
Tuffley, C. & Steel, M. Links between maximum likelihood and maximum parsimony under a simple model of site substitution. Bull. Math. Biol. 59, 581–607 (1997)
Swofford, D. L. PAUP*: Phylogenetic Analysis Using Parsimony and Other Methods, v.4.0b10 (Sinauer Associates, Sunderland, Massachusetts, 1998)
Yang, Z. PAML: a program package for phylogenetic analysis by maximum likelihood. Comput. Appl. Biosci. 13, 555–556 (1997)
Huelsenbeck, J. P. & Ronquist, F. MRBAYES: Bayesian inference of phylogenetic trees. Bioinformatics 17, 754–755 (2001)
Posada, D. & Crandall, K. A. MODELTEST: testing the model of DNA substitution. Bioinformatics 14, 817–818 (1998)
Swofford, D. L. et al. Bias in phylogenetic estimation and its relevance to the choice between parsimony and likelihood methods. Syst. Biol. 50, 525–539 (2001)
Acknowledgements
We thank J. Conery for advice, support and programming advice. P. Phillips, R. DeSalle and S. Proulx provided comments and discussion. We benefited from discussions of mixed model methods with D. Zwickl. B.K. was supported by an NSF IGERT training grant in Evolution, Development and Genomics to the University of Oregon.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Competing interests
The authors declare that they have no competing financial interests.
Supplementary information
Rights and permissions
About this article
Cite this article
Kolaczkowski, B., Thornton, J. Performance of maximum parsimony and likelihood phylogenetics when evolution is heterogeneous. Nature 431, 980–984 (2004). https://doi.org/10.1038/nature02917
Received:
Accepted:
Issue Date:
DOI: https://doi.org/10.1038/nature02917
This article is cited by
-
Tracing the birth of structural domains from loops during protein evolution
Scientific Reports (2023)
-
Designing Weights for Quartet-Based Methods When Data are Heterogeneous Across Lineages
Bulletin of Mathematical Biology (2023)
-
Africa’s oldest dinosaurs reveal early suppression of dinosaur distribution
Nature (2022)
-
Will the genomics revolution finally solve the Salmo systematics?
Hydrobiologia (2022)
-
Bacteroides muris sp. nov. isolated from the cecum of wild-derived house mice
Archives of Microbiology (2022)
Comments
By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.