Article | Published:

Genome-scale approaches to resolving incongruence in molecular phylogenies

Nature volume 425, pages 798804 (23 October 2003) | Download Citation

Subjects

Abstract

One of the most pervasive challenges in molecular phylogenetics is the incongruence between phylogenies obtained using different data sets, such as individual genes. To systematically investigate the degree of incongruence, and potential methods for resolving it, we screened the genome sequences of eight yeast species and selected 106 widely distributed orthologous genes for phylogenetic analyses, singly and by concatenation. Our results suggest that data sets consisting of single or a small number of concatenated genes have a significant probability of supporting conflicting topologies. By contrast, analyses of the entire data set of concatenated genes yielded a single, fully resolved species tree with maximum support. Comparable results were obtained with a concatenation of a minimum of 20 genes; substantially more genes than commonly used but a small fraction of any genome. These results have important implications for resolving branches of the tree of life.

Access optionsAccess options

Rent or Buy article

Get time limited or full article access on ReadCube.

from$8.99

All prices are NET prices.

References

  1. 1.

    , , & A kingdom-level phylogeny of eukaryotes based on combined protein data. Science 290, 972–977 (2000)

  2. 2.

    , , , & Universal trees based on large combined protein sequence data sets. Nature Genet. 28, 281–285 (2001)

  3. 3.

    et al. Evidence for a clade of nematodes, arthropods and other moulting animals. Nature 387, 489–493 (1997)

  4. 4.

    & Early animal evolution: emerging views from comparative biology and geology. Science 284, 2129–2137 (1999)

  5. 5.

    A molecular view of microbial diversity and the biosphere. Science 276, 734–740 (1997)

  6. 6.

    & Phylogeny of the Oriental Drosophila melanogaster species group: a multilocus reconstruction. Syst. Biol. 51, 786–805 (2002)

  7. 7.

    & Testing for phylogenetic conflict among molecular data sets in the tribe Triticeae (Gramineae). Syst. Biol. 45, 522–543 (1996)

  8. 8.

    , & Arthropod phylogeny based on eight molecular loci and morphology. Nature 413, 157–161 (2001)

  9. 9.

    , , , & Mitochondrial protein phylogeny joins myriapods with chelicerates. Nature 413, 154–157 (2001)

  10. 10.

    , , & Conflicting phylogenetic signals at the base of the metazoan tree. Evol. Dev. 5, 346–359 (2003)

  11. 11.

    & Molecular phylogenetic analyses of the mitochondrial ADP–ATP carriers: the Plantae/Fungi/Metazoa trichotomy revisited. Proc. Natl Acad. Sci. USA 98, 10202–10207 (2001)

  12. 12.

    & Animals and fungi are each other's closest relatives: congruent evidence from multiple proteins. Proc. Natl Acad. Sci. USA 90, 11558–11562 (1993)

  13. 13.

    & in Molecular Systematics of Plants II: DNA Sequencing (eds Soltis, D. E., Soltis, P. S. & Doyle, J. J.) 265–296 (Kluwer, Boston, Massachusetts, 1998)

  14. 14.

    Performance of phylogenetic methods in simulation. Syst. Biol. 44, 17–48 (1995)

  15. 15.

    , & Can the cambrian explosion be inferred through molecular phylogeny? Development(Suppl.) 15–25 (1994)

  16. 16.

    , & Sampling properties of DNA sequence data in phylogenetic analysis. Mol. Biol. Evol. 12, 814–822 (1995)

  17. 17.

    Is it better to add taxa or characters to a difficult phylogenetic problem? Syst. Biol. 47, 9–17 (1998)

  18. 18.

    , & Comparison of models for nucleotide substitution used in maximum-likelihood phylogenetic estimation. Mol. Biol. Evol. 11, 316–324 (1994)

  19. 19.

    & Perils of paralogy: using HSP70 genes for inferring organismal phylogenies. Syst. Biol. 51, 570–587 (2002)

  20. 20.

    Gene trees in species trees. Syst. Biol. 46, 523–536 (1997)

  21. 21.

    , & DNA archives and our nearest relative: the trichotomy problem revisited. Mol. Phylog. Evol. 14, 259–275 (2000)

  22. 22.

    , & Molecular marker incongruence in plant hybrid zones and in phylogenetic trees. Acta Bot. Neerland. 45, 243–262 (1996)

  23. 23.

    , & Combining data in phylogenetic analysis. Trends Ecol. Evol. 11, 152–158 (1996)

  24. 24.

    , , , & Partitioning and combining data in phylogenetic analysis. Syst. Biol. 42, 384–397 (1993)

  25. 25.

    Combining data with different distributions of among-site rate variation. Syst. Biol. 45, 375–380 (1996)

  26. 26.

    Can three incongruence tests predict when data should be combined? Mol. Biol. Evol. 14, 733–740 (1997)

  27. 27.

    , & Angiosperm phylogeny inferred from multiple genes as a tool for comparative biology. Nature 402, 402–404 (1999)

  28. 28.

    et al. Molecular phylogenetics and the origins of placental mammals. Nature 409, 614–618 (2001)

  29. 29.

    , & The origin of red algae and the evolution of chloroplasts. Nature 405, 69–72 (2000)

  30. 30.

    & Amphioxus mitochondrial DNA, chordate phylogeny, and the limits of inference based on comparisons of sequences. Syst. Biol. 47, 61–76 (1998)

  31. 31.

    Inferring complex phylogenies. Nature 383, 130–131 (1996)

  32. 32.

    , , , & Sequencing and comparison of yeast species to identify genes and regulatory elements. Nature 423, 241–254 (2003)

  33. 33.

    et al. Life with 6000 genes. Science 274, 546 (1996) 563–567

  34. 34.

    et al. Finding functional features in Saccharomyces genomes by phylogenetic footprinting. Science 301, 71–76 (2003)

  35. 35.

    & Molecular evidence for an ancient duplication of the entire yeast genome. Nature 387, 708–713 (1997)

  36. 36.

    , & Gene order evolution and paleopolyploidy in hemiascomycete yeasts. Proc. Natl Acad. Sci. USA 99, 9272–9277 (2002)

  37. 37.

    , , & Yeast genome duplication was followed by asynchronous differentiation of duplicated genes. Nature 421, 848–852 (2003)

  38. 38.

    Hillis, D. M., Moritz, C. & Mable, B. K. (eds) Molecular Systematics (Sinauer, Sunderland, Massachusetts, 1996)

  39. 39.

    et al. The analysis of 100 genes supports the grouping of three highly divergent amoebae: Dictyostelium, Entamoeba, and Mastigamoeba. Proc. Natl Acad. Sci. USA 99, 1414–1419 (2002)

  40. 40.

    , , & Evidence for a high frequency of simultaneous double-nucleotide substitutions. Science 287, 1283–1286 (2000)

  41. 41.

    & Troubleshooting molecular phylogenetic analyses. Annu. Rev. Ecol. Syst. 33, 49–72 (2002)

  42. 42.

    & Phylogenetic relationships among yeasts of the ‘Saccharomyces complex’ determined from multigene sequence analyses. FEMS Yeast Res. 3, 417–432 (2003)

  43. 43.

    , & Heterotachy, an important process of protein evolution. Mol. Biol. Evol. 19, 1–7 (2002)

  44. 44.

    , , , & Three new species in the Saccharomyces sensu stricto complex: Saccharomyces cariocanus, Saccharomyces kudriavzevii and Saccharomyces mikatae. Int. J. Syst. Evol. Microbiol. 50, 1931–1942 (2000)

  45. 45.

    , , & Widespread horizontal transfer of mitochondrial genes in flowering plants. Nature 424, 197–201 (2003)

  46. 46.

    , , & The evolutionary position of nematodes. BMC Evol. Biol. 2, 7 (2002)

  47. 47.

    , & Clustal-W—improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. Nucleic Acids Res. 22, 4673–4680 (1994)

  48. 48.

    BioEdit: a user-friendly biological sequence alignment editor and analysis program for Windows 95/98/NT. Nucleic Acids Symp. Ser. 41, 95–98 (1999)

  49. 49.

    PAUP*: Phylogenetic Analysis Using Parsimony (*and Other Methods) (Version 4.0b10) (Sinauer, Sunderland, Massachusetts, 2002)

  50. 50.

    & MODELTEST: testing the model of DNA substitution. Bioinformatics 14, 817–818 (1998)

Download references

Acknowledgements

We are grateful to P. Cliften, M. Johnston and the Washington University Genome Sequencing Center for access to genome sequence data for S. kudriavzevii, S. castellii and S. kluyveri; the staff of the Saccharomyces Genome Database (http://www.yeastgenome.org/) for access to genome sequence data for S. cerevisiae, S. paradoxus, S. mikatae and S. bayanus; and the Stanford Genome Technology Center website (http://www-sequence.stanford.edu/group/candida) for access to sequence data for C. albicans. We thank D. Baum, B. Hersh, C. Hittinger and K. Johnson for useful comments on the manuscript, D. Baum and members of the Carroll laboratory for useful discussions on phylogenetics, and D. Lautenschleger for computer support. A.R. is a Human Frontier Science Program long-term fellow, B.L.W. and N.K. are NIH post-doctoral fellows, and S.B.C. is an investigator of the Howard Hughes Medical Institute. This work was funded by the Howard Hughes Medical Institute.

Author information

Author notes

    • Antonis Rokas
    •  & Barry L. Williams

    These authors contributed equally to this work

Affiliations

  1. Howard Hughes Medical Institute, Laboratory of Molecular Biology, R. M. Bock Laboratories, University of Wisconsin-Madison, 1525 Linden Drive, Madison, Wisconsin 53706, USA

    • Antonis Rokas
    • , Barry L. Williams
    • , Nicole King
    •  & Sean B. Carroll

Authors

  1. Search for Antonis Rokas in:

  2. Search for Barry L. Williams in:

  3. Search for Nicole King in:

  4. Search for Sean B. Carroll in:

Competing interests

The authors declare that they have no competing financial interests.

Corresponding author

Correspondence to Sean B. Carroll.

Supplementary information

About this article

Publication history

Received

Accepted

Published

DOI

https://doi.org/10.1038/nature02053

Further reading

Comments

By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.