Tunicates or urochordates (appendicularians, salps and sea squirts), cephalochordates (lancelets) and vertebrates (including lamprey and hagfish) constitute the three extant groups of chordate animals. Traditionally, cephalochordates are considered as the closest living relatives of vertebrates, with tunicates representing the earliest chordate lineage1,2. This view is mainly justified by overall morphological similarities and an apparently increased complexity in cephalochordates and vertebrates relative to tunicates2. Despite their critical importance for understanding the origins of vertebrates3, phylogenetic studies of chordate relationships have provided equivocal results4,5,6,7. Taking advantage of the genome sequencing of the appendicularian Oikopleura dioica, we assembled a phylogenomic data set of 146 nuclear genes (33,800 unambiguously aligned amino acids) from 14 deuterostomes and 24 other slowly evolving species as an outgroup. Here we show that phylogenetic analyses of this data set provide compelling evidence that tunicates, and not cephalochordates, represent the closest living relatives of vertebrates. Chordate monophyly remains uncertain because cephalochordates, albeit with a non-significant statistical support, surprisingly grouped with echinoderms, a hypothesis that needs to be tested with additional data. This new phylogenetic scheme prompts a reappraisal of both morphological and palaeontological data and has important implications for the interpretation of developmental and genomic studies in which tunicates and cephalochordates are used as model animals.
Access optionsAccess options
Subscribe to Journal
Get full journal access for 1 year
only $3.90 per issue
All prices are NET prices.
VAT will be added later in the checkout.
Rent or Buy article
Get time limited or full article access on ReadCube.
All prices are NET prices.
We thank S. Conway Morris, R. P. S. Jefferies, W. R. Jeffery and J. Mallatt for suggestions, and N. Lartillot and N. Rodrigue for critical readings of early versions of the manuscript. Oikopleura genome data were generated at Génoscope Evry (France) with material and co-funding from the Sars International Centre. We are grateful to P. Wincker and the Génoscope team. We gratefully acknowledge the financial support provided by Génome Québec, the Canadian Research Chair and the Université de Montréal, and the Réseau Québecois de Calcul de Haute Performance for computational resources. Author Contributions H.P. conceived the study. D.C. contributed sequence data from the Oikopleura genome project. F.D., H.B. and H.P. assembled the data set and performed phylogenetic analyses. F.D. wrote the first draft of the manuscript and all authors contributed to the writing of its final version.
This file describes the protocol used to assemble the genomic data. More details on the phylogenetic analyses are also provided with investigations of the effects of taxon sampling, compositional bias and heterotachy on tree reconstruction.
This table provides the list of all 146 gene names and the number of amino acid positions conserved for each gene alignment.
This table summarizes the amount and occurrence of missing data per taxa in the complete dataset.
This figure presents the most parsimonious tree obtained with Oikopleura dioica used as the single representative of tunicates.
This figure shows the maximum likelihood tree obtained with a reduced dataset using Oikopleura as the single representative of tunicates.
This figure presents a principal component analysis (PCA) of amino acid frequencies on the complete dataset.
This figure shows the maximum likelihood tree obtained with a reduced dataset where the sea-urchin (Strongylocentrotus) is removed from the complete dataset.
This figure shows the most parsimonious tree obtained from the complete dataset recoded into six Dayhoff categories.
This figure presents the maximum likelihood topology identified by the partitioned-likelihood analysis on the complete dataset.
This figure shows the majority rule consensus tree obtained from Bayesian analysis of the complete dataset under a covarion model.
About this article
Cellular and Molecular Life Sciences (2018)