Article | Published:

Arthropod fossil data increase congruence of morphological and molecular phylogenies

Nature Communications volume 4, Article number: 2485 (2013) | Download Citation


The relationships of major arthropod clades have long been contentious, but refinements in molecular phylogenetics underpin an emerging consensus. Nevertheless, molecular phylogenies have recovered topologies that morphological phylogenies have not, including the placement of hexapods within a paraphyletic Crustacea, and an alliance between myriapods and chelicerates. Here we show enhanced congruence between molecular and morphological phylogenies based on 753 morphological characters for 309 fossil and Recent panarthropods. We resolve hexapods within Crustacea, with remipedes as their closest extant relatives, and show that the traditionally close relationship between myriapods and hexapods is an artefact of convergent character acquisition during terrestrialisation. The inclusion of fossil morphology mitigates long-branch artefacts as exemplified by pycnogonids: when fossils are included, they resolve with euchelicerates rather than as a sister taxon to all other euarthropods.


Arthropods are diverse, disparate, abundant and ubiquitous; they outnumber all other animal phyla combined. Five major extant groups can be distinguished (Fig. 1): pycnogonids (sea spiders), euchelicerates (horseshoe crabs and arachnids), myriapods (centipedes and millipedes), hexapods (insects and their flightless relatives) and crustaceans (crabs, lobsters, barnacles and so on). Each group is characterized by a distinct set of morphological features and their monophyly is little disputed, except for the crustaceans1,2,3,4. Molecular clock estimates calibrated by new fossil discoveries indicate that these groups originated and had begun to diversify by at least the mid-Cambrian5. Hence, they have had more than 500 million years to specialize and overprint ancestral characteristics, and thus few unequivocal features are informative with regards to their interrelationships. Molecular characters provide an alternative source of data that has partly alleviated this problem, although some resultant trees have recovered groupings with little morphological support. An example is a clade comprising chelicerates (pycnogonids and euchelicerates) and myriapods as sister taxa6,7,8,9 (Fig. 1a), considered so surprising it was named Paradoxopoda8 (alternatively Myriochelata9). Although a few neuroanatomical and developmental characters were later proposed as putative novelties of Paradoxopoda10,11, subsequent exploration of molecular data sets suggested that this grouping is a long-branch artefact3,12.

Figure 1: Current hypotheses of arthropod interrelationships.
Figure 1

(a) Paradoxopoda (Myriochelata) hypothesis with myriapods as sister taxon to Chelicerata. (b) Chelicerata/Mandibulata hypothesis with a clade composed of euchelicerates and pycnogonids (Chelicerata) as sister taxon to Mandibulata. (c) Cormogonida hypothesis with pycnogonids as sister taxon to all other euarthropods (Cormogonida).

Diverse molecular data sources support a close relationship between hexapods and crustaceans (collectively known as Tetraconata or Pancrustacea), either as sister taxa6, a result also favoured by some morphological evidence13,14, or more typically with hexapods nested within a paraphyletic Crustacea. The latter is supported by a number of independent lines of evidence, including nuclear ribosomal genes8,15, mitochondrial genomes and gene order, nuclear protein-coding genes1 and transcriptomics2,3,4, but has so far remained elusive in morphological phylogenies, apart from those based solely on neural characters16, which resolve malacostracan crustaceans closer to hexapods than to branchiopods. The position of the pycnogonids is equally controversial, being variously resolved as the closest relatives of euchelicerates1,3,15 (Fig. 1a,b) or as sister taxon to all other euarthropods17 (Fig. 1c; the ‘Cormogonida hypothesis’). The relative paucity of Recent morphological characters that unite pycnogonids with other arthropods or unite hexapods with any particular crustacean group to the exclusion of other groups has hampered attempts to remove long-branch artefacts and decide between alternative hypotheses. Inclusion of fossil taxa, however, provides a possible mechanism for sampling ancestral morphologies and extinct character combinations; fossil morphology has been shown in other phylogenies to mitigate long-branch biases18,19. For this reason, we undertook a large-scale phylogenetic analysis that incorporates data from a total of 309 panarthropods (plus two non-panarthropod ecdysozoans), including all major extinct and extant panarthropod groups. The 753 characters primarily describe morphology (703 characters), but are supplemented with additional data from development (29 characters), behaviour (6 characters) and gene order and gene expression (15 characters). The latter were included because they are analysed like morphology (amenable to absence/presence coding) rather than like sequence data. These characters were optimized using both equal character weighting and implied character weighting20 with a range of concavity constants (k=2, 3 and 10). Compared with previous morphology-based analyses24, this study more than doubles the number of fossil terminals (n=215). For the first time, the sample of fossil taxa includes most of the best-known arthropods from all major Cambrian to Devonian Konservat Lagerstätten, including Chengjiang, Sirius Passet, the Emu Bay Shale, the Burgess Shale, Swedish Orsten, Herefordshire and Hunsrück.

This analysis demonstrates the importance of including fossil data in large-scale phylogenetic analyses and helps to resolve long-standing conflicts regarding the relationships of crown-group arthropods.


The plesiomorphic condition of Euarthropoda

Each analysis recovered a fundamental split in the arthropod crown group (Euarthropoda) between Chelicerata and Mandibulata (myriapods, hexapods and crustaceans); both of these two main clades have a diverse fossil stem group (Fig. 2). The mandibulate stem group is composed of marrellomorphs, Agnostus, and a variety of other Cambrian Orsten taxa, including phosphatocopines. Successive outgroups of Chelicerata include vicissicaudates (aglaspidids, cheloniellids, xenopods and Sanctacaris) and a paraphyletic assemblage of trilobitomorphs. Most of these taxa (including trilobites) have traditionally been regarded as stem chelicerates under the Arachnomorpha hypothesis21,22, but more recent hypotheses regarding the organization of the arthropod head prompted their assignment to total-group Mandibulata23. These and some subsequent studies considered deutocerebral antennae to be an autapomorphy of total-group Mandibulata and the raptorial first pair of appendages of pycnogonids and euchelicerates to be a symplesiomorphy of Euarthropoda19,24. Under this scheme, the raptorial appendages of stem-group euarthropods such as megacheirans (‘great-appendage’ arthropods), fuxianhuiiids and bivalved stem-group arthropods (for example, Odaraia, Canadaspis and Perspicaris) were considered homologous to the chelicerae of chelicerates, and any antenniform appendages anterior to this were considered segmentally homologous to the antennae of onychophorans, that is, protocerebral rather than deutocerebral. Recent studies of Fuxianhuia and other closely related taxa25,26, however, indicate that the antennae are in fact deutocerebral. This finding implies that their post-antennal appendages are not homologous to chelicerae, and that mandibulate antennae are homologous to the antennae of many or most members of the euarthropod stem group. Our analyses resolve deutocerebral antennae as the symplesiomorphic condition for Euarthropoda, with chelicerae being a transformation of them and an unequivocal autapomorphy of Chelicerata (Fig. 2).

Figure 2: Summary of relationships amongst major arthropod taxa based on 311 taxa and 753 characters.
Figure 2

A summary of all results produced using implied weighting (k=2, 3 and 10). The equally weighted tree differs mainly in the position of hexapods, which resolve as sister taxon to all other pancrustaceans (monophyletic Crustacea). A full tree showing the position of all terminals is presented in Fig. 4. Paraphyletic taxa are indicated with a double line, numbers in parentheses indicate the number of terminals analysed within each group. Colours on branches show the transformation between antennae and chelicerae. Lineages possessing an anterior (presumably deutocerebral) pair of antennae are indicated in green, chelierate-bearing lineages in red, taxa which have secondarily lost their deutocerebral appendages in blue, and taxa lacking specialized arthropodized cephalic appendages in black. Numbers associated with nodes are selected Group present/Contradicted support values.

The phylogenetic position of Pycnogonida

Our phylogeny accordingly resolves pycnogonids and euchelicerates as sister taxa, although few characters beyond their shared chelicerae/chelifores support this placement. When fossils are removed from the data set (DS-II and DS-V; Table 1), pycnogonids are instead recovered as sister group of all other arthropods (Cormogonida) (Fig. 3). Many characters supporting the monophyly of Cormogonida in this latter tree, such as the presence of a telson, occur in the pycnogonid stem lineage and, hence, do not support Cormogonida in the full analysis (DS-I). Our full analysis recovers a long stem lineage for the Euarthropoda (Figs 2 and 4a,b), comprising lobopodians, dinocaridids, bivalved arthropods, fuxianhuiids and megacheirans, consistent with a few previous analyses24,27. Many of the ‘typical’ euarthropod features, such as compound eyes, an arthrodized trunk, arthropodized limbs and specialised head appendages, were gradually acquired in the euarthropod stem lineage24; when these stem-group exemplars were removed from the data set (DS-III), Cormogonida was again recovered (Fig. 3). We hence interpret Cormogonida to result from an attraction between Euchelicerata and Mandibulata caused by the secondary reduction of typical euarthropod characters in pycnogonids.

Table 1: Composition of data sets analysed in this study.
Figure 3: Relationships amongst major arthropod taxa when fossils were removed from the data set.
Figure 3

Numbers associated with nodes are Group present/Contradicted support values based on analyses using implied weighting (k=3). Numbers in parentheses indicate the number of taxa analysed within each group.

Figure 4: Phylogeny of panarthropoda.
Figure 4

Strict consensus of 45 MPTs (Most Parsimonious Trees) of 13,987,601 steps (adjusted homoplasy=592; consistency index (CI)=0.520, Retention index (RI)=0.873) produced using implied character weighting (k=3). Numbers associated with nodes are Group present/Contradicted support values. (a) Non-arthropod panarthropods and cycloneuralians. (b) Radiodontans and upper stem-group euarthropods. (c) Total-group Mandibulata and Bradoriida. (d) Stem-group chelicerates (Artiopoda). (e) Trilobitomorpha sensu stricto. (f) Chelicerata (including Xiphosura and stem-group Arachnida). (g) Arachnida. (h) Pancrustacea. (i) Miracrustacea.

The phylogenetic position of Hexapoda

A sister-group relationship between Hexapoda and Crustacea is recovered in the extant-only (Fig. 3) and equally weighted trees, but not in the full analysis with implied weights, where Crustacea is paraphyletic with respect to Hexapoda (Figs 2 and 4h). The latter result (crustacean paraphyly) arises from the use of methodologies that are more philosophically sound, that is, implied character weights rather than equal weights (see Methods for justification of implied weights), and more comprehensive taxon sampling, that is, including extinct and extant taxa rather than extant alone.

When myriapods were removed from these data sets, the remipedes (plus the Silurian fossil Tanazios28 in the total data set) resolved as the closest extant sister taxon to hexapods (and euthycarcinoids), thus retrieving crustacean paraphyly and mirroring the molecular support for a remipede sister group to hexapods1,2,29 (Fig. 3). Apomorphies for a clade of remipedes and hexapods (Miracrustacea in partim) are in part influenced by character states in their fossil sister groups (Tanzazios and euthycarcinoids), such as the apparent presence of an intercalary segment in Tanazios30 versus a second antenna in extant remipedes, whereas others are sourced from internal anatomy of extant taxa31. Apomorphies of a remipede–hexapod clade are instead resolved as symplesiomorphic for Tetraconata in the extant-only data set (DS-II) because of their shared presence in myriapods.


The total-evidence topology (Figs 2 and 4) indicates that characters supporting a close relationship between myriapods and hexapods were convergently acquired. Among these are the presence of a limbless intercalary segment in the head, uniramous appendages, tracheae, Malpighian tubules as ectodermal extensions of the hindgut and a tentorial endoskeleton. Some of these are variously present in a number of other arthropods, such as uniramy of the cephalic appendages in extant arachnids. Some homoplasy may stem from common adaptation to a terrestrial environment; this was tested using selective deactivation of characters (DS-IV and DS-V; details in Methods). Those characters linked to a terrestrial ecology, such as uniramy, were found to have the greatest effect on topology (DS-IV) and their deactivation recovered topologies in which hexapods grouped with remipedes rather than with myriapods. We therefore conclude that crustacean monophyly is influenced by the convergent acquisition of terrestrial adaptations in myriapods and hexapods, with the result that crustaceans attract to each other when hexapods group with myriapods. The continued attraction between myriapods and hexapods in the equally weighted analyses may be due to the paucity of stem-group representatives of these lineages—although euthycarcinoids resolved as sister taxon to hexapods in the current study, they share few characters to place them unambiguously. These results are not biased by the inclusion of characters that have little or no fossilization potential, such as embryonic development, behaviour, gene order and gene expression; higher-level relationships remained stable when only morphological characters were included (DS-VI, which deactivated those characters).

Although our results show increased congruence with molecular phylogenies for the deep divergences within Euarthropoda, they are less congruent with regards to some of the internal relationships of these clades, and in many cases resolve ‘traditional’ morphological groupings, such as a basal position for scorpions and opilionids within Arachnida (Fig. 4g) and the grouping of Entomostraca as a clade within the Pancrustacea (Fig. 4h). The arachnid example may reflect a paucity of fossils near the timing of cladogenesis. For example, many arachnid orders are well established in the Carboniferous and are assignable to extant clades; terrestrialization likely occurred in the late Cambrian–Ordovician5, but the fossil record before the Devonian is poor, and hence can contribute few informative character combinations to analyses. In the morphology-only data set (DS-VI), some extant internal nodes collapsed but the general relationships amongst larger clades, for example, Euchelicerata, remain consistent.

Increased congruence with molecular results in our full data set analysis when compared with our extant-only analysis provides clear evidence that the addition of fossils improves the results of morphological parsimony analysis, as intermediate morphologies allow breaking up long branches and provide a root for character polarization. We hence advocate the inclusion of fossils in any large-scale phylogenetic analysis of morphological data.


Taxon and character sampling

The current data set is based on ref. 24. The 173 taxa and 580 characters used therein (Supplementary Note 1) were supplemented with a further 138 taxa and 173 characters (Supplementary Note 2); the total data set consists of 311 taxa and 753 characters. Of these 311 taxa, two represent non-panarthropod ecdysozoans (Caenorhabditis and Priapulus) and were used as universal outgroups, and 25 represent non-arthropod Panarthropoda, including two extant tardigrades and two extant onychophorans. The remaining 284 arthropods included 194 fossils and 90 extant exemplars (the latter consisting of 3 pycnogonids, 21 euchelicerates, 13 myriapods, 13 hexapods and 40 crustaceans).

Cladistic analysis

All versions of this data set (see below) were converted into NEXUS file format32 (Supplementary Data 1) and analysed using TNT v.1.1. (Tree analysis using New Technology)33. The large size of the data set makes the probability of finding local optima very high and therefore necessitates the use of New Technology Search options34. These included 100 Random addition sequences with Parsimony Ratcheting35, Sectorial Searches, Tree Drifting and Tree Fusing34. Experimentation with these settings revealed that default options were sufficient for finding optimal trees. All characters were treated as non-additive (unordered) and weighted using both equal and implied character weighting options (see below). Nodal support was measured using Symmetric Resampling36. This measure is most appropriate to implied weights, because (unlike bootstrapping or jackknifing) it is not affected by character weighting and transformation costs36. Symmetric resampling used 1,000 replicates, each involving a New Technology search with a change probability of 33%. Nodal support values are expressed as Group present/Contradicted frequency differences (Fig. 4). To determine the impact of both character and taxon inclusion, a set of experiments was undertaken in which either particular classes of characters (for example, characters associated with terrestrialization) or taxa were selectively deactivated (see main text). These subsets of the data were then rerun using the methodology outlined above.

Given the large size of the data sets, it is computationally unfeasible to undertake selective higher-order taxon jackknifing. For this reason, individual taxa or selections of taxa were selected using a random number generator. Twenty-five random replicates were undertaken (DS-V).

Implied weighting

Justification for differential character weighting, particularly implied weighting, has been given elsewhere20,24,37. In summary, equal character weighting is only appropriate in analyses with no potential homoplasy, although this is rarely, if ever, the case. Most methods of differential character weighting require either a priori weighting or a posteriori weighting. Both of these require either an ad hoc assumption of character importance or reference to a current topology and thus can lead to circular reasoning, that is, weighting is based on a topology, which in turn was based on weighting. Implied weighting has been proposed as a method to overcome this logical impasse37. During implied weighing, characters are weighted during tree searches and the resultant Most Parsimonious Trees are compared to determine the maximum total character fit. The character fit is determined as a function of homoplasy such that those characters with most homoplasy will have a lower character fit. The most parsimonious tree is therefore the one with the greatest character fit. Unlike other character-weighting methods, which may produce a tree longer than those implied if characters were equally weighted, implied weighting is self-consistent, that is, it will only produce trees shorter under the weights they imply. Character fit can be adjusted using a concavity constant (k), where k determines how much a character is downweighted based on its level of homoplasy. The default option for TNT is k=3, a near-linear decreasing function, and is the constant preferred here, as it resolves relationships in favour of those with less homoplasy34, whereas a concavity constant <3 would resolve relationships in favour of more homoplasy, but increase the overall character usage. All analyses were undertaken using a variety of concavity constants (2, 3, 5 and 10) to determine the effect of character weighting on hypotheses of relationship.

Additional information

How to cite this article: Legg, D. A. et al. Arthropod fossil data increase congruence of morphological and molecular phylogenies. Nat. Commun. 4:2485 doi: 10.1038/ncomms3485 (2013).


  1. 1.

    et al. Arthropod relationships revealed by phylogenomic analysis of nuclear protein-coding sequences. Nature 463, 1079–1083 (2010).

  2. 2.

    et al. Pancrustacean phylogeny in the light of new phylogenomic data: support for Remipedia as the possible sister group of Hexapoda. Mol. Biol. Evol. 29, 1031–1045 (2012).

  3. 3.

    et al. A congruent solution to arthropod phylogeny: phylogenomics, microRNAs and morphology support monophyletic Mandibulata. Proc. R. Soc. B 278, 298–306 (2011).

  4. 4.

    , , & Phylotranscriptomics to bring the understudied into the fold: monophyletic Ostracoda, fossil placement and pancrustacean phylogeny. Mol. Biol. Evol. 30, 215–233 (2013).

  5. 5.

    , & Molecular timetrees reveal a Cambrian colonisation of land and a new scenario for ecdysozoan evolution. Curr. Biol. 23, 1–7 (2013).

  6. 6.

    & Ribosomal DNA phylogeny of the major extant arthropod classes and the evolution of myriapods. Nature 376, 165–167 (1995).

  7. 7.

    , , , & Mitochondrial protein phylogeny joins myriapods and chelicerates. Nature 413, 154–157 (2001).

  8. 8.

    , & Ecdysozoan phylogeny and Bayesian inference: first use of nearly complete 28S and 18S rRNA gene sequences to classify arthropods and their kin. Mol. Phylogenet. Evol. 31, 178–191 (2004).

  9. 9.

    , , & The colonization of land by animals: molecular phylogeny and divergence times among arthropods. BMC Biol. 2, 1–10 (2004).

  10. 10.

    & Velvet worm development links myriapods with chelicerates. Proc. R. Soc. B 276, 3571–3579 (2009).

  11. 11.

    & Neurogenesis in myriapods and chelicerates and its importance for understanding arthropod relationships. Integr. Comp. Biol. 46, 195–206 (2006).

  12. 12.

    A review of long-branch attraction. Cladistics 21, 163–193 (2005).

  13. 13.

    , , & The position of crustaceans within Arthropoda—evidence from nine molecular loci and morphology. Crust. Issues 16, 307–352 (2004).

  14. 14.

    & Phylogenetic relationships of basal hexapods among mandibulate arthropods: a cladistic analysis based on comparative morphological characters. Zool. Scr. 33, 511–550 (2004).

  15. 15.

    et al. Can comprehensive background knowledge be incorporated into substitution models to improve phylogenetic analyses? BMC Evol. Biol. 9, 119 (2009).

  16. 16.

    & A new view of insect-crustacean relationships I. Inferences from neural cladistics and comparative neuroanatomy. Arthropod Struct. Dev. 40, 276–288 (2011).

  17. 17.

    , & Arthropod phylogeny based on eight molecular loci and morphology. Nature 413, 157–161 (2001).

  18. 18.

    Can incomplete taxa rescue phylogenetic analyses from long-branch attraction? Syst. Biol. 54, 731–742 (2005).

  19. 19.

    Arthropod phylogeny: an overview from the perspectives of morphology, molecular data and the fossil record. Arthropod Struct. Dev. 39, 74–87 (2010).

  20. 20.

    , , & Weighting against homoplasy improves phylogenetic analysis of morphological datasets. Cladistics 24, 758–773 (2008).

  21. 21.

    , , , & inArthropod Fossils and Phylogeny ed Edgecombe G. D. 33–105Columbia University Press: Columbia, (1998).

  22. 22.

    & The phylogeny of arachnomorph arthropods and the origin of Chelicerata. Trans. R. Soc. Edinburgh Earth Sci. 94, 169–193 (2004).

  23. 23.

    & The evolution of arthropod heads: reconciling morphological, developmental and palaeontological evidence. Dev. Genes Evol. 216, 395–415 (2006).

  24. 24.

    , , & Cambrian bivalved arthropod reveals origin of arthrodization. Proc. R. Soc. B 279, 4699–4704 (2012).

  25. 25.

    , , & Complex brain and optic lobes in an early Cambrian arthropod. Nature 490, 258–261 (2012).

  26. 26.

    , , & Specialized appendages in fuxianhuiids and the head organization of early arthropods. Nature 494, 468–471 (2013).

  27. 27.

    A palaeontological solution to the arthropod head problem. Nature 417, 271–275 (2002).

  28. 28.

    , , & A new probable stem lineage crustacean with three-dimensionally preserved soft parts from the Herefordshire (Silurian) Lagerstätte, UK. Proc. R. Soc. B 274, 2099–2107 (2007).

  29. 29.

    , , , & Hemocyanin suggests a close relationship of Remipedia and Hexapoda. Mol. Biol. Evol. 26, 2711–2718 (2009).

  30. 30.

    Crustacean classification: on-going controversies and unresolved problems. Zootaxa 1668, 313–325 (2007).

  31. 31.

    , , , & Ovary structure and early oogenesis in the remipede Godzilliognomus frondosus (Crustacea, Remipedia): phylogenetic implications. Zoology 115, 261–269 (2012).

  32. 32.

    , & NEXUS: an extensible file format for systematic information. Syst. Biol. 46, 590–621 (1997).

  33. 33.

    , & TNT, a free program for phylogenetic analysis. Cladistics 24, 774–786 (2008).

  34. 34.

    Analyzing large datasets in reasonable times: solutions for composite optima. Cladistics 15, 415–428 (1999).

  35. 35.

    The parsimony ratchet, a new method for rapid parsimony analysis. Cladistics 15, 407–414 (1999).

  36. 36.

    et al. Improvements to resampling measures of group support. Cladistics 19, 324–332 (2003).

  37. 37.

    Estimating character weights during tree search. Cladistics 9, 83–91 (1993).

Download references


D.A.L. thanks APSOMA, particularly Javier Ortega-Hernández, Allison Daley, Xiaoya Ma and Jo Wolfe for discussion. D.A.L. is funded by a Janet Watson Scholarship (Imperial College London).

Author information


  1. Department of Earth Sciences and Engineering, Royal School of Mines, Imperial College London, London SW7 2AZ, UK

    • David A. Legg
    •  & Mark D. Sutton
  2. Department of Earth Sciences, The Natural History Museum, London SW7 5BD, UK

    • David A. Legg
    •  & Gregory D. Edgecombe
  3. Oxford University Museum of Natural History, Oxford OX1 3PW, UK

    • David A. Legg


  1. Search for David A. Legg in:

  2. Search for Mark D. Sutton in:

  3. Search for Gregory D. Edgecombe in:


The data set was constructed by D.A.L. and G.D.E. with input by M.D.S., and was analysed by D.A.L. All authors contributed equally to writing this work.

Competing interests

The authors declare no competing financial interests.

Corresponding author

Correspondence to David A. Legg.

Supplementary information

PDF files

  1. 1.

    Supplementary Notes and References

    Supplementary Notes 1-2 and Supplementary References

Text files

  1. 1.

    Supplementary Data 1

    Nexus file for 311 taxa and 753 characters

About this article

Publication history





Rights and permissions

To obtain permission to re-use content from this article visit RightsLink.


By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.