Molecular evidence from retroposons that whales form a clade within even-toed ungulates


The origin of whales and their transition from terrestrial life to a fully aquatic existence has been studied in depth. Palaeontological1,2, morphological3 and molecular studies4,5,6,7 suggest that the order Cetacea (whales, dolphins and porpoises) is more closely related to the order Artiodactyla (even-toed ungulates, including cows, camels and pigs) than to other ungulate orders. The traditional view that the order Artiodactyla is monophyletic has been challenged by molecular analyses of variations in mitochondrial and nuclear DNA5,6,7. We have characterized two families of short interspersed elements (SINEs) that were present exclusively in the genomes of whales, ruminants and hippopotamuses, but not in those of camels and pigs. We made an extensive survey of retropositional events that might have occurred during the divergence of whales and even-toed ungulates. We have characterized nine retropositional events of a SINE unit, each of which provides phylogenetic resolution of the relationships among whales, ruminants, hippopotamuses and pigs. Our data provide evidence that whales, ruminants and hippopotamuses form a monophyletic group.


We attempted to resolve the issue of whether the order Artiodactyla is monophyletic or paraphyletic by basing our analysis on the presence or absence of SINEs at particular orthologous loci of certain groups of species. SINEs are retroposons that have been amplified and integrated into genomes by retroposition8,9,10,11, that is, by the integration of a reverse-transcribed copy of RNA. As a consequence of the nature of retroposons, SINEs can be found specifically within members of a particular clade10,11,12,13. It is generally believed that SINEs are not excised precisely and, moreover, that SINEs have not been inserted independently at orthologous loci within different evolutionary lineages. These features mean that SINEs are very useful for the reconstruction of phylogenetic relationships among closely related species12,13.

We have characterized two new and different families of SINEs, designated the CHR-1 (for Cetacea, hippopotamus and Ruminantia) and CHR-2 family of repeats, from the genomes of several species of whales. The consensus sequences of these two families of SINEs are shown in Fig. 1a. The order Artiodactyla is traditionally divided into three suborders: Ruminantia (chevrotains, deer, cows, sheep), Tylopoda (camels) and Suiformes (pigs, peccaries and hippopotamuses). Dot-hybridization studies showed that these two families of SINEs are distributed extensively in the genomes of Cetacea, Ruminantia and hippopotamus, but were not detected in those of Tylopoda or of Suiformes other than the hippopotamus (Fig. 1b). These results suggest that whales, ruminants and hippopotamuses form a monophyletic group. This possibility prompted us to isolate specific genomic loci at which SINEs had been inserted.

Figure 1: The two newly isolated families of SINEs, CHR-1 and CHR-2, were present exclusively in the genomes of whales, ruminants and hippopotamuses.

a, The consensus sequences of CHR-1 and CHR-2. The tRNA-derived region is underlined in each case. These sequences will appear in the DDBJ, EMBL and GenBank databases under the accession numbers: AB005033 and AB005034. b, Dot hybridization experiment.

The first approach to this involved random screening, to identify loci that contained a CHR-1 or CHR-2 SINE unit, followed by cloning and sequencing. Polymerase chain reactions (PCRs) were performed with genomic DNA from various cetacean and artiodactyl species to determine whether or not the locus might be informative from a phylogenetic perspective. Second, we performed a comprehensive survey of the protein-coding genes, in standard databases, in which an intron contained one unit of CHR-1 or CHR-2. When the length of the intron was short enough for generation of a PCR product from the entire intron, we designed one set of primers by reference to the sequences of exons. We used these two approaches to characterize seven different loci with a CHR-1 or CHR-2 SINE unit, as described below.

Our analysis indicates that a CHR-2 SINE had been integrated at the locus Pm52 in a common ancestor of cetaceans (Fig. 2A). The patterns of PCR products are shown in Fig. 2A, 2a. We performed hybridization experiments with the SINE sequence to confirm that the SINE unit had been integrated in a common ancestor of all cetaceans (Fig. 2A, 2b), and with the flanking sequence to confirm that the orthologous locus of each species had been amplified accurately (Fig. 2A, 2). The presence of the SINE unit in longer fragments (about 620 base pairs in length) in cetaceans (lanes 1–7) and the absence of the SINE unit in shorter fragments (about 230 base pairs (bp) in length) in artiodactyls (lanes 8–15) were confirmed by sequencing. The small fluctuations in fragment length were due to insertions and deletions of several nucleotides (data not shown). The Pm72 locus yielded similar results (Fig. 2B). The presence of these two loci indicates that the order Cetacea forms a monophyletic group.

Figure 2: Analysis of the seven loci at which a SINE unit(s) was inserted during the evolution of cetaceans, ruminants and hippopotamuses: A, Pm52; B, Pm72; C, pgha3; D, c21-352; E, Gm5; F, aaa228; G, aaa792.

a, Products of PCR; b, c, results of hybridization experiments with different kinds of probe, namely, a unit sequence of the SINE (b) and the flanking sequence (c), respectively. In G, d and e show results of hybridization experiments with two different SINE probes, the CHR-2 SINE and the Bov-tA SINE, respectively.

The locus pgha3, at which a CHR-1 SINE was integrated in intron C of the gene for the α-subunit of a pituitary glycoprotein hormone (Fig. 2C), and locus c21-352, at which a CHR-1 SINE was integrated in intron C of the gene for steroid 21-hydroxylase (Fig. 2D), demonstrate the monophyly of ruminants.

Locus Gm5 was isolated by random screening by using CHR-1 SINE as probe. The SINE unit seems to have been integrated in a common ancestor of cetaceans, ruminants and hippopotamuses, suggesting that these three evolutionary lineages are monophyletic (Fig. 2E). The sequences of the fragments from the short-finned pilot whale, cow, hippopotamus and Bactrian camel confirmed the presence of the SINE unit in the longer fragment, and its absence in the shorter fragment, respectively. In lane 15 for the pig, a longer band was detected, but sequencing showed that this was due to insertion of another SINE unit, PRE-1 (ref. 14) in another site of this locus (data not shown).

The loci aaa228 and aaa792 are both derived from the gene for the α-subunit of the F(0)F(1) ATP synthase (designated the atpA1 gene) in the bovine genome. At locus 228, a CHR-1 SINE is present in the intron between exons 2 and 3, suggesting monophyly of cetaceans, ruminants and hippopotamuses (Fig. 2F, Fig. 1). Hybridization experiments using two different kinds of probe (Fig. 2F, 2 and 2) confirmed this conclusion.

Locus aaa792, between exons 10 and 11, is more complex. Three different families of SINEs (CHR-1, CHR-3 and Bov-tA15) became associated independently with this locus during the evolution of cetaceans and even-toed ungulates. The first integration event, involving a CHR-1 SINE, occurred in a common ancestor of cetaceans, ruminants and hippopotamuses (Fig. 2G, a–c). The pattern of PCR products is shown in Fig. 2G, 2. Hybridization experiments with the CHR-1 sequence (Fig. 2G, 2) and the flanking sequence (Fig. 2G, 2) as probe, respectively, showed that the SINE unit was integrated at the orthologous loci of the species designated above lanes 1–13, but not at those of the camel (lane 14) or the pig (lane 15). These results confirm the monophyly of cetaceans, ruminants and hippopotamuses, excluding camels and pigs. However, the lengths of fragments generated by PCR (Fig. 2G, 2) varied among species (lanes 1–13), but we deduced from the sequences of the main fragments that the other two different kinds of SINE were involved in this locus. Experiments using new probes confirmed that a CHR-2 SINE was integrated in the lineage of the minke and humpback whales (Fig. 2G, 2, lanes 1 and 2), and that a Bov-tA SINE was integrated in the lineage of the pecora (cows, sheep, deer and giraffes), indicating that the lineage forms a monophyletic group (Fig. 2G, 2, lanes 8–11). The sequences of major fragments are shown in Fig. 3.

Figure 3: An alignment of sequences of the aaa792 locus in the cow (BT, Bos taurus), minke whale (BA, Balaenoptera acutorostrata), hippopotamus (HA, Hippopotamus amphibius) and bactrian camel (CB, Camelus bactrianus).

Boxed sequences indicate direct repeats arising from duplication upon retroinsertion. The underlined sequences show the sequences used for primers. Bars indicate deletions. Nucleotides identical to those in the cow are indicated by dots.

All results for the seven loci are congruent (Fig. 4), and provide conclusive evidence for the paraphyly of the order Artiodactyla, which should include the order Cetacea, and for the paraphyly of the suborder Suiformes, from which hippopotamuses should be excluded. Hippopotamuses form a monophyletic group with cetaceans and ruminants.

Figure 4: Phylogenetic relationships among cetaceans and artiodactyls, as deduced from the sites of insertion of SINEs.

Arrows indicate the timing of insertion of SINEs. Types of SINE are shown in parentheses.

The inclusion of cetaceans within the order Artiodactyla has been proposed previously5, as the Ruminantia/Cetacea clade with the Suiformes (pigs and peccaries) as an outgroup. The possibility of clustering the hippopotamus with the Cetacea has also been suggested6,7, even though hippopotamuses have traditionally been grouped with pigs and peccaries on morphological grounds16. However, careful reanalyses of available molecular data17,18,19 indicate that the hypothesis of artiodactyl paraphyly was not supported convincingly from a statistical point of view. However, recent analyses of genes for milk casein7 provided new, convincing support for the ((cetaceans, hippopotamuses), ruminants) tree, supporting a previous hypothesis5. Our analysis of SINE retrotranspositions seems to provide unambiguous support for that hypothesis.

The conclusions from our retropositional analysis are inconsistent with earlier morphologically based hypotheses16,20,21. Paleontological and morphological data suggest that modern whales originated from the Archaeocetes (primitive aquatic cetaceans), which first appeared in the early Eocene epoch22. The Archaeocetes are believed to have originated from mesonychians, which appeared before the Eocene20. However, the most primitive artiodactyls (Dichobunids) first appeared in the early Eocene, and the origin of nearly all the families of artiodactyls can only be traced back to the middle or the late Eocene23,24. Such a sequence of appearance of these animals is inconsistent with our molecular data. However, a recent calibration of molecular clocks suggests that divergences among orders of eutherian mammals can be traced back more than 100 Myr. Hence, diversification of avian and mammalian orders might not have been an adaptive radiation after the Cretaceous/Tertiary extinction event (65 Myr ago), but might have been correlated with the fragmentation of emergent land areas during the Cretaceous25. We believe that recent molecular data will lead to the reinterpretation by palaeontologists of many fossil records of Artiodactyla to match our conclusions. Extensive morphological reversals and convergences, as well as large gaps in the fossil record, will then have to be acknowledged.


Polymerase chain reaction. PCR was performed in a 50-μl reaction mixture containing 0.2 mM dNTP, 200 ng of primer, Tth buffer (final Mg2+concentration, 1.5 mM) and 1 unit of Tth DNA polymerase (Toyobo, Osaka). Annealing temperature was chosen from 49 °C to 58 °C. A portion of the PCR products was analysed by electroporesis in an agarose gel containing 2% (w/v) Nusieve GTG and 1% (w/v) Seakem GTG (FMC BioProducts, Rockland, ME). Hybridization and washing were performed as described13.

Sequences of primers for PCR. Pm52 locus, 5′ primer, 5′-TCCTGATTCC(C/T)CTGAACAAA-3′, and 3′ primer, 5′-GGG(G/A)AAGACT(C/T)CCA(G/A)(C/T)TTTGAAAT-3′ Pm72 locus, 5′ primer, 5′-TTTAAAGCATGGCAGTTGGATTT(G/A)T-3′, and 3′ primer, 5′-GGATCTGTTTTTACTTTGACC-3′ pgha3 locus, 5′ primer, 5′-TCGGTGTGGTTCTC(G/C)AC(C/T)CT-3′, and 3′ primer, 5′-TGC(C/T)CCAATCTATCA(G/A)TG(C/T)ATG-3′ c21-352 locus, 5′ primer, 5′-GAGAATTCCTTCTG(G/A)AT(G/A)GT(G/C)AC-3′, and 3′ primer, 5′-(G/A)(C/T)CCGCAGCTCCATGGA(G/A)CC-3′ Gm5 locus, 5′ primer, 5′-GTAATGTGATTTGGCTTAGTGC-3′, and 3′ primer, 5′-TCAGCTCCTGGTGGCAGTCT-3′ aaa228 locus, 5′ primer, 5′-GCTTGATACCTACCACTATGAA-3′, and 3′ primer, 5′-CCTGG(A/C)(A/T)GTCT(G/C)AATTTGCAC-3′ and aaa792 locus, 5′ primer, 5′-TGTGGA(A/T)(G/T)NTG(G/C)CAGATTT(A/T)AAAG-3′, and 3′ primer, 5′-CAGCCACTTGCTCTTCAATAGC-3′.

Locus. Of the seven loci described, Pm52 and Pm72 were newly isolated by cloning and sequencing from a genomic library of sperm whales (Physeter macrocephalus), and Gm5 was isolated from that of short-finned pilot whale (Globicephala macrorhynchus). The other four loci were found in bovine genomic sequences in the EMBL database, as follows (accession numbers, locus name, SINE family): (X00004, pgha3, CHR1); (M11267 and M13545, c21-352, CHR-1); (X64565 and S48112, aaa228, CHR-1); (X64565 and S48112; aaa792; CHR-1, CHR-2 and Bov-tA).


  1. 1

    Gingerich, P. D., Smith, B. H. & Simons, E. L. Hind limbs of Eocene Basilosaurus: evidence of feet in whales. Science 249, 154–157 (1990).

    ADS  CAS  Article  Google Scholar 

  2. 2

    Thewissen, J. G. M. & Hussain, S. T. Origin of underwater hearing in whales. Nature 361, 444–445 (1993).

    ADS  CAS  Article  Google Scholar 

  3. 3

    Novacek, M. J. Mammalian phylogeny: shaking the tree. Nature 356, 121–125 (1992).

    ADS  CAS  Article  Google Scholar 

  4. 4

    Milinkovitch, M. C., Ortí, G. & Meyer, A. Revised phylogeny of whales suggested by mitochondrial ribosomal DNA sequences. Nature 361, 346–348 (1993).

    ADS  CAS  Article  Google Scholar 

  5. 5

    Graur, D. & Higgins, D. G. Molecular evidence for the inclusion of cetaceans within the order Artiodactyla. Mol. Biol. Evol. 11, 357–364 (1994).

    CAS  PubMed  Google Scholar 

  6. 6

    Irwin, D. M. & Arnason, U. Cytochrome b gene of marine mammals: Phylogeny and evolution. J. Mamm. Evol. 2, 37–55 (1994).

    Article  Google Scholar 

  7. 7

    Gatesy, J., Hayashi, C., Cronin, M. A. & Arctander, P. Evidence from milk casein genes that cetaceans are close relatives of hippopotamid artiodactyls. Mol. Biol. Evol. 13, 954–963 (1996).

    CAS  Article  Google Scholar 

  8. 8

    Weiner, A. M., Deininger, P. L. & Efstratiadiss, A. Nonviral retroposons: genes, pseudogenes, and transposable elements generated by the reverse flow of genetic information. Annu. Rev. Biochem. 55, 631–661 (1986).

    CAS  Article  Google Scholar 

  9. 9

    Schmid, C. & Maraia, R. Transcriptional regulation and transpositional selection of active SINE sequences. Curr. Opin. Genet. Dev. 2, 874–882 (1992).

    CAS  Article  Google Scholar 

  10. 10

    Okada, N. SINEs: Short interspersed repeated elements of the eukaryotic genome. Trends Ecol. Evol. 6, 358–361 (1991).

    CAS  Article  Google Scholar 

  11. 11

    Okada, N. & Ohshima, K. in The Impact of Short Interspersed Elements (SINEs) on the Host Genome (ed. Maraia, R. J.) 61–79 (Landes, Austin, TX, 1995).

    Google Scholar 

  12. 12

    Murata, S., Takasaki, N., Saitoh, M. & Okada, N. Determination of the phylogenetic relationships among Pacific salmonids by using short interspersed elements (SINEs) as temporal landmarks of evolution. Proc. Natl Acad. Sci. USA 90, 6995–6999 (1993).

    ADS  CAS  Article  Google Scholar 

  13. 13

    Murata, S., Takasaki, N., Saitoh, M., Tachida, H. & Okada, N. Details of retropositional genome dynamics that provide a rationale for a generic division: The distinct branching of all the Pacific salmon and trout (Oncorhynchus) from the Atlantic salmon and trout (Salmo). Genetics 142, 915–926 (1996).

    CAS  PubMed  PubMed Central  Google Scholar 

  14. 14

    Takahashi, H., Awata, T. & Yasue, H. Characterization of swine short interspersed repetitive sequences. Anim. Genet. 23, 443–448 (1992).

    CAS  Article  Google Scholar 

  15. 15

    Lenstra, J. A., van Boxtel, J. A. F., Zwaagstra, K. A. & Schwerin, M. Short interspersed nuclear element (SINE) sequences of the Bovidae. Anim. Genet. 24, 33–39 (1993).

    CAS  Article  Google Scholar 

  16. 16

    Gentry, A. W. n & Hooker, J. J. in The Phylogeny and Classification of the Tetrapods Vol. 2 Mammals (ed. Benton, M. J.) 235–272 (Clarendon, Oxford, 1988).

    Google Scholar 

  17. 17

    Adachi, J. & Hasegawa, M. Instability of quartet analyses of molecular sequence data by the maximum likelihood method: The Cetacea/Artiodactyla relationships. Mol. Phyl. Evol. 6, 72–76 (1996).

    CAS  Article  Google Scholar 

  18. 18

    Hasegawa, M. & Adachi, J. Phylogenetic position of cetaceans relative to artiodactyls: Reanalysis of mitochondrial and nuclear sequences. Mol. Biol. Evol. 13, 710–717 (1996).

    CAS  Article  Google Scholar 

  19. 19

    Philippe, H. & Douzery, E. The pitfalls of molecular phylogeny based on four species, as illustrated by the Cetacea/Artiodactyla relationships. J. Mamm. Evol. 2, 133–152 (1994).

    Article  Google Scholar 

  20. 20

    Thewissen, J. G. M. Phylogenetic aspects of cetacean origins: a morphological perspective. J. Mamm. Evol. 2, 157–184 (1994).

    Article  Google Scholar 

  21. 21

    Prothero, D. R., Manning, E. M. & Fischer, M. in The Phylogeny and Classification of the Tetrapods Vol. 2 Mammals (ed. Benton, M. J.) 201–234 (Clarendon, Oxford, 1988).

    Google Scholar 

  22. 22

    Fordyce, R. E. & Barnes, L. G. The evolutionary history of whales and dolphins. Annu. Rev. Earth Planet. Sci. 22, 419–455 (1994).

    ADS  Article  Google Scholar 

  23. 23

    Rose, K. D. Skeleton of Diacodexis, oldest known artiodactyl. Science 216, 621–623 (1982).

    ADS  CAS  Article  Google Scholar 

  24. 24

    Golz, D. J. Eocene Artiodactyla of southern California. Nat. Hist. Mus. Los Angeles County, Sci. Bull. 26, 1–85 (1976).

    Google Scholar 

  25. 25

    Hedges, S. B., Parker, P. H., Sibley, C. G. & Kumar, S. Continental breakup and the ordinal diversification of birds and mammals. Nature 381, 226–229 (1996).

    ADS  CAS  Article  Google Scholar 

Download references


We thank the Zoological Society of San Diego's Center and Y. Mukai in the Meat Hygenic Inspection Office in Ueda, Nagano prefecture for providing a sample of DNA from the lesser Malayan chevrotain and samples of DNA from calf, pig and sheep, respectively. This work was supported by a Grant-in-Aid for Specially Promoted Research from the Ministry of Education, Science, Sports and Culture of Japan.

Author information



Corresponding author

Correspondence to Norihiro Okada.

Rights and permissions

Reprints and Permissions

About this article

Cite this article

Shimamura, M., Yasue, H., Ohshima, K. et al. Molecular evidence from retroposons that whales form a clade within even-toed ungulates. Nature 388, 666–670 (1997).

Download citation


By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.


Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing