The African coelacanth genome provides insights into tetrapod evolution

Journal name:
Nature
Volume:
496,
Pages:
311–316
Date published:
DOI:
doi:10.1038/nature12027
Received
Accepted
Published online

Abstract

The discovery of a living coelacanth specimen in 1938 was remarkable, as this lineage of lobe-finned fish was thought to have become extinct 70million years ago. The modern coelacanth looks remarkably similar to many of its ancient relatives, and its evolutionary proximity to our own fish ancestors provides a glimpse of the fish that first walked on land. Here we report the genome sequence of the African coelacanth, Latimeria chalumnae. Through a phylogenomic analysis, we conclude that the lungfish, and not the coelacanth, is the closest living relative of tetrapods. Coelacanth protein-coding genes are significantly more slowly evolving than those of tetrapods, unlike other genomic features. Analyses of changes in genes and regulatory elements during the vertebrate adaptation to land highlight genes involved in immunity, nitrogen excretion and the development of fins, tail, ear, eye, brain and olfaction. Functional assays of enhancers involved in the fin-to-limb transition and in the emergence of extra-embryonic tissues show the importance of the coelacanth genome as a blueprint for understanding tetrapod evolution.

At a glance

Figures

  1. A phylogenetic tree of a broad selection of jawed vertebrates shows that lungfish, not coelacanth, is the closest relative of tetrapods.
    Figure 1: A phylogenetic tree of a broad selection of jawed vertebrates shows that lungfish, not coelacanth, is the closest relative of tetrapods.

    Multiple sequence alignments of 251genes with a 1:1 ratio of orthologues in 22vertebrates and with a full sequence coverage for both lungfish and coelacanth were used to generate a concatenated matrix of 100,583 unambiguously aligned amino acid positions. The Bayesian tree was inferred using PhyloBayes under the CAT+GTR+Γ4 model with confidence estimates derived from 100 gene jack-knife replicates (support is 100% for all clades but armadillo + elephant with 45%)48. The tree was rooted on cartilaginous fish, and shows that the lungfish is more closely related to tetrapods than the coelacanth, and that the protein sequence of coelacanth is evolving slowly. Pink lines (tetrapods) are slightly offset from purple lines (lobe-finned fish), to indicate that these species are both tetrapods and lobe-finned fish.

  2. Alignment of the HOX-D locus and an upstream gene desert identifies conserved limb enhancers.
    Figure 2: Alignment of the HOX-D locus and an upstream gene desert identifies conserved limb enhancers.

    a, Organization of the mouse HOX-D locus and centromeric gene desert, flanked by the Atf2 and Mtx2 genes. Limb regulatory sequences (I1, I2, I3, I4, CsB and CsC) are noted. Using the mouse locus as a reference (NCBI and mouse genome sequencing consortium NCBI37/mm9 assembly), corresponding sequences from human, chicken, frog, coelacanth, pufferfish, medaka, stickleback, zebrafish and elephant shark were aligned. Alignment shows regions of homology between tetrapod, coelacanth and ray-finned fishes. b, Alignment of vertebrate cis-regulatory elements I1, I2, I3, I4, CsB and CsC. c, Expression patterns of coelacanth island I in a transgenic mouse. Limb buds are indicated by arrowheads in the first two panels. The third panel shows a close-up of a limb bud.

  3. Phylogeny of Cps1 coding sequences is used to determine positive selection within the urea cycle.
    Figure 3: Phylogeny of Cps1 coding sequences is used to determine positive selection within the urea cycle.

    Branch lengths are scaled to the expected number of substitutions per nucleotide, and branch colours indicate the strength of selection (dN/dS or ω). Red, positive or diversifying selection (ω>5); blue, purifying selection (ω = 0); yellow, neutral evolution (ω = 1). Thick branches indicate statistical support for evolution under episodic diversifying selection. The proportion of each colour represents the fraction of the sequence undergoing the corresponding class of selection.

  4. Transgenic analysis implicates involvement of Hox CNE HA14E1 in extraembryonic activities in the chick and mouse.
    Figure 4: Transgenic analysis implicates involvement of Hox CNE HA14E1 in extraembryonic activities in the chick and mouse.

    a, Chicken HA14E1 drives reporter expression in blood islands in chick embryos. A construct containing chicken HA14E1 upstream of a minimal (thymidine kinase) promoter driving enhanced green fluorescent protein (eGFP) was electroporated in HH4-stage chick embryos together with a nuclear mCherry construct. GFP expression was analysed at stage approximately HH11. The green aggregations and punctate staining are observed in the blood islands and developing vasculature. b, Expression of Latimeria Hoxa14-reporter transgene in the developing placental labyrinth of a mouse embryo. A field of cells from the labyrinth region of an embryo at embryonic day8.5 from a BAC transgenic line containing coelacanth Hoxa9–Hoxa14 (ref. 49) in which the Hoxa14 gene had been supplanted with the gene for red fluorescence protein (RFP). Immunohistochemistry was used to detect RFP (brown staining in a small number of cells).

Accession codes

References

  1. Smith, J. L. B. A living fish of mesozoic type. Nature 143, 455456 (1939)
  2. Nulens, R., Scott, L. & Herbin, M. An Updated Inventory of All Known Specimens of the Coelacanth, Latimeria Spp. Smithiana Vol. 3 (South African Institute for Aquatic Biodiversity, 2010)
  3. Erdmann, M. V., Caldwell, R. L. & Kasim Moosa, M. Indonesian 'king of the sea' discovered. Nature 395, 335 (1998)
  4. Smith, J. L. B. Old Fourlegs: the Story of the Coelacanth (Longmans, Green, 1956)
  5. Zhu, M. et al. Earliest known coelacanth skull extends the range of anatomically modern coelacanths to the Early Devonian. Nature Commun. 3, 772 (2012)
  6. Zimmer, C. At the Water's Edge: Fish with Fingers, Whales with Legs, and How Life Came Ashore but then Went Back to Sea (Free Press, 1999)
  7. Zardoya, R. & Meyer, A. The complete DNA sequence of the mitochondrial genome of a “living fossil,” the coelacanth (Latimeria chalumnae). Genetics 146, 9951010 (1997)
  8. Amemiya, C. T. et al. Complete HOX cluster characterization of the coelacanth provides further evidence for slow evolution of its genome. Proc. Natl Acad. Sci. USA 107, 3622–3627
  9. Larsson, T. A., Larson, E. T. & Larhammar, D. Cloning and sequence analysis of the neuropeptide Y receptors Y5 and Y6 in the coelacanth Latimeria chalumnae. Gen. Comp. Endocrinol. 150, 337342 (2007)
  10. Noonan, J. P. et al. Coelacanth genome sequence reveals the evolutionary history of vertebrate genes. Genome Res. 14, 23972405 (2004)
  11. Gnerre, S. et al. High-quality draft assemblies of mammalian genomes from massively parallel sequence data. Proc. Natl Acad. Sci. USA 108, 15131518 (2011)
  12. Bogart, J. P., Balon, E. K. & Bruton, M. N. The chromosomes of the living coelacanth and their remarkable similarity to those of one of the most ancient frogs. J. Hered. 85, 322325 (1994)
  13. Cantarel, B. L. et al. MAKER: an easy-to-use annotation pipeline designed for emerging model organism genomes. Genome Res. 18, 188196 (2008)
  14. Grabherr, M. G. et al. Full-length transcriptome assembly from RNA-seq data without a reference genome. Nature Biotech. 29, 644652 (2011)
  15. Pallavicini, A. et al. Analysis of the transcriptome of the Indonesian coelacanth Latimeria menadoensis. BMC Genomics (in the press)
  16. Schultze, H. P. & Trueb, L. Origins of the Higher Groups of Tetrapods: Controversy and Consensus. (Comstock Publishing Associates, 1991)
  17. Meyer, A. & Dolven, S. I. Molecules, fossils, and the origin of tetrapods. J. Mol. Evol. 35, 102113 (1992)
  18. Brinkmann, H., Venkatesh, B., Brenner, S. & Meyer, A. Nuclear protein-coding genes support lungfish and not the coelacanth as the closest living relatives of land vertebrates. Proc. Natl Acad. Sci. USA 101, 49004905 (2004)
  19. Lartillot, N. & Philippe, H. A Bayesian mixture model for across-site heterogeneities in the amino-acid replacement process. Mol. Biol. Evol. 21, 10951109 (2004)
  20. Takezaki, N., Rzhetsky, A. & Nei, M. Phylogenetic test of the molecular clock and linearized trees. Mol. Biol. Evol. 12, 823833 (1995)
  21. Tajima, F. Simple methods for testing the molecular evolutionary clock hypothesis. Genetics 135, 599607 (1993)
  22. Bejerano, G. et al. A distal enhancer and an ultraconserved exon are derived from a novel retroposon. Nature 441, 8790 (2006)
  23. Voss, S. R. et al. Origin of amphibian and avian chromosomes by fission, fusion, and retention of ancestral chromosomes. Genome Res. 21, 13061312 (2011)
  24. Smith, J. J. & Voss, S. R. Gene order data from a model amphibian (Ambystoma): new perspectives on vertebrate genome structure and evolution. BMC Genomics 7, 219 (2006)
  25. Inoue, J. G., Miya, M., Venkatesh, B. & Nishida, M. The mitochondrial genome of Indonesian coelacanth Latimeria menadoensis (Sarcopterygii: Coelacanthiformes) and divergence time estimation between the two coelacanths. Gene 349, 227235 (2005)
  26. Holder, M. T., Erdmann, M. V., Wilcox, T. P., Caldwell, R. L. & Hillis, D. M. Two living species of coelacanths? Proc. Natl Acad. Sci. USA 96, 1261612620 (1999)
  27. Canapa, A. et al. Composition and phylogenetic analysis of vitellogenin coding sequences in the Indonesian coelacanth Latimeria menadoensis. J. Exp. Zool..B 318, 404416 (2012)
  28. The Chimpanzee Sequencing and Analysis Consortium Initial sequence of the chimpanzee genome and comparison with the human genome. Nature 437, 6987 (2005)
  29. Zhang, J. et al. Loss of fish actinotrichia proteins and the fin-to-limb transition. Nature 466, 234237 (2010)
  30. Jovelin, R. et al. Evolution of developmental regulation in the vertebrate FgfD subfamily. J. Exp. Zool.B 314, 3356 (2010)
  31. Braasch, I. & Postlethwait, J. H. The teleost agouti-related protein 2 gene is an ohnolog gone missing from the tetrapod genome. Proc. Natl Acad. Sci. USA 108, E47E48 (2011)
  32. Navratilova, P. et al. Systematic human/zebrafish comparative identification of cis-regulatory activity around vertebrate developmental transcription factor genes. Dev. Biol 327, 526540 (2009)
  33. Xie, X. et al. Systematic discovery of regulatory motifs in conserved regions of the human genome, including thousands of CTCF insulator sites. Proc. Natl Acad. Sci. USA 104, 71457150 (2007)
  34. Jones, F. C. et al. The genomic basis of adaptive evolution in threespine sticklebacks. Nature 484, 5561 (2012)
  35. Shubin, N., Tabin, C. & Carroll, S. Deep homology and the origins of evolutionary novelty. Nature 457, 818823 (2009)
  36. Montavon, T. et al. A regulatory archipelago controls Hox genes transcription in digits. Cell 147, 11321145 (2011)
  37. Wright, P. A. Nitrogen excretion: three end products, many physiological roles. J. Exp. Biol. 198, 273281 (1995)
  38. Kosakovsky Pond, S. L. et al. A random effects branch-site model for detecting episodic diversifying selection. Mol. Biol. Evol. 28, 30333043 (2011)
  39. Häberle, J. et al. Molecular defects in human carbamoy phosphate synthetase I: mutational spectrum, diagnostic and protein structure considerations. Hum. Mutat. 32, 579589 (2011)
  40. Carroll, R. L. Vertebrate Paleontology and Evolution (W.H. Freeman and Company, 1988)
  41. Gekas, C. et al. Hematopoietic stem cell development in the placenta. Int. J. Dev. Biol. 54, 10891098 (2010)
  42. Bejerano, G. et al. Ultraconserved elements in the human genome. Science 304, 13213125 (2004)
  43. Wellik, D. M. Hox patterning of the vertebrate axial skeleton. Dev. Dyn. 236, 24542463 (2007)
  44. Scotti, M. & Kmita, M. Recruitment of 5′ Hoxa genes in the allantois is essential for proper extra-embryonic function in placental mammals. Development 139, 731730 (2012)
  45. Bengtén, E. et al. Immunoglobulin isotypes: structure, function, and genetics. Curr. Top. Microbiol. Immunol. 248, 189219 (2000)
  46. Ota, T., Rast, J. P., Litman, G. W. & Amemiya, C. T. Lineage-restricted retention of a primitive immunoglobulin heavy chain isotype within the Dipnoi reveals an evolutionary paradox. Proc. Natl Acad. Sci. USA 100, 25012506 (2003)
  47. Gregory, T. R. The Evolution of the Genome 171 (Elsevier Academic, 2004)
  48. Stamatakis, A., Ludwig, T. & Meier, H. RAxML-III: a fast program for maximum likelihood-based inference of large phylogenetic trees. Bioinformatics 21, 456463 (2005)
  49. Smith, J. J., Sumiyama, K. & Amemiya, C. T. A living fossil in the genome of a living fossil: Harbinger transposons in the coelacanth genome. Mol. Biol. Evol. 29, 985993 (2012)

Download references

Author information

  1. These authors contributed equally to this work.

    • Chris T. Amemiya &
    • Jessica Alföldi

Affiliations

  1. Molecular Genetics Program, Benaroya Research Institute, Seattle, Washington 98101, USA

    • Chris T. Amemiya,
    • Mark Robinson &
    • Nil Ratan Saha
  2. Department of Biology, University of Washington, Seattle, Washington 98105, USA

    • Chris T. Amemiya
  3. Broad Institute of MIT and Harvard, Cambridge, Massachusetts 02142, USA

    • Jessica Alföldi,
    • Iain MacCallum,
    • Aaron M. Berlin,
    • Lin Fan,
    • Sante Gnerre,
    • Andreas Gnirke,
    • Jeremy Johnson,
    • Marcia Lara,
    • Joshua Z. Levin,
    • Evan Mauceli,
    • Dariusz Przybylski,
    • Filipe J. Ribeiro,
    • Ted Sharpe,
    • Diana Tabbaa,
    • Jason Turner-Maier,
    • Louise Williams,
    • David B. Jaffe,
    • Federica Di Palma,
    • Eric S. Lander &
    • Kerstin Lindblad-Toh
  4. Comparative Genomics Laboratory, Institute of Molecular and Cell Biology, A*STAR, Biopolis, Singapore 138673, Singapore

    • Alison P. Lee,
    • Vydianathan Ravi &
    • Byrappa Venkatesh
  5. Department of Biology, University of Konstanz, Konstanz 78464, Germany

    • Shaohua Fan,
    • Tereza Manousaki,
    • Nathalie Feiner,
    • Shigehiro Kuraku,
    • Oleg Simakov &
    • Axel Meyer
  6. Département de Biochimie, Université de Montréal, Centre Robert Cedergren, Montréal H3T 1J4, Canada

    • Hervé Philippe &
    • Henner Brinkmann
  7. Institute of Neuroscience, University of Oregon, Eugene, Oregon 97403, USA

    • Ingo Braasch &
    • John H. Postlethwait
  8. Konstanz Research School of Chemical Biology, University of Konstanz, Konstanz 78464, Germany

    • Tereza Manousaki &
    • Axel Meyer
  9. Instituto de Ciencias Biologicas, Universidade Federal do Para, Belem 66075-110, Brazil

    • Igor Schneider
  10. Department of Genetics, Harvard Medical School, Boston, Massachusetts 02115, USA

    • Nicolas Rohner &
    • Clifford J. Tabin
  11. Department of Anthropology, University of Utah, Salt Lake City, Utah 84112, USA

    • Chris Organ
  12. Institut de Genomique Fonctionnelle de Lyon, Ecole Normale Superieure de Lyon, Lyon 69007, France

    • Domitille Chalopin &
    • Jean-Nicolas Volff
  13. Department of Biology, University of Kentucky, Lexington, Kentucky 40506, USA

    • Jeramiah J. Smith
  14. Biomedical Biotechnology Research Unit (BioBRU), Department of Biochemistry, Microbiology & Biotechnology, Rhodes University, Grahamstown 6139, South Africa

    • Rosemary A. Dorrington,
    • Gregory L. Blatch &
    • Adrienne L. Edkins
  15. Department of Life Sciences, University of Trieste, Trieste 34128, Italy

    • Marco Gerdol,
    • Gianluca De Moro &
    • Alberto Pallavicini
  16. Department of Informatics, Wellcome Trust Sanger Institute, Hinxton CB10 1SA, UK

    • Bronwen Aken,
    • Stephen M. J. Searle &
    • Simon White
  17. Department of Life and Environmental Sciences, Polytechnic University of Marche, Ancona 60131, Italy

    • Maria Assunta Biscotti,
    • Marco Barucca,
    • Adriana Canapa,
    • Mariko Forconi &
    • Ettore Olmo
  18. Department of Life Sciences, University of Liege, Liege 4000, Belgium

    • Denis Baurain
  19. College of Health and Biomedicine, Victoria University, Melbourne VIC 8001, Australia

    • Gregory L. Blatch
  20. Department for Innovation in Biological, Agro-food and Forest Systems, University of Tuscia, Viterbo 01100, Italy

    • Francesco Buonocore,
    • Anna Maria Fausto &
    • Giuseppe Scapigliati
  21. Department of Biology, University of Hamburg, Hamburg 20146, Germany

    • Thorsten Burmester
  22. Eccles Institute of Human Genetics, University of Utah, Salt Lake City, Utah 84112, USA

    • Michael S. Campbell &
    • Mark Yandell
  23. Department of Pediatrics, University of South Florida Morsani College of Medicine, Children’s Research Institute, St. Petersburg, Florida 33701, USA

    • John P. Cannon &
    • Gary W. Litman
  24. South African National Bioinformatics Institute, University of the Western Cape, Bellville 7535, South Africa

    • Alan Christoffels,
    • Junaid Gamieldien,
    • Uljana Hesse,
    • Sumir Panji,
    • Barbara Picone &
    • Peter van Heusden
  25. International Max-Planck Research School for Organismal Biology, University of Konstanz, Konstanz 78464, Germany

    • Nathalie Feiner &
    • Axel Meyer
  26. Biology Department, Woods Hole Oceanographic Institution, Woods Hole, Massachusetts 02543, USA

    • Jared V. Goldstone,
    • Mark E. Hahn,
    • Sibel I. Karchner &
    • John J. Stegeman
  27. MRC Functional Genomics Unit, Oxford University, Oxford OX1 3PT, UK

    • Wilfried Haerty &
    • Chris P. Ponting
  28. Transcriptome Bioinformatics Group, LIFE Research Center for Civilization Diseases, Universität Leipzig, Leipzig 04109, Germany

    • Steve Hoffmann
  29. Graduate School of Science and Technology, Keio University, Yokohama 223-8522, Japan

    • Tsutomu Miyake
  30. Department of Molecular Genetics, All Children’s Hospital, St. Petersburg, Florida 33701, USA

    • M. Gail Mueller
  31. Department of Microbiology, Immunology and Biochemistry, University of Tennessee Health Science Center, Memphis, Tennessee 38163, USA

    • David R. Nelson
  32. Bioinformatics Group, Department of Computer Science, Universität Leipzig, Leipzig 04109, Germany

    • Anne Nitsche,
    • Peter F. Stadler &
    • Hakim Tafer
  33. Department of Evolutionary Studies of Biosystems, The Graduate University for Advanced Studies, Hayama 240-0193, Japan

    • Tatsuya Ota
  34. Computational EvoDevo Group, Department of Computer Science, Universität Leipzig, Leipzig 04109, Germany

    • Sonja J. Prohaska
  35. Weatherall Institute of Molecular Medicine, University of Oxford, Oxford OX1 2JD, UK

    • Tatjana Sauka-Spengler
  36. European Molecular Biology Laboratory, Heidelberg 69117, Germany

    • Oleg Simakov
  37. Division of Population Genetics, National Institute of Genetics, Mishima 411-8540, Japan

    • Kenta Sumiyama
  38. University of Chicago, Chicago, Illinois 60637, USA

    • Neil Shubin
  39. Department Physiological Chemistry, Biocenter, University of Wuerzburg, Wuerzburg 97070, Germany

    • Manfred Schartl
  40. Science for Life Laboratory, Department of Medical Biochemistry and Microbiology, Uppsala University, Uppsala 751 23, Sweden

    • Kerstin Lindblad-Toh
  41. Present addresses: Genome Resource and Analysis Unit, Center for Developmental Biology, RIKEN, Kobe, Japan (S.K.); Boston Children’s Hospital, Boston, Massachusetts, USA (E.M.); Computational Biology Unit, Institute of Infectious Disease and Molecular Medicine, University of Cape Town Health Sciences Campus, Anzio Road, Observatory 7925, South Africa (S.P.); New York Genome Center, New York, New York, USA (F.J.R.).

    • Shigehiro Kuraku,
    • Evan Mauceli,
    • Sumir Panji &
    • Filipe J. Ribeiro

Contributions

Author Contributions J.A., C.T.A., A.M. and K.L.T. planned and oversaw the project. R.A.D. and C.T.A. provided blood and tissues for sequencing. C.T.A. and M.L. prepared the DNA for sequencing. I.M., S.G., D.P., F.J.R., T.S. and D.B.J. assembled the genome. N.R.S. and C.T.A. prepared RNA from L. chalumnae and P. annectens, and L.F. and J.Z.L. made the L. chalumnae RNA-seq library. A.C., M.B., M.A.B., M.F., F.B., G.S., A.M.F., A.P., M.G., G.D.M., J.T.-M. and E.O. sequenced and analysed the L. menadoensis RNA-seq library. B.A., S.M.J.S., S.W., M.S.C. and M.Y. annotated the genome. W.H. and C.P.P. carried out the annotation and analysis of long non-coding RNAs. P.F.S., S.H., A.N., H.T. and S.J.P. annotated non-coding RNAs. M.G., G.D.M., A.P., M.R. and C.T.A. compared L. chalumnae and L. menadoensis sequences. H.B., D.B. and H.P. carried out the phylogenomic analysis. T.Ma. and A.M. performed the gene relative-rate analysis. A.C., J.G., S.P., B.P., P.v.H. and U.H. carried out the analysis, annotation and statistical enrichment of L. chalumnae specific gene duplications. N.F. and A.M. analysed the homeobox gene repertoires. G.L.B. and A.L.E. analysed the chaperone genes. D.C., S.F., O.S., J.-N.V., M.S. and A.M. analysed transposable elements. J.J.S. analysed large-scale rearrangements in vertebrate genomes. I.B., J.H.P., N.F. and S.K. analysed genes lost in tetrapods. T.Mi. analysed actinodin and pectoral-fin musculature. C.O. and M.S. analysed selection in urea cycle genes. A.P.L. and B.V. carried out the conserved non-coding element analysis. I.S., N.R., V.R., N.S. and C.J.T. carried out the analysis of autopodial CNEs. K.S., T.S.-S. and C.T.A. examined the evolution of a placenta-related CNE. N.R.S., G.W.L., M.G.M., T.O. and C.T.A. performed the IgM analysis. J.A., C.T.A., A.M. and K.L.T. wrote the paper with input from other authors.

Competing financial interests

The authors declare no competing financial interests.

Corresponding authors

Correspondence to:

Genome assemblies, transcriptomes and mitochondrial DNA sequences have been deposited in GenBank/EMBL/DDBJ. The L. chalumnae genome assembly has been deposited under the accession number AFYH00000000. The L. chalumnae transcriptome has been deposited under the accession number SRX117503 and the P. annectans transcriptomes have been deposited under the accession numbers SRX152529, SRX152530 and SRX152531. The P. annectans mitochondrial DNA sequence was deposited under the accession number JX568887. All animal experiments were approved by the MIT Committee for Animal Care.

Author details

Supplementary information

PDF files

  1. Supplementary Information (5.1 MB)

    This file contains Supplementary Methods, Supplementary Notes 1-13, Supplementary Figures 1-22, Supplementary Tables 1-24 and Supplementary references.

Zip files

  1. Supplementary Data (2.6 MB)

    This zipped file contains Supplementary Data files 1-7.

Additional data