Insights into hominid evolution from the gorilla genome sequence

Journal name:
Date published:
Published online


Gorillas are humans’ closest living relatives after chimpanzees, and are of comparable importance for the study of human origins and evolution. Here we present the assembly and analysis of a genome sequence for the western lowland gorilla, and compare the whole genomes of all extant great ape genera. We propose a synthesis of genetic and fossil evidence consistent with placing the human–chimpanzee and human–chimpanzee–gorilla speciation events at approximately 6 and 10 million years ago. In 30% of the genome, gorilla is closer to human or chimpanzee than the latter are to each other; this is rarer around coding genes, indicating pervasive selection throughout great ape evolution, and has functional consequences in gene expression. A comparison of protein coding genes reveals approximately 500 genes showing accelerated evolution on each of the gorilla, human and chimpanzee lineages, and evidence for parallel acceleration, particularly of genes involved in hearing. We also compare the western and eastern gorilla species, estimating an average sequence divergence time 1.75 million years ago, but with evidence for more recent genetic exchange and a population bottleneck in the eastern species. The use of the genome sequence in these and future analyses will promote a deeper understanding of great ape biology and evolution.

At a glance


  1. Speciation of the great apes.
    Figure 1: Speciation of the great apes.

    a, Phylogeny of the great ape family, showing the speciation of human (H), chimpanzee (C), gorilla (G) and orang-utan (O). Horizontal lines indicate speciation times within the hominine subfamily and the sequence divergence time between human and orang-utan. Interior grey lines illustrate an example of incomplete lineage sorting at a particular genetic locus—in this case (((C, G), H), O) rather than (((H, C), G), O). Below are mean nucleotide divergences between human and the other great apes from the EPO alignment. b, Great ape speciation and divergence times. Upper panel, solid lines show how times for the HC and HCG speciation events estimated by CoalHMM vary with average mutation rate; dashed lines show the corresponding average sequence divergence times, as well as the HO sequence divergence. Blue blocks represent hominid fossil species (key at top right): each has a vertical extent spanning the range of dates estimated for it in the literature9, 50, and a horizontal position at the maximum mutation rate consistent both with its proposed phylogenetic position and the CoalHMM estimates (including some allowance for ancestral polymorphism in the case of Sivapithecus). The grey shaded region shows that an increase in mutation rate going back in time can accommodate present-day estimates, fossil hypotheses, and a middle Miocene speciation for orang-utan. Lower panel, estimates of the average mutation rate in present-day humans11, 12, 13; grey bars show 95% confidence intervals, with black lines at the means. Estimates were made by the 1000 Genomes Project for trios of European (CEU) and Yoruban African (YRI) ancestry.

  2. Genome-wide incomplete lineage sorting (ILS) and selection.
    Figure 2: Genome-wide incomplete lineage sorting (ILS) and selection.

    a, Variation in ILS. Each vertical blue line represents the fraction of ILS between human, chimpanzee and gorilla estimated in a 1-Mbp region. Dashed black lines show the average ILS across the autosomes and on X; the red line shows the expected ILS on X, given the autosomal average and assuming neutral evolution. b, Reduction in ILS around protein coding genes. The blue line shows the mean rate of ILS sites normalized by mutation rate as a function of distance upstream or downstream of the nearest gene (see Supplementary Information). The horizontal dashed line indicates the average value outside 300kbp from the nearest gene; error bars are s.e.m.

  3. Differences in expression and regulation.
    Figure 3: Differences in expression and regulation.

    a, Mean gene expression distance between human and chimpanzee as a function of the proportion of ILS sites per gene. Each point represents a sliding window of 900 genes (over genes ordered by ILS fraction); s.d. error limits are shown in grey. b, Top row, classification of CTCF sites in the gorilla (EB(JC)) and human (GM12878) LCLs on the basis of species-uniqueness; numbers of alignable CTCF binding sites are shown for each category. Bottom three rows, sequence changes of CTCF motifs embedded in human-specific, shared and gorilla-specific CTCF binding sites located within shared CpG islands, species-specific CpG islands or outside CpG islands. Numbers of CTCF binding sites are shown for each CpG island category. Gorilla and human motif sequences are compared and represented as indels, disruptions (>4-bp gaps) and substitutions.

  4. Gorilla species distribution and divergence.
    Figure 4: Gorilla species distribution and divergence.

    a, Distribution of gorilla species in Africa. The western species (Gorilla gorilla) comprises two subspecies: western lowland gorillas (G. gorilla gorilla) and Cross River gorillas (G. gorilla diehli). Similarly, the eastern species (Gorilla beringei) is subclassified into eastern lowland gorillas (G. beringei graueri) and mountain gorillas (G. beringei beringei). (Based on data in ref. 43.) Areas of water are shown pale blue. Inset, area of main map. b, Western lowland gorilla Kamilah, source of the reference assembly (photograph by J.R.). c, Eastern lowland gorilla Mukisi (photograph by M. Seres). d, Isolation–migration model of the western and eastern species. NA, NW and NE are ancestral, western and eastern effective population sizes; m is the migration rate. e, Likelihood surface for migration and split time parameters in the isolation–migration model; colours from blue (minimum) to red (maximum) indicate the magnitude of likelihood.


  1. Huxley, T. H. Evidence as to Man’s Place in Nature (Williams & Norgate, 1863)
  2. King, M. C. & Wilson, A. C. Evolution at two levels in humans and chimpanzees. Science 188, 107116 (1975)
  3. Chimpanzee Sequencing and Analysis Consortium. Initial sequence of the chimpanzee genome and comparison with the human genome. Nature 437, 6987 (2005)
  4. Locke, D. P. et al. Comparative and demographic analysis of orang-utan genomes. Nature 469, 529533 (2011)
  5. Hubbard, T. J. et al. Ensembl 2009. Nucleic Acids Res. 37, D690D697 (2009)
  6. Paten, B., Herrero, J., Beal, K., Fitzgerald, S. & Birney, E. Enredo and Pecan: genome-wide mammalian consistency-based multiple alignment with paralogs. Genome Res. 18, 18141828 (2008)
  7. Bradley, B. J. Reconstructing phylogenies and phenotypes: a molecular view of human evolution. J. Anat. 212, 337353 (2008)
  8. Burgess, R. & Yang, Z. Estimation of hominoid ancestral population sizes under bayesian coalescent models incorporating mutation rate variation and sequencing errors. Mol. Biol. Evol. 25, 19791994 (2008)
  9. Wood, B. & Harrison, T. The evolutionary context of the first hominins. Nature 470, 347352 (2011)
  10. Steiper, M. E. & Young, N. M. Timing primate evolution: lessons from the discordance between molecular and paleontological estimates. Evol. Anthropol. 17, 179188 (2008)
  11. Lynch, M. Rate, molecular spectrum, and consequences of human mutation. Proc. Natl Acad. Sci. USA 107, 961968 (2010)
  12. The 1000 Genomes Project Consortium. A map of human genome variation from population-scale sequencing. Nature 467, 10611073 (2010); correction. 473, 544 (2011)
  13. Roach, J. C. et al. Analysis of genetic inheritance in a family quartet by whole-genome sequencing. Science 328, 636639 (2010)
  14. Hartwig, W. C. et al. The Primate Fossil Record (Cambridge Univ. Press, 2002)
  15. Kim, S. H., Elango, N., Warden, C., Vigoda, E. & Yi, S. V. Heterogeneous genomic molecular clocks in primates. PLoS Genet. 2, e163 (2006)
  16. Fleagle, J. G. Primate Adaptation and Evolution 2nd edn (Academic Press, 1998)
  17. Charlesworth, D., Morgan, M. T. & Charlesworth, B. Mutation accumulation in finite populations. J. Hered. 84, 321325 (1993)
  18. McVicker, G., Gordon, D., Davis, C. & Green, P. Widespread genomic signatures of natural selection in hominid evolution. PLoS Genet. 5, e1000471 (2009)
  19. Myers, S., Bottolo, L., Freeman, C., McVean, G. & Donnelly, P. A fine-scale map of recombination rates and hotspots across the human genome. Science 310, 321324 (2005)
  20. Vicoso, B. & Charlesworth, B. Evolution on the X chromosome: unusual patterns and processes. Nature Rev. Genet. 7, 645653 (2006)
  21. Ellegren, H. Characteristics, causes and evolutionary consequences of male-biased mutation. Proc. R. Soc. Lond. B 274, 110 (2007)
  22. Goetting-Minesky, M. P. & Makova, K. D. Mammalian male mutation bias: impacts of generation time and regional variation in substitution rates. J. Mol. Evol. 63, 537544 (2006)
  23. Presgraves, D. C. & Yi, S. V. Doubts about complex speciation between humans and chimpanzees. Trends Ecol. Evol. 24, 533540 (2009)
  24. Patterson, N., Richter, D. J., Gnerre, S., Lander, E. S. & Reich, D. Genetic evidence for complex speciation of humans and chimpanzees. Nature 441, 11031108 (2006)
  25. Hughes, J. F. et al. Chimpanzee and human Y chromosomes are remarkably divergent in structure and gene content. Nature 463, 536539 (2010)
  26. Kamada, F. et al. A genome-wide association study identifies RNF213 as the first Moyamoya disease gene. J. Hum. Genet. 56, 3440 (2011)
  27. Herculano-Houzel, S. Scaling of brain metabolism with a fixed energy budget per neuron: implications for neuronal activity, plasticity and evolution. PLoS ONE 6, e17514 (2011)
  28. Clark, A. G. et al. Inferring nonneutral evolution from human-chimp-mouse orthologous gene trios. Science 302, 19601963 (2003)
  29. Ellis, R. A. & Montagna, W. The skin of primates. VI. The skin of the gorilla (Gorilla gorilla). Am. J. Phys. Anthropol. 20, 7993 (1962)
  30. Streeter, G. L. Some uniform characteristics of the primate auricle. Anat. Rec. A 23, 335341 (1922)
  31. Wallis, O. C., Zhang, Y. P. & Wallis, M. Molecular evolution of GH in primates: characterisation of the GH genes from slow loris and marmoset defines an episode of rapid evolutionary change. J. Mol. Endocrinol. 26, 249258 (2001)
  32. Stenson, P. D. et al. The Human Gene Mutation Database: 2008 update. Genome Med. 1, 13 (2009)
  33. Gibbs, R. A. et al. Evolutionary and biomedical insights from the rhesus macaque genome. Science 316, 222234 (2007)
  34. Montgomery, S. B. et al. Transcriptome genetics using second generation sequencing in a Caucasian population. Nature 464, 773777 (2010)
  35. Blekhman, R., Marioni, J. C., Zumbo, P., Stephens, M. & Gilad, Y. Sex-specific and lineage-specific alternative splicing in primates. Genome Res. 20, 180189 (2010)
  36. Phillips, J. E. & Corces, V. G. CTCF: master weaver of the genome. Cell 137, 11941211 (2009)
  37. McDaniell, R. et al. Heritable individual-specific and allele-specific chromatin signatures in humans. Science 328, 235239 (2010)
  38. Kunarso, G. et al. Transposable elements have rewired the core regulatory network of human embryonic stem cells. Nature Genet. 42, 631634 (2010)
  39. Schmidt, D. et al. Five-vertebrate ChIP-seq reveals the evolutionary dynamics of transcription factor binding. Science 328, 10361040 (2010)
  40. Groves, C. Primate Taxonomy (Smithsonian Institution Press, 2001)
  41. Thalmann, O., Fischer, A., Lankester, F., Paabo, S. & Vigilant, L. The complex evolutionary history of gorillas: insights from genomic data. Mol. Biol. Evol. 24, 146158 (2007)
  42. Stokes, E., Malonga, R., Rainey, H. & Strindberg, S. Western Lowland Gorilla Surveys in Northern Republic of Congo 2006–2007. Summary Scientific Report (WCS Global Conservation, 2008)
  43. IUCN. The IUCN Red List of Threatened Species. Version 2010. 1 left fencehttp://www.iucnredlist.orgright fence (2010)
  44. Stacey, M., Lin, H. H., Hilyard, K. L., Gordon, S. & McKnight, A. J. Human epidermal growth factor (EGF) module-containing mucin-like hormone receptor 3 is a new member of the EGF-TM7 family that recognizes a ligand on human macrophages and activated neutrophils. J. Biol. Chem. 276, 1886318870 (2001)
  45. Jensen-Seaman, M. I. & Li, W. H. Evolution of the hominoid semenogelin genes, the major proteins of ejaculated semen. J. Mol. Evol. 57, 261270 (2003)
  46. Alkan, C. et al. Personalized copy number and segmental duplication maps using next-generation sequencing. Nature Genet. 41, 10611067 (2009)
  47. Gazave, E. et al. Copy number variation analysis in the great apes reveals species-specific patterns of structural variation. Genome Res. 21, 16261639 (2011)
  48. Begun, D. R. in Handbook of Palaeoanthropology Vol. 2, Primate Evolution and Human Origins (eds Henke, W. & Tattersall, I.) 921977 (Springer, 2007)
  49. Green, R. E. et al. A draft sequence of the Neandertal genome. Science 328, 710722 (2010)
  50. Lebatard, A. E. et al. Cosmogenic nuclide dating of Sahelanthropus tchadensis and Australopithecus bahrelghazali: Mio-Pliocene hominids from Chad. Proc. Natl Acad. Sci. USA 105, 32263231 (2008)

Download references

Author information


  1. Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton CB10 1SA, UK

    • Aylwyn Scally,
    • Ian Goodhead,
    • Shane McCarthy,
    • Y. Amy Tang,
    • Yali Xue,
    • Bryndis Yngvadottir,
    • Qasim Ayub,
    • Yuan Chen,
    • Chris M. Clee,
    • Yong Gu,
    • Paul Heath,
    • Anja Kolb-Kokocinski,
    • Gavin K. Laird,
    • Anthony S. Rogers,
    • Jared T. Simpson,
    • Daniel J. Turner,
    • Weldon Whitener,
    • Zemin Ning,
    • Duncan T. Odom,
    • Michael A. Quail,
    • Stephen M. Searle,
    • Jane Rogers,
    • Chris Tyler-Smith &
    • Richard Durbin
  2. Bioinformatics Research Center, Aarhus University, C.F. Møllers Allé 8, 8000 Aarhus C, Denmark

    • Julien Y. Dutheil,
    • Asger Hobolth,
    • Thomas Mailund,
    • Lars N. Andersen,
    • Kasper Munch &
    • Mikkel H. Schierup
  3. Department of Genome Sciences, University of Washington School of Medicine, Seattle, Washington 98195, USA

    • LaDeana W. Hillier,
    • Tomas Marques-Bonet,
    • Can Alkan,
    • Emre Karakoc,
    • Saba Sajjadian &
    • Evan E. Eichler
  4. European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton CB10 1SD, UK

    • Gregory E. Jordan,
    • Javier Herrero,
    • Petra C. Schwalie,
    • Kathryn Beal,
    • Stephen Fitzgerald,
    • Albert J. Vilella,
    • Paul Flicek &
    • Nick Goldman
  5. Department of Genetic Medicine and Development, University of Geneva Medical School, Rue Michel-Servet 1, 1211 Geneva 4, Switzerland

    • Tuuli Lappalainen &
    • Emmanouil T. Dermitzakis
  6. Institut de Biologia Evolutiva (UPF-CSIC), 08003 Barcelona, Catalonia, Spain

    • Tomas Marques-Bonet &
    • Javier Prado-Martinez
  7. Institucio Catalana de Recerca i Estudis Avançats, ICREA, 08010 Barcelona, Spain

    • Tomas Marques-Bonet
  8. Department of Zoology, University of Cambridge, Downing Street, Cambridge CB2 3EJ, UK

    • Stephen H. Montgomery,
    • Brenda J. Bradley,
    • Timothy D. O’Connor &
    • Nicholas I. Mundy
  9. University of Cambridge, Department of Oncology, Hutchison/MRC Research Centre, Hills Road, Cambridge CB2 0XZ, UK

    • Michelle C. Ward,
    • Dominic Schmidt &
    • Duncan T. Odom
  10. Cancer Research UK, Cambridge Research Institute, Li Ka Shing Centre, Robinson Way, Cambridge CB2 0RE, UK

    • Michelle C. Ward,
    • Dominic Schmidt &
    • Duncan T. Odom
  11. Howard Hughes Medical Institute, University of Washington, Seattle, Washington 20815-6789, USA

    • Can Alkan &
    • Evan E. Eichler
  12. Institute of Medical Genetics, Cardiff University, Heath Park, Cardiff CF14 4XN, UK

    • Edward V. Ball,
    • Matthew Mort,
    • Andrew D. Phillips,
    • Katy Shaw,
    • Peter D. Stenson &
    • David N. Cooper
  13. Department of Anthropology, Yale University, 10 Sachem Street, New Haven, Connecticut 06511, USA

    • Brenda J. Bradley
  14. The Genome Institute at Washington University, Washington University School of Medicine, Saint Louis, Missouri 63108, USA

    • Tina A. Graves,
    • Wesley C. Warren &
    • Richard K. Wilson
  15. MRC Functional Genomics Unit, University of Oxford, Department of Physiology, Anatomy and Genetics, South Parks Road, Oxford OX1 3QX, UK

    • Andreas Heger,
    • Stephen Meader &
    • Chris P. Ponting
  16. Wellcome Trust Centre for Human Genetics, Roosevelt Drive, Oxford OX3 7BN, UK

    • Gerton Lunter
  17. Comparative Genomics Unit, Genome Technology Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, Maryland, 20892-2152, USA

    • James C. Mullikin
  18. Max Planck Institute for Evolutionary Anthropology, Primatology Department, Deutscher Platz 6, Leipzig 04103, Germany

    • Linda Vigilant
  19. Children’s Hospital Oakland Research Institute, Oakland, California 94609, USA

    • Baoli Zhu &
    • Pieter de Jong
  20. San Diego Zoo’s Institute for Conservation Research, Escondido, California 92027, USA

    • Oliver A. Ryder
  21. Present addresses: Institut des Sciences de l'Évolution – Montpellier (I.S.E.-M.), Université de Montpellier II – CC 064, 34095 Montpellier Cedex 05, France (J.Y.D); Centre for Genomic Research, Institute of Integrative Biology, University of Liverpool, Crown Street, Liverpool L69 7ZB, UK (I.G.); Division of Biological Anthropology, University of Cambridge, Fitzwilliam Street, Cambridge CB2 1QH, UK (B.Y.); EASIH, University of Cambridge, Addenbrooke’s Hospital, Cambridge CB2 0QQ, UK (A.S.R.); Oxford Nanopore Technologies, Edmund Cartwright House, 4 Robert Robinson Avenue, Oxford OX4 4GA, UK (D.J.T.); Institute of Microbiology, Chinese Academy of Sciences, Datun Road, Chaoyang District, Beijing 100101, China (B.Z.); The Genome Analysis Centre, Norwich Research Park, Norwich NR4 7UH, UK (J.R.).

    • Julien Y. Dutheil,
    • Ian Goodhead,
    • Bryndis Yngvadottir,
    • Anthony S. Rogers,
    • Daniel J. Turner,
    • Baoli Zhu &
    • Jane Rogers


Manuscript main text: A.S., R.D., C.T.-S., N.I.M., G.E.J., P.C.S., A.K.-K. Project coordination: A.S., A.S.R., A.K.-K., R.D. Project initiation: J.R., R.D., R.K.W. Library preparation and sequencing: I.G., D.J.T., M.A.Q., C.M.C., B.Z., P.d.J., O.A.R., Q.A., B.Y., Y.X., T.A.G., W.C.W. Assembly: A.S., L.W.H., Y.G., J.T.S., J.C.M., W.W., Z.N. Fosmid finishing: P.H. Assembly quality: A.S., S. Meader, G.L., C.P.P. Annotation: Y.A.T., G.K.L., A.J.V., A. Heger, S.M.S. Primate multiple alignments: J.H., K.B., S.F. Great ape speciation and ILS: J.Y.D., A.S., T.M., M.H.S., K.M., G.E.J. Sequence loss and gain: A.S., S.M., C.T.-S., Y.A.T., A.J.V. Protein evolution: G.E.J., S.H.M., N.I.M., B.J.B., T.D.O’C., Y.X., Y.C., N.G. Human disease allele analysis: Y.X., Y.C., C.T.-S., P.D.S., E.V.B., A.D.P., M.M., K.S., D.N.C. Transcriptome analysis: T.L., E.T.D. ChIP-seq experiment and analysis: P.C.S., M.C.W., D.S., P.F., D.T.O. Additional gorilla samples: B.Y., Y.X., L.V., C.T.-S. Gorilla species diversity and divergence: A.S., A.H., T.M., L.N.A., B.Y., L.V. Gorilla species functional differences: Y.X., Y.C., C.T.-S. Segmental duplication analysis: T.M.-B., C.A., S.S., E.K., J.P.-M., E.E.E.

Competing financial interests

The authors declare no competing financial interests.

Corresponding author

Correspondence to:

Accession numbers for all primary sequencing data are given in Supplementary Information. The assembly has been submitted to EMBL with accession numbers FR853080 to FR853106, and annotation is available at Ensembl (

Author details

Supplementary information

PDF files

  1. Supplementary Information (12.2M)

    This file contains Supplementary Information, Supplementary Methods, Supplementary Figures and Supplementary Tables. Please note some of the tables are in separate files - see contents list for details.

Excel files

  1. Supplementary Tables (380K)

    This file contains tables ST3.1 (lincRNA annotation), ST8.2-ST8.4, ST8.7 (Protein evolution), ST8.8, ST8.9 (Stop-SNP and disease allele genes) and ST11.1 (Expression-CTCF-changes).

  2. Supplementary Tables (852K)

    This file contains table ST12.2 (Gorilla species amino-acid differences).

  3. Supplementary Tables (6M)

    This file contains table ST8.1:Ensembl protein primate orthology status.

  4. Supplementary Tables (8.8M)

    This file contains table ST8.5: Complete codon model LRT results.

  5. Supplementary Tables (698K)

    This file contains table ST4.3: CoalHMM results (genome wide).

  6. Supplementary Tables (40K)

    This file contains table ST4.5:CoalHMM results on simulated data.

Additional data