Gibbon genome and the fast karyotype evolution of small apes

Carbone, Lucia; Alan Harris, R.; Gnerre, Sante; Veeramah, Krishna R.; Lorente-Galdos, Belen; Huddleston, John; Meyer, Thomas J.; Herrero, Javier; Roos, Christian; Aken, Bronwen; Anaclerio, Fabio; Archidiacono, Nicoletta; Baker, Carl; Barrell, Daniel; Batzer, Mark A.; Beal, Kathryn; Blancher, Antoine; Bohrson, Craig L.; Brameier, Markus; Campbell, Michael S.; Capozzi, Oronzo; Casola, Claudio; Chiatante, Giorgia; Cree, Andrew; Damert, Annette; de Jong, Pieter J.; Dumas, Laura; Fernandez-Callejo, Marcos; Flicek, Paul; Fuchs, Nina V.; Gut, Ivo; Gut, Marta; Hahn, Matthew W.; Hernandez-Rodriguez, Jessica; Hillier, LaDeana W.; Hubley, Robert; Ianc, Bianca; Izsvák, Zsuzsanna; Jablonski, Nina G.; Johnstone, Laurel M.; Karimpour-Fard, Anis; Konkel, Miriam K.; Kostka, Dennis; Lazar, Nathan H.; Lee, Sandra L.; Lewis, Lora R.; Liu, Yue; Locke, Devin P.; Mallick, Swapan; Mendez, Fernando L.; Muffato, Matthieu; Nazareth, Lynne V.; Nevonen, Kimberly A.; O’Bleness, Majesta; Ochis, Cornelia; Odom, Duncan T.; Pollard, Katherine S.; Quilez, Javier; Reich, David; Rocchi, Mariano; Schumann, Gerald G.; Searle, Stephen; Sikela, James M.; Skollar, Gabriella; Smit, Arian; Sonmez, Kemal; Hallers, Boudewijn ten; Terhune, Elizabeth; Thomas, Gregg W. C.; Ullmer, Brygg; Ventura, Mario; Walker, Jerilyn A.; Wall, Jeffrey D.; Walter, Lutz; Ward, Michelle C.; Wheelan, Sarah J.; Whelan, Christopher W.; White, Simon; Wilhelm, Larry J.; Woerner, August E.; Yandell, Mark; Zhu, Baoli; Hammer, Michael F.; Marques-Bonet, Tomas; Eichler, Evan E.; Fulton, Lucinda; Fronick, Catrina; Muzny, Donna M.; Warren, Wesley C.; Worley, Kim C.; Rogers, Jeffrey; Wilson, Richard K.; Gibbs, Richard A.

doi:10.1038/nature13679

Download PDF

Article
Open access
Published: 10 September 2014

Gibbon genome and the fast karyotype evolution of small apes

Lucia Carbone^1,2,3,4,
R. Alan Harris⁵,
Sante Gnerre⁶,
Krishna R. Veeramah^7,8,
Belen Lorente-Galdos⁹,
John Huddleston^10,11,
Thomas J. Meyer¹,
Javier Herrero^12,13^nAff43,
Christian Roos¹⁴,
Bronwen Aken^12,15,
Fabio Anaclerio¹⁶,
Nicoletta Archidiacono¹⁶,
Carl Baker¹⁰,
Daniel Barrell^12,15,
Mark A. Batzer¹⁷,
Kathryn Beal¹²,
Antoine Blancher¹⁸,
Craig L. Bohrson¹⁹,
Markus Brameier¹⁴,
Michael S. Campbell²⁰,
Oronzo Capozzi¹⁶,
Claudio Casola²¹,
Giorgia Chiatante¹⁶,
Andrew Cree²²,
Annette Damert²³,
Pieter J. de Jong²⁴,
Laura Dumas²⁵,
Marcos Fernandez-Callejo⁹,
Paul Flicek¹²,
Nina V. Fuchs²⁶,
Ivo Gut²⁷,
Marta Gut²⁷,
Matthew W. Hahn²⁸,
Jessica Hernandez-Rodriguez⁹,
LaDeana W. Hillier²⁹,
Robert Hubley³⁰,
Bianca Ianc²³,
Zsuzsanna Izsvák²⁶,
Nina G. Jablonski³¹,
Laurel M. Johnstone⁷,
Anis Karimpour-Fard²⁵,
Miriam K. Konkel¹⁷,
Dennis Kostka³²,
Nathan H. Lazar⁴,
Sandra L. Lee²²,
Lora R. Lewis²²,
Yue Liu²²,
Devin P. Locke²⁹^nAff43,
Swapan Mallick³³,
Fernando L. Mendez⁷^nAff43,
Matthieu Muffato¹²,
Lynne V. Nazareth²²,
Kimberly A. Nevonen²,
Majesta O’Bleness²⁵,
Cornelia Ochis²³,
Duncan T. Odom^15,34,
Katherine S. Pollard^35,36,37,
Javier Quilez⁹,
David Reich³³,
Mariano Rocchi¹⁶,
Gerald G. Schumann³⁸,
Stephen Searle¹⁵,
James M. Sikela²⁵,
Gabriella Skollar³⁹,
Arian Smit²⁹,
Kemal Sonmez^4,40,
Boudewijn ten Hallers²⁴^nAff43,
Elizabeth Terhune²,
Gregg W. C. Thomas²⁸,
Brygg Ullmer⁴¹,
Mario Ventura¹⁶,
Jerilyn A. Walker¹⁷,
Jeffrey D. Wall^36,37,
Lutz Walter¹⁴,
Michelle C. Ward³⁴^nAff43,
Sarah J. Wheelan¹⁹,
Christopher W. Whelan⁴⁰^nAff43,
Simon White¹⁵,
Larry J. Wilhelm²,
August E. Woerner⁷,
Mark Yandell²⁰,
Baoli Zhu²⁴^nAff43,
Michael F. Hammer⁷,
Tomas Marques-Bonet^9,27,
Evan E. Eichler^10,11,
Lucinda Fulton²⁹,
Catrina Fronick²⁹,
Donna M. Muzny²²,
Wesley C. Warren²⁹,
Kim C. Worley²²,
Jeffrey Rogers²²,
Richard K. Wilson²⁹ &
…
Richard A. Gibbs²²

Nature volume 513, pages 195–201 (2014)Cite this article

40k Accesses
227 Citations
277 Altmetric
Metrics details

Subjects

Abstract

Gibbons are small arboreal apes that display an accelerated rate of evolutionary chromosomal rearrangement and occupy a key node in the primate phylogeny between Old World monkeys and great apes. Here we present the assembly and analysis of a northern white-cheeked gibbon (Nomascus leucogenys) genome. We describe the propensity for a gibbon-specific retrotransposon (LAVA) to insert into chromosome segregation genes and alter transcription by providing a premature termination site, suggesting a possible molecular mechanism for the genome plasticity of the gibbon lineage. We further show that the gibbon genera (Nomascus, Hylobates, Hoolock and Symphalangus) experienced a near-instantaneous radiation ∼5 million years ago, coincident with major geographical changes in southeast Asia that caused cycles of habitat compression and expansion. Finally, we identify signatures of positive selection in genes important for forelimb development (TBX5) and connective tissues (COL1A1) that may have been involved in the adaptation of gibbons to their arboreal habitat.

Genomic signatures of high-altitude adaptation and chromosomal polymorphism in geladas

Article 24 March 2022

The hagfish genome and the evolution of vertebrates

Article Open access 23 January 2024

Pulmonate slug evolution is reflected in the de novo genome of Arion vulgaris Moquin-Tandon, 1855

Article Open access 20 August 2022

Main

Gibbons (Hylobatidae) are critically endangered¹ small apes that inhabit the tropical forests of southeast Asia (Fig. 1) and belong to the superfamily Hominoidea along with great apes and humans. In the primate phylogeny, gibbons diverged between Old World monkeys and great apes, providing a unique perspective from which to study the origins of hominoid characteristics.

**Figure 1: Geographic distribution of gibbon species used in the study.**

Gibbons have several distinctive traits, the most striking of which is the unusually high number of large-scale chromosomal rearrangements in comparison to the inferred ancestral ape karyotype². The four gibbon genera (Nomascus, Hylobates, Hoolock and Symphalangus) occupy different regions of southeast Asia and bear distinctive karyotypes, with diploid chromosome numbers ranging from 38 to 52 (Fig. 1). Given the relatively recent differentiation of these genera (4–6 million years ago (Myr ago), this constitutes an extraordinarily fast rate of karyotype change.

In order to investigate the mechanisms behind the plasticity of the gibbon genome, understand the evolutionary relationships among the four extant gibbon genera and study the evolution of putatively functional sequences related to gibbon-specific adaptations, we sequenced and assembled the genome of a female northern white-cheeked gibbon (Nomascus leucogenys) named ‘Asia’. The reference assembly (Nleu1.0) provides on average 5.7-fold Sanger read coverage over 2.9 gigabase pairs (Gb) (Table 1 and Supplementary Table ST1.1). Our quality assessment (Extended Data Fig. 1) confirmed its equivalence to other Sanger sequence-based non-human primate draft assemblies (such as the orangutan or rhesus macaque^3,4) (Supplementary Information section S1, Supplementary Data Files 1 and 2). We also obtained ∼15× whole-genome shotgun (WGS) short-read data (Illumina) for two individuals of each gibbon genus and high-coverage exome data (>60×) for two of the same individuals in order to derive error models for single nucleotide polymorphism (SNP) calls (Supplementary Information section S2; Supplementary Tables ST2.1–2.3).

Table 1 Gibbon assembly statistics

Full size table

Gibbon–human synteny breakpoints

Nleu1.0 scaffolds were aligned against the human reference (GRCh37) to be ordered and oriented into 26 chromosomes (Nleu3.0) under extensive guidance by cytogenetic data. The reshuffled nature of the gibbon genome was especially evident when human–gibbon chromosome alignments were compared with those between human and great apes, rhesus macaque (Old World monkey) and marmoset (New World monkey) (Fig. 2a). This higher rate of reshuffling applied only to large-scale chromosomal rearrangements (>10 megabases (Mb)), whereas smaller-scale rearrangements (10–100 kilobases (kb)) were comparable with other species (Fig. 2b) (Supplementary Information section S1).

**Figure 2: Analysis of gibbon–human synteny and breakpoints.**

We identified 96 gibbon–human synteny breakpoints in Nleu1.0 and classified them as to whether they could be defined at the base-pair level (class I, n = 42) or only narrowed to an interval due to greater complexity (class II, n = 54). As previously reported⁵, breakpoints were significantly depleted of genes (Supplementary Fig. SF5.2 and Supplementary Data File 3) and breakpoint intervals contained a mixture of repetitive sequences that inserted exclusively into the gibbon genome^2,5,6 (Fig. 2c). To assess breakpoint segmental duplication content, we identified gibbon-specific segmental duplication using in silico methods followed by experimental validation (Extended Data Fig. 2, Supplementary Fig. SF3.1, Supplementary Information section S3 and Supplementary Data File 4). Of note, both gibbon-specific segmental duplication and gene family expansion analyses suggested the gibbon genome has not undergone a greater rate of duplication than other hominoids, further supporting a model in which accelerated evolution has been limited to gross chromosomal rearrangements (Supplementary Information section S6, Supplementary Fig. SF6.1).

Segmental duplication enrichment was the best predictor of gibbon–human synteny breakpoints, as shown through permutation analyses (P value < 0.0001); however, breakpoints were also enriched for Alu elements (Supplementary Table ST5.1; Supplementary Information section S5; Supplementary Fig. SF5.2). Although non-allelic homologous recombination between highly similar sequences can mediate large-scale rearrangements⁷, the majority of gibbon chromosomal breakpoints bore signatures of non-homology based mechanisms (Fig. 2c). These included the insertion of non-templated sequences (2–51 nucleotides (nt)) and/or the absence of identity, suggesting non-homologous end joining. The presence of micro-homologies (2–26 nt) in a small portion of the breakpoints (13/42) pointed to additional alternative mechanisms such as microhomology-mediated end joining⁸ or microhomology-mediated break-induced replication⁹. The origin of the complex structure of breakpoint intervals (class II) was less obvious and reinforced the observation that repeats have the tendency to accumulate at the breakpoints.

To explore the possibility that chromatin conformation, rather than sequence, might predispose regions to breakage, we investigated the relationship between gibbon breakpoints and CCCTC-binding factor (CTCF), an evolutionarily conserved protein with multiple functions, including mediating intra- and interchromosomal interactions¹⁰. We performed chromatin immunoprecipitation followed by high-throughput sequencing (ChIP-seq) of CTCF-bound DNA using lymphoblast cell lines established from eight gibbon individuals (Supplementary Information section S5). We observed an enrichment of gibbon–human breakpoints in CTCF-binding events (P value = 0.0028), which increased when we considered a ∼20 kb window centred around each breakpoint (P value of < 0.0001). Notably, this enrichment was maintained only for CTCF-binding events shared with other primates (human, orangutan and rhesus macaque)¹¹ but not those specific to gibbon (P value = 0.0019) (Supplementary Fig. SF5.4).

Thus, gibbon–human breakpoints co-localized with distinct genomic features and epigenetic marks; however, as many of these features were shared with other primates, other factors unique to the gibbon lineage must have been present to trigger the increased frequency of chromosomal rearrangements.

LAVA insertions in the gibbon genome

The gibbon genome contains all previously described classes of transposable elements that are mostly also present in other primates. One exceptional addition is the LAVA element, a novel retrotransposon that emerged exclusively in gibbons¹² and has a composite structure comprised of portions of other repeats (3′-L1-AluS-VNTR-Alu_-like-5′) (Fig. 3a). Searches of Nleu1.0 retrieved 1,797 LAVA insertions, 1,256 of which were 3′ intact elements, many carrying signs of target-primed reverse transcription (TPRT)¹³. The distribution of 3′ intact LAVA elements uncovered a significant overlap with genes (Pearson chi-squared, P = 0.017) and Gene Ontology (GO) analyses using the database for annotation, visualization, and integrated discovery (DAVID)¹⁴ showed a significant functional enrichment exclusive to the ‘microtubule cytoskeleton’ category (false discovery rate = 0.031, P value = 0.001) (Supplementary Information section S7 and Supplementary Data File 6) (Extended Data Fig. 3). Additional analyses with meta-pathway database tools^15,16 refined this enrichment to pathways related to chromosome segregation, including ‘establishment of sister chromatid cohesion’ and ‘mitotic metaphase and anaphase’ (Supplementary Table ST7.3). Genes with LAVA insertions include proteins that function as checkpoints for cell division and for spindle integrity/architecture (such as MAP4, CEP164 and BUB1B)^17,18,19, participate in kinetochore assembly and attachment to the spindle (for example, MAD1L1 and CLASP2)^20,21, and have a role in chromosome segregation during cell division (for example, KIFAP3 and KIF27)²² (Extended Data Table 1).

**Figure 3: The LAVA element and evidence for LAVA-mediated early transcription termination.**

Intragenic LAVA insertions were skewed toward introns (Pearson chi-squared, P = 0.0001) and were less frequent than expected when within <1 kb of the nearest exon junction (Extended Data Fig. 3). The majority (74%) of intronic LAVA elements were found in the antisense orientation. We speculated that intronic antisense LAVA insertions may cause early transcription termination by providing a polyadenylation site in the antisense orientation, as previously described for L1 elements^23,24 (Extended Data Fig. 3). Indeed, we found 84.1% of the 3′-intact LAVA elements encoded a perfect polyadenylation signal at their 3′ end in antisense orientation.

To obtain experimental evidence that LAVA elements disrupt transcription, we performed a reporter assay in which the 3' end of a luciferase gene construct lacking a transcriptional termination site was fused to the 3'-terminal fragments of LAVA_E and LAVA_F elements, mimicking the arrangement observed in gibbon genes (Fig. 3b, left). Luciferase activity exceeding background level by ∼50% was observed from the LAVA_F reporter construct (Fig. 3b, right), indicating faithful termination of luciferase transcription. Furthermore, 3′ rapid amplification of cDNA ends (RACE) experiments confirmed that the transcription termination site had been supplied from the LAVA element (Extended Data Fig. 3). Thus antisense intronic LAVA insertions can cause early transcription termination with some variability possibly due to the genomic context of the polyadenylation site, which explained the difference between the two reporter constructs.

We also investigated LAVA induced early transcription termination in vivo by analyzing RNA-seq data generated for the gibbon named Asia (Supplementary Table ST2.4). Specifically, we looked for paired-end reads only partially aligning to an antisense LAVA element due to untemplated residues and then identified cases for which the presence of a poly(A) tail was preventing full-length alignment. This analysis revealed that elements from a variety of subfamilies have the potential to cause early transcription termination, including those identified for LAVA elements inserted in the microtubule cytoskeleton genes (for example, LAVA_B2R2, LAVA_C4B, LAVA_B1R2) (Extended Data Table 1). We observed that early transcription termination occurred at relatively low levels as we identified a significant number of read pairs indicative of normal transcription and splicing for LAVA-terminated genes (Supplementary Table ST7.5). This is to be expected, as full inactivation of many of these genes would be lethal. On the other hand, as alternative splicing and RNA pol II transcript termination/polyadenylation are tightly coupled processes, LAVA-mediated early transcription termination could also act by differently affecting distinct isoforms and/or influencing the ratio between isoforms. Finally, LAVA insertions may also affect gene expression by functioning as exon traps, as shown for SVA elements²⁵. One putative example of an exon trapping event was identified for HORMAD2, a gene that monitors the formation of synapsis during crossover²⁶ (Supplementary Information section S7, Supplementary Table ST7.6, Supplementary Fig. SF7.1–7.2).

As genome reshuffling began in the common ancestor of all extant gibbon species, LAVA insertions must have occurred in key genes before the four genera diverged. We experimentally confirmed the mode and tempo of all 23 LAVA insertions in genes from the microtubule cytoskeleton category using both site-specific PCR and in silico methods (Extended Data Figure 4) and found that most of the insertions (15/23) were shared by the four gibbon genera (Supplementary Data File 6). Eleven of the genes match the structural requirements for early transcription termination and five of them are also shared. These genes include MAP4, involved in spindle architecture and CEP164, a G2/M checkpoint gene whose inactivation results in an aberrant spindle during cell division^18,19 (Extended Data Table 1).

The complex evolutionary history of gibbons

We explored the relationship between LAVA family expansion and evolution of the gibbon lineage and, through analyses of diagnostic mutations, identified 22 LAVA subfamilies (Fig. 3c). In addition, we tested for the presence or absence of 200 LAVA loci from among the evolutionarily youngest elements in each subfamily (Extended Data Fig. 4) across 17 unrelated gibbon individuals and found that 52% of loci were shared among all four genera, whereas 27% were Nomascus specific. The remaining LAVA insertions showed a variety of confounding phylogenetic relationships consistent with incomplete lineage sorting (ILS) of ancestral polymorphisms, perhaps as a result of a rapid radiation of gibbon genera (Supplementary Information section S7; Supplementary Table ST7.1–7.2). We used a maximum likelihood method²⁷ to obtain age estimates for the 22 LAVA subfamilies. In the case of the two oldest subfamilies, LAVA_A1 and LAVA_A2, we obtained estimates of ∼18 Myr ago and ∼17 Myr ago, respectively (Supplementary Table ST7.3). A coalescent-based methodology implemented in the software G-PhosCS²⁸ using Nleu1.0 estimated a gibbon–great ape population divergence time of ∼16.8 Myr ago (95% confidence intervals (CI): 15.9–17.6 Myr ago) assuming a split time with macaque of 29 Myr ago (Supplementary Information section S4). Hence, the LAVA element probably originated around the time of the divergence of gibbons from the ancestral great ape/human lineage.

The evolutionary history of the gibbon lineage and, in particular, the timing and order of splitting among the four genera, is still a subject of debate²⁹. To address this issue, we generated medium coverage (mean ∼15×) WGS short read data for two individuals from each of the four genera, including two different Hylobates species (H. moloch and H. pileatus) (Supplementary Table ST2.1–2.2). Although phylogenetic analysis of assembled whole mitochondrial DNA genomes using BEAST³⁰ strongly supported monophyletic groupings for each gibbon genus, the branching order of the four genera remained unresolved (Supplementary Fig. SF9.1–9.2; Supplementary Information S9).

Neighbour-joining trees constructed from pairwise sequence divergence, k, across ∼11,000 genic (200 base pairs (bp)) and ∼12,000 non-genic (1 kilobase (kb)) autosomal loci supported a supermatrix sequence topology of (((Siamang (SSY), Hoolock (HLE)), Nomascus (NLE)), (H. pileatus (HPL)), H. moloch (HMO)) (Fig. 4a); nevertheless, bootstrap confidence for the node separating NLE and Hylobates was low (∼52%). This topology was also the most frequently observed when constructing k-based unweighted pair group method with arithmetic mean (UPGMA) trees along the genome using non-overlapping 100-kb sliding windows. However, all 15 possible rooted topologies for the four genera were observed at considerable frequencies (Extended Data Fig. 5), consistent with the extensive ILS observed in the LAVA element analysis.

**Figure 4: Gibbon phylogeny and demography.**

In order to infer the most likely bifurcating species topology amongst the four genera while taking into account ILS, we used a novel coalescent-based ABC methodology using the autosomal non-genic and genic loci (Veeramah et al., in the press) (Supplementary Information section S8). The topology described above had the highest combined posterior probability, though support was relatively low (P (model) = 17%) and other topologies, including one with NLE and Hylobates interchanged as the most external taxa, had comparable probabilities (Fig. 4a).

The estimated internal branch lengths under the best species topology using our ABC framework and G-PhoCS were very short, supporting a rapid speciation process for the four gibbon genera (Fig 4b, right). Given this observation and uncertainty in the best topology, we also estimated parameters under an instantaneous speciation model (Fig. 4b, left). Assuming an overall autosomal mutation rate of 1 × 10⁻⁹ per site per year, we placed the beginning of the speciation process at ∼5 Myr ago under both models, with the two Hylobates species diverging ∼1.5 Myr ago.

Consistent with the ABC analysis, SSY and HLE share the largest number of alleles across the whole genome (Supplementary Table ST8.5). However, NLE and the two Hylobates samples are both significantly closer to SSY than HLE as assessed by the D-statistic³¹. This result could be explained by two independent gene flow events between SSY and both NLE and Hylobates. However, fertile intergenic hybrids have yet to be observed either in the wild or captivity³²; an alternative explanation would be long-term population structure in the gibbon ancestral population. Both the ABC and G-PhoCS analyses suggest that the ancestral gibbon effective population size (N_e) was large (80,000–130,000), but neither of these frameworks can distinguish this from a structured ancestral population.

The coalescent-based analysis (Fig. 4a), along with estimates of genome-wide heterozygosity (Supplementary Fig. ST8.2), suggests a larger long-term N_e for both N. leucogenys and H. moloch compared to the other species. Analysis using the pairwise sequentially Markovian coalescent (PSMC) model³³ indicates that these two species underwent an increase in N_e during the Late Pleistocene era (500–100 thousand years ago (kyr ago) followed by a subsequent decrease in N_e 100–50 kyr ago (Fig. 4c) (Supplementary Information section S8). Fluctuation in N_e could result from changes in the actual number of individuals in the population, changes in population structure, and/or variable gene flow.

Functional sequence evolution

Accelerated substitution rates are a hallmark of adaptive evolution, and genomic regions with excess lineage-specific substitutions have been found to have functional roles³⁴. We identified 240 short (153 bp) median length) regions with accelerated substitution rates in the gibbon lineage (gibARs). We observed that gibARs were primarily intergenic (66%) and tended to co-localize near the same genes as LAVA elements (P value = 81 × 10⁻⁶; odds ratio of 2.74 (95% CI: 1.79–4.07)). Consistent with this finding, a GO enrichment test for genes within ± 100 kb of each gibAR (in comparison with background genes) revealed enrichment for the ‘chromosome organization’ category (Benjamini–Hochberg false discovery rate <5%) (Extended Data Fig. 6). Given evidence of functional roles gathered for human accelerated regions³⁵, we speculate that the gibARs may create functional elements (for example, enhancers or protein-binding domains) to modulate the transcriptional effect of local LAVA insertions (Supplementary Information section S12 and Supplementary Data File 9).

We assessed the potential presence of positive selection in 13,638 human genes with one-to-one orthologues in gibbon using a branch-site likelihood ratio test³⁶ (Supplementary Information section S10). One of the most striking features of gibbons is their use of brachiation (arboreal locomotion using only the arms). We uncovered evidence related to traits possibly associated with this adaptation such as the gibbon’s longer arms, more powerful shoulder flexors, rotator muscles and elbow flexors³⁷. First, some genes whose functions relate to these anatomical specializations appear to have undergone positive selection in gibbons. They include TBX5 (P value = 0.00015), required for the development of all forelimb elements³⁸; COL1A1 (pro-alpha1 chains of type I collagen) (P value = 3.39 × 10⁻¹¹), the fibril-forming collagen that is the main protein of bones, tendons and teeth³⁹; and CHRNA1 (acetylcholine receptor subunit alpha precursor) (P value = 0.00039), involved in skeletal muscle contraction⁴⁰. These genes have not been identified as positively selected in other primates to date. We also observed that some genes involved in chondrogenesis (SNX19, ID2 and EXT1) were associated with gibARs. Finally, the chondroadherin gene (CHAD)⁴¹ coding for a cartilage matrix protein is specifically duplicated in all gibbon genera (Extended Data Fig. 2).

Discussion

Our sequencing, assembling and analysis of the gibbon genome has provided numerous insights into the accelerated evolution of the gibbon karyotype and identified genetic signatures related to gibbon biology. First, segmental duplications and repetitive sequences were the best predictors of gibbon–human breakpoints, although we excluded a causal role given the predominance of non-homology-based repair signatures. Furthermore, accelerated rearrangement was confined to large-scale chromosomal events, pointing to a mechanism responsible for causing gross chromosomal changes, rather than global genomic instability. This is in line with our hypothesis that the high rate of chromosomal rearrangements may have been due to LAVA-induced premature transcription termination of chromosome segregation genes. This effect may have occurred at a low enough level to be compatible with life but sufficient to increase the frequency of chromosome segregation errors. The link between erroneous chromosome segregation and increased chromosomal rearrangement has been recently demonstrated by others through in vitro experiments^25,26.

The question remains how such a high number of chromosomal rearrangements could become fixed in such a relatively short time. One possibility is that a combination of geographic isolation and post-mating reproductive barriers accelerated the radiation of the four gibbon genera. Our estimates dated the lineage-splitting event to the Miocene–Pliocene transition, when major changes in the distribution of tropical and subtropical forests were caused by the elevation of the Yunnan plateau and rise in sea levels^42,43. Furthermore, fluctuation in sea levels beginning in the Early Pliocene appears to have brought about cycles of forest fragmentation and amalgamation, leading to alternating range compression and expansion for many mammalian groups⁴⁴.

Together, these results advance our knowledge of the unique traits of the small apes and highlight the complex evolutionary history of these species. Moreover, our analyses of the rearranged gibbon genome help to provide insight into the mechanisms of chromosome evolution as well as uncovering a new source of genome plasticity.

Methods Summary

Sanger-based whole-genome sequencing was performed as described for other species. The genome assembly was generated using the ARACHNE genome assembler assisted with alignment data from the human genome (Supplementary Information section S1). The source DNA for the sequencing was derived from a single female (Asia; studbook no. 0098, ISIS no. NLL605) housed at the Virginia Zoo in Norfolk, Virginia. Short-read libraries were constructed at the Oregon Health & Science University (OHSU) following standard Illumina protocols and sequenced on an Illumina HiSeq 2000. Analyses were performed with custom analysis pipelines. See Supplementary Information for additional information about the methods.

Accession codes

Primary accessions

GenBank/EMBL/DDBJ

ADFV00000000.1

Sequence Read Archive

SRP043117

Data deposits

The N. leucogenys WGS project has been deposited in GenBank under the project accession ADFV00000000.1. All short-read data have been deposited into the Short Read Archive (http://www.ncbi.nlm.nih.gov/sra) under the accession number SRP043117. Resources for exploring the gibbon genome are available at UCSC (http://genome.ucsc.edu), Ensembl (http://ensembl.org), NCBI (http://ncbi.nlm.nih.gov), and the Baylor College of Medicine Human Genome Sequencing Center (https://www.hgsc.bcm.edu/non-human-primates/gibbon-genome-project). This paper is dedicated to the memory of Alan R. Mootnick (1951–2011).

References

Mittermeier, R. A., Rylands, A. B. & Wilson, D. E. Handbook of the Mammals of the World Vol. 3 (Lynx Edicions,. (2013)
Carbone, L. et al. A high-resolution map of synteny disruptions in gibbon and human genomes. PLoS Genet. 2, e223 (2006)
Article PubMed PubMed Central CAS Google Scholar
Locke, D. P. et al. Comparative and demographic analysis of orang-utan genomes. Nature 469, 529–533 (2011)
Article ADS CAS PubMed PubMed Central Google Scholar
Gibbs, R. A. et al. Evolutionary and biomedical insights from the rhesus macaque genome. Science 316, 222–234 (2007)
Article CAS PubMed Google Scholar
Girirajan, S. et al. Sequencing human–gibbon breakpoints of synteny reveals mosaic new insertions at rearrangement sites. Genome Res. 19, 178–190 (2009)
Article CAS PubMed PubMed Central Google Scholar
Carbone, L. et al. Evolutionary breakpoints in the gibbon suggest association between cytosine methylation and karyotype evolution. PLoS Genet. 5, e1000538 (2009)
Article PubMed PubMed Central MathSciNet CAS Google Scholar
Bailey, J. A. & Eichler, E. E. Primate segmental duplications: crucibles of evolution, diversity and disease. Nature Rev. Genet. 7, 552–564 (2006)
Article CAS PubMed Google Scholar
Yan, C. T. et al. IgH class switching and translocations use a robust non-classical end-joining pathway. Nature 449, 478–482 (2007)
Article ADS CAS PubMed Google Scholar
Hastings, P. J., Ira, G. & Lupski, J. R. A microhomology-mediated break-induced replication model for the origin of human copy number variation. PLoS Genet. 5, e1000327 (2009)
Article CAS PubMed PubMed Central Google Scholar
Merkenschlager, M. & Odom, D. T. CTCF and cohesin: linking gene regulatory elements with their targets. Cell 152, 1285–1297 (2013)
Article CAS PubMed Google Scholar
Schwalie, P. C. et al. Co-binding by YY1 identifies the transcriptionally active, highly conserved set of CTCF-bound regions in primate genomes. Genome Biol. 14, R148 (2013)
Article PubMed PubMed Central CAS Google Scholar
Carbone, L. et al. Centromere remodeling in Hoolock leuconedys (Hylobatidae) by a new transposable element unique to the gibbons. Genome Biol. Evol. 4, 648–658 (2012)
Article CAS PubMed Google Scholar
Luan, D. D., Korman, M. H., Jakubczak, J. L. & Eickbush, T. H. Reverse transcription of R2Bm RNA is primed by a nick at the chromosomal target site: a mechanism for non-LTR retrotransposition. Cell 72, 595–605 (1993)
Article CAS PubMed Google Scholar
Huang da W, Sherman, B. T. & Lempicki, R. A. Systematic and integrative analysis of large gene lists using DAVID bioinformatics resources. Nature Protocols 4, 44–57 (2009)
Article PubMed CAS Google Scholar
Mostafavi, S., Ray, D., Warde-Farley, D., Grouios, C. & Morris, Q. GeneMANIA: a real-time multiple association network integration algorithm for predicting gene function. Genome Biol. 9 (Suppl. 1). S4 (2008)
Article PubMed PubMed Central CAS Google Scholar
Kamburov, A., Wierling, C., Lehrach, H. & Herwig, R. ConsensusPathDB—a database for integrating human functional interaction networks. Nucleic Acids Res. 37, D623–D628 (2009)
Article CAS PubMed Google Scholar
Baker, D. J., Jin, F., Jeganathan, K. B. & van Deursen, J. M. Whole chromosome instability caused by Bub1 insufficiency drives tumorigenesis through tumor suppressor gene loss of heterozygosity. Cancer Cell 16, 475–486 (2009)
Article CAS PubMed PubMed Central Google Scholar
Samora, C. P. et al. MAP4 and CLASP1 operate as a safety mechanism to maintain a stable spindle position in mitosis. Nature Cell Biol. 13, 1040–1050 (2011)
Article CAS PubMed Google Scholar
Leber, B. et al. Proteins required for centrosome clustering in cancer cells. Sci. Transl. Med. 2, 33ra38 (2010)
Article PubMed CAS Google Scholar
Schuyler, S. C., Wu, Y. F. & Kuan, V. J. The Mad1–Mad2 balancing act—a damaged spindle checkpoint in chromosome instability and cancer. J. Cell Sci. 125, 4197–4206 (2012)
CAS PubMed Google Scholar
Maia, A. R. et al. Cdk1 and Plk1 mediate a CLASP2 phospho-switch that stabilizes kinetochore-microtubule attachments. J. Cell Biol. 199, 285–301 (2012)
Article CAS PubMed PubMed Central Google Scholar
Haraguchi, K., Hayashi, T., Jimbo, T., Yamamoto, T. & Akiyama, T. Role of the kinesin-2 family protein, KIF3, during mitosis. J. Biol. Chem. 281, 4094–4099 (2006)
Article CAS PubMed Google Scholar
Han, J. S., Szak, S. T. & Boeke, J. D. Transcriptional disruption by the L1 retrotransposon and implications for mammalian transcriptomes. Nature 429, 268–274 (2004)
Article ADS CAS PubMed Google Scholar
Wheelan, S. J., Aizawa, Y., Han, J. S. & Boeke, J. D. Gene-breaking: a new paradigm for human retrotransposon-mediated gene evolution. Genome Res. 15, 1073–1078 (2005)
Article CAS PubMed PubMed Central Google Scholar
Damert, A. et al. 5′-Transducing SVA retrotransposon groups spread efficiently throughout the human genome. Genome Res. 19, 1992–2008 (2009)
Article CAS PubMed PubMed Central Google Scholar
Wojtasz, L. et al. Meiotic DNA double-strand breaks and chromosome asynapsis in mice are monitored by distinct HORMAD2-independent and -dependent mechanisms. Genes Dev. 26, 958–973 (2012)
Article CAS PubMed PubMed Central Google Scholar
Marchani, E. E., Xing, J., Witherspoon, D. J., Jorde, L. B. & Rogers, A. R. Estimating the age of retrotransposon subfamilies using maximum likelihood. Genomics 94, 78–82 (2009)
Article CAS PubMed Google Scholar
Gronau, I., Hubisz, M. J., Gulko, B., Danko, C. G. & Siepel, A. Bayesian inference of ancient human demography from individual genome sequences. Nature Genet. 43, 1031–1034 (2011)
Article CAS PubMed Google Scholar
Wall, J. D. et al. Incomplete lineage sorting is common in extant gibbon genera. PLoS ONE 8, e53682 (2013)
Article ADS CAS PubMed PubMed Central Google Scholar
Drummond, A. J. & Rambaut, A. BEAST: Bayesian evolutionary analysis by sampling trees. BMC Evol. Biol. 7, 214 (2007)
Article PubMed PubMed Central CAS Google Scholar
Durand, E. Y., Patterson, N., Reich, D. & Slatkin, M. Testing for ancient admixture between closely related populations. Mol. Biol. Evol. 28, 2239–2252 (2011)
Article CAS PubMed PubMed Central Google Scholar
Hirai, H., Hirai, Y., Domae, H. & Kirihara, Y. A most distant intergeneric hybrid offspring (Larcon) of lesser apes, Nomascus leucogenys and Hylobates lar. Hum. Genet. 122, 477–483 (2007)
Article PubMed Google Scholar
Li, H. & Durbin, R. Inference of human population history from individual whole-genome sequences. Nature 475, 493–496 (2011)
CAS PubMed PubMed Central Google Scholar
Prabhakar, S. et al. Human-specific gain of function in a developmental enhancer. Science 321, 1346–1350 (2008)
Article ADS CAS PubMed PubMed Central Google Scholar
Pollard, K. S. et al. An RNA gene expressed during cortical development evolved rapidly in humans. Nature 443, 167–172 (2006)
Article ADS CAS PubMed Google Scholar
Yang, Z. PAML 4: phylogenetic analysis by maximum likelihood. Mol. Biol. Evol. 24, 1586–1591 (2007)
Article CAS PubMed Google Scholar
Michilsens, F., Vereecke, E. E., D’Août, K. & Aerts, P. Functional anatomy of the gibbon forelimb: adaptations to a brachiating lifestyle. J. Anat. 215, 335–354 (2009)
Article PubMed PubMed Central Google Scholar
Browne, M. L. et al. Evaluation of genes involved in limb development, angiogenesis, and coagulation as risk factors for congenital limb deficiencies. Am. J. Med. Genet. A. 158A, 2463–2472 (2012)
Article PubMed PubMed Central CAS Google Scholar
Marini, J. C. et al. Consortium for osteogenesis imperfecta mutations in the helical domain of type I collagen: regions rich in lethal mutations align with collagen binding sites for integrins and proteoglycans. Hum. Mutat. 28, 209–221 (2007)
Article CAS PubMed PubMed Central Google Scholar
Masuda, A. et al. hnRNP H enhances skipping of a nonfunctional exon P3A in CHRNA1 and a mutation disrupting its binding causes congenital myasthenic syndrome. Hum. Mol. Genet. 17, 4022–4035 (2008)
Article CAS PubMed PubMed Central Google Scholar
Hessle, L. et al. The skeletal phenotype of chondroadherin deficient mice. PLoS ONE 8, e63080 (2013)
Article ADS CAS PubMed PubMed Central Google Scholar
Cane, M. A. & Molnar, P. Closing of the Indonesian seaway as a precursor to east African aridification around 3–4 million years ago. Nature 411, 157–162 (2001)
Article ADS CAS PubMed Google Scholar
Xu J.-X, Ferguson D. K, Li, C.-S. & Wang Y.-F Late Miocene vegetation and climate of the Lühe region in Yunnan, southwestern China. Rev. Palaeobot. Palynol. 148, 36–59 (2008)
Article Google Scholar
Woodruff D. S & Turner L. M The Indochinese–Sundaic zoogeographic transition: a description and analysis of terrestrial mammal species distributions. J. Biogeogr. 36, 803–821 (2009)
Article Google Scholar
Harvey, P. H., Martin, R. D. & Clutton-Brock, T. H. in Primate Societies (eds Smuts B. B., et al.) Life histories in comparative perspective. 181–196 (Chicago Univ. Press, 1987)
Google Scholar
Kim, S. K. et al. Patterns of genetic variation within and between Gibbon species. Mol. Biol. Evol. 28, 2211–2218 (2011)
Article CAS PubMed PubMed Central Google Scholar

Download references

Acknowledgements

The gibbon genome project was funded by the National Human Genome Research Institute (NHGRI) including grants U54 HG003273 (R.A.G.) and U54 HG003079 (R.K.W.) with further support from National Institutes of Health NIH/NIAAA P30 AA019355 and NIH/NCRR P51 RR000163 (L.C.), R01_HG005226 (J.D.W., M.F.H.), NIH P30CA006973 (S.J.W.), a fellowship from the National Library of Medicine Biomedical Informatics Research Training Program (N.H.L.), R01 GM59290 (M.A.B.) and U41 HG007497-01 (M.A.B, M.K.K.), R01 MH081203 (J.M.S.), HG002385 (E.E.E.), National Science Foundation (NSF) CNS-1126739 (B.U., M.A.B., M.K.K.) and DBI-0845494 (M.W.H.), PRIN 2012 (M.R.), Futuro in ricerca 2010 RBFR103CE3 (M.V.), ERC Starting Grant (260372) and MICINN (Spain) BFU2011-28549 (T.M.-B.), grant of the Ministry of National Education, CNCS – UEFISCDI, project number PN-II-ID-PCE-2012-4-0090 (A.D.), grant of the Deutsche Forschungsgemeinschaft SCHU1014/8-1 (G.G.S.), ERC Starting and Advanced Grant and EMBO Young Investigator Award (Z.I., N.V.F.), ERC Starting Grant and EMBO Young Investigator Award (D.T.O.), Commonwealth Scholarship Commission (M.C.W.). E.E.E. is an investigator of the Howard Hughes Medical Institute. We acknowledge the contributions of the staff of the HGSC, including the operations team: H. Dinh, S. Jhangiani V. Korchina, C. Kovar; the library team: K. Blankenburg, L. Pu, S. Vattathil; the assembly team: D. Rio-Deiros, H. Jiang; the submissions team: M. Batterton, D. Kalra, K. Wilczek-Boney, W. Hale, G. Fowler, J. Zhang; the quality control team: P. Aqrawi, S. Gross, V. Joshi, J. Santibanez; and the sequence production team: U. Anosike, C. Babu, D. Bandaranaike, B. Beltran, D. Berhane-Mersha, C. Bickham, T. Bolden, M. Dao, M. Davila, L. Davy-Carroll, S. Denson, P. Fernando, C. Francis, R. Garcia III, B. Hollins, B. Johnson, J. Jones, J. Kalu, N. Khan, B. Leal, F. Legall III, Y. Liu, J. Lopez, R. Mata, M. Obregon, C. Onwere, A. Parra, Y. Perez, A. Perez, C. Pham, J. Quiroz, S. Ruiz, M. Scheel, D. Simmons, I. Sisson, J. Tisius, G. Toledanes, R. Varghese, V. Vee, D. Walker, C. White, A. Williams, R. Wright, T. Attaway, T. Garrett, C. Mercado, N. Ngyen, H. Paul and Z. Trejos. We thank Z. Ivics for providing some of the reagents. We additionally acknowledge the Production Sequencing Group at The Genome Institute. Wellcome Trust (grant numbers WT095908 and WT098051), NHGRI (U41HG007234) and European Molecular Biology Laboratory. For the production of next-generation sequences, we acknowledge the Massively Parallel Sequencing Shared Resources (MPSSR) at OHSU, the National Center of Genomic Analyses (CNAG) (Barcelona, Spain), the University of Arizona Genetics Core (UAGC), and the UCSF sequencing core. We also acknowledge the Louisiana Optical Network Institute (LONI). We thank the Gibbon Conservation Center and the Fort Wayne Children’s Zoo for providing the gibbon samples. The MAKER annotation pipeline is supported by NSF IOS-1126998.We thank T. Brown for proofreading and editing the manuscript.

Author information

Javier Herrero, Devin P. Locke, Fernando L. Mendez, Boudewijn ten Hallers, Michelle C. Ward, Christopher W. Whelan & Baoli Zhu
Present address: Present addresses: Bill Lyons Informatics Center, UCL Cancer Institute, University College London, London WC1E 6DD, UK (J.He); Seven Bridges Genomics, Cambridge, Massachusetts 02138, USA (D.P.L.); Department of Genetics, Stanford University School of Medicine, Stanford, California 94305, USA (F.L.M.); BioNano Genomics, San Diego, California 92121, USA (B.t.H.); University of Chicago, Department of Human Genetics, Chicago, Illinois 60637, USA (M.C.W.); Stanley Center for Psychiatric Research, Broad Institute, Cambridge, Massachusetts 02138, USA (C.W.W.); The CAS Key Laboratory of Pathogenic Microbiology and Immunology, Institute of Microbiology, Chinese Academy of Sciences, Beijing 100101, China (B.Z.).,
USTAR Center for Genetic Discovery, University of Utah, Salt Lake City, Utah 84112, USA.

Authors and Affiliations

Department of Behavioral Neuroscience, Oregon Health & Science University, 3181 SW Sam Jackson Park Road Portland, Oregon 97239, USA.,
Lucia Carbone & Thomas J. Meyer
Division of Neuroscience, Oregon National Primate Research Center, 505 NW 185th Avenue, Beaverton, Oregon 97006, USA.,
Lucia Carbone, Kimberly A. Nevonen, Elizabeth Terhune & Larry J. Wilhelm
Department of Molecular & Medical Genetics, Oregon Health & Science University, 3181 SW Sam Jackson Park Road, Portland, Oregon 97239, USA.,
Lucia Carbone
Bioinformatics and Computational Biology Division, Department of Medical Informatics & Clinical Epidemiology, Oregon Health & Science University, 3181 SW Sam Jackson Park Road, Portland, Oregon 97239, USA.,
Lucia Carbone, Nathan H. Lazar & Kemal Sonmez
Department of Molecular and Human Genetics, Baylor College of Medicine, One Baylor Plaza, Houston, Texas 77030, USA.,
R. Alan Harris
Nabsys, 60 Clifford Street, Providence, Rhode Island 02903, USA.,
Sante Gnerre
ARL Division of Biotechnology, University of Arizona, Tucson, 85721, Arizona, USA
Krishna R. Veeramah, Laurel M. Johnstone, Fernando L. Mendez, August E. Woerner & Michael F. Hammer
Department of Ecology and Evolution, Stony Brook University, Stony Brook, New York 11790, USA.,
Krishna R. Veeramah
IBE, Institut de Biologia Evolutiva (UPF-CSIC), Universitat Pompeu Fabra, PRBB, Doctor Aiguader, 88, 08003 Barcelona, Spain.,
Belen Lorente-Galdos, Marcos Fernandez-Callejo, Jessica Hernandez-Rodriguez, Javier Quilez & Tomas Marques-Bonet
Department of Genome Sciences, University of Washington School of Medicine, Seattle, 98195, Washington, USA
John Huddleston, Carl Baker & Evan E. Eichler
Howard Hughes Medical Institute, 1705 NE Pacific Street, Seattle, Washington 98195, USA.,
John Huddleston & Evan E. Eichler
European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK.,
Javier Herrero, Bronwen Aken, Daniel Barrell, Kathryn Beal, Paul Flicek & Matthieu Muffato
The Genome Analysis Centre, Norwich Research Park, Norwich NR4 7UH, UK.,
Javier Herrero
Leibniz Institute for Primate Research, Gene Bank of Primates, German Primate Center, Göttingen 37077, Germany.,
Christian Roos, Markus Brameier & Lutz Walter
European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK.,
Bronwen Aken, Daniel Barrell, Duncan T. Odom, Stephen Searle & Simon White
Department of Biology, University of Bari, Via Orabona 4, 70125, Bari, Italy.,
Fabio Anaclerio, Nicoletta Archidiacono, Oronzo Capozzi, Giorgia Chiatante, Mariano Rocchi & Mario Ventura
Department of Biological Sciences, Louisiana State University, Baton Rouge, 70803, Louisiana, USA
Mark A. Batzer, Miriam K. Konkel & Jerilyn A. Walker
University of Paul Sabatier, Toulouse 31062, France.,
Antoine Blancher
Department of Oncology, Division of Biostatistics and Bioinformatics, The Johns Hopkins University School of Medicine, Baltimore, 21205, Maryland, USA
Craig L. Bohrson & Sarah J. Wheelan
University of Utah, Salt Lake City, Utah 84112, USA.,
Michael S. Campbell & Mark Yandell
Department of Ecosystem Science and Management, Texas A&M University, College Station, 77843, Texas, USA
Claudio Casola
Department of Molecular and Human Genetics, Human Genome Sequencing Center, Baylor College of Medicine, One Baylor Plaza, Houston, Texas 77030, USA.,
Andrew Cree, Sandra L. Lee, Lora R. Lewis, Yue Liu, Lynne V. Nazareth, Donna M. Muzny, Kim C. Worley, Jeffrey Rogers & Richard A. Gibbs
Babes-Bolyai-University, Institute for Interdisciplinary Research in Bio-Nano-Sciences, Molecular Biology Center, Cluj-Napoca 400084, Romania.,
Annette Damert, Bianca Ianc & Cornelia Ochis
Children’s Hospital Oakland Research Institute, BACPAC Resources, Oakland, 94609, California, USA
Pieter J. de Jong, Boudewijn ten Hallers & Baoli Zhu
Department of Biochemistry and Molecular Genetics, University of Colorado School of Medicine, Aurora, 80045, Colorado, USA
Laura Dumas, Anis Karimpour-Fard, Majesta O’Bleness & James M. Sikela
Max Delbrück Center for Molecular Medicine, Berlin 13125, Germany.,
Nina V. Fuchs & Zsuzsanna Izsvák
Centro Nacional de Análisis Genómico (CNAG), Parc Científic de Barcelona, Barcelona 08028, Spain.,
Ivo Gut, Marta Gut & Tomas Marques-Bonet
Indiana University, School of Informatics and Computing, Bloomington, 47408, Indiana, USA
Matthew W. Hahn & Gregg W. C. Thomas
The Genome Center at Washington University, Washington University School of Medicine, 4444 Forest Park Avenue, Saint Louis, Missouri 63108, USA.,
LaDeana W. Hillier, Devin P. Locke, Arian Smit, Lucinda Fulton, Catrina Fronick, Wesley C. Warren & Richard K. Wilson
Institute for Systems Biology, Seattle, 98109-5234, Washington, USA
Robert Hubley
Department of Anthropology, The Pennsylvania State University, University Park, 16802, Pennsylvania, USA
Nina G. Jablonski
Department of Developmental Biology, Department of Computational and Systems Biology, University of Pittsburgh School of Medicine, Pittsburg, 15261, Pennsylvania, USA
Dennis Kostka
Department of Genetics, Harvard Medical School, Boston, 02115, Massachusetts, USA
Swapan Mallick & David Reich
University of Cambridge, Cancer Research UK-Cambridge Institute, Cambridge CB2 0RE, UK.,
Duncan T. Odom & Michelle C. Ward
University of California, Gladstone Institutes, San Francisco, 94158-226, California, USA
Katherine S. Pollard
Institute for Human Genetics, University of California, San Francisco, 94143-0794, California, USA
Katherine S. Pollard & Jeffrey D. Wall
Division of Biostatistics, University of California, San Francisco, 94143-0794, California, USA
Katherine S. Pollard & Jeffrey D. Wall
Division of Medical Biotechnology, Paul Ehrlich Institute, 63225 Langen, Germany.,
Gerald G. Schumann
Gibbon Conservation Center, 19100 Esguerra Rd, Santa Clarita, California 91350, USA.,
Gabriella Skollar
Oregon Health & Science University, Center for Spoken Language Understanding, Institute on Development and Disability, Portland, 97239, Oregon, USA
Kemal Sonmez & Christopher W. Whelan
Louisiana State University, School of Electrical Engineering and Computer Science, Baton Rouge, 70803, Louisiana, USA
Brygg Ullmer

Authors

Lucia Carbone
View author publications
You can also search for this author in PubMed Google Scholar
R. Alan Harris
View author publications
You can also search for this author in PubMed Google Scholar
Sante Gnerre
View author publications
You can also search for this author in PubMed Google Scholar
Krishna R. Veeramah
View author publications
You can also search for this author in PubMed Google Scholar
Belen Lorente-Galdos
View author publications
You can also search for this author in PubMed Google Scholar
John Huddleston
View author publications
You can also search for this author in PubMed Google Scholar
Thomas J. Meyer
View author publications
You can also search for this author in PubMed Google Scholar
Javier Herrero
View author publications
You can also search for this author in PubMed Google Scholar
Christian Roos
View author publications
You can also search for this author in PubMed Google Scholar
Bronwen Aken
View author publications
You can also search for this author in PubMed Google Scholar
Fabio Anaclerio
View author publications
You can also search for this author in PubMed Google Scholar
Nicoletta Archidiacono
View author publications
You can also search for this author in PubMed Google Scholar
Carl Baker
View author publications
You can also search for this author in PubMed Google Scholar
Daniel Barrell
View author publications
You can also search for this author in PubMed Google Scholar
Mark A. Batzer
View author publications
You can also search for this author in PubMed Google Scholar
Kathryn Beal
View author publications
You can also search for this author in PubMed Google Scholar
Antoine Blancher
View author publications
You can also search for this author in PubMed Google Scholar
Craig L. Bohrson
View author publications
You can also search for this author in PubMed Google Scholar
Markus Brameier
View author publications
You can also search for this author in PubMed Google Scholar
Michael S. Campbell
View author publications
You can also search for this author in PubMed Google Scholar
Oronzo Capozzi
View author publications
You can also search for this author in PubMed Google Scholar
Claudio Casola
View author publications
You can also search for this author in PubMed Google Scholar
Giorgia Chiatante
View author publications
You can also search for this author in PubMed Google Scholar
Andrew Cree
View author publications
You can also search for this author in PubMed Google Scholar
Annette Damert
View author publications
You can also search for this author in PubMed Google Scholar
Pieter J. de Jong
View author publications
You can also search for this author in PubMed Google Scholar
Laura Dumas
View author publications
You can also search for this author in PubMed Google Scholar
Marcos Fernandez-Callejo
View author publications
You can also search for this author in PubMed Google Scholar
Paul Flicek
View author publications
You can also search for this author in PubMed Google Scholar
Nina V. Fuchs
View author publications
You can also search for this author in PubMed Google Scholar
Ivo Gut
View author publications
You can also search for this author in PubMed Google Scholar
Marta Gut
View author publications
You can also search for this author in PubMed Google Scholar
Matthew W. Hahn
View author publications
You can also search for this author in PubMed Google Scholar
Jessica Hernandez-Rodriguez
View author publications
You can also search for this author in PubMed Google Scholar
LaDeana W. Hillier
View author publications
You can also search for this author in PubMed Google Scholar
Robert Hubley
View author publications
You can also search for this author in PubMed Google Scholar
Bianca Ianc
View author publications
You can also search for this author in PubMed Google Scholar
Zsuzsanna Izsvák
View author publications
You can also search for this author in PubMed Google Scholar
Nina G. Jablonski
View author publications
You can also search for this author in PubMed Google Scholar
Laurel M. Johnstone
View author publications
You can also search for this author in PubMed Google Scholar
Anis Karimpour-Fard
View author publications
You can also search for this author in PubMed Google Scholar
Miriam K. Konkel
View author publications
You can also search for this author in PubMed Google Scholar
Dennis Kostka
View author publications
You can also search for this author in PubMed Google Scholar
Nathan H. Lazar
View author publications
You can also search for this author in PubMed Google Scholar
Sandra L. Lee
View author publications
You can also search for this author in PubMed Google Scholar
Lora R. Lewis
View author publications
You can also search for this author in PubMed Google Scholar
Yue Liu
View author publications
You can also search for this author in PubMed Google Scholar
Devin P. Locke
View author publications
You can also search for this author in PubMed Google Scholar
Swapan Mallick
View author publications
You can also search for this author in PubMed Google Scholar
Fernando L. Mendez
View author publications
You can also search for this author in PubMed Google Scholar
Matthieu Muffato
View author publications
You can also search for this author in PubMed Google Scholar
Lynne V. Nazareth
View author publications
You can also search for this author in PubMed Google Scholar
Kimberly A. Nevonen
View author publications
You can also search for this author in PubMed Google Scholar
Majesta O’Bleness
View author publications
You can also search for this author in PubMed Google Scholar
Cornelia Ochis
View author publications
You can also search for this author in PubMed Google Scholar
Duncan T. Odom
View author publications
You can also search for this author in PubMed Google Scholar
Katherine S. Pollard
View author publications
You can also search for this author in PubMed Google Scholar
Javier Quilez
View author publications
You can also search for this author in PubMed Google Scholar
David Reich
View author publications
You can also search for this author in PubMed Google Scholar
Mariano Rocchi
View author publications
You can also search for this author in PubMed Google Scholar
Gerald G. Schumann
View author publications
You can also search for this author in PubMed Google Scholar
Stephen Searle
View author publications
You can also search for this author in PubMed Google Scholar
James M. Sikela
View author publications
You can also search for this author in PubMed Google Scholar
Gabriella Skollar
View author publications
You can also search for this author in PubMed Google Scholar
Arian Smit
View author publications
You can also search for this author in PubMed Google Scholar
Kemal Sonmez
View author publications
You can also search for this author in PubMed Google Scholar
Boudewijn ten Hallers
View author publications
You can also search for this author in PubMed Google Scholar
Elizabeth Terhune
View author publications
You can also search for this author in PubMed Google Scholar
Gregg W. C. Thomas
View author publications
You can also search for this author in PubMed Google Scholar
Brygg Ullmer
View author publications
You can also search for this author in PubMed Google Scholar
Mario Ventura
View author publications
You can also search for this author in PubMed Google Scholar
Jerilyn A. Walker
View author publications
You can also search for this author in PubMed Google Scholar
Jeffrey D. Wall
View author publications
You can also search for this author in PubMed Google Scholar
Lutz Walter
View author publications
You can also search for this author in PubMed Google Scholar
Michelle C. Ward
View author publications
You can also search for this author in PubMed Google Scholar
Sarah J. Wheelan
View author publications
You can also search for this author in PubMed Google Scholar
Christopher W. Whelan
View author publications
You can also search for this author in PubMed Google Scholar
Simon White
View author publications
You can also search for this author in PubMed Google Scholar
Larry J. Wilhelm
View author publications
You can also search for this author in PubMed Google Scholar
August E. Woerner
View author publications
You can also search for this author in PubMed Google Scholar
Mark Yandell
View author publications
You can also search for this author in PubMed Google Scholar
Baoli Zhu
View author publications
You can also search for this author in PubMed Google Scholar
Michael F. Hammer
View author publications
You can also search for this author in PubMed Google Scholar
Tomas Marques-Bonet
View author publications
You can also search for this author in PubMed Google Scholar
Evan E. Eichler
View author publications
You can also search for this author in PubMed Google Scholar
Lucinda Fulton
View author publications
You can also search for this author in PubMed Google Scholar
Catrina Fronick
View author publications
You can also search for this author in PubMed Google Scholar
Donna M. Muzny
View author publications
You can also search for this author in PubMed Google Scholar
Wesley C. Warren
View author publications
You can also search for this author in PubMed Google Scholar
Kim C. Worley
View author publications
You can also search for this author in PubMed Google Scholar
Jeffrey Rogers
View author publications
You can also search for this author in PubMed Google Scholar
Richard K. Wilson
View author publications
You can also search for this author in PubMed Google Scholar
Richard A. Gibbs
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

L.C. led the project and the manuscript preparation. L.C., W.C.W., K.C.W., J.R., E.E.E., T.M.-B., R.A.H., K.R.V. and M.F.H. supervised the project and contributed to overall organization of the manuscript. L.C. and T.J.M. prepared the figures. Sanger data production, assembly construction and testing was carried out by: L.F., C.F., D.M.M., L.V.N., A.C., S.L.L., L.R.L., D.P.L., W.C.W., K.C.W., J.R., S.G., L.D.W.H., D.R. and S.M. Mitochondrial genome assembly was done by Y.L. Illumina sequencing production and submission: L.C., T.M.-B., J.D.W., M.F.H., E.T., L.J.W., M.G., I.G., A.B. and J.H.-R. Samples were provided by G.S. Gene set and validation of gene models: D.B., S.W., S.S., B.A., M.M., J.He., P.F., M.S.C. and M.Y. Assembly validation: B.L.-G., J.He. and T.M.-B. BAC library generation: P.J.dJ., B.tH. and B.Z. Cytogenetic analyses: M.R., N.A. and O.C. Segmental duplications and structural variations: J.Hu., C.B., B.L.-G., J.Q., M.F.-C., G.C., F.A., M.V., T.M.-B. and E.E.E. cDNA Array CGH: L.D., M.O’B., A.K.-F. and J.M.S. Comparative analysis of gibbon chromosomal rearrangements was carried out by J.He. Breakpoint analysis: L.C., C.W.W. and L.J.W. LAVA analysis: L.C., R.A.H., T.J.M., N.H.L., L.J.W., K.A.N., K.S., A.D., M.A.B., M.K.K., J.A.W., B.U., A.S. and R.H. Luciferase assay and 3′ RACE: A.D., B.I., C.O., G.G.S., N.V.F. and Z.I. RNA-seq analysis for early transcription termination: S.J.W. and C.L.B. Short-read alignments, SNP calling and population genetics analysis (autosomal DNA): L.M.J., F.L.M., A.E.W., L.J.W., K.R.V., M.F.H. and J.D.W. Population genetics analyses (mtDNA): C.R., L.W., M.B. and T.M.-B. Positive selection analyses: G.W.C.T. and M.W.H. Gene family evolution analyses: M.W.H. and C.C. Gibbon accelerated region analyses: K.S.P. and D.K. CTCF-binding analyses: M.C.W., D.T.O., P.F., E.T., C.W.W., L.J.W., J.He. and K.B. Biogeography analysis: N.G.J. and C.R. Principal investigators: R.K.W. and R.A.G.

Corresponding author

Correspondence to Lucia Carbone.

Ethics declarations

Competing interests

E.E.E. is on the scientific advisory board (SAB) of DNAnexus and was an SAB member of Pacific Biosciences (2009–2013) and SynapDx (2011–2013).

Extended data figures and tables

Extended Data Figure 1 The gibbon assembly statistics and quality control.

a, The table compares the gibbon assembly statistics to those of other primates sequenced with a similar strategy. b, The plot represents the percentage of the 10,734 single-copy gene HMMs (hidden Markov models) for which just one gene (blue) is found in the different mammalian genomes in Ensembl 70. Other HMMs match more than one gene (red). The missing HMMs (cyan) either do not match any protein or the score is within the range of what can be expected for unrelated proteins. The remaining category (green) represents HMMs for which the best matching gene scores better than unrelated proteins but not as well as expected. See Supplementary Information section 1.4 for more details.

Extended Data Figure 2 Analysis of gibbon–human synteny blocks and identification and validation of gibbon segmental duplications.

a, The image shows a representative gibbon-only whole-genome shotgun sequence detection (WSSD) call by Sanger read depth. The duplication identified in this case overlaps with the gene CHAD that codes for a cartilage matrix protein. b, Examples of fluorescence in situ hybridizations on gibbon metaphases using duplicated human fosmid clones that were identified by the (WGS) detection strategy (red signals). Left, interchromosomal duplication. Middle, interspersed intrachromosomal duplication. Right, intrachromosomal tandem duplication confirmed using co-hybridization with a single control probe (blue signals). c, Megabases of lineage-specific and shared duplications for primates based on GRChr37 read depth analysis. Copy-number corrected values by species are shown below.

Extended Data Figure 3 Analysis of LAVA element insertion in genes and early termination of transcription.

a, The histogram shows the results of permutation analyses. We find a significant association between LAVA elements and genes. Moreover, insertions are significantly enriched in introns and depleted in exons, most probably as a result of selection against insertions in exons. b, Schematic representation of the mechanism through which LAVA intronic insertions in antisense orientation might cause early termination of transcription. The truncated transcript is indicated on the diagram as A and normal transcript indicated on the diagram as B (pA = polyadenylation site). c, We calculated the distance to the nearest exon for each intronic LAVA and compared this to what would be expected for random insertions (that is, background). We found fewer insertions than expected by chance within 1 kb of the nearest exon. d, Identification of pmiRGlo_LA_F polyadenylation sites by 3′ RACE. Alignment of thirteen 3′ RACE PCR clone sequences and the pmiRGlo_LA_F sequence. LAVA_F 3′ TSD is highlighted by dark green background; the major antisense LAVA_F polyadenylation signal (MAPS) is highlighted by red background. The termination sites are marked with arrows on the LAVA_F sequence. Poly(A) tails of the identified transcripts are in red text.

Extended Data Figure 4 Evolution of the LAVA element.

a, Screenshots from the Integrative Genomics Viewer (IGV) browser for loci MAP4, RABGAP1 and BBS9. Each column shows portions of the IGV visualization of a LAVA insertion locus identified in Nleu1.0 and its flanking sequence. Red rectangles indicate the margins of each LAVA insertion. Read pairs are coloured red when their insert size is larger than expected, indicating the presence of an unshared LAVA insertion. MAP4 is a shared LAVA insertion, whereas RABGAP1 and BBS9 are Nomascus specific. b, LAVA elements containing at least 300 bp of the LA section of LAVA were selected and reanalysed using RepeatMasker to determine subfamily affiliation and divergence from the consensus sequence. LAVA elements are grouped based upon their subfamily affiliations (see legend top right for colour scheme). The x axis shows the per cent divergence from the respective consensus sequence and the y axis shows the number of elements with a certain per cent divergence from the consensus sequence.

Extended Data Figure 5 Analysis of the phylogenetic relationships between gibbon genera.

a, Neighbour-joining trees for gibbons using non-genic loci. b, UPGMA trees for 100 kb non-overlapping sliding windows moving along the gibbon genome reporting the top 15 topologies (see also Supplementary Table ST8.3). The percentage of total support for each topology is given within each subpanel.

Extended Data Figure 6 Analysis of the relationship between gibbon accelerated regions (gibARs) and genes.

a, Intergenic regions are enriched in gibARs. Different sequence types are shown on the x axis and the y axis displays the fraction of gibARs and candidate regions annotated to the respective class. gibARs are significantly enriched in intergenic regions (P = 4.7 × 10⁻⁶) and significantly depleted in exons (P = 7.3 × 10⁻⁶). P values for each class were calculated with the Fisher’s exact test. Introns are comparably prevalent in candidates and gibARs, whereas in the UTR and flanking region, counts are too low to draw meaningful conclusions (data not shown). b, TreeMap from REVIGO for GOslim Biological Process terms with a Benjamini–Hochberg false discovery rate of 5%. Each rectangle is a cluster representative; larger rectangles represent ‘superclusters’ including loosely related terms. The size of the rectangles reflects the P value.

Extended Data Table 1 Genes from the ‘microtubule cytoskeleton’ GO category with LAVA insertions

Full size table

Related audio

Researcher Lucia Carbone on what the gibbon genome reveals about the acrobatic animals

Supplementary information

Supplementary Information

This file contains Supplementary Sections 1-6 – see Supplementary Contents for details. (PDF 19003 kb)

Supplementary Data

This file contains Supplementary Data 1. (XLSX 43 kb)

Supplementary Data

This file contains Supplementary Data 2. (XLSX 164 kb)

Supplementary Data

This file contains Supplementary Data 3. (PDF 1775 kb)

Supplementary Data

This file contains Supplementary Data 4. (XLSX 175 kb)

Supplementary Data

This file contains Supplementary Data 5. (XLSX 1604 kb)

Supplementary Data

This file contains Supplementary Data 6. (PPTX 5073 kb)

Supplementary Data

This file contains Supplementary Data 7. (XLSX 6602 kb)

Supplementary Data

This file contains Supplementary Data 8. (XLSX 44 kb)

Supplementary Data

This file contains Supplementary Data 9. (PDF 2103 kb)

PowerPoint slides

PowerPoint slide for Fig. 1

PowerPoint slide for Fig. 2

PowerPoint slide for Fig. 3

PowerPoint slide for Fig. 4

Rights and permissions

This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 Unported licence. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in the credit line; if the material is not included under the Creative Commons licence, users will need to obtain permission from the licence holder to reproduce the material. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-sa/3.0/.

Reprints and permissions

About this article

Cite this article

Carbone, L., Alan Harris, R., Gnerre, S. et al. Gibbon genome and the fast karyotype evolution of small apes. Nature 513, 195–201 (2014). https://doi.org/10.1038/nature13679

Download citation

Received: 23 March 2014
Accepted: 14 July 2014
Published: 10 September 2014
Issue Date: 11 September 2014
DOI: https://doi.org/10.1038/nature13679

This article is cited by

Reconstruction of hundreds of reference ancestral genomes across the eukaryotic kingdom
- Matthieu Muffato
- Alexandra Louis
- Hugues Roest Crollius
Nature Ecology & Evolution (2023)
Spaces of Phylogenetic Diversity Indices: Combinatorial and Geometric Properties
- Kerry Manson
- Mike Steel
Bulletin of Mathematical Biology (2023)
High-density linkage maps and chromosome level genome assemblies unveil direction and frequency of extensive structural rearrangements in wood white butterflies (Leptidea spp.)
- L. Höök
- K. Näsvall
- N. Backström
Chromosome Research (2023)
A map of white matter tracts in a lesser ape, the lar gibbon
- Katherine L. Bryant
- Paul R. Manger
- Rogier B. Mars
Brain Structure and Function (2023)
TAD evolutionary and functional characterization reveals diversity in mammalian TAD boundary properties and function
- Mariam Okhovat
- Jake VanCampen
- Lucia Carbone
Nature Communications (2023)

Comments

By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.

Subjects

Abstract

Similar content being viewed by others

Main

Gibbon–human synteny breakpoints

LAVA insertions in the gibbon genome

The complex evolutionary history of gibbons

Functional sequence evolution

Discussion

Methods Summary

Accession codes

Primary accessions

GenBank/EMBL/DDBJ

Sequence Read Archive

Data deposits

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Competing interests

Extended data figures and tables

Related audio

Supplementary information

PowerPoint slides

Rights and permissions

About this article

Cite this article

Share this article

This article is cited by

Comments

Search

Quick links