Gibbons are small arboreal apes that display an accelerated rate of evolutionary chromosomal rearrangement and occupy a key node in the primate phylogeny between Old World monkeys and great apes. Here we present the assembly and analysis of a northern white-cheeked gibbon (Nomascus leucogenys) genome. We describe the propensity for a gibbon-specific retrotransposon (LAVA) to insert into chromosome segregation genes and alter transcription by providing a premature termination site, suggesting a possible molecular mechanism for the genome plasticity of the gibbon lineage. We further show that the gibbon genera (Nomascus, Hylobates, Hoolock and Symphalangus) experienced a near-instantaneous radiation ~5 million years ago, coincident with major geographical changes in southeast Asia that caused cycles of habitat compression and expansion. Finally, we identify signatures of positive selection in genes important for forelimb development (TBX5) and connective tissues (COL1A1) that may have been involved in the adaptation of gibbons to their arboreal habitat.
- 2013) , & Handbook of the Mammals of the World Vol. 3 (Lynx Edicions,. (
- A high-resolution map of synteny disruptions in gibbon and human genomes. PLoS Genet. 2, e223 (2006) et al.
- Comparative and demographic analysis of orang-utan genomes. Nature 469, 529–533 (2011) et al.
- Evolutionary and biomedical insights from the rhesus macaque genome. Science 316, 222–234 (2007) et al.
- Sequencing human–gibbon breakpoints of synteny reveals mosaic new insertions at rearrangement sites. Genome Res. 19, 178–190 (2009) et al.
- Evolutionary breakpoints in the gibbon suggest association between cytosine methylation and karyotype evolution. PLoS Genet. 5, e1000538 (2009) et al.
- Primate segmental duplications: crucibles of evolution, diversity and disease. Nature Rev. Genet. 7, 552–564 (2006) &
- IgH class switching and translocations use a robust non-classical end-joining pathway. Nature 449, 478–482 (2007) et al.
- A microhomology-mediated break-induced replication model for the origin of human copy number variation. PLoS Genet. 5, e1000327 (2009) , &
- CTCF and cohesin: linking gene regulatory elements with their targets. Cell 152, 1285–1297 (2013) &
- Co-binding by YY1 identifies the transcriptionally active, highly conserved set of CTCF-bound regions in primate genomes. Genome Biol. 14, R148 (2013) et al.
- Centromere remodeling in Hoolock leuconedys (Hylobatidae) by a new transposable element unique to the gibbons. Genome Biol. Evol. 4, 648–658 (2012) et al.
- Reverse transcription of R2Bm RNA is primed by a nick at the chromosomal target site: a mechanism for non-LTR retrotransposition. Cell 72, 595–605 (1993) , , &
- Systematic and integrative analysis of large gene lists using DAVID bioinformatics resources. Nature Protocols 4, 44–57 (2009) , &
- GeneMANIA: a real-time multiple association network integration algorithm for predicting gene function. Genome Biol. 9 (Suppl. 1). S4 (2008) , , , &
- ConsensusPathDB—a database for integrating human functional interaction networks. Nucleic Acids Res. 37, D623–D628 (2009) , , &
- Whole chromosome instability caused by Bub1 insufficiency drives tumorigenesis through tumor suppressor gene loss of heterozygosity. Cancer Cell 16, 475–486 (2009) , , &
- MAP4 and CLASP1 operate as a safety mechanism to maintain a stable spindle position in mitosis. Nature Cell Biol. 13, 1040–1050 (2011) et al.
- Proteins required for centrosome clustering in cancer cells. Sci. Transl. Med. 2, 33ra38 (2010) et al.
- The Mad1–Mad2 balancing act—a damaged spindle checkpoint in chromosome instability and cancer. J. Cell Sci. 125, 4197–4206 (2012) , &
- Cdk1 and Plk1 mediate a CLASP2 phospho-switch that stabilizes kinetochore-microtubule attachments. J. Cell Biol. 199, 285–301 (2012) et al.
- Role of the kinesin-2 family protein, KIF3, during mitosis. J. Biol. Chem. 281, 4094–4099 (2006) , , , &
- Transcriptional disruption by the L1 retrotransposon and implications for mammalian transcriptomes. Nature 429, 268–274 (2004) , &
- Gene-breaking: a new paradigm for human retrotransposon-mediated gene evolution. Genome Res. 15, 1073–1078 (2005) , , &
- 5′-Transducing SVA retrotransposon groups spread efficiently throughout the human genome. Genome Res. 19, 1992–2008 (2009) et al.
- Meiotic DNA double-strand breaks and chromosome asynapsis in mice are monitored by distinct HORMAD2-independent and -dependent mechanisms. Genes Dev. 26, 958–973 (2012) et al.
- Estimating the age of retrotransposon subfamilies using maximum likelihood. Genomics 94, 78–82 (2009) , , , &
- Bayesian inference of ancient human demography from individual genome sequences. Nature Genet. 43, 1031–1034 (2011) , , , &
- Incomplete lineage sorting is common in extant gibbon genera. PLoS ONE 8, e53682 (2013) et al.
- BEAST: Bayesian evolutionary analysis by sampling trees. BMC Evol. Biol. 7, 214 (2007) &
- Testing for ancient admixture between closely related populations. Mol. Biol. Evol. 28, 2239–2252 (2011) , , &
- A most distant intergeneric hybrid offspring (Larcon) of lesser apes, Nomascus leucogenys and Hylobates lar. Hum. Genet. 122, 477–483 (2007) , , &
- Inference of human population history from individual whole-genome sequences. Nature 475, 493–496 (2011) &
- Human-specific gain of function in a developmental enhancer. Science 321, 1346–1350 (2008) et al.
- An RNA gene expressed during cortical development evolved rapidly in humans. Nature 443, 167–172 (2006) et al.
- PAML 4: phylogenetic analysis by maximum likelihood. Mol. Biol. Evol. 24, 1586–1591 (2007)
- Functional anatomy of the gibbon forelimb: adaptations to a brachiating lifestyle. J. Anat. 215, 335–354 (2009) , , &
- Evaluation of genes involved in limb development, angiogenesis, and coagulation as risk factors for congenital limb deficiencies. Am. J. Med. Genet. A. 158A, 2463–2472 (2012) et al.
- Consortium for osteogenesis imperfecta mutations in the helical domain of type I collagen: regions rich in lethal mutations align with collagen binding sites for integrins and proteoglycans. Hum. Mutat. 28, 209–221 (2007) et al.
- hnRNP H enhances skipping of a nonfunctional exon P3A in CHRNA1 and a mutation disrupting its binding causes congenital myasthenic syndrome. Hum. Mol. Genet. 17, 4022–4035 (2008) et al.
- The skeletal phenotype of chondroadherin deficient mice. PLoS ONE 8, e63080 (2013) et al.
- Closing of the Indonesian seaway as a precursor to east African aridification around 3–4 million years ago. Nature 411, 157–162 (2001) &
- Late Miocene vegetation and climate of the Lühe region in Yunnan, southwestern China. Rev. Palaeobot. Palynol. 148, 36–59 (2008) , , &
- The Indochinese–Sundaic zoogeographic transition: a description and analysis of terrestrial mammal species distributions. J. Biogeogr. 36, 803–821 (2009) &
- Life histories in comparative perspective. 181–196 (Chicago Univ. Press, 1987) , & in Primate Societies (eds , et al.)
- Patterns of genetic variation within and between Gibbon species. Mol. Biol. Evol. 28, 2211–2218 (2011) et al.
Extended data figures and tables
Extended Data Figures
- Extended Data Figure 1: The gibbon assembly statistics and quality control. (453 KB)
a, The table compares the gibbon assembly statistics to those of other primates sequenced with a similar strategy. b, The plot represents the percentage of the 10,734 single-copy gene HMMs (hidden Markov models) for which just one gene (blue) is found in the different mammalian genomes in Ensembl 70. Other HMMs match more than one gene (red). The missing HMMs (cyan) either do not match any protein or the score is within the range of what can be expected for unrelated proteins. The remaining category (green) represents HMMs for which the best matching gene scores better than unrelated proteins but not as well as expected. See Supplementary Information section 1.4 for more details.
- Extended Data Figure 2: Analysis of gibbon–human synteny blocks and identification and validation of gibbon segmental duplications. (352 KB)
a, The image shows a representative gibbon-only whole-genome shotgun sequence detection (WSSD) call by Sanger read depth. The duplication identified in this case overlaps with the gene CHAD that codes for a cartilage matrix protein. b, Examples of fluorescence in situ hybridizations on gibbon metaphases using duplicated human fosmid clones that were identified by the (WGS) detection strategy (red signals). Left, interchromosomal duplication. Middle, interspersed intrachromosomal duplication. Right, intrachromosomal tandem duplication confirmed using co-hybridization with a single control probe (blue signals). c, Megabases of lineage-specific and shared duplications for primates based on GRChr37 read depth analysis. Copy-number corrected values by species are shown below.
- Extended Data Figure 3: Analysis of LAVA element insertion in genes and early termination of transcription. (316 KB)
a, The histogram shows the results of permutation analyses. We find a significant association between LAVA elements and genes. Moreover, insertions are significantly enriched in introns and depleted in exons, most probably as a result of selection against insertions in exons. b, Schematic representation of the mechanism through which LAVA intronic insertions in antisense orientation might cause early termination of transcription. The truncated transcript is indicated on the diagram as A and normal transcript indicated on the diagram as B (pA = polyadenylation site). c, We calculated the distance to the nearest exon for each intronic LAVA and compared this to what would be expected for random insertions (that is, background). We found fewer insertions than expected by chance within 1 kb of the nearest exon. d, Identification of pmiRGlo_LA_F polyadenylation sites by 3′ RACE. Alignment of thirteen 3′ RACE PCR clone sequences and the pmiRGlo_LA_F sequence. LAVA_F 3′ TSD is highlighted by dark green background; the major antisense LAVA_F polyadenylation signal (MAPS) is highlighted by red background. The termination sites are marked with arrows on the LAVA_F sequence. Poly(A) tails of the identified transcripts are in red text.
- Extended Data Figure 4: Evolution of the LAVA element. (582 KB)
a, Screenshots from the Integrative Genomics Viewer (IGV) browser for loci MAP4, RABGAP1 and BBS9. Each column shows portions of the IGV visualization of a LAVA insertion locus identified in Nleu1.0 and its flanking sequence. Red rectangles indicate the margins of each LAVA insertion. Read pairs are coloured red when their insert size is larger than expected, indicating the presence of an unshared LAVA insertion. MAP4 is a shared LAVA insertion, whereas RABGAP1 and BBS9 are Nomascus specific. b, LAVA elements containing at least 300 bp of the LA section of LAVA were selected and reanalysed using RepeatMasker to determine subfamily affiliation and divergence from the consensus sequence. LAVA elements are grouped based upon their subfamily affiliations (see legend top right for colour scheme). The x axis shows the per cent divergence from the respective consensus sequence and the y axis shows the number of elements with a certain per cent divergence from the consensus sequence.
- Extended Data Figure 5: Analysis of the phylogenetic relationships between gibbon genera. (340 KB)
a, Neighbour-joining trees for gibbons using non-genic loci. b, UPGMA trees for 100 kb non-overlapping sliding windows moving along the gibbon genome reporting the top 15 topologies (see also Supplementary Table ST8.3). The percentage of total support for each topology is given within each subpanel.
- Extended Data Figure 6: Analysis of the relationship between gibbon accelerated regions (gibARs) and genes. (247 KB)
a, Intergenic regions are enriched in gibARs. Different sequence types are shown on the x axis and the y axis displays the fraction of gibARs and candidate regions annotated to the respective class. gibARs are significantly enriched in intergenic regions (P = 4.7 × 10−6) and significantly depleted in exons (P = 7.3 × 10−6). P values for each class were calculated with the Fisher’s exact test. Introns are comparably prevalent in candidates and gibARs, whereas in the UTR and flanking region, counts are too low to draw meaningful conclusions (data not shown). b, TreeMap from REVIGO for GOslim Biological Process terms with a Benjamini–Hochberg false discovery rate of 5%. Each rectangle is a cluster representative; larger rectangles represent ‘superclusters’ including loosely related terms. The size of the rectangles reflects the P value.
Extended Data Tables
- Supplementary Information (18.5 MB)
This file contains Supplementary Sections 1-6 – see Supplementary Contents for details.
- Supplementary Data (1.7 MB)
This file contains Supplementary Data 3.
- Supplementary Data (2 MB)
This file contains Supplementary Data 9.
- Supplementary Data (43 KB)
This file contains Supplementary Data 1.
- Supplementary Data (164 KB)
This file contains Supplementary Data 2.
- Supplementary Data (175 KB)
This file contains Supplementary Data 4.
- Supplementary Data (1.5 MB)
This file contains Supplementary Data 5.
- Supplementary Data (6.4 MB)
This file contains Supplementary Data 7.
- Supplementary Data (44 KB)
This file contains Supplementary Data 8.
- Supplementary Data (4.9 MB)
This file contains Supplementary Data 6.