Reference genomes for the great apes have suffered from limitations — including varying quality and completeness — that hamper the identification of genetic variation specific to humans. The publication in Science of new, improved great ape genome assemblies aims to address this issue.

Credit: Stephanie Hough/redbrickstock.com/Alamy

Kronenberg et al. sequenced and assembled one chimpanzee and one orangutan genome, as well as two new human genomes, using high-coverage (>65×) single-molecule real-time (SMRT) long-read sequencing. By scaffolding the chimpanzee and orangutan genomes without guidance from the human reference genome the authors avoid introducing a ‘humanizing’ bias, for example, during gene annotation. The resulting assemblies improve contiguity for the chimpanzee and orangutan genomes by 32-fold and 533-fold, respectively.

De novo transcript models for chimpanzee, orangutan and gorilla genomes generated from full-length cDNA samples of induced pluripotent stem cells and short-read RNA sequencing data improved the mapping of human protein-coding transcripts to the chimpanzee and orangutan genomes, mainly as a result of closing existing gaps.

Five-way, genome-wide multiple sequence alignment showed that 83% of the ape genome was represented in the alignment. By mapping each assembly back to the human reference genome (GRCh38), using the two newly assembled human genomes as controls, the team could undertake comprehensive genome-wide analyses of structural variation (SV). They identified 614,186 deletions, insertions and inversions across the great apes, including 17,789 fixed human-specific SVs, 90 of which were predicted to disrupt genes, with an additional 643 potentially affecting regulatory regions.

comprehensive genome-wide analyses of structural variation

Using single-cell gene expression data from human and chimpanzee cerebral organoid models and from primary human cortex, the authors compared SV locations with genes differentially expressed during brain development and found that ~40% of genes downregulated in human radial glial neuroprogenitors were enriched for fixed SVs, mainly deletions or insertions. By contrast, genes associated with human-specific segmental duplications showed a pattern of upregulated expression. As the differential expression of genes during brain development may underlie the increase in brain volume from chimpanzees to humans over the course of evolution, functional analysis of the candidate genes highlighted by these associations may help identify the genetic differences that make us human.