Inferred gene expression differences between modern humans and our extinct archaic relatives suggest potential mechanistic bases for the evolution of hominin phenotypes.
What does it mean to be human? Addressing this question from a genetic perspective involves comparing our genomes to those of our closest evolutionary relatives — including those now extinct. Sequencing of the Neanderthal genome, approximately one decade ago1, raised expectations about uncovering the key genetic changes contributing to modern human-specific traits. While the field of functional genomics has since advanced in leaps and bounds, the functional genomic basis of hominin phenotypic divergence has remained elusive. Rapid degradation of RNA represents one contributing barrier, as it prevents most direct functional assays of archaic hominin remains. Writing in Nature Ecology & Evolution, Colbran et al.2 report a creative strategy for inferring patterns of regulatory divergence from archaic hominin genome sequences alone. Bypassing the need for intact RNA, the authors predicted divergent regulation of hundreds of genes that may in turn mediate divergence of organismal phenotypes.
Colbran et al. adapted the method PrediXcan3 to infer regulatory divergence among hominin lineages. PrediXcan is a supervised machine learning method that aggregates information from multiple variants associated with gene expression in a training sample to ‘impute’ (that is, predict) regulatory effects in novel test samples. Training data comprised genotype and multi-tissue RNA sequencing from the Genotype-Tissue Expression Project, while test data comprised 2,504 diverse modern human genomes, as well as the Altai Neanderthal genome. Regulatory divergence was inferred when imputed expression values for the Neanderthal fell outside the range of modern humans (Fig. 1). Notably, this approach is equally applicable to loci with and without evidence of archaic introgression. The latter, termed ‘genes without archaic regulatory regions’ (GWARRs), is the focus of their study. Functional divergence of GWARRs is salient given the hypothesis that negative selection shaped the landscape of introgressed sequence following hybridization4. Neanderthal alleles of divergent GWARRs may be absent from contemporary populations because their regulatory effects were associated with reduced fitness in our hybrid ancestors.
Strikingly, Colbran et al. detected significant regulatory divergence at more than 700 GWARRs. Divergent GWARRs were modestly enriched for genes that are intolerant to loss-of-function mutations, consistent with the notion that they are functionally constrained and sensitive to regulatory perturbation. Divergent GWARRs were also enriched for various phenotypic associations identified using biobank data, including potential fitness-associated reproductive phenotypes such as spontaneous abortion and polycystic ovary syndrome. However, GWARRs were not enriched within ‘deserts’ of introgression: large (>8 Mb) regions of the genome where no modern human carries archaic introgressed sequence, whose existence has previously been attributed to negative selection5. The extent to which purifying selection has targeted functionally divergent Neanderthal-introgressed alleles thus remains equivocal. Finally, the authors extended their analysis to an additional Neanderthal genome, as well as the genome of a Denisovan, a divergent hominin sister group to the Neanderthals. Grouping these individuals by patterns of gene regulation recapitulated known phylogenetic relationships among the hominin lineages. Their analysis also revealed that certain classes of genes, including those involved in immune response, have diverged more extensively than others. Such differences in rates of regulatory divergence reflect underlying differences in the combined forces of selection targeting different gene sets.
Have Colbran et al. effectively resurrected Neanderthal and Denisovan transcriptomes? We join the authors in discouraging such a literal interpretation. Below we outline several biological and statistical caveats that complicate genomic approaches to phenotype prediction across evolutionary timescales.
First, because the PrediXcan model was necessarily trained on human genotype and expression data, all truly divergent loci (that is, fixed differences or substitutions) were inaccessible to the model. There are tens of thousands of such substitutions, which have occurred on both the archaic and modern human lineages. Based on studies of introgression, we know many have significant regulatory effects6,7. The method employed by Colbran et al. is instead relegated to a combination of (1) persisting polymorphisms that arose in the common ancestors of archaic and modern humans and (2) modern-human specific polymorphisms where archaic hominins are homozygous for ancestral alleles. While the authors argue that their method is conservative and underestimates the true extent of regulatory divergence, it remains unclear how linked regulatory changes may combine to amplify or buffer expression divergence.
Second, the prediction method used by the authors was limited to genetic variants located nearby the genes that they regulate (that is, cis-expression quantitative trait loci (eQTL)). While this improves model tractability, the polygenic architecture of gene expression is thought to be dominated by abundant distant regulatory effects (that is, trans-eQTL) that are individually small but together explain ~70% of gene expression heritability8. Moreover, cis and trans effects on expression may compensate or interact in complex ways, especially over evolutionary timescales.
Third and perhaps most concerning is the fact that polygenic scores such as PrediXcan have recently been demonstrated to exhibit low portability across populations9. This limitation arises by consequence of differences in the structure of linkage disequilibrium, as well as complex gene by gene (G × G or ‘epistasis’) and gene by environment (G × E) interactions. In light of these challenges, the regulatory divergence inferred by Colbran et al. is best interpreted as a rough proxy for true gene expression divergence.
We emphasize that a rough proxy for regulatory divergence nevertheless represents a substantial advance in the field’s modest knowledge of archaic hominin biology, gleaned from decaying bone fragments. Innovative statistical and computational approaches offer encouraging headway towards the profound goal of dissecting the functional genomic basis of hominin phenotypic divergence. Such approaches will be advantageous in the coming decades, as routine whole-genome sequencing coupled with multiplexed assays of variant effects10 generate a wealth of data ripe for evolutionary analysis.
Green, R. E. et al. Science 328, 710–722 (2010).
Colbran, L. L. et al. Nat. Ecol. Evol. https://doi.org/10.1038/s41559-019-0996-x (2019).
Gamazon, E. R. et al. Nat. Genet. 47, 1091–1098 (2015).
Sankararaman, S., Mallick, S., Patterson, N. & Reich, D. Curr. Biol. 26, 1241–1247 (2016).
Vernot, B. et al. Science 352, 235–239 (2016).
McCoy, R. C., Wakefield, J. & Akey, J. M. Cell 168, 916–927.e12 (2017).
Dannemann, M., Prüfer, K. & Kelso, J. Genome Biol. 18, 61 (2017).
Liu, X., Li, Y. I. & Pritchard, J. K. Cell 177, 1022–1034.e6 (2019).
Martin, A. R. et al. Am. J. Hum. Genet. 100, 635–649 (2017).
Gasperini, M., Starita, L. & Shendure, J. Nat. Protoc. 11, 1782–1787 (2016).
About this article
Cite this article
Yan, S.M., McCoy, R.C. Functional divergence among hominins. Nat Ecol Evol 3, 1507–1508 (2019). https://doi.org/10.1038/s41559-019-0995-y