Genomic studies have shown that Neanderthals interbred with modern humans, and that non-Africans today are the products of this mixture1, 2. The antiquity of Neanderthal gene flow into modern humans means that genomic regions that derive from Neanderthals in any one human today are usually less than a hundred kilobases in size. However, Neanderthal haplotypes are also distinctive enough that several studies have been able to detect Neanderthal ancestry at specific loci1, 3, 4, 5, 6, 7, 8. We systematically infer Neanderthal haplotypes in the genomes of 1,004 present-day humans9. Regions that harbour a high frequency of Neanderthal alleles are enriched for genes affecting keratin filaments, suggesting that Neanderthal alleles may have helped modern humans to adapt to non-African environments. We identify multiple Neanderthal-derived alleles that confer risk for disease, suggesting that Neanderthal alleles continue to shape human biology. An unexpected finding is that regions with reduced Neanderthal ancestry are enriched in genes, implying selection to remove genetic material derived from Neanderthals. Genes that are more highly expressed in testes than in any other tissue are especially reduced in Neanderthal ancestry, and there is an approximately fivefold reduction of Neanderthal ancestry on the X chromosome, which is known from studies of diverse species to be especially dense in male hybrid sterility genes10, 11, 12. These results suggest that part of the explanation for genomic regions of reduced Neanderthal ancestry is Neanderthal alleles that caused decreased fertility in males when moved to a modern human genetic background.
- A draft sequence of the Neanderthal genome. Science 328, 710–722 (2010) et al.
- The complete genome sequence of a Neanderthal from the Altai Mountains. Nature 505, 43–49 (2014) et al.
- The shaping of modern human immune systems by multiregional admixture with archaic humans. Science 334, 89–94 (2011) et al.
- A haplotype at STAT2 Introgressed from neanderthals and serves as a candidate of positive selection in Papua New Guinea. Am. J. Hum. Genet. 91, 265–274 (2012) , &
- Neanderthal origin of genetic variation at the cluster of OAS immunity genes. Mol. Biol. Evol. 30, 798–801 (2013) , &
- An X-linked haplotype of Neanderthal origin is present among all non-African populations. Mol. Biol. Evol. 28, 1957–1962 (2011) et al.
- Higher levels of neanderthal ancestry in East Asians than in Europeans. Genetics 194, 199–209 (2013) et al.
- Evolutionary history and adaptation from high-coverage whole-genome sequences of diverse African hunter-gatherers. Cell 150, 457–469 (2012) et al.
- An integrated map of genetic variation from 1,092 human genomes. Nature 491, 56–65 (2012)
- Abrupt cline for sex chromosomes in a hybrid zone between two species of mice. Evolution 46, 1146–1163 (1992) , , &
- A complex genetic basis to X-linked hybrid male sterility between two species of house mice. Genetics 179, 2213–2228 (2008) , &
- Sex chromosomes and speciation in Drosophila. Trends Genet. 24, 336–343 (2008)
- Conditional random fields: probabilistic models for segmenting and labeling sequence data. Proc. 18th Int. Conf. Machine Learn. 282–289. (2001) , &
- The date of interbreeding between Neanderthals and modern humans. PLoS Genet. 8, e1002947 (2012) , , , &
- msHOT: modifying Hudson's ms simulator to incorporate crossover and gene conversion hotspots. Bioinformatics 23, 520–521 (2007) &
- Enredo and Pecan: genome-wide mammalian consistency-based multiple alignment with paralogs. Genome Res. 18, 1814–1828 (2008) , , , &
- A high-coverage genome sequence from an archaic Denisovan individual. Science 338, 222–226 (2012) et al.
- http://23andme.https.internapcdn.net/res/pdf/hXitekfSJe1lcIy7-Q72XA_23-05_Neanderthal_Ancestry.pdf (23andMe, 2011) Neanderthal Ancestry Estimator White paper 23-05
- Potential etiologic and functional implications of genome-wide association loci for human diseases and traits. Proc. Natl Acad. Sci. USA 106, 9362–9367 (2009) et al.
- Sequence variants in SLC16A11 are a common risk factor for type 2 diabetes in Mexico. Nature http://dx.doi.org/10.1038/nature12828 (25 December 2014)
- Widespread genomic signatures of natural selection in hominid evolution. PLoS Genet. 5, e1000471 (2009) , , &
- Genetic history of an archaic hominin group from Denisova Cave in Siberia. Nature 468, 1053–1060 (2010) et al.
- 180–207 Sinauer Associates, 1989) Speciation and Its Consequences (eds & ,
- Evolution of postmating reproductive isolation: the composite nature of Haldane's rule and its genetic basis. Am. Nat. 142, 187–212 (1993) &
- The GENCODE v7 catalog of human long noncoding RNAs: analysis of their gene structure, evolution, and expression. Genome Res. 22, 1775–1789 (2012) et al.
- Measurement of the human allele frequency spectrum demonstrates greater genetic drift in East Asians than in Europeans. Nature Genet. 39, 1251–1255 (2007) , , &
- Genome-wide scan of 29,141 African Americans finds no evidence of selection since admixture. Preprint at http://arxiv.org/pdf/1312.2675.pdf (2013) et al.
- The evolution of postzygotic isolation: accumulating Dobzhansky-Muller incompatibilities. Evolution 55, 1085–1094 (2001) &
- A fine-scale map of recombination rates and hotspots across the human genome. Science 310, 321–324 (2005) , , , &
- 93–128 (MIT Press, 2007) & in Introduction to Statistical Relational Learning (eds & ) Ch. 4,
- Representations of quasi-Newton matrices and their use in limited memory methods. Mathematical Programming 63, 129–156 (1994) , &
- Population genetics models of local ancestry. Genetics 191, 607–619 (2012)
- The consensus coding sequence (CCDS) project: identifying a common protein-coding gene set for the human and mouse genomes. Genome Res. 19, 1316–1323 (2009) et al.
- Gene ontology: tool for the unification of biology. Nature Genet. 25, 25–29 (2000) et al.
- FUNC: a package for detecting significant associations between gene sets and ontological annotations. BMC Bioinformatics 8, 41 (2007) et al.
- 2005) & Wavelet Methods for Time Series Analysis. (Cambridge Univ. Press,
- The jackknife and the bootstrap for general stationary observations. Ann. Statist. 17, 1217–1241 (1989)
- Generating samples under a Wright–Fisher neutral model of genetic variation. Bioinformatics 18, 337–338 (2002)
- Differential expression analysis for sequence count data. Genome Biol. 11, R106 (2010) &
Extended data figures and tables
Extended Data Figures
- Extended Data Figure 1: Three features used in the Conditional Random Field for predicting Neanderthal ancestry. (252 KB)
Top (feature 1), patterns of variation at a single SNP. Sites at which a panel of sub-Saharan-African individuals carry the ancestral allele and in which the sequenced Neanderthal and the test haplotype carry the derived allele are likely to be derived from Neanderthal gene flow. Middle (feature 2), haplotype divergence patterns. Genomic segments in which the divergence of the test haplotype to the sequenced Neanderthal is low, whereas the divergence to a panel of sub-Saharan-African individuals is high, are likely to be introgressed. Bottom (feature 3), we searched for segments that have a length consistent with what is expected from Neanderthal-to-modern-human gene flow approximately 2,000 generations ago, corresponding to a size of about 0.05 cM = (100 cM per Morgan)/(2,000 generations).
- Extended Data Figure 2: Map of Neanderthal ancestry in 1000 Genomes European and east-Asian populations. (466 KB)
For each chromosome, we plot the fraction of alleles confidently inferred to be of Neanderthal origin (probability >90%) in non-overlapping 1-Mb windows in Europeans (red) and in east Asians (green). Black bars denote the coordinates of the centromeres. We plot traces in non-overlapping 10-Mb windows that pass filters. We label 10-Mb-scale windows that are deficient in Neanderthal ancestry (e1–e9 (e, European), a1–a17 (a, Asian)) (see Supplementary Information section 8 for details).
- Extended Data Figure 3: Tiling path from confidently inferred Neanderthal haplotypes. (333 KB)
a, Example tiling path at the BNC2 locus on chromosome 9 in European individuals. Red, confidently inferred Neanderthal haplotypes in a subset of these individuals; blue, resulting tiling path. We identified Neanderthal haplotypes by scanning for runs of consecutive SNPs along a haplotype with a marginal probability >90% and requiring the haplotypes to be at least 0.02 cM long. b, Distribution of contig lengths obtained by constructing a tiling path across confidently inferred Neanderthal haplotypes. On merging Neanderthal haplotypes in each of the 1000 Genomes European and east-Asian populations, we reconstructed 4,437 Neanderthal contigs with median length 129 kb.
Extended Data Tables
- Supplementary Information (5.5 MB)
This file contains Supplementary Figures, Supplementary Tables and Supplementary Text and Data - see Contents for more information.