Ancient DNA makes it possible to observe natural selection directly by analysing samples from populations before, during and after adaptation events. Here we report a genome-wide scan for selection using ancient DNA, capitalizing on the largest ancient DNA data set yet assembled: 230 West Eurasians who lived between 6500 and 300 bc, including 163 with newly reported data. The new samples include, to our knowledge, the first genome-wide ancient DNA from Anatolian Neolithic farmers, whose genetic material we obtained by extracting from petrous bones, and who we show were members of the population that was the source of Europe’s first farmers. We also report a transect of the steppe region in Samara between 5600 and 300 bc, which allows us to identify admixture into the steppe from at least two external sources. We detect selection at loci associated with diet, pigmentation and immunity, and two independent episodes of selection on height.
This is a preview of subscription content, access via your institution
Open Access articles citing this article.
Bioarchaeological and paleogenomic profiling of the unusual Neolithic burial from Grotta di Pietra Sant’Angelo (Calabria, Italy)
Scientific Reports Open Access 24 July 2023
Genome Medicine Open Access 17 July 2023
Communications Biology Open Access 03 July 2023
Subscribe to this journal
Receive 51 print issues and online access
$199.00 per year
only $3.90 per issue
Rent or buy this article
Prices vary by article type
Prices may be subject to local taxes which are calculated during checkout
European Nucleotide Archive
The aligned sequences are available through the European Nucleotide Archive under accession number PRJEB11450. The Human Origins genotype datasets including ancient individuals can be found at (http://genetics.med.harvard.edu/reich/Reich_Lab/Datasets.html).
Grossman, S. R. et al. Identifying recent adaptations in large-scale genomic data. Cell 152, 703–713 (2013)
Wilde, S. et al. Direct evidence for positive selection of skin, hair, and eye pigmentation in Europeans during the last 5,000 y. Proc. Natl Acad. Sci. USA 111, 4832–4837 (2014)
Gamba, C. et al. Genome flux and stasis in a five millennium transect of European prehistory. Nature Commun. 5, 5257 (2014)
Lazaridis, I. et al. Ancient human genomes suggest three ancestral populations for present-day Europeans. Nature 513, 409–413 (2014)
Allentoft, M. E. et al. Population genomics of Bronze Age Eurasia. Nature 522, 167–172 (2015)
Keller, A. et al. New insights into the Tyrolean Iceman’s origin and phenotype as inferred by whole-genome sequencing. Nature Commun. 3, 698 (2012)
Haak, W. et al. Massive migration from the steppe was a source for Indo-European languages in Europe. Nature 522, 207–211 (2015)
Olalde, I. et al. Derived immune and ancestral pigmentation alleles in a 7,000-year-old Mesolithic European. Nature 507, 225–228 (2014)
Pinhasi, R. et al. Optimal ancient DNA yields from the inner ear part of the human petrous bone. PLoS ONE 10, e0129102 (2015)
Alexander, D. H., Novembre, J. & Lange, K. Fast model-based estimation of ancestry in unrelated individuals. Genome Res. 19, 1655–1664 (2009)
Patterson, N. et al. Ancient admixture in human history. Genetics 192, 1065–1093 (2012)
Underhill, P. A. et al. The phylogenetic and geographic structure of Y-chromosome haplogroup R1a. Eur. J. Hum. Genet. 23, 124–131 (2015)
The 1000 Genomes Project Consortium. A global reference for human genetic variation. Nature 526, 68–74 (2015)
Devlin, B. & Roeder, K. Genomic control for association studies. Biometrics 55, 997–1004 (1999)
Enattah, N. S. et al. Identification of a variant associated with adult-type hypolactasia. Nature Genet. 30, 233–237 (2002)
Bersaglieri, T. et al. Genetic signatures of strong recent positive selection at the lactase gene. Am. J. Hum. Genet. 74, 1111–1120 (2004)
Burger, J., Kirchner, M., Bramanti, B., Haak, W. & Thomas, M. G. Absence of the lactase-persistence-associated allele in early Neolithic Europeans. Proc. Natl Acad. Sci. USA 104, 3736–3741 (2007)
Teslovich, T. M. et al. Biological, clinical and population relevance of 95 loci for blood lipids. Nature 466, 707–713 (2010)
Fumagalli, M. et al. Greenlandic Inuit show genetic signatures of diet and climate adaptation. Science 349, 1343–1347 (2015)
Mathias, R. A. et al. Adaptive evolution of the FADS gene cluster within Africa. PLoS ONE 7, e44926 (2012)
Wang, T. J. et al. Common genetic determinants of vitamin D insufficiency: a genome-wide association study. Lancet 376, 180–188 (2010)
Price, A. L. et al. The impact of divergence time on the nature of population structure: an example from Iceland. PLoS Genet. 5, e1000505 (2009)
Wellcome Trust Case Control Consortium. Genome-wide association study of 14,000 cases of seven common diseases and 3,000 shared controls. Nature 447 , 661–678 (2007)
Huff, C. D. et al. Crohn’s disease and genetic hitchhiking at IBD5. Mol. Biol. Evol. 29, 101–111 (2012)
Hunt, K. A. et al. Newly identified genetic risk variants for celiac disease related to the immune response. Nature Genet. 40, 395–402 (2008)
Jostins, L. et al. Host–microbe interactions have shaped the genetic architecture of inflammatory bowel disease. Nature 491, 119–124 (2012)
Beleza, S. et al. Genetic architecture of skin and eye color in an African–European admixed population. PLoS Genet. 9, e1003372 (2013)
Sturm, R. A. et al. A single SNP in an evolutionary conserved region within intron 86 of the HERC2 gene determines human blue-brown eye color. Am. J. Hum. Genet. 82, 424–431 (2008)
Eiberg, H. et al. Blue eye color in humans may be caused by a perfectly associated founder mutation in a regulatory element located within the HERC2 gene inhibiting OCA2 expression. Hum. Genet. 123, 177–187 (2008)
Barreiro, L. B. et al. Evolutionary dynamics of human Toll-like receptors and their different contributions to host defense. PLoS Genet. 5, e1000562 (2009)
Uciechowski, P. et al. Susceptibility to tuberculosis is associated with TLR1 polymorphisms resulting in a lack of TLR1 cell surface expression. J. Leukoc. Biol. 90, 377–388 (2011)
Wong, S. H. et al. Leprosy and the adaptation of human toll-like receptor 1. PLoS Pathog. 6, e1000979 (2010)
Fujimoto, A. et al. A scan for genetic determinants of human hair morphology: EDAR is associated with Asian hair thickness. Hum. Mol. Genet. 17, 835–843 (2008)
Kimura, R. et al. A common variation in EDAR is a genetic determinant of shovel-shaped incisors. Am. J. Hum. Genet. 85, 528–535 (2009)
Kamberov, Y. G. et al. Modeling recent human evolution in mice by expression of a selected EDAR variant. Cell 152, 691–702 (2013)
Turchin, M. C. et al. Evidence of widespread selection on standing variation in Europe at height-associated SNPs. Nature Genet. 44, 1015–1019 (2012)
Berg, J. J. & Coop, G. et al. A population genetic signal of polygenic adaptation. PLoS Genet. 10, e1004412 (2014)
Lango Allen, H. et al. Hundreds of variants clustered in genomic loci and biological pathways affect human height. Nature 467, 832–838 (2010)
Speliotes, E. K. et al. Association analyses of 249,796 individuals reveal 18 new loci associated with body mass index. Nature Genet. 42, 937–948 (2010)
Heid, I. M. et al. Meta-analysis identifies 13 new loci associated with waist–hip ratio and reveals sexual dimorphism in the genetic basis of fat distribution. Nature Genet. 42, 949–960 (2010)
Morris, A. P. et al. Large-scale association analysis provides insights into the genetic architecture and pathophysiology of type 2 diabetes. Nature Genet. 44, 981–990 (2012)
Briggs, A. W. et al. Removal of deaminated cytosines and detection of in vivo methylation in ancient DNA. Nucleic Acids Res. 38, e87 (2010)
Fu, Q. et al. DNA analysis of an early modern human from Tianyuan Cave, China. Proc. Natl Acad. Sci. USA 110, 2223–2227 (2013)
Fu, Q. et al. An early modern human from Romania with a recent Neanderthal ancestor. Nature. 524, 216–219 (2015)
Korneliussen, T. S., Albrechtsen, A. & Nielsen, R. ANGSD: analysis of next generation sequencing data. BMC Bioinformatics 15, 356 (2014)
International HapMap Consortium. A second generation human haplotype map of over 3.1 million SNPs. Nature 449, 851–861 (2007)
Lappalainen, T. et al. Transcriptome and genome sequencing uncovers functional variation in humans. Nature 501, 506–511 (2013)
Li, J. Z. et al. Worldwide human relationships inferred from genome-wide patterns of variation. Science 319, 1100–1104 (2008)
Loh, P. R. et al. Inferring admixture histories of human populations using linkage disequilibrium. Genetics 193, 1233–1254 (2013)
Chang, C. C. et al. Second-generation PLINK: rising to the challenge of larger and richer datasets. GigaScience 4 (2015)
Skoglund, P., Storå, J., Götherström, A. & Jakobsson, M. Accurate sex identification of ancient human remains using DNA shotgun sequencing. J. Archaeol. Sci. 40, 4477–4482 (2013)
Li, H. et al. The sequence alignment/map format and SAMtools. Bioinformatics 25, 2078–2079 (2009)
Norton, H. L. et al. Genetic evidence for the convergent evolution of light skin in Europeans and East Asians. Mol. Biol. Evol. 24, 710–722 (2007)
Bokor, S. et al. Single nucleotide polymorphisms in the FADS gene cluster are associated with delta-5 and delta-6 desaturase activities estimated by serum fatty acid ratios. J. Lipid Res. 51, 2325–2333 (2010)
Tanaka, T. et al. Genome-wide association study of plasma polyunsaturated fatty acids in the InCHIANTI Study. PLoS Genet. 5, e1000338 (2009)
Ahn, J. et al. Genome-wide association study of circulating vitamin D levels. Hum. Mol. Genet. 19, 2739–2745 (2010)
Gründemann, D. et al. Discovery of the ergothioneine transporter. Proc. Natl Acad. Sci. USA 102, 5256–5261 (2005)
Chauhan, S. et al. ZKSCAN3 is a master transcriptional repressor of autophagy. Mol. Cell 50, 16–28 (2013)
Soler Artigas, M. et al. Genome-wide association and large-scale follow up identifies 16 new loci influencing lung function. Nature Genet. 43, 1082–1090 (2011)
Pruim, R. J. et al. LocusZoom: regional visualization of genome-wide association scan results. Bioinformatics 26, 2336–2337 (2010)
We thank P. de Bakker, J. Burger, C. Economou, E. Fornander, Q. Fu, F. Hallgren, K. Kirsanow, A. Mittnik, I. Olalde, A. Powell, P. Skoglund, S. Tabrizi and A. Tandon for discussions, suggestions about SNPs to include, or contribution to sample preparation or data curation. We thank S. Pääbo, M. Meyer, Q. Fu and B. Nickel for collaboration in developing the 1240k capture reagent. We thank J. M. V. Encinas and M. E. Prada for allowing us to resample La Braña 1. I.M. was supported by the Human Frontier Science Program LT001095/2014-L. C.G. was supported by the Irish Research Council for Humanities and Social Sciences (IRCHSS). F.G. was supported by a grant of the Netherlands Organization for Scientific Research, no. 380-62-005. A.K., P.K. and O.M. were supported by RFBR no. 15-06-01916 and RFH no. 15-11-63008 and O.M. by a state grant of the Ministry of Education and Science of the Russia Federation no. 33.1195.2014/k. J.K. was supported by ERC starting grant APGREID and DFG grant KR 4015/1-1. K.W.A. was supported by DFG grant AL 287 / 14-1. C.L.-F. was supported by a BFU2015-64699-P grant from the Spanish government. W.H. and B.L. were supported by Australian Research Council DP130102158. R.P. was supported by ERC starting grant ADNABIOARC (263441), and an Irish Research Council ERC support grant. D.R. was supported by US National Science Foundation HOMINID grant BCS-1032255, US National Institutes of Health grant GM100233, and the Howard Hughes Medical Institute.
The authors declare no competing financial interests.
Extended data figures and tables
We plot the number of raw sequences against the mean coverage of analysed SNPs after removal of duplicates, comparing the 163 samples for which capture data are newly reported in this study, against the 102 samples analysed by shotgun sequencing in ref. 5. We caution that the true cost is more than that of sequencing alone.
a, Mainland European populations later than 3000 bc are better modelled with steppe ancestry as a third ancestral population, (closer correspondence between empirical and estimated f4-statistics as estimated by resnorm; Methods). b, Later (post-Poltavka) steppe populations are better modelled with Anatolian Neolithic as a third ancestral population. c, Estimated mixture proportions of mainland European populations without steppe ancestry. d, Estimated mixture proportions of Eurasian steppe populations without Anatolian Neolithic ancestry. e, Estimated mixture proportions of later populations with both steppe and Anatolian Neolithic ancestry. f, Admixture plot at k = 17 showing population differences over time and space. EN, Early Neolithic; MN, Middle Neolithic; LN, Late Neolithic; BA, Bronze Age; LNBA, Late Neolithic and Bronze Age.
Locuszoom60 plots for genome-wide significant signals. Points show the –log10 P value for each SNP, coloured according to their linkage disequilibrium (LD; units of r2) with the most associated SNP. The blue line shows the recombination rate, with scale on right hand axis in centimorgans per megabase (cM/Mb). Genes are shown in the lower panel of each subplot.
Extended Data Figure 4 PCA of selection populations and derived allele frequencies for genome-wide significant signals.
a, Ancient samples projected onto principal components of modern samples, as in Fig. 1, but labelled according to selection populations defined in Extended Data Table 1. b, Allele frequency plots as in Fig. 3. Six signals not included in Fig. 3—for SLC22A4 we show both rs272872, which is our strongest signal, and rs1050152, which was previously hypothesized to be under selection, and we also show SLC24A5, which is not genome-wide significant but is discussed in the main text.
This figure compares the genotypes at all sites within 150 kb of rs3827760 (in blue) for the 6 Motala samples and 20 randomly chosen CHB (Chinese from Beijing) and CEU (Utah residents with northern and western European ancestry) samples. Each row is a sample and each column is a SNP. Grey means homozygous for the major (in CEU) allele. Pink denotes heterozygous and red indicates homozygous for the other allele. For the Motala samples, an open circle means that there is only a single sequence, otherwise the circle is coloured according to the number of sequences observed. Three of the Motala samples are heterozygous for rs3827760 and the derived allele lies on the same haplotype background as in present-day East Asians. The only other ancient samples with evidence of the derived EDAR allele in this data set are two Afanasievo samples dating to 3300–3000 bc, and one Scythian dating to 400–200 bc (not shown).
a, Estimated power for different selection coefficients (s) for a SNP that is selected in all populations for either 50, 100 or 200 generations. b, Effect of increasing sample size, showing estimated power for a SNP selected for 100 generations, with different amounts of data, relative to the main text. c, Effect of admixture from Yoruba (YRI) into one of the modern populations, showing the effect on the genomic inflation factor (blue, left axis) and the power to detect selection on a SNP selected for 100 generations with a selection coefficient of 0.02. d, Effect of mis-specification of the mixture proportions. Here 0 on the x axis corresponds to the proportions we used, and 1 corresponds to a random mixture matrix.
This file contains Supplementary Text comprising: Archaeological context for 83 newly reported ancient samples (Section 1) and Population interactions between Anatolia, mainland Europe, and the Eurasian steppe (Section 2) with additional references. (PDF 1045 kb)
This file contains information about 230 ancient samples used in this study. (XLSX 101 kb)
This file shows FST between ancient and modern populations. (XLSX 26 kb)
This file contains Genome-wide selection scan results and allele frequencies. (TXT 71292 kb)
About this article
Cite this article
Mathieson, I., Lazaridis, I., Rohland, N. et al. Genome-wide patterns of selection in 230 ancient Eurasians. Nature 528, 499–503 (2015). https://doi.org/10.1038/nature16152
This article is cited by
correctKin: an optimized method to infer relatedness up to the 4th degree from low-coverage ancient human genomes
Genome Biology (2023)
Inferring biological kinship in ancient datasets: comparing the response of ancient DNA-specific software packages to low coverage data
BMC Genomics (2023)
Genome Medicine (2023)