Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

Genome-wide patterns of selection in 230 ancient Eurasians

Abstract

Ancient DNA makes it possible to observe natural selection directly by analysing samples from populations before, during and after adaptation events. Here we report a genome-wide scan for selection using ancient DNA, capitalizing on the largest ancient DNA data set yet assembled: 230 West Eurasians who lived between 6500 and 300 bc, including 163 with newly reported data. The new samples include, to our knowledge, the first genome-wide ancient DNA from Anatolian Neolithic farmers, whose genetic material we obtained by extracting from petrous bones, and who we show were members of the population that was the source of Europe’s first farmers. We also report a transect of the steppe region in Samara between 5600 and 300 bc, which allows us to identify admixture into the steppe from at least two external sources. We detect selection at loci associated with diet, pigmentation and immunity, and two independent episodes of selection on height.

This is a preview of subscription content, access via your institution

Access options

Rent or buy this article

Prices vary by article type

from$1.95

to$39.95

Prices may be subject to local taxes which are calculated during checkout

Figure 1: Population relationships of samples.
Figure 2: Genome-wide scan for selection.
Figure 3: Allele frequencies for five genome-wide significant signals of selection.
Figure 4: Polygenic selection on height.

Accession codes

Primary accessions

European Nucleotide Archive

Data deposits

The aligned sequences are available through the European Nucleotide Archive under accession number PRJEB11450. The Human Origins genotype datasets including ancient individuals can be found at (http://genetics.med.harvard.edu/reich/Reich_Lab/Datasets.html).

References

  1. Grossman, S. R. et al. Identifying recent adaptations in large-scale genomic data. Cell 152, 703–713 (2013)

    Article  CAS  Google Scholar 

  2. Wilde, S. et al. Direct evidence for positive selection of skin, hair, and eye pigmentation in Europeans during the last 5,000 y. Proc. Natl Acad. Sci. USA 111, 4832–4837 (2014)

    Article  ADS  CAS  Google Scholar 

  3. Gamba, C. et al. Genome flux and stasis in a five millennium transect of European prehistory. Nature Commun. 5, 5257 (2014)

    Article  ADS  CAS  Google Scholar 

  4. Lazaridis, I. et al. Ancient human genomes suggest three ancestral populations for present-day Europeans. Nature 513, 409–413 (2014)

    Article  ADS  CAS  Google Scholar 

  5. Allentoft, M. E. et al. Population genomics of Bronze Age Eurasia. Nature 522, 167–172 (2015)

    Article  ADS  CAS  Google Scholar 

  6. Keller, A. et al. New insights into the Tyrolean Iceman’s origin and phenotype as inferred by whole-genome sequencing. Nature Commun. 3, 698 (2012)

    Article  ADS  Google Scholar 

  7. Haak, W. et al. Massive migration from the steppe was a source for Indo-European languages in Europe. Nature 522, 207–211 (2015)

    Article  ADS  CAS  Google Scholar 

  8. Olalde, I. et al. Derived immune and ancestral pigmentation alleles in a 7,000-year-old Mesolithic European. Nature 507, 225–228 (2014)

    Article  ADS  CAS  Google Scholar 

  9. Pinhasi, R. et al. Optimal ancient DNA yields from the inner ear part of the human petrous bone. PLoS ONE 10, e0129102 (2015)

    Article  Google Scholar 

  10. Alexander, D. H., Novembre, J. & Lange, K. Fast model-based estimation of ancestry in unrelated individuals. Genome Res. 19, 1655–1664 (2009)

    Article  CAS  Google Scholar 

  11. Patterson, N. et al. Ancient admixture in human history. Genetics 192, 1065–1093 (2012)

    Article  Google Scholar 

  12. Underhill, P. A. et al. The phylogenetic and geographic structure of Y-chromosome haplogroup R1a. Eur. J. Hum. Genet. 23, 124–131 (2015)

    Article  Google Scholar 

  13. The 1000 Genomes Project Consortium. A global reference for human genetic variation. Nature 526, 68–74 (2015)

  14. Devlin, B. & Roeder, K. Genomic control for association studies. Biometrics 55, 997–1004 (1999)

    Article  CAS  Google Scholar 

  15. Enattah, N. S. et al. Identification of a variant associated with adult-type hypolactasia. Nature Genet. 30, 233–237 (2002)

    Article  CAS  Google Scholar 

  16. Bersaglieri, T. et al. Genetic signatures of strong recent positive selection at the lactase gene. Am. J. Hum. Genet. 74, 1111–1120 (2004)

    Article  CAS  Google Scholar 

  17. Burger, J., Kirchner, M., Bramanti, B., Haak, W. & Thomas, M. G. Absence of the lactase-persistence-associated allele in early Neolithic Europeans. Proc. Natl Acad. Sci. USA 104, 3736–3741 (2007)

    Article  ADS  CAS  Google Scholar 

  18. Teslovich, T. M. et al. Biological, clinical and population relevance of 95 loci for blood lipids. Nature 466, 707–713 (2010)

    Article  ADS  CAS  Google Scholar 

  19. Fumagalli, M. et al. Greenlandic Inuit show genetic signatures of diet and climate adaptation. Science 349, 1343–1347 (2015)

    Article  ADS  CAS  Google Scholar 

  20. Mathias, R. A. et al. Adaptive evolution of the FADS gene cluster within Africa. PLoS ONE 7, e44926 (2012)

    Article  ADS  CAS  Google Scholar 

  21. Wang, T. J. et al. Common genetic determinants of vitamin D insufficiency: a genome-wide association study. Lancet 376, 180–188 (2010)

    Article  CAS  Google Scholar 

  22. Price, A. L. et al. The impact of divergence time on the nature of population structure: an example from Iceland. PLoS Genet. 5, e1000505 (2009)

    Article  Google Scholar 

  23. Wellcome Trust Case Control Consortium. Genome-wide association study of 14,000 cases of seven common diseases and 3,000 shared controls. Nature 447 , 661–678 (2007)

  24. Huff, C. D. et al. Crohn’s disease and genetic hitchhiking at IBD5. Mol. Biol. Evol. 29, 101–111 (2012)

    Article  CAS  Google Scholar 

  25. Hunt, K. A. et al. Newly identified genetic risk variants for celiac disease related to the immune response. Nature Genet. 40, 395–402 (2008)

    Article  CAS  Google Scholar 

  26. Jostins, L. et al. Host–microbe interactions have shaped the genetic architecture of inflammatory bowel disease. Nature 491, 119–124 (2012)

    Article  CAS  Google Scholar 

  27. Beleza, S. et al. Genetic architecture of skin and eye color in an African–European admixed population. PLoS Genet. 9, e1003372 (2013)

    Article  CAS  Google Scholar 

  28. Sturm, R. A. et al. A single SNP in an evolutionary conserved region within intron 86 of the HERC2 gene determines human blue-brown eye color. Am. J. Hum. Genet. 82, 424–431 (2008)

    Article  CAS  Google Scholar 

  29. Eiberg, H. et al. Blue eye color in humans may be caused by a perfectly associated founder mutation in a regulatory element located within the HERC2 gene inhibiting OCA2 expression. Hum. Genet. 123, 177–187 (2008)

    Article  CAS  Google Scholar 

  30. Barreiro, L. B. et al. Evolutionary dynamics of human Toll-like receptors and their different contributions to host defense. PLoS Genet. 5, e1000562 (2009)

    Article  Google Scholar 

  31. Uciechowski, P. et al. Susceptibility to tuberculosis is associated with TLR1 polymorphisms resulting in a lack of TLR1 cell surface expression. J. Leukoc. Biol. 90, 377–388 (2011)

    Article  CAS  Google Scholar 

  32. Wong, S. H. et al. Leprosy and the adaptation of human toll-like receptor 1. PLoS Pathog. 6, e1000979 (2010)

    Article  Google Scholar 

  33. Fujimoto, A. et al. A scan for genetic determinants of human hair morphology: EDAR is associated with Asian hair thickness. Hum. Mol. Genet. 17, 835–843 (2008)

    Article  CAS  Google Scholar 

  34. Kimura, R. et al. A common variation in EDAR is a genetic determinant of shovel-shaped incisors. Am. J. Hum. Genet. 85, 528–535 (2009)

    Article  CAS  Google Scholar 

  35. Kamberov, Y. G. et al. Modeling recent human evolution in mice by expression of a selected EDAR variant. Cell 152, 691–702 (2013)

    Article  CAS  Google Scholar 

  36. Turchin, M. C. et al. Evidence of widespread selection on standing variation in Europe at height-associated SNPs. Nature Genet. 44, 1015–1019 (2012)

    Article  CAS  Google Scholar 

  37. Berg, J. J. & Coop, G. et al. A population genetic signal of polygenic adaptation. PLoS Genet. 10, e1004412 (2014)

    Article  Google Scholar 

  38. Lango Allen, H. et al. Hundreds of variants clustered in genomic loci and biological pathways affect human height. Nature 467, 832–838 (2010)

    Article  ADS  CAS  Google Scholar 

  39. Speliotes, E. K. et al. Association analyses of 249,796 individuals reveal 18 new loci associated with body mass index. Nature Genet. 42, 937–948 (2010)

    Article  CAS  Google Scholar 

  40. Heid, I. M. et al. Meta-analysis identifies 13 new loci associated with waist–hip ratio and reveals sexual dimorphism in the genetic basis of fat distribution. Nature Genet. 42, 949–960 (2010)

    Article  CAS  Google Scholar 

  41. Morris, A. P. et al. Large-scale association analysis provides insights into the genetic architecture and pathophysiology of type 2 diabetes. Nature Genet. 44, 981–990 (2012)

    Article  CAS  Google Scholar 

  42. Briggs, A. W. et al. Removal of deaminated cytosines and detection of in vivo methylation in ancient DNA. Nucleic Acids Res. 38, e87 (2010)

    Article  Google Scholar 

  43. Fu, Q. et al. DNA analysis of an early modern human from Tianyuan Cave, China. Proc. Natl Acad. Sci. USA 110, 2223–2227 (2013)

    Article  ADS  CAS  Google Scholar 

  44. Fu, Q. et al. An early modern human from Romania with a recent Neanderthal ancestor. Nature. 524, 216–219 (2015)

    Article  ADS  CAS  Google Scholar 

  45. Korneliussen, T. S., Albrechtsen, A. & Nielsen, R. ANGSD: analysis of next generation sequencing data. BMC Bioinformatics 15, 356 (2014)

    Article  Google Scholar 

  46. International HapMap Consortium. A second generation human haplotype map of over 3.1 million SNPs. Nature 449, 851–861 (2007)

  47. Lappalainen, T. et al. Transcriptome and genome sequencing uncovers functional variation in humans. Nature 501, 506–511 (2013)

    Article  ADS  CAS  Google Scholar 

  48. Li, J. Z. et al. Worldwide human relationships inferred from genome-wide patterns of variation. Science 319, 1100–1104 (2008)

    Article  ADS  CAS  Google Scholar 

  49. Loh, P. R. et al. Inferring admixture histories of human populations using linkage disequilibrium. Genetics 193, 1233–1254 (2013)

    Article  Google Scholar 

  50. Chang, C. C. et al. Second-generation PLINK: rising to the challenge of larger and richer datasets. GigaScience 4 (2015)

  51. Skoglund, P., Storå, J., Götherström, A. & Jakobsson, M. Accurate sex identification of ancient human remains using DNA shotgun sequencing. J. Archaeol. Sci. 40, 4477–4482 (2013)

    Article  CAS  Google Scholar 

  52. Li, H. et al. The sequence alignment/map format and SAMtools. Bioinformatics 25, 2078–2079 (2009)

    Article  Google Scholar 

  53. Norton, H. L. et al. Genetic evidence for the convergent evolution of light skin in Europeans and East Asians. Mol. Biol. Evol. 24, 710–722 (2007)

    Article  CAS  Google Scholar 

  54. Bokor, S. et al. Single nucleotide polymorphisms in the FADS gene cluster are associated with delta-5 and delta-6 desaturase activities estimated by serum fatty acid ratios. J. Lipid Res. 51, 2325–2333 (2010)

    Article  CAS  Google Scholar 

  55. Tanaka, T. et al. Genome-wide association study of plasma polyunsaturated fatty acids in the InCHIANTI Study. PLoS Genet. 5, e1000338 (2009)

    Article  Google Scholar 

  56. Ahn, J. et al. Genome-wide association study of circulating vitamin D levels. Hum. Mol. Genet. 19, 2739–2745 (2010)

    Article  CAS  Google Scholar 

  57. Gründemann, D. et al. Discovery of the ergothioneine transporter. Proc. Natl Acad. Sci. USA 102, 5256–5261 (2005)

    Article  ADS  Google Scholar 

  58. Chauhan, S. et al. ZKSCAN3 is a master transcriptional repressor of autophagy. Mol. Cell 50, 16–28 (2013)

    Article  CAS  Google Scholar 

  59. Soler Artigas, M. et al. Genome-wide association and large-scale follow up identifies 16 new loci influencing lung function. Nature Genet. 43, 1082–1090 (2011)

    Article  Google Scholar 

  60. Pruim, R. J. et al. LocusZoom: regional visualization of genome-wide association scan results. Bioinformatics 26, 2336–2337 (2010)

    Article  CAS  Google Scholar 

Download references

Acknowledgements

We thank P. de Bakker, J. Burger, C. Economou, E. Fornander, Q. Fu, F. Hallgren, K. Kirsanow, A. Mittnik, I. Olalde, A. Powell, P. Skoglund, S. Tabrizi and A. Tandon for discussions, suggestions about SNPs to include, or contribution to sample preparation or data curation. We thank S. Pääbo, M. Meyer, Q. Fu and B. Nickel for collaboration in developing the 1240k capture reagent. We thank J. M. V. Encinas and M. E. Prada for allowing us to resample La Braña 1. I.M. was supported by the Human Frontier Science Program LT001095/2014-L. C.G. was supported by the Irish Research Council for Humanities and Social Sciences (IRCHSS). F.G. was supported by a grant of the Netherlands Organization for Scientific Research, no. 380-62-005. A.K., P.K. and O.M. were supported by RFBR no. 15-06-01916 and RFH no. 15-11-63008 and O.M. by a state grant of the Ministry of Education and Science of the Russia Federation no. 33.1195.2014/k. J.K. was supported by ERC starting grant APGREID and DFG grant KR 4015/1-1. K.W.A. was supported by DFG grant AL 287 / 14-1. C.L.-F. was supported by a BFU2015-64699-P grant from the Spanish government. W.H. and B.L. were supported by Australian Research Council DP130102158. R.P. was supported by ERC starting grant ADNABIOARC (263441), and an Irish Research Council ERC support grant. D.R. was supported by US National Science Foundation HOMINID grant BCS-1032255, US National Institutes of Health grant GM100233, and the Howard Hughes Medical Institute.

Author information

Authors and Affiliations

Authors

Contributions

W.H., R.P. and D.R. supervised the study. S.A.R., J.L.A., J.M.B., E.C., F.G., A.K., P.K., M.L., H.M., O.M., V.M., M.A.R., J.R., J.M.V., J.K., A.C., K.W.A., D.B., D.A., C.L., W.H., R.P. and D.R. assembled archaeological material. I.M., I.L., N.R., S.M., N.P., S.D., J.P., W.H. and D.R. analysed genetic data. N.R., E.H., K.St., D.F., M.N., K.Si., C.G., E.R.J., B.L., C.L. and W.H. performed wet laboratory ancient DNA work. I.M., I.L. and D.R. wrote the manuscript with input from all co-authors.

Corresponding authors

Correspondence to Iain Mathieson, Wolfgang Haak, Ron Pinhasi or David Reich.

Ethics declarations

Competing interests

The authors declare no competing financial interests.

Extended data figures and tables

Extended Data Figure 1 Efficiency and cost-effectiveness of 1240k capture.

We plot the number of raw sequences against the mean coverage of analysed SNPs after removal of duplicates, comparing the 163 samples for which capture data are newly reported in this study, against the 102 samples analysed by shotgun sequencing in ref. 5. We caution that the true cost is more than that of sequencing alone.

Extended Data Figure 2 Early isolation and later admixture between farmers and steppe populations.

a, Mainland European populations later than 3000 bc are better modelled with steppe ancestry as a third ancestral population, (closer correspondence between empirical and estimated f4-statistics as estimated by resnorm; Methods). b, Later (post-Poltavka) steppe populations are better modelled with Anatolian Neolithic as a third ancestral population. c, Estimated mixture proportions of mainland European populations without steppe ancestry. d, Estimated mixture proportions of Eurasian steppe populations without Anatolian Neolithic ancestry. e, Estimated mixture proportions of later populations with both steppe and Anatolian Neolithic ancestry. f, Admixture plot at k = 17 showing population differences over time and space. EN, Early Neolithic; MN, Middle Neolithic; LN, Late Neolithic; BA, Bronze Age; LNBA, Late Neolithic and Bronze Age.

Extended Data Figure 3 Regional association plots.

Locuszoom60 plots for genome-wide significant signals. Points show the –log10 P value for each SNP, coloured according to their linkage disequilibrium (LD; units of r2) with the most associated SNP. The blue line shows the recombination rate, with scale on right hand axis in centimorgans per megabase (cM/Mb). Genes are shown in the lower panel of each subplot.

Extended Data Figure 4 PCA of selection populations and derived allele frequencies for genome-wide significant signals.

a, Ancient samples projected onto principal components of modern samples, as in Fig. 1, but labelled according to selection populations defined in Extended Data Table 1. b, Allele frequency plots as in Fig. 3. Six signals not included in Fig. 3—for SLC22A4 we show both rs272872, which is our strongest signal, and rs1050152, which was previously hypothesized to be under selection, and we also show SLC24A5, which is not genome-wide significant but is discussed in the main text.

Extended Data Figure 5 Motala haplotypes carrying the derived, selected EDAR allele.

This figure compares the genotypes at all sites within 150 kb of rs3827760 (in blue) for the 6 Motala samples and 20 randomly chosen CHB (Chinese from Beijing) and CEU (Utah residents with northern and western European ancestry) samples. Each row is a sample and each column is a SNP. Grey means homozygous for the major (in CEU) allele. Pink denotes heterozygous and red indicates homozygous for the other allele. For the Motala samples, an open circle means that there is only a single sequence, otherwise the circle is coloured according to the number of sequences observed. Three of the Motala samples are heterozygous for rs3827760 and the derived allele lies on the same haplotype background as in present-day East Asians. The only other ancient samples with evidence of the derived EDAR allele in this data set are two Afanasievo samples dating to 3300–3000 bc, and one Scythian dating to 400–200 bc (not shown).

Extended Data Figure 6 Estimated power of the selection scan.

a, Estimated power for different selection coefficients (s) for a SNP that is selected in all populations for either 50, 100 or 200 generations. b, Effect of increasing sample size, showing estimated power for a SNP selected for 100 generations, with different amounts of data, relative to the main text. c, Effect of admixture from Yoruba (YRI) into one of the modern populations, showing the effect on the genomic inflation factor (blue, left axis) and the power to detect selection on a SNP selected for 100 generations with a selection coefficient of 0.02. d, Effect of mis-specification of the mixture proportions. Here 0 on the x axis corresponds to the proportions we used, and 1 corresponds to a random mixture matrix.

Extended Data Table 1 230 ancient individuals analysed in this study
Extended Data Table 2 Key f-statistics used to support claims about population history
Extended Data Table 3 Twelve genome-wide significant signals of selection

Supplementary information

Supplementary Information

This file contains Supplementary Text comprising: Archaeological context for 83 newly reported ancient samples (Section 1) and Population interactions between Anatolia, mainland Europe, and the Eurasian steppe (Section 2) with additional references. (PDF 1045 kb)

Supplementary Data 1

This file contains information about 230 ancient samples used in this study. (XLSX 101 kb)

Supplementary Data 2

This file shows FST between ancient and modern populations. (XLSX 26 kb)

Supplementary Data 3

This file contains Genome-wide selection scan results and allele frequencies. (TXT 71292 kb)

PowerPoint slides

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Mathieson, I., Lazaridis, I., Rohland, N. et al. Genome-wide patterns of selection in 230 ancient Eurasians. Nature 528, 499–503 (2015). https://doi.org/10.1038/nature16152

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1038/nature16152

This article is cited by

Comments

By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.

Search

Quick links

Nature Briefing: Translational Research

Sign up for the Nature Briefing: Translational Research newsletter — top stories in biotechnology, drug discovery and pharma.

Get what matters in translational research, free to your inbox weekly. Sign up for Nature Briefing: Translational Research