Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

Altitude adaptation in Tibetans caused by introgression of Denisovan-like DNA

This article has been updated

Abstract

As modern humans migrated out of Africa, they encountered many new environmental conditions, including greater temperature extremes, different pathogens and higher altitudes. These diverse environments are likely to have acted as agents of natural selection and to have led to local adaptations. One of the most celebrated examples in humans is the adaptation of Tibetans to the hypoxic environment of the high-altitude Tibetan plateau1,2,3. A hypoxia pathway gene, EPAS1, was previously identified as having the most extreme signature of positive selection in Tibetans4,5,6,7,8,9,10, and was shown to be associated with differences in haemoglobin concentration at high altitude. Re-sequencing the region around EPAS1 in 40 Tibetan and 40 Han individuals, we find that this gene has a highly unusual haplotype structure that can only be convincingly explained by introgression of DNA from Denisovan or Denisovan-related individuals into humans. Scanning a larger set of worldwide populations, we find that the selected haplotype is only found in Denisovans and in Tibetans, and at very low frequency among Han Chinese. Furthermore, the length of the haplotype, and the fact that it is not found in any other populations, makes it unlikely that the haplotype sharing between Tibetans and Denisovans was caused by incomplete ancestral lineage sorting rather than introgression. Our findings illustrate that admixture with other hominin species has provided genetic variation that helped humans to adapt to new environments.

Access options

Rent or Buy article

Get time limited or full article access on ReadCube.

from$8.99

All prices are NET prices.

Figure 1: Genome-wide FST versus maximal allele frequency difference.
Figure 2: Haplotype pattern in a region defined by SNPs that are at high frequency in Tibetans and at low frequency in Han Chinese.
Figure 3: A haplotype network based on the number of pairwise differences between the 40 most common haplotypes.

Accession codes

Primary accessions

Sequence Read Archive

Data deposits

Sequence data have been deposited in the Sequence Read Archive under accession number SRP041218.

Change history

  • 13 August 2014

    The affiliations list has been updated to correct the address of author Kui Li.

References

  1. 1

    Moore, L. G., Young, D., McCullough, R. E., Droma, T. & Zamudio, S. Tibetan protection from intrauterine growth restriction (IUGR) and reproductive loss at high altitude. Am. J. Hum. Biol. 13, 635–644 (2001)

    CAS  Google Scholar 

  2. 2

    Niermeyer, S. et al. Child health and living at high altitude. Arch. Dis. Child. 94, 806–811 (2009)

    CAS  Google Scholar 

  3. 3

    Wu, T. et al. Hemoglobin levels in Quinghai-Tibet: different effects of gender for Tibetans vs. Han. J. Appl. Physiol. 98, 598–604 (2005)

    CAS  Google Scholar 

  4. 4

    Yi, X. et al. Sequencing of 50 human exomes reveals adaptation to high altitude. Science 329, 75–78 (2010)

    CAS  ADS  PubMed  PubMed Central  Google Scholar 

  5. 5

    Bigham, A. et al. Identifying signature of natural selection in Tibetan and Andean populations using dense genome scan data. PLoS Genet. 6, e1001116 (2010)

    PubMed  PubMed Central  Google Scholar 

  6. 6

    Simonson, T. S. et al. Genetic evidence for high-altitude adaptation in Tibet. Science 329, 72–75 (2010)

    CAS  ADS  PubMed  Google Scholar 

  7. 7

    Beall, C. M. et al. Natural selection on EPAS1 (HIF2a) associated with low hemoglobin concentration in Tibetan highlanders. Proc. Natl Acad. Sci. USA 107, 11459–11464 (2010)

    CAS  ADS  Google Scholar 

  8. 8

    Peng, Y. et al. Genetic variations in Tibetan populations and high-altitude adaptation at the Himalayas. Mol. Biol. Evol. 28, 1075–1081 (2011)

    CAS  Google Scholar 

  9. 9

    Xu, S. et al. A genome-wide search for signals of high-altitude adaptation in Tibetans. Mol. Biol. Evol. 28, 1003–1011 (2011)

    Google Scholar 

  10. 10

    Wang, B. et al. On the origin of Tibetans and their genetic basis in adapting high-altitude environments. PLoS ONE 6, e17002 (2011)

    CAS  ADS  PubMed  PubMed Central  Google Scholar 

  11. 11

    Moore, L. G. et al. Maternal adaptation to high-altitude pregnancy: an experiment of nature—a review. Placenta 25, S60–S71 (2004)

    CAS  Google Scholar 

  12. 12

    Vargas, E. & Spielvogel, H. Chronic mountain sickness, optimal hemoglobin, and heart disease. High Alt. Med. Biol. 7, 138–149 (2006)

    Google Scholar 

  13. 13

    Yip, R. Significance of an abnormally low or high hemoglobin concentration during pregnancy: special consideration of iron nutrition1'2'3. Am. J. Clin. Nutr. 72, 272S–279S (2000)

    CAS  Google Scholar 

  14. 14

    Meyer, M. et al. A high-coverage genome sequence from an archaic Denisovan individual. Science 338, 222–226 (2012)

    CAS  ADS  PubMed  PubMed Central  Google Scholar 

  15. 15

    Li, J. Z. et al. Worldwide human relationships inferred from genome-wide patterns of variation. Science 319, 1100–1104 (2008)

    CAS  ADS  Google Scholar 

  16. 16

    Rosenberg, N. A. Standardized subsets of the HGDP-CEPH Human Genome Diversity Cell Line Panel, accounting for atypical and duplicated samples and pairs of close relatives. Ann. Hum. Genet. 70, 841–847 (2006)

    CAS  Google Scholar 

  17. 17

    Soejima, M. & Koda, Y. Population differences of two coding SNPs. in pigmentation-related genes SLC24A5 and SLC45A2. Int. J. Legal Med. 121, 36–39 (2007)

    Google Scholar 

  18. 18

    Sulem, P. et al. Genetic determinants of hair, eye and skin pigmentation in Europeans. Nature Genet. 39, 1443–1452 (2007)

    CAS  Google Scholar 

  19. 19

    Coop, G. et al. The role of geography in human adaptation. PLoS Genet. 5, e1000500 (2009)

    PubMed  PubMed Central  Google Scholar 

  20. 20

    Pickrell, J. K. et al. Signals of recent positive selection in a worldwide sample of human populations. Genome Res. 19, 826–837 (2009)

    CAS  PubMed  PubMed Central  Google Scholar 

  21. 21

    An integrated map of genetic variation from 1,092 human genomes. Nature 491, 56–65 (2012)

  22. 22

    Paradis, E. Pegas: an R package for population genetics with an integrated–modular approach. Bioinformatics 26, 419–420 (2010)

    CAS  Google Scholar 

  23. 23

    Vernot, B. & Akey, J. Resurrecting Surviving neandertal lineages from modern human genomes. Science (2014)

  24. 24

    Plagnol, V. & Wall, J. D. Possible ancestral structure in human populations. PLoS Genet. 2, e105 (2006)

    PubMed  PubMed Central  Google Scholar 

  25. 25

    Reich, D. et al. Genetic history of an archaic hominin group from Denisova cave in Siberia. Nature 468, 1053–1060 (2010)

    CAS  ADS  PubMed  PubMed Central  Google Scholar 

  26. 26

    Prüfer, K. et al. The complete genome sequence of a Neanderthal from the Altai Mountains. Nature 505, 43–49 (2014)

    ADS  PubMed  PubMed Central  Google Scholar 

  27. 27

    Skoglund, P. & Jakobsson, M. Archaic human ancestry in East Asia. Proc. Natl Acad. Sci. USA 108, 18301–18306 (2011)

    CAS  ADS  Google Scholar 

  28. 28

    Abi-Rached, L. et al. The shaping of modern human immune systems by multiregional admixture with archaic humans. Science 334, 89–94 (2011)

    CAS  ADS  PubMed  PubMed Central  Google Scholar 

  29. 29

    Mendez, F. L., Watkins, J. C. & Hammer, M. F. A haplotype at STAT2 introgressed from Neanderthals and serves as a candidate of positive selection in Papua New Guinea. Am. J. Hum. Genet. 91, 265–274 (2012)

    CAS  PubMed  PubMed Central  Google Scholar 

  30. 30

    Sankararaman, S. et al. The genomic landscape of Neanderthal ancestry in present-day humans. Nature (2014)

  31. 31

    Li, R., Li, Y., Kristiansen, K. & Wang, J. SOAP: short oligonucleotide alignment program. Bioinformatics 24, 713–714 (2008)

    CAS  Google Scholar 

  32. 32

    Li, R. et al. SNP detection for massively parallel whole-genome resequencing. Genome Res. 19, 1124–1132 (2009)

    CAS  PubMed  PubMed Central  Google Scholar 

  33. 33

    Browning, B. L. & Browning, S. R. A fast, powerful method for detecting identity by descent. Am. J. Hum. Genet. 88, 173–182 (2011)

    CAS  PubMed  PubMed Central  Google Scholar 

  34. 34

    Coop, G. et al. The role of geography in human adaptation. PLoS Genet. 5, e1000500 (2009)

    PubMed  PubMed Central  Google Scholar 

  35. 35

    Reynolds, J., Weir, B. S. & Cockerham, C. C. Estimation of the coancestry coefficient: basis for a short-term genetic distance. Genetics 105, 767–779 (1983)

    CAS  PubMed  PubMed Central  Google Scholar 

  36. 36

    R Development Core Team R: A language and environment for statistical computing http://www.R-project.org/ (R Foundation for Statistical Computing, 2011)

  37. 37

    Ewing, G. & Hermisson, J. MSMS: a coalescent simulation program including recombination, demographic structure, and selection at a single locus. Bioinformatics 26, 2064–2065 (2010)

    CAS  PubMed  PubMed Central  Google Scholar 

  38. 38

    Myers, S. et al. A fine-scale map of recombination rates and hotspots across the human genome. Science 310, 321–324 (2005)

    CAS  ADS  Google Scholar 

  39. 39

    Hinch, A. G. et al. The landscape of recombination in African Americans. Nature 476, 170–175 (2011)

    CAS  ADS  PubMed  PubMed Central  Google Scholar 

  40. 40

    Scally, A. & Durbin, R. Revising the human mutation rate: implications for understanding human evolution. Nature Rev. Genet. 13, 745–753 (2012)

    CAS  Google Scholar 

  41. 41

    Teshima, K. M. & Innan, H. mbs: modifying Hudson’s ms software to generate samples of DNA sequences with a biallelic site under selection. BMC Bioinformatics 10, 166 (2009)

    PubMed  PubMed Central  Google Scholar 

  42. 42

    Hudson, R. R. Generating samples under a Wright–Fisher neutral model of genetic variation. Bioinformatics 18, 337–338 (2002)

    CAS  Google Scholar 

  43. 43

    Sankararaman, S. et al. The date of interbreeding between Neandertals and modern humans. PLoS Genet. 8, e1002947 (2012)

    CAS  PubMed  PubMed Central  Google Scholar 

  44. 44

    Durand, E. Y. et al. Testing for ancient admixture between closely related populations. Mol. Biol. Evol. 28, 2239–2252 (2011)

    CAS  PubMed  PubMed Central  Google Scholar 

  45. 45

    Simonson, T. S. et al. Genetic evidence for high-altitude adaptation in Tibet. Science 329, 72–75 (2010)

    CAS  ADS  PubMed  Google Scholar 

Download references

Acknowledgements

This research was funded by the State Key Development Program for Basic Research of China, 973 Program (2011CB809203, 2012CB518201, 2011CB809201, 2011CB809202), China National GeneBank-Shenzhen and Shenzhen Key Laboratory of Transomics Biotechnologies (no. CXB201108250096A). This work was also supported by research grants from the US NIH; R01HG003229 to R.N. and R01HG003229-08S2 to E.H.S. We thank F. Jay, M. Liang and F. Casey for useful discussions.

Author information

Affiliations

Authors

Contributions

R.N., Ji.W. and Ju.W. supervised the project. X.J., A., Z.B., Y.L., X.Y., M.H., P.N., B.W., X.O., H., J.L., Z.X.P.C., K.L., G.G., Y.Y., W.W., X.Z., X.X., H.Y., Y.L., Ji.W. and Ju.W. collected and generated the data, and performed the preliminary bioinformatic analyses to call SNPs and indels from the raw data. E.H.-S. and N.V. filtered the data and B.M.P. phased the data. E.H.-S. performed the majority of the population genetic analysis with some contributions from B.M.P. and M.S. E.H.-S. and R.N. wrote the manuscript with critical input from all the authors.

Corresponding authors

Correspondence to Jian Wang or Jun Wang or Rasmus Nielsen.

Ethics declarations

Competing interests

The authors declare no competing financial interests.

Extended data figures and tables

Extended Data Figure 1 FST calculated for each SNP between Tibetan and Han populations.

Each dot represents the FST value for each SNP in EPAS1. The x axis is the physical position in the gene. Positions are based on the hg18 build of the human genome. The green box defines a 32.7-kb region where we observe the largest genetic differentiation between Han Chinese and Tibetans. The first and last positions of this 32.7-kb region correspond to the first and last position of the SNPs listed in Supplementary Table 3. For comparison, in ref. 4 the genome-wide FST between Han and Tibetans is 0.02. The site with the largest frequency difference (and therefore largest FST) is circled.

Extended Data Figure 2 Distribution of fixed differences.

The left panel is the distribution of fixed differences between two haplotype groups under a scenario of selection on a de novo mutation (see Methods), and the right panel is the distribution under a scenario of selection on standing variation (see Methods) for a region of size 32.7 kb. The initial frequency of the selected allele in the SSV model is 1%. Each row of panels corresponds to different selection strengths (2Ns) from 200 to 1,000. The red lines mark the number of fixed differences observed between the two haplotype classes in the real data for the given window size.

Extended Data Figure 3 Haplotype frequencies for Tibetans, our Han samples and the populations from the 1000 genomes project for the five-SNP motif in the EPAS1 region.

The y axis is the haplotype frequency. The legend shows all the possible haplotypes for the region considered among these populations: ASW, African American from the south western United States; CEU, Utah Residents with Northern and Western European ancestry; CHB, Han Chinese from Beijing; CHS, Southern Han Chinese; CLM, Colombian; FIN, Finnish; GBR, British; HAN, Han Chinese from Beijing; IBS, Iberian; JPT, Japanese; MXL, Mexican; PUR, Puerto Rican; LWK, Luhya; TSI, Toscani; TIB, Tibetan; YRI, Yoruban (see Methods).

Extended Data Figure 4 Derived allele frequency of the SNPs with the largest frequency difference between Tibetans and the 1000 Genomes Project populations.

At these SNPs, the frequency difference between Tibetans and the 1000 Genomes project populations is 0.65 or larger. Positions 46571435, 46579689, 46584859 and 46600358 were not called as SNPs in the 1000 Genomes data, so we assume these positions were fixed for the human reference allele. Note that even though position 46577251, 46588331, 46594122 and 46598025 appear to have a frequency of 0.0 for the populations in the 1000 Genomes data, the derived allele in these SNPs are observed at very low frequency in at least one population (for example, CHB).

Extended Data Figure 5 Differences between haplotypes.

a, The full matrix of pairwise differences between all the unique haplotypes in Fig. 3, for the 40 most common haplotypes identified in the 1000 Genomes and the Tibetan samples in the 32.7-kb region of EPAS1. The Denisovan haplotype (of frequency two) was added afterwards for comparison. The unique haplotypes are labelled with Roman numerals (here and in Fig. 3), and the Denisovan haplotype is the first column, haplotype I. Refer to Fig. 3 in the main text and the supplementary material for the representation of populations for each haplotype. b, Illustration of the genealogical structure in a model with gene flow from Denisovans to Tibet. Letters a–k are the labels for the branch lengths and are adjacent to their corresponding branches. The divergence between modern human haplotypes and the introgressed haplotype in Tibetans would be larger than the haplotypes in other modern human populations and the Denisovan haplotype (see Methods and Supplementary Information). TIB, CEU and YRI denote Tibetan, European and Yoruban populations. Note that the lengths i and k are unknown as we do not know when these populations went extinct.

Extended Data Figure 6 Other haplotype networks.

a, A haplotype network based on the number of pairwise differences between 43 unique haplotypes defined from the 20 most differentiated SNPs between Tibetans and the 14 populations from the 1000 Genomes Project. The R software package pegas (ref. 22) was used to generate the figure. The haplotype distances are from pairwise differences. Each pie chart represents one unique haplotype and the size of the pie chart is proportional to log2(number of chromosomes with that haplotype). The sections in the pie provide the breakdown of the haplotypes amongst populations. The width of the edges is proportional to the number of pairwise differences between the joined haplotypes; the thinnest edge width represents a difference of one mutation. The number 57 next to a Tibetan haplotype is the number of Tibetan chromosomes with that haplotype. Similarly, the number 1,912 is the number of chromosomes (across several populations) with that haplotype. b, The number of pairwise differences between the Denisovan haplotype and the 43 unique haplotypes defined from the 20 most differentiated SNPs between Tibetans and the 14 populations from the 1000 Genomes Project (same haplotypes as in a). Each bar is a unique haplotype, and they are sorted in increasing order of pairwise differences. The colours within each bar represent the proportion of chromosomes with that haplotype broken down by populations. The numbers on top of each bar represent the total number of chromosomes within the 1000 Genomes data set and Tibetans that have the haplotype. Note this is the same data set used to create the haplotype network in panel a. Supplementary Tables 5 and 6 contain the 43 haplotypes and the frequencies within each of the populations.

Extended Data Figure 7 Number of pairwise differences.

Red bars are the histograms of the number of pairwise differences between Denisovan and Tibetans. Blue bars are the histograms of the number of pairwise differences between Denisovan and GBR, CHS, FIN, PUR, CLM, IBS, CEU, YRI, CHB, JPT, LWK, ASW, MXL or TSI. All comparisons are within the 32.7-kb region of high differentiation (green box in Extended Data Fig. 1).

Extended Data Figure 8 Divergence distributions.

Modern human–Denisovan divergence (see Methods) for intronic regions of size 32.7 kb is plotted in red. Modern human–modern human divergence for the same intronic regions is plotted in blue. At the EPAS1 32.7-kb region, in green, is plotted the Tibetan–Han divergence. The black arrow points to the number of nucleotide differences between the Denisovan and the most common Tibetan haplotype (0.0038). This value is significantly lower than what we observe between modern human–Denisovan (red curve, P = 0.0028).

Extended Data Figure 9 Null distributions of D for an assumed Tibet–Han divergence of 3,000 years.

Each histogram corresponds to the D values obtained under null models without gene flow, and the red vertical bar corresponds to the D values observed in the real data. The observed D values are significant (P < 0.001) even when we assume Tibet–Han divergence of 5,000 or 10,000 years (see Methods and Supplementary Tables 8–10) (model abbreviations are given in the Supplementary Information; section on D statistics under models of no gene flow).

Extended Data Figure 10 S* statistics and PCA plot.

a, A measure of introgression, S*, from ref. 23. Distributions are for 1,000 simulations under the four demographic models described in the Supplementary Information; section on D statistics under models of no gene flow. S* for the Tibetan individuals is shown as a vertical grey line. For all models, the empirical P values are 0.035, 0.028, 0.019 and 0.017, respectively, for each model (top to bottom). b, Plots the first and second principal components using all the CHS (100 individuals) and the CHB (97 individuals) from the 1000 Genomes and the 77 Tibetan individuals from ref. 45 (see Methods). The black circle and the black triangle represent the single CHB and the CHS individuals carrying the five-SNP Tibetan–Denisovan-haplotype (Extended Data Fig. 3). All SNPs in the intersection between the 1000 Genomes populations and the 77 Tibetan individuals from chromosome 2 were used for this analysis.

Supplementary information

Supplementary Information

This file contains Supplementary Text, Supplementary References and Supplementary Tables 1-11. (PDF 342 kb)

PowerPoint slides

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Huerta-Sánchez, E., Jin, X., Asan et al. Altitude adaptation in Tibetans caused by introgression of Denisovan-like DNA. Nature 512, 194–197 (2014). https://doi.org/10.1038/nature13408

Download citation

Further reading

Comments

By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.

Search

Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing