Abstract
Characterizing genetic diversity within and between populations has broad applications in studies of human disease and evolution. We propose a new approach, spatial ancestry analysis, for the modeling of genotypes in two- or three-dimensional space. In spatial ancestry analysis (SPA), we explicitly model the spatial distribution of each SNP by assigning an allele frequency as a continuous function in geographic space. We show that the explicit modeling of the allele frequency allows individuals to be localized on the map on the basis of their genetic information alone. We apply our SPA method to a European and a worldwide population genetic variation data set and identify SNPs showing large gradients in allele frequency, and we suggest these as candidate regions under selection. These regions include SNPs in the well-characterized LCT region, as well as at loci including FOXP2, OCA2 and LRP1B.
Access options
Subscribe to Journal
Get full journal access for 1 year
70,80 €
only 5,90 € per issue
All prices include VAT for France.
Rent or Buy article
Get time limited or full article access on ReadCube.
from$8.99
All prices are NET prices.
References
- 1.
Price, A.L. et al. Principal components analysis corrects for stratification in genome-wide association studies. Nat. Genet. 38, 904–909 (2006).
- 2.
Seldin, M.F., Pasaniuc, B. & Price, A.L. New approaches to disease mapping in admixed populations. Nat. Rev. Genet. 12, 523–528 (2011).
- 3.
Lewontin, R.C. & Krakauer, J. Distribution of gene frequency as a test of the theory of the selective neutrality of polymorphisms. Genetics 74, 175–195 (1973).
- 4.
Pickrell, J.K. et al. Signals of recent positive selection in a worldwide sample of human populations. Genome Res. 19, 826–837 (2009).
- 5.
Coop, G. et al. The role of geography in human adaptation. PLoS Genet. 5, e1000500 (2009).
- 6.
Jakobsson, M. et al. Genotype, haplotype and copy-number variation in worldwide human populations. Nature 451, 998–1003 (2008).
- 7.
Li, J.Z. et al. Worldwide human relationships inferred from genome-wide patterns of variation. Science 319, 1100–1104 (2008).
- 8.
Lao, O. et al. Correlation between genetic and geographic structure in Europe. Curr. Biol. 18, 1241–1248 (2008).
- 9.
Novembre, J. et al. Genes mirror geography within Europe. Nature 456, 98–101 (2008).
- 10.
Novembre, J. & Stephens, M. Interpreting principal component analyses of spatial population genetic variation. Nat. Genet. 40, 646–649 (2008).
- 11.
McVean, G. A genealogical interpretation of principal components analysis. PLoS Genet. 5, e1000686 (2009).
- 12.
Novembre, J. & Di Rienzo, A. Spatial patterns of variation due to natural selection in humans. Nat. Rev. Genet. 10, 745–755 (2009).
- 13.
Excoffier, L. & Ray, N. Surfing during population expansions promotes genetic revolutions and structuration. Trends Ecol. Evol. 23, 347–351 (2008).
- 14.
Voight, B.F., Kudaravalli, S., Wen, X. & Pritchard, J.K. A map of recent positive selection in the human genome. PLoS Biol. 4, e72 (2006).
- 15.
Holsinger, K.E. & Weir, B.S. Genetics in geographically structured populations: defining, estimating and interpreting FST. Nat. Rev. Genet. 10, 639–650 (2009).
- 16.
Coop, G., Witonsky, D., Di Rienzo, A. & Pritchard, J.K. Using environmental correlations to identify loci underlying local adaptation. Genetics 185, 1411–1423 (2010).
- 17.
Nelson, M.R. et al. The population reference sample, POPRES: a resource for population, disease, and pharmacological genetics research. Am. J. Hum. Genet. 83, 347–358 (2008).
- 18.
Sabeti, P.C. et al. Detecting recent positive selection in the human genome from haplotype structure. Nature 419, 832–837 (2002).
- 19.
Bersaglieri, T. et al. Genetic signatures of strong recent positive selection at the lactase gene. Am. J. Hum. Genet. 74, 1111–1120 (2004).
- 20.
Enard, W. et al. Molecular evolution of FOXP2, a gene involved in speech and language. Nature 418, 869–872 (2002).
- 21.
Liu, C.X., Musco, S., Lisitsina, N.M., Yaklichkin, S.Y. & Lisitsyn, N.A. Genomic organization of a new candidate tumor suppressor gene, LRP2B. Genomics 69, 271–274 (2000).
- 22.
Nocedal, J. & Wright, S.J. Numerical Optimization (Springer, New York, 2000).
Acknowledgements
W.-Y.Y. and E.E. are supported by grants from the US National Science Foundation (0513612, 0731455, 0729049, 0916676 and 1065276) and the US National Institutes of Health (K25 HL080079, U01 DA024417, P01 HL30568 and PO1 HL28481). J.N. is supported by National Science Foundation grant (0933731) and by the Searle Scholars Program. E.H. is a faculty fellow of the Edmond J. Safra Program at Tel Aviv University and was supported in part by the Israeli Science Foundation (grant 04514831) and by IBM open collaborative research award program.
Author information
Author notes
- Eleazar Eskin
- & Eran Halperin
These authors contributed equally to this work.
Affiliations
Interdepartmental Program in Bioinformatics, University of California, Los Angeles, California, USA.
- Wen-Yun Yang
- , John Novembre
- & Eleazar Eskin
Department of Computer Science, University of California, Los Angeles, California, USA.
- Wen-Yun Yang
- & Eleazar Eskin
Department of Ecology and Evolutionary Biology, University of California, Los Angeles, California, USA.
- John Novembre
Department of Human Genetics, University of California, Los Angeles, California, USA.
- Eleazar Eskin
International Computer Science Institute, Berkeley, California, USA.
- Eran Halperin
School of Computer Science, Tel Aviv University, Tel Aviv, Israel.
- Eran Halperin
Department of Molecular Microbiology and Biotechnology, Tel Aviv University, Tel Aviv, Israel.
- Eran Halperin
Authors
Search for Wen-Yun Yang in:
Search for John Novembre in:
Search for Eleazar Eskin in:
Search for Eran Halperin in:
Contributions
W.-Y.Y., J.N., E.E. and E.H. designed the methods and experiments. W.-Y.Y. implemented the methods. W.-Y.Y., J.N., E.E. and E.H. jointly performed the analysis. All authors discussed the results and contributed to the writing of the manuscript.
Competing interests
The authors declare no competing financial interests.
Corresponding author
Correspondence to Eleazar Eskin.
Supplementary information
PDF files
- 1.
Supplementary Text and Figures
Supplementary Figures 1–5, Supplementary Tables 1–4 and Supplementary Note
Rights and permissions
To obtain permission to re-use content from this article visit RightsLink.
About this article
Further reading
-
1.
Understanding 6th-century barbarian social organization and migration through paleogenomics
Nature Communications (2018)
-
2.
Conserved noncoding sequences conserve biological networks and influence genome evolution
Heredity (2018)
-
3.
Genome-wide methylation data mirror ancestry information
Epigenetics & Chromatin (2017)
-
4.
Dissecting the Genetic Basis of Local Adaptation in Soybean
Scientific Reports (2017)
-
5.
Translational Psychiatry (2017)