Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Technical Report
  • Published:

Visualizing spatial population structure with estimated effective migration surfaces

Abstract

Genetic data often exhibit patterns broadly consistent with 'isolation by distance'—a phenomenon where genetic similarity decays with geographic distance. In a heterogeneous habitat, this may occur more quickly in some regions than in others: for example, barriers to gene flow can accelerate differentiation between neighboring groups. We use the concept of 'effective migration' to model the relationship between genetics and geography. In this paradigm, effective migration is low in regions where genetic similarity decays quickly. We present a method to visualize variation in effective migration across a habitat from geographically indexed genetic data. Our approach uses a population genetic model to relate effective migration rates to expected genetic dissimilarities. We illustrate its potential and limitations using simulations and data from elephant, human and Arabidopsis thaliana populations. The resulting visualizations highlight important spatial features of population structure that are difficult to discern using existing methods for summarizing genetic variation.

This is a preview of subscription content, access via your institution

Access options

Buy this article

Prices may be subject to local taxes which are calculated during checkout

Figure 1: A schematic overview of EEMS (Estimated Effective Migration Surfaces), using African elephant data for illustration.
Figure 2: Simulations comparing EEMS and PCA.
Figure 3: Simulations illustrate that EEMS infers effective migration rates rather than actual steady-state migration rates.
Figure 4: EEMS analysis of African elephant data.
Figure 5: EEMS analysis of human population structure in Western Europe and in sub-Saharan Africa.
Figure 6: EEMS analysis of A. thaliana data from the RegMap panel.

Similar content being viewed by others

References

  1. Li, J.Z. et al. Worldwide human relationships inferred from genome-wide patterns of variation. Science 319, 1100–1104 (2008).

    CAS  PubMed  Google Scholar 

  2. Reich, D., Thangaraj, K., Patterson, N., Price, A.L. & Singh, L. Reconstructing Indian population history. Nature 461, 489–494 (2009).

    CAS  PubMed  PubMed Central  Google Scholar 

  3. Beaumont, M.A. & Balding, D.J. Identifying adaptive genetic divergence among populations from genome scans. Mol. Ecol. 13, 969–980 (2004).

    CAS  PubMed  Google Scholar 

  4. Becquet, C. & Przeworski, M. A new approach to estimate parameters of speciation models with application to apes. Genome Res. 17, 1505–1519 (2007).

    CAS  PubMed  PubMed Central  Google Scholar 

  5. Teeter, K.C. et al. Genome-wide patterns of gene flow across a house mouse hybrid zone. Genome Res. 18, 67–76 (2008).

    CAS  PubMed  PubMed Central  Google Scholar 

  6. Kronforst, M.R., Young, L.G., Blume, L.M. & Gilbert, L.E. Multilocus analyses of admixture and introgression among hybridizing Heliconius butterflies. Evolution 60, 1254–1268 (2006).

    CAS  PubMed  Google Scholar 

  7. Hinch, A. et al. The landscape of recombination in African Americans. Nature 476, 170–175 (2011).

    CAS  PubMed  PubMed Central  Google Scholar 

  8. Gonder, M.K. et al. Evidence from Cameroon reveals differences in the genetic structure and histories of chimpanzee populations. Proc. Natl. Acad. Sci. USA 108, 4766–4771 (2011).

    CAS  PubMed  PubMed Central  Google Scholar 

  9. Wasser, S.K. et al. Assigning African elephant DNA to geographic region of origin: applications to the ivory trade. Proc. Natl. Acad. Sci. USA 101, 14847–14852 (2004).

    CAS  PubMed  PubMed Central  Google Scholar 

  10. Yang, W.-Y., Novembre, J., Eskin, E. & Halperin, E. A model-based approach for analysis of spatial structure in genetic data. Nat. Genet. 44, 725–731 (2012).

    CAS  PubMed  PubMed Central  Google Scholar 

  11. Campbell, C.D. et al. Demonstrating stratification in a European American population. Nat. Genet. 37, 868–872 (2005).

    CAS  PubMed  Google Scholar 

  12. Price, A.L., Zaitlen, N.A., Reich, D. & Patterson, N. New approaches to population stratification in genome-wide association studies. Nat. Rev. Genet. 11, 459–463 (2010).

    CAS  PubMed  PubMed Central  Google Scholar 

  13. Pritchard, J.K., Stephens, M. & Donnelly, P. Inference of population structure using multilocus genotype data. Genetics 155, 945–959 (2000).

    CAS  PubMed  PubMed Central  Google Scholar 

  14. Guillot, G., Estoup, A., Mortier, F. & Cosson, J.F. A spatial statistical model for landscape genetics. Genetics 170, 1261–1280 (2005).

    CAS  PubMed  PubMed Central  Google Scholar 

  15. Price, A.L. et al. Principal components analysis corrects for stratification in genome-wide association studies. Nat. Genet. 38, 904–909 (2006).

    CAS  PubMed  Google Scholar 

  16. Patterson, N., Price, A.L. & Reich, D. Population structure and eigenanalysis. PLoS Genet. 2, e190 (2006).

    Article  PubMed  PubMed Central  Google Scholar 

  17. Rousset, F. Genetic differentiation and estimation of gene flow from F-statistics under isolation by distance. Genetics 145, 1219–1228 (1997).

    CAS  PubMed  PubMed Central  Google Scholar 

  18. Novembre, J. et al. Genes mirror geography within Europe. Nature 456, 98–101 (2008).

    CAS  PubMed  PubMed Central  Google Scholar 

  19. Novembre, J. & Stephens, M. Interpreting principal component analyses of spatial population genetic variation. Nat. Genet. 40, 646–649 (2008).

    CAS  PubMed  PubMed Central  Google Scholar 

  20. McVean, G. A genealogical interpretation of principal components analysis. PLoS Genet. 5, e1000686 (2009).

    PubMed  PubMed Central  Google Scholar 

  21. DeGiorgio, M. & Rosenberg, N.A. Geographic sampling scheme as a determinant of the major axis of genetic variation in principal components analysis. Mol. Biol. Evol. 30, 480–488 (2013).

    CAS  PubMed  Google Scholar 

  22. Manni, F., Guérard, E. & Heyer, E. Geographic patterns of (genetic, morphologic, linguistic) variation: how barriers can be detected by using Monmonier's algorithm. Hum. Biol. 76, 173–190 (2004).

    PubMed  Google Scholar 

  23. Manel, S. et al. A new individual-based spatial approach for identifying genetic discontinuities in natural populations. Mol. Ecol. 16, 2031–2043 (2007).

    CAS  PubMed  Google Scholar 

  24. Duforet-Frebourg, N. & Blum, M.G. Nonstationary patterns of isolation-by-distance: inferring measures of local genetic differentiation with Bayesian kriging. Evolution 68, 1110–1123 (2014).

    PubMed  PubMed Central  Google Scholar 

  25. Beerli, P. & Felsenstein, J. Maximum likelihood estimation of a migration matrix and effective population sizes in n subpopulations by using a coalescent approach. Proc. Natl. Acad. Sci. USA 98, 4563–4568 (2001).

    CAS  PubMed  PubMed Central  Google Scholar 

  26. McRae, B.H. Isolation by resistance. Evolution 60, 1551–1561 (2006).

    PubMed  Google Scholar 

  27. Hanks, E. & Hooten, M. Circuit theory and model-based inference for landscape connectivity. J. Am. Stat. Assoc. 108, 22–33 (2013).

    CAS  Google Scholar 

  28. Kimura, M. & Weiss, G.H. The stepping stone model of population structure and the decrease of genetic correlation with distance. Genetics 49, 561–576 (1964).

    CAS  PubMed  PubMed Central  Google Scholar 

  29. Hudson, R.R. Generating samples under a Wright-Fisher neutral model of genetic variation. Bioinformatics 18, 337–338 (2002).

    CAS  PubMed  Google Scholar 

  30. Wasser, S.K. et al. Genetic assignment of large seizures of elephant ivory reveals Africa's major poaching hotspots. Science 349, 84–87 (2015).

    CAS  PubMed  PubMed Central  Google Scholar 

  31. Beaumont, M.A. & Nichols, R.A. Evaluating loci for use in the genetic analysis of population structure. Proc. R. Soc. Lond. B 263, 1619–1626 (1996).

    Google Scholar 

  32. Georgiadis, N. et al. Structure and history of African elephant populations: I. eastern and southern Africa. J. Hered. 85, 100–104 (1994).

    CAS  PubMed  Google Scholar 

  33. Comstock, K.E. et al. Patterns of molecular genetic variation among African elephant populations. Mol. Ecol. 11, 2489–2498 (2002).

    CAS  PubMed  Google Scholar 

  34. Nelson, M.R. et al. The Population Reference Sample, POPRES: a resource for population, disease, and pharmacological genetics research. Am. J. Hum. Genet. 83, 347–358 (2008).

    CAS  PubMed  PubMed Central  Google Scholar 

  35. Xing, J. et al. Toward a more uniform sampling of human genetic diversity: a survey of worldwide populations by high-density genotyping. Genomics 96, 199–210 (2010).

    CAS  PubMed  Google Scholar 

  36. Henn, B.M. et al. Hunter-gatherer genomic diversity suggests a southern African origin for modern humans. Proc. Natl. Acad. Sci. USA 108, 5154–5162 (2011).

    CAS  PubMed  PubMed Central  Google Scholar 

  37. Wang, C., Zöllner, S. & Rosenberg, N.A. A quantitative comparison of the similarity between genes and geography in worldwide human populations. PLoS Genet. 8, e1002886 (2012).

    CAS  PubMed  PubMed Central  Google Scholar 

  38. Lao, O. et al. Correlation between genetic and geographic structure in Europe. Curr. Biol. 18, 1241–1248 (2008).

    CAS  PubMed  Google Scholar 

  39. Tian, C. et al. Analysis and application of European genetic substructure using 300 K SNP information. PLoS Genet. 4, e4 (2008).

    PubMed  PubMed Central  Google Scholar 

  40. Nordborg, M. et al. The pattern of polymorphism in Arabidopsis thaliana. PLoS Biol. 3, e196 (2005).

    PubMed  PubMed Central  Google Scholar 

  41. Platt, A. et al. The scale of population structure in Arabidopsis thaliana. PLoS Genet. 6, e1000843 (2010).

    PubMed  PubMed Central  Google Scholar 

  42. Horton, M.W. et al. Genome-wide patterns of genetic variation in worldwide Arabidopsis thaliana accessions from the RegMap panel. Nat. Genet. 44, 212–216 (2012).

    CAS  PubMed  PubMed Central  Google Scholar 

  43. O'Kane, S. & Al-Shehbaz, I. A synopsis of Arabidopsis (Brassicaceae). Novon 7, 323–327 (1997).

    Google Scholar 

  44. McRae, B.H., Dickson, B.G., Keitt, T.H. & Shah, V.B. Using circuit theory to model connectivity in ecology, evolution, and conservation. Ecology 89, 2712–2724 (2008).

    PubMed  Google Scholar 

  45. Felsenstein, J. A pain in the torus: some difficulties with models of isolation by distance. Am. Nat. 109, 359–368 (1975).

    Google Scholar 

  46. Lawson, D.J., Hellenthal, G., Myers, S. & Falush, D. Inference of population structure using dense haplotype data. PLoS Genet. 8, e1002453 (2012).

    CAS  PubMed  PubMed Central  Google Scholar 

  47. Mathieson, I. & McVean, G. Differential confounding of rare and common variants in spatially structured populations. Nat. Genet. 44, 243–246 (2012).

    CAS  PubMed  PubMed Central  Google Scholar 

  48. Cavalli-Sforza, L.L. & Edwards, A.W. Phylogenetic analysis. Models and estimation procedures. Am. J. Hum. Genet. 19, 233–257 (1967).

    CAS  PubMed  PubMed Central  Google Scholar 

  49. Felsenstein, J. Maximum-likelihood estimation of evolutionary trees from continuous characters. Am. J. Hum. Genet. 25, 471–492 (1973).

    CAS  PubMed  PubMed Central  Google Scholar 

  50. Saitou, N. & Nei, M. The neighbor-joining method: a new method for reconstructing phylogenetic trees. Mol. Biol. Evol. 4, 406–425 (1987).

    CAS  PubMed  Google Scholar 

  51. Pickrell, J.K. & Pritchard, J.K. Inference of population splits and mixtures from genome-wide allele frequency data. PLoS Genet. 8, e1002967 (2012).

    CAS  PubMed  PubMed Central  Google Scholar 

  52. McCullagh, P. Marginal likelihood for distance matrices. Stat. Sin. 19, 631–649 (2009).

    Google Scholar 

  53. Bahlo, M. & Griffiths, R.C. Coalescence time for two genes from a subdivided population. J. Math. Biol. 43, 397–410 (2001).

    CAS  PubMed  Google Scholar 

  54. Hey, J. A multi-dimensional coalescent process applied to multi-allelic selection models and migration models. Theor. Popul. Biol. 39, 30–48 (1991).

    CAS  PubMed  Google Scholar 

  55. Klein, D. & Randić, M. Resistance distance. J. Math. Chem. 12, 81–95 (1993).

    Google Scholar 

  56. Babić, D., Klein, D., Lukovits, I., Nikolić, S. & Trinajstić, N. Resistance-distance matrix: a computational algorithm and its application. Int. J. Quantum Chem. 90, 166–176 (2002).

    Google Scholar 

  57. Light, A. & Bartlein, P. The end of the rainbow? Color schemes for improved data graphics. Eos 85, 385 (2004).

    Google Scholar 

Download references

Acknowledgements

We thank S. Wasser for access to the African elephant data and I. Moltke for compiling the human data set from sub-Saharan Africa. We also thank B. McRae for helpful discussions on resistance distances. This work was supported in part by US National Institutes of Health (NIH) grant U01 CA198933 to J.N. and grant HG02585 to M.S.

Author information

Authors and Affiliations

Authors

Contributions

M.S. and J.N. conceived the project. D.P., J.N. and M.S. developed and refined methods. D.P. implemented methods. D.P., J.N. and M.S. wrote the manuscript.

Corresponding authors

Correspondence to John Novembre or Matthew Stephens.

Ethics declarations

Competing interests

The authors declare no competing financial interests.

Supplementary information

Supplementary Text and Figures

Supplementary Figures 1–17 and Supplementary Note. (PDF 10801 kb)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Petkova, D., Novembre, J. & Stephens, M. Visualizing spatial population structure with estimated effective migration surfaces. Nat Genet 48, 94–100 (2016). https://doi.org/10.1038/ng.3464

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1038/ng.3464

This article is cited by

Search

Quick links

Nature Briefing AI and Robotics

Sign up for the Nature Briefing: AI and Robotics newsletter — what matters in AI and robotics research, free to your inbox weekly.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing: AI and Robotics