Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Analysis
  • Published:

Differential confounding of rare and common variants in spatially structured populations

Abstract

Well-powered genome-wide association studies, now made possible through advances in technology and large-scale collaborative projects, promise to characterize the contribution of rare variants to complex traits and disease. However, while population structure is a known confounder of association studies, it remains unknown whether methods developed to control stratification are equally effective for rare variants. Here, we demonstrate that rare variants can show a stratification that is systematically different from, and typically stronger than, common variants, and this is not necessarily corrected by existing methods. We show that the same process leads to inflation for load-based tests and can obscure signals at truly associated variants. Furthermore, we show that populations can display spatial structure in rare variants, even when Wright's fixation index FST is low, but that allele frequency–dependent metrics of allele sharing can reveal localized stratification. These results underscore the importance of collecting and integrating spatial information in the genetic analysis of complex traits.

This is a preview of subscription content, access via your institution

Access options

Buy this article

Prices may be subject to local taxes which are calculated during checkout

Figure 1: Differential inflation of rare and common variants.
Figure 2: Spatial distribution of rare and common variants.
Figure 3: Comparison of methods for correcting for population structure.
Figure 4: Excess allele sharing.

Similar content being viewed by others

References

  1. Manolio, T.A. et al. Finding the missing heritability of complex diseases. Nature 461, 747–753 (2009).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  2. Bodmer, W. & Bonilla, C. Common and rare variants in multifactorial susceptibility to common diseases. Nat. Genet. 40, 695–701 (2008).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  3. Spencer, C.C., Su, Z., Donnelly, P. & Marchini, J. Designing genome-wide association studies: sample size, power, imputation, and the choice of genotyping chip. PLoS Genet. 5, e1000477 (2009).

    Article  PubMed  PubMed Central  Google Scholar 

  4. Nejentsev, S., Walker, N., Riches, D., Egholm, M. & Todd, J.A. Rare variants of IFIH1, a gene implicated in antiviral responses, protect against type 1 diabetes. Science 324, 387–389 (2009).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  5. Cohen, J.C. et al. Multiple rare alleles contribute to low plasma levels of HDL cholesterol. Science 305, 869–872 (2004).

    Article  CAS  PubMed  Google Scholar 

  6. Wang, J. et al. Common and rare ABCA1 variants affecting plasma HDL cholesterol. Arterioscler. Thromb. Vasc. Biol. 20, 1983–1989 (2000).

    Article  CAS  PubMed  Google Scholar 

  7. 1000 Genomes Project Consortium. et al. A map of human genome variation from population-scale sequencing. Nature 467, 1061–1073 (2010).

  8. Ionita-Laza, I., Buxbaum, J.D., Laird, N.M. & Lange, C. A new testing strategy to identify rare variants with either risk or protective effect on disease. PLoS Genet. 7, e1001289 (2011).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  9. Li, B. & Leal, S.M. Methods for detecting associations with rare variants for common diseases: application to analysis of sequence data. Am. J. Hum. Genet. 83, 311–321 (2008).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  10. Madsen, B.E. & Browning, S.R. A groupwise association test for rare mutations using a weighted sum statistic. PLoS Genet. 5, e1000384 (2009).

    Article  PubMed  PubMed Central  Google Scholar 

  11. Morris, A.P. & Zeggini, E. An evaluation of statistical approaches to rare variant analysis in genetic association studies. Genet. Epidemiol. 34, 188–193 (2010).

    Article  PubMed  Google Scholar 

  12. Mukhopadhyay, I., Feingold, E., Weeks, D.E. & Thalamuthu, A. Association tests using kernel-based measures of multi-locus genotype similarity between individuals. Genet. Epidemiol. 34, 213–221 (2010).

    PubMed  PubMed Central  Google Scholar 

  13. Neale, B.M. et al. Testing for an unusual distribution of rare variants. PLoS Genet. 7, e1001322 (2011).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  14. Bansal, V., Libiger, O., Torkamani, A. & Schork, N.J. Statistical analysis strategies for association studies involving rare variants. Nat. Rev. Genet. 11, 773–785 (2010).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  15. Knowler, W.C., Williams, R.C., Pettitt, D.J. & Steinberg, A.G. Gm3;5,13,14 and type 2 diabetes mellitus: an association in American Indians with genetic admixture. Am. J. Hum. Genet. 43, 520–526 (1988).

    CAS  PubMed  PubMed Central  Google Scholar 

  16. Lander, E.S. & Schork, N.J. Genetic dissection of complex traits. Science 265, 2037–2048 (1994).

    Article  CAS  PubMed  Google Scholar 

  17. Pritchard, J.K. & Donnelly, P. Case-control studies of association in structured or admixed populations. Theor. Popul. Biol. 60, 227–237 (2001).

    Article  CAS  PubMed  Google Scholar 

  18. Cardon, L.R. & Palmer, L.J. Population stratification and spurious allelic association. Lancet 361, 598–604 (2003).

    Article  PubMed  Google Scholar 

  19. Clayton, D.G. et al. Population structure, differential bias and genomic control in a large-scale, case-control association study. Nat. Genet. 37, 1243–1246 (2005).

    Article  CAS  PubMed  Google Scholar 

  20. Marchini, J., Cardon, L.R., Phillips, M.S. & Donnelly, P. The effects of human population structure on large genetic association studies. Nat. Genet. 36, 512–517 (2004).

    Article  CAS  PubMed  Google Scholar 

  21. Bacanu, S.A., Devlin, B. & Roeder, K. The power of genomic control. Am. J. Hum. Genet. 66, 1933–1944 (2000).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  22. Devlin, B. & Roeder, K. Genomic control for association studies. Biometrics 55, 997–1004 (1999).

    Article  CAS  PubMed  Google Scholar 

  23. Price, A.L. et al. Principal components analysis corrects for stratification in genome-wide association studies. Nat. Genet. 38, 904–909 (2006).

    Article  CAS  PubMed  Google Scholar 

  24. Kang, H.M. et al. Efficient control of population structure in model organism association mapping. Genetics 178, 1709–1723 (2008).

    Article  PubMed  PubMed Central  Google Scholar 

  25. Nelis, M. et al. Genetic structure of Europeans: a view from the North-East. PLoS ONE 4, e5472 (2009).

    Article  PubMed  PubMed Central  Google Scholar 

  26. Bustamante, C.D., Burchard, E.G. & De la Vega, F.M. Genomics for the world. Nature 475, 163–165 (2011).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  27. Moran, P.A.P. Notes on continuous stochastic phenomena. Biometrika 37, 17–23 (1950).

    Article  CAS  PubMed  Google Scholar 

  28. Copeland, K.T., Checkoway, H., McMichael, A.J. & Holbrook, R.H. Bias due to misclassification in estimation of relative risk. Am. J. Epidemiol. 105, 488–495 (1977).

    Article  CAS  PubMed  Google Scholar 

Download references

Acknowledgements

The authors thank M. Pirinen, C. Spencer, Z. Iqbal and C. Lindgren for discussion and M. Pirinen for providing software to fit the linear mixed models used in this analysis. This work was supported by grants from the Wellcome Trust (089250/Z/09/Z to I.M., 086084/Z/08/Z to G.M. and 090532/Z/09/Z to the Wellcome Trust Centre for Human Genetics).

Author information

Authors and Affiliations

Authors

Contributions

G.M. conceived and designed the study. I.M. ran simulations and collected results. G.M. and I.M. jointly wrote the simulation code and manuscript.

Corresponding author

Correspondence to Iain Mathieson.

Ethics declarations

Competing interests

The authors declare no competing financial interests.

Supplementary information

Supplementary Text and Figures

Supplementary Figures 1–7 (PDF 813 kb)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Mathieson, I., McVean, G. Differential confounding of rare and common variants in spatially structured populations. Nat Genet 44, 243–246 (2012). https://doi.org/10.1038/ng.1074

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1038/ng.1074

This article is cited by

Search

Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing