Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

Response to Sul and Eskin

We thank Sul and Eskin (Mixed models can correct for population structure for genomic regions under selection. Nature Reviews Genetics 26 Feb 2013 (doi:10.1038/nrg2813-c1))1 for carefully examining and confirming the limitation of standard mixed model association methods that we identified in our Progress article (New approaches to population stratification in genome-wide association studies. Nature Reviews Genetics 11, 459–463 (2010))2 and for developing an interesting new way to address it.

In our article2, we investigated the limits of mixed model methods by considering an extreme simulation in which most markers had low population differentiation (FST = 0.01), but a small fraction of markers were unusually differentiated (allele frequency difference = 0.6). We found that standard mixed model methods3 did not fully correct for population structure, but mixed models with principal component covariates4 did fully correct for population structure. We stated that “population structure is a fixed effect, and spurious associations might result if it is modelled as a random effect based on overall covariance”.

Sul and Eskin1 have confirmed that, in this extreme simulation, standard mixed model methods do not fully correct for population structure and that mixed models with principal component covariates do fully correct for population structure. They also investigated a new approach, which is to use a mixed model using two kinship matrices: one computed using unusually differentiated markers identified by their spatial ancestry analysis (SPA) method5, and one computed using the remaining markers. They reported that this approach also fully corrects for population structure in this simulation. Thus, population stratification (a fixed effect in this simulation) can be addressed using random effects in a way that we had not previously considered: our review considered only mixed models with a single random effect based on overall covariance3,4,6,7,8 but did not consider mixed models with multiple random effects1.

Another possibility, very similar to the Sul and Eskin1 approach, is to use a mixed model that uses two kinship matrices — one computed from principal component 1, and one computed using the remaining principal components; this approach is based on the natural decomposition of a kinship matrix into its principal components9. This would also fully correct for population structure in this extreme simulation, as Sul and Eskin1 showed that using a single kinship matrix computed from principal component 1 fully corrects for population structure.

A broader question is whether the limitation of standard mixed model methods that arises in this extreme simulation is a major concern in empirical studies. In our article2, we stated that standard mixed model methods are an appealing and simple approach and are sufficient to correct for stratification in many settings. Sul and Eskin1 indicated that the limitation we described did not arise in the Finnish and UK data sets that they analysed. We agree that mixed models with a single random effect based on overall covariance will probably be sufficient to correct for population structure fully in most settings.

Finally, we note that recent work has raised additional points about mixed model methods, including inclusion versus exclusion of the candidate marker in the kinship matrix, use of only a small subset of markers in computing the kinship matrix and effects of case–control ascertainment10,11,12,13. We believe that these are important points that merit further investigation, but this is outside the scope of the current Correspondence.

References

  1. 1

    Sul, J. H. & Eskin, E. Mixed models can correct for population structure for genomic regions under selection. Nature Rev. Genet. 26 Feb 2013 (10.1038/nrg2813-c1).

  2. 2

    Price, A. L., Zaitlen, N. A., Reich, D. & Patterson, N. New approaches to population stratification in genome-wide association studies. Nature Rev. Genet. 11, 459–463 (2010).

    CAS  Article  Google Scholar 

  3. 3

    Kang, H. M. et al. Variance component model to account for sample structure in genome-wide association studies. Nature Genet. 42, 348–354 (2010).

    CAS  Article  Google Scholar 

  4. 4

    Zhang, Z. et al. Mixed linear model approach adapted for genome-wide association studies. Nature Genet. 42, 355–360 (2010).

    CAS  Article  Google Scholar 

  5. 5

    Yang, W. Y., Novembre, J., Eskin, E. & Halperin, E. A model-based approach for analysis of spatial structure in genetic data. Nature Genet. 44, 725–731 (2012).

    CAS  Article  Google Scholar 

  6. 6

    Segura, V. et al. An efficient multi-locus mixed-model approach for genome-wide association studies in structured populations. Nature Genet. 44, 825–830 (2012).

    CAS  Article  Google Scholar 

  7. 7

    Zhou, X. & Stephens, M. Genome-wide efficient mixed-model analysis for association studies. Nature Genet. 44, 821–824 (2012).

    CAS  Article  Google Scholar 

  8. 8

    Vilhjalmsson, B. J. & Nordborg, M. The nature of confounding in genome-wide association studies. Nature Rev. Genet. 14, 1–2 (2013).

    CAS  Article  Google Scholar 

  9. 9

    Janss, L., de Los Campos, G., Sheehan, N. & Sorensen, D. Inferences from genomic models in stratified populations. Genetics 192, 693–704 (2012).

    Article  Google Scholar 

  10. 10

    Sawcer, S. et al. Genetic risk and a primary role for cell-mediated immune mechanisms in multiple sclerosis. Nature 476, 214–219 (2011).

    CAS  Article  Google Scholar 

  11. 11

    Lippert, C. et al. FaST linear mixed models for genome-wide association studies. Nature Methods 8, 833–835 (2011).

    CAS  Article  Google Scholar 

  12. 12

    Listgarten, J. et al. Improved linear mixed models for genome-wide association studies. Nature Methods 9, 525–526 (2012).

    CAS  Article  Google Scholar 

  13. 13

    Mefford, J. & Witte, J. S. The covariate's dilemma. PLoS Genet. 8, e1003096 (2012).

    CAS  Article  Google Scholar 

Download references

Acknowledgements

We are grateful to E. Eskin, P. Visscher, J. Yang and M. Goddard for helpful discussions.

Author information

Affiliations

Authors

Corresponding author

Correspondence to Alkes L. Price.

Ethics declarations

Competing interests

The authors declare no competing financial interests.

Related links

FURTHER INFORMATION

Alkes L. Price's homepage

Rights and permissions

Reprints and Permissions

About this article

Cite this article

Price, A., Zaitlen, N., Reich, D. et al. Response to Sul and Eskin. Nat Rev Genet 14, 300 (2013). https://doi.org/10.1038/nrg2813-c2

Download citation

Further reading

Search

Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing