Abstract
Genome-wide association studies (GWAS) search for associations between genetic variants and disease status, typically via logistic regression. Often there are covariates, such as sex or well-established major genetic factors, that are known to affect disease susceptibility and are independent of tested genotypes at the population level. We show theoretically and with data from recent GWAS on multiple sclerosis, psoriasis and ankylosing spondylitis that inclusion of known covariates can substantially reduce power for the identification of associated variants when the disease prevalence is lower than a few percent. Whether the inclusion of such covariates reduces or increases power to detect genetic effects depends on various factors, including the prevalence of the disease studied. When the disease is common (prevalence of >20%), the inclusion of covariates typically increases power, whereas, for rarer diseases, it can often decrease power to detect new genetic associations.
This is a preview of subscription content, access via your institution
Relevant articles
Open Access articles citing this article.
-
Prevalence of cancer susceptibility variants in patients with multiple Lynch syndrome related cancers
Scientific Reports Open Access 20 July 2021
-
NOD1 rs2075820 (p.E266K) polymorphism is associated with gastric cancer among individuals infected with cagPAI-positive H. pylori
Biological Research Open Access 20 April 2021
-
Re-evaluating the relationship between missing heritability and the microbiome
Microbiome Open Access 08 June 2020
Access options
Subscribe to this journal
Receive 12 print issues and online access
$189.00 per year
only $15.75 per issue
Rent or buy this article
Get just this article for as long as you need it
$39.95
Prices may be subject to local taxes which are calculated during checkout

References
Spencer, C.C., Su, Z., Donnelly, P. & Marchini, J. Designing genome-wide association studies: sample size, power, imputation, and the choice of genotyping chip. PLoS Genet. 5, e1000477 (2009).
Robinson, L.D. & Jewell, N.P. Some surprising results about covariate adjustment in logistic-regression models. Int. Stat. Rev. 59, 227–240 (1991).
Prentice, R.L. & Pyke, R. Logistic disease incidence models and case-control studies. Biometrika 66, 403–411 (1979).
Neuhaus, J.M. & Jewell, N.P. A geometric approach to assess bias due to omitted covariates in generalized linear-models. Biometrika 80, 807–815 (1993).
Stringer, S., Wray, N.R., Kahn, R.S. & Derks, E.M. Underestimated effect sizes in GWAS: fundamental limitations of single SNP analysis for dichotomous phenotypes. PLoS ONE 6, e27964 (2011).
Neuhaus, J.M. Estimation efficiency with omitted covariates in generalized linear models. J. Am. Stat. Assoc. 93, 1124–1129 (1998).
Xing, G. & Xing, C. Adjusting for covariates in logistic regression models. Genet. Epidemiol. 34, 769–771 (2010).
Lin, D.Y. & Zeng, D. On the relative efficiency of using summary statistics versus individual-level data in meta-analysis. Biometrika 97, 321–332 (2010).
International Multiple Sclerosis Genetics Consortium & Wellcome Trust Case Control Consortium 2. Genetic risk and a primary role for cell-mediated immune mechanisms in multiple sclerosis. Nature 476, 214–219 (2011).
Genetic Analysis of Psoriasis Consortium & Wellcome Trust Case Control Consortium 2. A genome-wide association study identifies new psoriasis susceptibility loci and an interaction between HLA-C and ERAP1. Nat. Genet. 42, 985–990 (2010).
Australo-Anglo-American Spondyloarthritis Consortium & Wellcome Trust Case Control Consortium 2. Interaction between ERAP1 and HLA-B27 in ankylosing spondylitis implicates peptide handling in the mechanism for HLA-B27 in disease susceptibility. Nat. Genet. 43, 761–767 (2011).
Chasman, D.I. et al. Genome-wide association study reveals three susceptibility loci for common migraine in the general population. Nat. Genet. 43, 695–698 (2011).
Lee, L.F. Specification error in multinomial logit-models—analysis of the omitted variable bias. J. Econom. 20, 197–209 (1982).
Vukcevic, D., Hechter, E., Spencer, C. & Donnelly, P. Disease model distortion in association studies. Genet. Epidemiol. 35, 278–290 (2011).
Acknowledgements
We thank G. Nicholson for helpful comments. This work was funded by the Wellcome Trust, as part of the Wellcome Trust Case Control Consortium 2 project (085475/B/08/Z and 085475/Z/08/Z) and through the Wellcome Trust core grant for the Wellcome Trust Centre for Human Genetics (090532/Z/09/Z). P.D. was supported in part by a Wolfson Royal Society Merit Award and a Wellcome Trust Senior Investigator Award (095552/Z/11/Z). C.C.A.S. was supported in part by a Wellcome Trust Career Development Fellowship (097364/Z/11/Z).
Author information
Authors and Affiliations
Contributions
M.P., P.D. and C.C.A.S. jointly designed the study and wrote the paper. M.P. derived the mathematical results and carried out the example analyses.
Corresponding authors
Ethics declarations
Competing interests
The authors declare no competing financial interests.
Supplementary information
Supplementary Text and Figures
Supplementary Figures 1–7 and Supplementary Note (PDF 259 kb)
Rights and permissions
About this article
Cite this article
Pirinen, M., Donnelly, P. & Spencer, C. Including known covariates can reduce power to detect genetic effects in case-control studies. Nat Genet 44, 848–851 (2012). https://doi.org/10.1038/ng.2346
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1038/ng.2346
This article is cited by
-
Adjusting for common variant polygenic scores improves yield in rare variant association analyses
Nature Genetics (2023)
-
NOD1 rs2075820 (p.E266K) polymorphism is associated with gastric cancer among individuals infected with cagPAI-positive H. pylori
Biological Research (2021)
-
Prevalence of cancer susceptibility variants in patients with multiple Lynch syndrome related cancers
Scientific Reports (2021)
-
Genome-wide association studies
Nature Reviews Methods Primers (2021)
-
Common genetic variants and modifiable risk factors underpin hypertrophic cardiomyopathy susceptibility and expressivity
Nature Genetics (2021)