Abstract
The main problems in drawing causal inferences from epidemiological case-control studies are confounding by unmeasured extraneous factors, selection bias and differential misclassification of exposure1. In genetics the first of these, in the form of population structure, has dominated recent debate2,3,4. Population structure explained part of the significant +11.2% inflation of test statistics we observed in an analysis of 6,322 nonsynonymous SNPs in 816 cases of type 1 diabetes and 877 population-based controls from Great Britain. The remainder of the inflation resulted from differential bias in genotype scoring between case and control DNA samples, which originated from two laboratories, causing false-positive associations. To avoid excluding SNPs and losing valuable information, we extended the genomic control method2,3,4,5 by applying a variable downweighting to each SNP.
This is a preview of subscription content, access via your institution
Access options
Subscribe to this journal
Receive 12 print issues and online access
$209.00 per year
only $17.42 per issue
Rent or buy this article
Prices vary by article type
from$1.95
to$39.95
Prices may be subject to local taxes which are calculated during checkout
Similar content being viewed by others
References
Breslow, N.E. & Day, N.E. Statistical Methods in Cancer Research Vol. I. The Analysis of Case-Control Studies (International Agency for Research on Cancer, Lyon, 1980).
Devlin, B., Bacanu, S.A. & Roeder, K. Genomic control to the extreme. Nat. Genet. 36, 1129–1130; author reply 1131 (2004).
Freedman, M.L. et al. Assessing the impact of population stratification on genetic association studies. Nat. Genet. 36, 388–393 (2004).
Marchini, J., Cardon, L.R., Phillips, M.S. & Donnelly, P. The effects of human population structure on large genetic association studies. Nat. Genet. 36, 512–517 (2004).
Devlin, B. & Roeder, K. Genomic control for association studies. Biometrics 55, 997–1004 (1999).
Vella, A. et al. Localization of a type 1 diabetes locus in the IL2RA/CD25 region by use of tag single-nucleotide polymorphisms. Am. J. Hum. Genet. 76, 773–779 (2005).
Lowe, C.E. et al. Cost-effective analysis of candidate genes using htSNPs: a staged approach. Genes Immun. 5, 301–305 (2004).
Wang, W.Y., Barratt, B.J., Clayton, D.G. & Todd, J.A. Genome-wide association studies: theoretical and practical concerns. Nat. Rev. Genet. 6, 109–118 (2005).
Hardenbol, P. et al. Multiplexed genotyping with sequence-tagged molecular inversion probes. Nat. Biotechnol. 21, 673–678 (2003).
Hardenbol, P. et al. Highly multiplexed molecular inversion probe genotyping: over 10,000 targeted SNPs genotyped in a single tube assay. Genome Res. 15, 269–275 (2005).
Ueda, H. et al. Association of the T-cell regulatory gene CTLA4 with susceptibility to autoimmune disease. Nature 423, 506–511 (2003).
The International HapMap Consortium. The International HapMap Project. Nature 426, 789–796 (2003).
Armitage, P. Test for linear trend in proportions and frequencies. Biometrics II, 375–386 (1955).
Mantel, N. Chi-square tests with one degree of freedom: extensions of the Mantel-Haenszel procedure. J. Am. Stat. Assoc. 58, 690–700 (1963).
Nelder, J. & Wedderburn, R. Generalised linear models. J. R. Statist. Soc. A 135, 370–384 (1972).
Moorhead, M. et al. Optimal genotype determination in highly multiplexed SNP data. Eur. J. Hum. Genet. (in the press).
Acknowledgements
We thank the individuals with T1D and control individuals for their participation; G. Coleman, S. Field, T. Mistry, K. Bourget, S. Clayton, M. Hardy, P. Lauder, M. Maisuria, W. Meadows and S. Wood for preparing DNA samples; D. Strachan, R. Jones, S. Ring and W. McArdle for providing DNA from the 1958 British Birth Cohort collection; and A. Long, N. Naclerio, T. Cormier, K. Tran, C. Bruckner and S. Picton for genotyping and technical assistance. We acknowledge use of DNA from the 1958 British Birth Cohort collection, funded by the Medical Research Council and the Wellcome Trust. We thank the Juvenile Diabetes Research Foundation, the Wellcome Trust, Diabetes UK and the Medical Research Council for financial support. D.G.C. is a Juvenile Diabetes Research Foundation and Wellcome Trust Principal Research Fellow.
Author information
Authors and Affiliations
Corresponding authors
Ethics declarations
Competing interests
M. Faham, M.M., H.B.J., M. Falkowski, P.H. and T.D.W. are currently employed by ParAllele Bioscience.
Supplementary information
Rights and permissions
About this article
Cite this article
Clayton, D., Walker, N., Smyth, D. et al. Population structure, differential bias and genomic control in a large-scale, case-control association study. Nat Genet 37, 1243–1246 (2005). https://doi.org/10.1038/ng1653
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1038/ng1653
This article is cited by
-
Transfer learning for genotype–phenotype prediction using deep learning models
BMC Bioinformatics (2022)
-
Towards fine-scale population stratification modeling based on kernel principal component analysis and random forest
Genes & Genomics (2021)
-
Genotype calling of triploid offspring from diploid parents
Genetics Selection Evolution (2020)
-
Neo-functionalization of a Teosinte branched 1 homologue mediates adaptations of upland rice
Nature Communications (2020)
-
Robust genome-wide ancestry inference for heterogeneous datasets: illustrated using the 1,000 genome project with 3D facial images
Scientific Reports (2020)