Article

European Journal of Human Genetics (2004) 12, 1001–1006. doi:10.1038/sj.ejhg.5201273 Published online 15 September 2004

Detect and adjust for population stratification in population-based association study using genomic control markers: an application of Affymetrix Genechip® Human Mapping 10K array

Ke Hao1, Cheng Li1,2, Carsten Rosenow3 and Wing H Wong1,4

  1. 1Department of Biostatistics, Harvard School of Public Health, Boston, MA, USA
  2. 2Department of Biostatistics, Dana Farber Cancer Institute, Boston, MA, USA
  3. 3Genomics Collaboration, Affymetrix, Santa Clara, CA, USA
  4. 4Department of Statistics, Harvard University, Cambridge, MA, USA

Correspondence: Dr WH Wong, Department of Biostatistics, Harvard University, Harvard School of Public Health, 655 Huntington Ave., Building II, Room 441, Boston, MA 02115, USA. Tel: +1 617 432 4912; Fax: +1 617 739 1781; E-mail: wwong@hsph.harvard.edu

Received 18 February 2004; Revised 11 June 2004; Accepted 22 July 2004; Published online 15 September 2004.

Top

Abstract

Population-based association design is often compromised by false or nonreplicable findings, partially due to population stratification. Genomic control (GC) approaches were proposed to detect and adjust for this confounder. To date, the performance of this strategy has not been extensively evaluated on real data. More than 10 000 single-nucleotide polymorphisms (SNPs) were genotyped on subjects from four populations (including an Asian, an African-American and two Caucasian populations) using GeneChip® Mapping 10 K array. On these data, we tested the performance of two GC approaches in different scenarios including various numbers of GC markers and different degrees of population stratification. In the scenario of substantial population stratification, both GC approaches are sensitive using only 20–50 random SNPs, and the mixed subjects can be separated into homogeneous subgroups. In the scenario of moderate stratification, both GC approaches have poor sensitivities. However, the bias in association test can still be corrected even when no statistical significant population stratification is detected. We conducted extensive benchmark analyses on GC approaches using SNPs over the whole human genome. We found GC method can cluster subjects to homogeneous subgroups if there is a substantial difference in genetic background. The inflation factor, estimated by GC markers, can effectively adjust for the confounding effect of population stratification regardless of its extent. We also suggest that as low as 50 random SNPs with heterozygosity >40% should be sufficient as genomic controls.

Keywords:

population stratification, population-based study, association test, genomic control

Top

MORE ARTICLES LIKE THIS

These links to content published by NPG are automatically generated

NEWS AND VIEWS

Confronting ethnicity-specific disease risk

Nature Genetics News and Views (01 Jan 2006)

Genome-wide tagging for everyone

Nature Genetics News and Views (01 Nov 2006)

Extra navigation

.

naturejobs

natureproducts


ADVERTISEMENT