Original Article

The Pharmacogenomics Journal (2010) 10, 347–354; doi:10.1038/tpj.2010.27

Assessment of variability in GWAS with CRLMM genotyping algorithm on WTCCC coronary artery disease

L Zhang1, S Yin1, K Miclaus2, M Chierici3, S Vega4, C Lambert5, H Hong6, R D Wolfinger2, C Furlanello3 and F Goodsaid1

  1. 1Genomics Group, Office of Clinical Pharmacology, Center for Drug Evaluation and Research, FDA, Silver Spring, MD, USA
  2. 2SAS Institute, Cary, NC, USA
  3. 3Fondazione Bruno Kessler, Trento, Italy
  4. 4Health Solutions Group, Microsoft (formerly Rosetta Biosoftware), Redmond, WA, USA
  5. 5Golden Helix, Bozeman, MT, USA
  6. 6National Center for Toxicological Research, FDA, Jefferson, AR, USA

Correspondence: Dr F Goodsaid, Genomics Group, Office of Clinical Pharmacology, Center for Drug Evaluation and Research (CDER), Food and Drug Administration, WO51 Rm2148 HFD-870, 10903 New Hampshire, Silver Spring, MD 20903, USA. E-mails: federico.goodsaid@fda.hhs.gov and li.zhang@fda.hhs.gov

Received 14 December 2009; Revised 28 February 2010; Accepted 9 March 2010.

Top

Abstract

The robustness of genome-wide association study (GWAS) results depends on the genotyping algorithms used to establish the association. This paper initiated the assessment of the impact of the Corrected Robust Linear Model with Maximum Likelihood Classification (CRLMM) genotyping quality on identifying real significant genes in a GWAS with large sample sizes. With microarray image data from the Wellcome Trust Case–Control Consortium (WTCCC), 1991 individuals with coronary artery disease (CAD) and 1500 controls, genetic associations were evaluated under various batch sizes and compositions. Experimental designs included different batch sizes of 250, 350, 500, 2000 samples with different distributions of cases and controls in each batch with either randomized or simply combined (4:3 case–control ratios) or separate case–control samples as well as whole 3491 samples. The separate composition could create 2–3% discordance in the single nucleotide polymorphism (SNP) results for quality control/statistical analysis and might contribute to the lack of reproducibility between GWAS. CRLMM shows high genotyping accuracy and stability to batch effects. According to the genotypic and allelic tests (P<5.0 × 10−7), nine significant signals on chromosome 9 were found consistently in all batch sizes with combined design. Our findings are critical to optimize the reproducibility of GWAS and confirm the genetic role in the pathophysiology of CAD.

Keywords:

CRLMM; GWAS; WTCCC; CAD; robustness; accuracy