Follow-up of 1715 SNPs from the Wellcome Trust Case Control Consortium genome-wide association study in type I diabetes families

Abstract

The advent of genome-wide association (GWA) studies has revolutionized the detection of disease loci and provided abundant evidence for previously undetected disease loci that can be pooled together in meta-analysis studies or used to design follow-up studies. A total of 1715 SNPs from the Wellcome Trust Case Control Consortium GWA study of type I diabetes (T1D) were selected and a follow-up study was conducted in 1410 affected sib-pair families assembled by the Type I Diabetes Genetics Consortium. In addition to the support for previously identified loci (PTPN22/1p13; ERBB3/12q13; SH2B3/12q24; CLEC16A/16p13; UBASH3A/21q22), evidence supporting two new and distinct chromosome locations associated with T1D was observed: FHOD3/18q12 (rs2644261, P=5.9 × 10−4) and Xp22 (rs5979785, P=6.8 × 10−3; http://www.T1DBase.org). There was independent support for both SNPs in a GWA meta-analysis of 7514 cases and 9045 controls (P values=5.0 × 10−3 and 6.7 × 10−6, respectively). The chromosome 18q12 region contains four genes, none of which are obvious functional candidate genes. In contrast, the Xp22 SNP is located 30 kb centromeric of the functional candidate genes TLR8 and TLR7 genes. Both TLR8 and TLR7 are functional candidate genes owing to their key roles as pathogen recognition receptors and, in the case of TLR7, overexpression has been associated directly with murine autoimmune disease.

Introduction

The Wellcome Trust Case Control Consortium (WTCCC) genome-wide association (GWA) study,1 in addition to the detection of four new and confirmed2 type I diabetes (T1D) loci, provided some evidence for previously undetected (P>5 × 10−7) T1D loci with effect sizes in the range of odds ratios between 1.1 and 1.3. The evidence provided by such GWA studies for these previously undetected loci can be pooled together in meta-analysis studies3 or used to design follow-up studies. Both of these approaches provide the additional samples required to detect novel loci with robust statistical power. Here, the Type I Diabetes Genetics Consortium (T1DGC) reports on a follow-up of 1715 SNPs from the WTCCC GWA study in a maximum of 1410 affected sib-pair families assembled by the T1DGC, providing 2198 parent–child trios. This genotyping project also involved a second candidate gene study, which is described in an accompanying paper,4 and adds to the initial T1DGC candidate gene study as described by Howson et al.5 in this issue. In addition, our study used the recently published T1DGC GWA case/control meta-analysis6 to provide independent evidence to evaluate whether the family associations represent newly identified T1D loci.

Results

The 1715 SNPs selected from the WTCCC GWA study were genotyped at the Wellcome Trust Sanger Institute (WTSI) in Cambridge, UK and at the Broad Institute of Harvard/MIT in Cambridge, MA, USA. The majority of these SNPs (1536 of 1715) were genotyped at the WTSI. As the two genotyping centers used different calling algorithms applied to different family groups, the analysis was initiated by re-calling the data using the same algorithm on the same family groups. To improve SNP quality control (QC) of the genotype data from the Broad Institute, in addition to the 179 WTCCC follow-up SNPs, we included 589 SNPs from a second T1DGC study4 of known T1D and T2D loci and other autoimmune disease loci that had all been genotyped together. The results for the T1D, T2D and autoimmune disease SNPs are reported in Cooper et al.,4 this volume and in T1DBase.7

Sample QC filters were applied that identified 368 (4.9%) samples genotyped at the WTSI and 590 (6.3%) samples genotyped at the Broad Institute (Supplementary Table 1) with insufficient DNA quality and quantity. The proportion of samples with insufficient DNA was high, given the lenient QC metrics used for analysis. After sample exclusions, the genotypes were re-called and these data were used for further analysis. SNP QC filters were applied that resulted in the exclusion of 3.7% of SNPs from the WTSI and 14.2% of SNPs from the Broad Institute (Supplementary Table 2). The marked difference in the proportion of SNPs excluded was partly the result of the inclusion of the 589 non-WTCCC follow-up SNPs (T1D, T2D and autoimmune disease SNPs)4 with the 179 WTCCC follow-up SNPs from the Broad Institute. For these non-WTCCC SNPs, no minor allele frequency filter had been used in their selection. As a result, a total of 1.3% of SNPs from the WTSI and 7.9% of SNPs from the Broad Institute were rejected based on minor allele frequency <5% (Supplementary Table 2).

To further investigate the genotyping success rate in the Broad Institute data, concordance was estimated in Warren 1 Diabetes UK (DUK) families for 33 SNPs genotyped at the Broad Institute and at the University of Cambridge. The median concordance was 96%. This relatively low rate suggested that our re-calling the Broad Institute data using Illuminus8 may have caused the discordance.

Illuminus and BeadStudio9 genotyping calls, the latter provided by the Broad Institute, were compared for 24 SNPs (9 of the previous 33 SNPs did not have BeadStudio calls). The median concordance was 99% for Illuminus–BeadStudio calls and 96% for Illuminus–Cambridge calls, suggesting that the use of Illuminus was not the source of the discordance. As a result, 101 of the most discordant DUK samples, defined as having more than four errors in 24 SNPs, were removed. After this sample removal, the median concordance between Illuminus and Cambridge genotyping calls for 24 SNPs increased from 96 to 99%. Sample QC excluded 39 of these 101 discordant DUK samples.

In addition to the concordance checks of the Broad Institute data, we visually inspected the 760 autosomal SNP signal intensity plots for each of the family groups (Supplementary Table 3). The initial analysis revealed that 11 of 32 SNPs with P<1 × 10−3 had poor genotype signal cloud clustering, despite passing other SNP QC measures (Supplementary Table 4). None of the 179 WTCCC follow-up SNPs genotyped at the Broad Institute had a P<1 × 10−3. In a similar strategy to that applied to the Broad Institute data, we visually inspected the most associated 200 SNPs genotyped at the WTSI. The analyses were repeated and eight distinct genomic locations with P<1 × 10−3 were identified, three not reported earlier (Table 1) and with good genotype clusters (Supplementary Figures 1a to 1c; T1DBase7).

Table 1 A summary of the T1D association results for four SNPs from distinct genomic locations that have not been reported previously

To further investigate the possible associations of these three new regions with T1D, the recent T1DGC meta-analysis,6 consisting of 7514 cases and 9045 controls, was used (Table 1). The meta-analysis provided P values for two of the three SNPs. The LOC646282/7q22 SNP had failed QC in the meta-analysis. Additional evidence supporting FHOD3/18q12 was found in the case–control meta-analysis (P=5.02 × 10−3; Table 1; Fisher's combined P=4.08 × 10−5). Despite none of the 32 SNPs on the X chromosome having a P value <1 × 10−3, the second most associated SNP on chromosome Xp22, rs5979785 (P=6.83 × 10−3), had additional evidence in the meta-analysis (P=6.72 × 10−6; Table 1; Fisher's combined P=8.23 × 10−7) and had good genotype clustering in all four family groups (Supplementary Figure 1d; T1DBase7).

Discussion

In this analysis of data from the T1DGC, we report evidence for two novel T1D loci in the FHOD3/18q12 and the Xp22 regions, which importantly, both have additional evidence in the case/control meta-analysis.6 Nevertheless, confirmation of these two newly identified candidate T1D loci will require genotyping of additional samples. The 678 kb-associated region on chromosome 18q12 contains four genes (FHOD3, KIAA1328, C18orf10 and BRUNOL4), none of which are obvious functional candidate genes.7 In contrast, the SNP in chromosome Xp22 (rs5979785) is located 30 kb centromeric of the functional candidate genes TLR7 and TLR8.7 TLR7, in particular, has been associated with autoimmune lupus in mice. A duplication in this gene results in a twofold overexpression and accelerates lupus autoimmunity in mice.10 It is possible that SNP rs5979785, or variants in linkage disequilibrium with it, alters TLR7 and/or TLR8 expression and may be associated with risk of T1D.

Materials and methods

Subjects

The DNA samples were genotyped at the WTSI in Hinxton, UK (http://www.sanger.ac.uk/) and at the Broad Institute of Harvard/MIT in Cambridge, MA, USA (http://www.broad.mit.edu/) using the Illumina Golden Gate technology. The samples were assembled by the T1DGC and consisted of affected sib-pair families of two parents and two affected offspring. The families were obtained from nine collections: DUK, Human Biological Data Interchange (HBDI), T1DGC Asia Pacific (AP) Network, T1DGC European (EUR) Network, T1DGC United Kingdom (UK) Network, T1DGC North America (NA) Network, Joslin (JOS) Diabetes Center, Sardinia (SAR) and Denmark (DAN). The AP, EUR, NA and UK collections were newly recruited by the T1DGC, whereas the remainder were part of established collections. In the WTSI data, 1813 families had at least one member who passed sample QC, 1069 families provided 2082 parent–child trios. In the Broad Institute data, 2074 families had at least one member who passed sample QC, 1410 families provided 2798 parent–child trios (Supplementary Tables 5a and 5b).

SNP selection

WTCCC SNPs were selected for follow-up starting from the 4157 SNPs with a 1-df (degrees of freedom) P0.01 and a minor allele frequency (5%). SNPs were removed if they exhibited an r20.8 with a more significant SNP. This process resulted in a panel of 2534 SNPs. Eleven SNPs were removed as they were located in genes already being examined by the T1DGC, resulting in 2523 SNPs. These SNPs were then validated for the Illumina GoldenGate platform. If the SNPs failed Illumina designs, replacement SNPs were then obtained. A total of 2346 SNPs were validated (assay design score >0.6 and without error codes). The top-ranked 1536 SNPs were genotyped at the WTSI and the next 179 SNPs were genotyped at the Broad Institute.

Re-calling genotypes

The 1715 WTCCC follow-up SNPs were genotyped at the WTSI and Broad Institute using the Illumina GoldenGate platform. The WTSI called genotypes using Illuminus8 and clustered by study group, whereas the Broad Institute used BeadStudio9 and clustered by genomic or whole genome amplified DNA. As the two genotyping centers used different calling algorithms on different family groups, the data were re-called using Illuminus on the same family groups. The families were divided into four groups based on previous experience:11 DUK; HBDI; New T1DGC family collections (NEW; families from AP, EUR, UK and NA Networks); and whole genome amplified (families from JOS, SAR and DAN). Individual genotyping calls were accepted only if the posterior probability of the best call exceeded a 0.9 threshold. The QC of the Broad Institute data is based on the 179 WTCCC follow-up SNPs and the 589 SNPs from known T1D and T2D loci, and other autoimmune disease genes (see Cooper et al.,4 this volume).

Sample QC

Samples known to have non-European ancestry were excluded. Sample call rate and heterozygosity of autosomal SNPs were used to exclude samples with insufficient quality and quantity of DNA. Samples were excluded without family information and duplicate samples, identified through genotype and family information (Supplementary Figures 2a and 2b). In addition, sample misinheritance was used to gauge any remaining genotyping failures (Supplementary Figures 3a and 3b). After making sample exclusions (Supplementary Table 1), the genotyping data were re-called.

Genotyping QC

SNPs were excluded if the minor allele frequency fell below 5% in unaffected parents, deviation from Hardy–Weinberg equilibrium exceeded a 1-df χ2 test statistic of 25 in unaffected parents or SNP call rate was less than 95% (Supplementary Table 2). In addition to these standard QC checks, for SNPs genotyped at the Broad Institute, SNP genotype concordance was compared with University of Cambridge genotyping. We visually inspected the genotype signal intensity cluster plots for the 760 autosomal SNPs genotyped at the Broad Institute, the 200 most associated autosomal SNPs genotyped at the WTSI and X chromosome SNPs genotyped at both centers. Genotype signal intensity cluster plots are available in T1DBase.7

Statistics

All analyses were carried out in the R statistical environment using the snpMatrix package from the bioConductor project.12 The family groups were analyzed using the transmission/disequilibrium test, configured as a score test. The scores and their variances were summed over family groups and genotyping centers to pool information.

Conflict of interest

The authors declare no conflict of interest.

References

  1. 1

    Wellcome Trust Case Control Consortium. Genome-wide association study of 14 000 cases of seven common diseases and 3000 shared controls. Nature 2007; 447: 661–678.

    Article  Google Scholar 

  2. 2

    Todd JA, Walker NM, Cooper JD, Smyth DJ, Downes K, Plagnol V et al. Robust associations of four new chromosome regions from genome-wide analyses of type 1 diabetes. Nat Genet 2007; 39: 857–864.

    CAS  Article  Google Scholar 

  3. 3

    Cooper JD, Smyth DJ, Smiles AM, Plagnol V, Walker NM, Allen JE et al. Meta-analysis of genome-wide association study data identifies additional type 1 diabetes risk loci. Nat Genet 2008; 40: 1399–1401.

    CAS  Article  Google Scholar 

  4. 4

    Cooper JD, Walker NM, Healy BC, Smyth DJ, Downes K, Todd JA and the Type I Diabetes Genetics Consortium. Analysis of 55 autoimmune disease and type II diabetes loci: further confirmation of chromosomes 4q27, 12q13.2 and 12q24.13 as type I diabetes loci, and support for a new locus, 12q13.3-q14.1. Genes Immun 2009; 10 (Suppl 1): S95–S120.

    Article  Google Scholar 

  5. 5

    Howson JMM, Walker NM, Smyth DJ, Todd JA and the Type I Diabetes Genetics Consortium. Analysis of 19 genes for association with type I diabetes in the Type I Diabetes Genetics Consortium families. Genes Immun 2009; 10 (Suppl 1): S74–S84

    CAS  Article  Google Scholar 

  6. 6

    Barrett JC, Clayton DG, Concannon P, Akolkar B, Cooper JD, Erlich HA et al. Genome-wide association study and meta-analysis find that over 40 loci affect risk of type 1 diabetes. Nat Genet 2009; 41: 703–707.

    CAS  Article  Google Scholar 

  7. 7

    http://www.T1DBase.org.

  8. 8

    Teo YY, Inouye M, Small KS, Gwilliam R, Deloukas P, Kwiatkowski DP et al. A genotype calling algorithm for the Illumina BeadArray platform. Bioinformatics 2007; 23: 2741–2746.

    CAS  Article  Google Scholar 

  9. 9

    http://www.illumina.com.

  10. 10

    Deane JA, Pisitkun P, Barrett RS, Feigenbaum L, Town T, Ward JM et al. Control of toll-like receptor 7 expression is essential to restrict autoimmunity and dendritic cell proliferation. Immunity 2007; 27: 801–810.

    CAS  Article  Google Scholar 

  11. 11

    Howson JM, Walker NM, Clayton D, Todd JA . Confirmation of HLA class II independent type 1 diabetes associations in the major histocompatibility complex including HLA-B and HLA-A. Diabetes Obes Metab 2009; 11 (Suppl 1): 31–45.

    Article  Google Scholar 

  12. 12

    Clayton D, Leung HT . An R package for analysis of whole-genome association studies. Hum Hered 2007; 64: 45–51.

    Article  Google Scholar 

Download references

Acknowledgements

The Type I Diabetes Genetics Consortium is a collaborative clinical study sponsored by the National Institute of Diabetes and Digestive and Kidney Diseases (NIDDK), the National Institute of Allergy and the Infectious Diseases (NIAID), the National Human Genome Research Institute (NHGRI), the National Institute of Child Health and the Human Development (NICHD), Juvenile Diabetes Research Foundation International (JDRF) and supported by U01 DK062418. JDC, NMW, DJS, KD, BCH and JAT are funded by the the Juvenile Diabetes Research Foundation International, the Wellcome Trust and the National Institute for Health Research Cambridge Biomedical Centre. The Cambridge Institute for Medical Research is in receipt of a Wellcome Trust Strategic Award (079895). Genotyping was performed at the Broad Institute Center for Genotyping and Analysis is supported by grant U54 RR020278 from the National Center for Research Resources.

Author information

Affiliations

Authors

Consortia

Corresponding author

Correspondence to J D Cooper.

Rights and permissions

Reprints and Permissions

About this article

Cite this article

Cooper, J., Walker, N., Smyth, D. et al. Follow-up of 1715 SNPs from the Wellcome Trust Case Control Consortium genome-wide association study in type I diabetes families. Genes Immun 10, S85–S94 (2009). https://doi.org/10.1038/gene.2009.97

Download citation

Keywords

  • genome-wide association
  • type I diabetes
  • follow-up study
  • T1DGC

Further reading

Search