Crohn disease and ulcerative colitis are two subphenotypes of inflammatory bowel disease (IBD), a complex disorder resulting from gene-environment interaction. We refined our previously defined linkage region for IBD on chromosome 10q23 and used positional cloning to identify genetic variants in DLG5 associated with IBD. DLG5 encodes a scaffolding protein involved in the maintenance of epithelial integrity. We identified two distinct haplotypes with a replicable distortion in transmission (P = 0.000023 and P = 0.004 for association with IBD, P = 0.00012 and P = 0.04 for association with Crohn disease). One of the risk-associated DLG5 haplotypes is distinguished from the common haplotype by a nonsynonymous single-nucleotide polymorphism 113G→A, resulting in the amino acid substitution R30Q in the DUF622 domain of DLG5. This mutation probably impedes scaffolding of DLG5. We stratified the study sample according to the presence of risk-associated CARD15 variants to study potential gene-gene interaction. We found a significant difference in association of the 113A DLG5 variant with Crohn disease in affected individuals carrying the risk-associated CARD15 alleles versus those carrying non-risk-associated CARD15 alleles. This is suggestive of a complex pattern of gene-gene interaction between DLG5 and CARD15, reflecting the complex nature of polygenic diseases. Further functional studies will evaluate the biological significance of DLG5 variants.
IBD is a spectrum of chronic relapsing inflammatory disorders affecting the gastrointestinal tract that can be classified into Crohn disease and ulcerative colitis. The identification of CARD15 (refs. 1–3) and several loci associated with susceptibility to IBD in independent linkage studies4 documents the polygenic etiology of IBD. We previously identified a locus in the pericentromeric region of chromosome 10 that was associated with susceptibility to IBD in a genome-wide linkage scan involving 282 families of European descent5. Fine mapping at an average distance of 5 cM using an additional 11 microsatellite markers in an extended linkage cohort (111 additional families including 422 affected sibling pairs) confirmed initial linkage findings and identified a two-peak linkage curve extending from D10S547 to D10S192 (multipoint lod score = 2.07, P = 0.0033 at D10S548) and a second peak at D10S201 (multipoint lod score = 1.6) for associated with Crohn disease (Fig. 1). We used a hierarchical linkage disequilibrium (LD) study to search for the causal variant(s) in the 40-cM interval (Fig. 1). Transmission disequilibrium testing (TDT) of trios randomly drawn from each family showed a significant single-point association with Crohn disease at D10S201 (P < 0.01), located in the second linkage peak on 10q22–10q23. We finely mapped the underlying 5-Mb region at an average distance of 75–120 kb using 37 single-nucleotide polymorphisms (SNPs) selected from the TSC allele frequency project in 457 independent trios with IBD (Supplementary Table 1 online). The marker TSC0376484 (rs1344966) in this panel was significantly associated with Crohn disease (χ2 = 9.00, P = 0.002) and more strongly with IBD (Crohn disease and ulcerative colitis, χ2 = 11.65, P = 0.0006; Fig. 1).
TSC0376484 is located near two genes of possible (patho)physiological relevance to chronic intestinal inflammation: KCNMA1, encoding a potassium-gated calcium channel6, and DLG5, a member of the membrane-associated guanylate kinase gene family, which is important in the maintenance of epithelial cell integrity7. To genetically narrow the association signal to one single candidate, we used LD mapping and genotyped selected publicly available SNPs from each gene in the 457 trios with IBD. The association signal was confined to DLG5, and none of the markers in KCNMA1 was significantly associated with IBD (Supplementary Table 2 online).
We sequenced coding exons 2–32 and the exon-intron boundaries of DLG5 in 47 individuals with IBD and identified or verified 33 SNPs (Supplementary Tables 3 and 4 online). We then tested all these SNPs for disease association. Association with the IBD phenotype was strongest (Table 1), with 18 markers in DLG5 showing significant association. Separate analysis of the Crohn disease and ulcerative colitis subgroups showed a strong association in the Crohn disease subgroup, which is in accordance with the original linkage observation5. The weaker signal in the ulcerative colitis subgroup may be due to reduced power in a small sample size. Because the combined group had the strongest association, we suggest that the signal in DLG5 reflects a factor associated with general susceptibility to IBD rather than to Crohn disease only.
Pairwise LD measures (D′) indicated strong LD across the entire gene, defining a single haplotype block of ∼85 kb and D′ values >0.8, except at TSC0000361 (located on the neighboring LD segment). We observed a sharp decline in LD at the boundaries of the haplotype block, differentiating DLG5 from the neighboring genes KCNMA1 and RPC155 (Fig. 2). Analysis of the extended DLG5 haplotype identified four common haplotypes (Fig. 3), with haplotype A tagged by eight SNPs (haplotype-tagging SNPs or htSNPs) of equivalent genetic information content. Haplotype A was significantly undertransmitted to individuals with IBD and Crohn disease, whereas haplotype D, uniquely tagged by the coding variant 113A, was significantly overtransmitted to individuals with both IBD (χ2 = 8.08, P = 0.004) and Crohn disease (χ2 = 4.15, P = 0.04; Fig. 3).
To corroborate our initial association finding, we genotyped the DLG5 htSNPs in an independent sample consisting of trios with IBD who had not yet been analyzed (n = 485; Supplementary Table 1 online). The htSNP DLG5_e26 in haplotype A was undertransmitted to the individuals with IBD (transmitted:untransmitted (T:U) ratio of 165:214), replicating the observed association (P = 0.006 in a one-tailed test), and rs1058198 had a T:U ratio of 196:237 (P = 0.024). 113A was overtransmitted in both IBD (T:U 90:73, P = 0.09) and Crohn disease (T:U 58:43, P = 0.065) but the distortion was not statistically significant. This can be explained by the smaller proportion of trios with Crohn disease in our replication sample and the reduced power in replication situations. We therefore tested the associated markers in a second independent sample (538 Crohn disease cases and 548 controls) using the case-control study design to estimate the attributable risk in a diverse population of European descent. The 113A variant was significantly associated with the IBD phenotype (P = 0.001, odds ratio (OR) = 1.62) as was rs2289310 (4136C→A, resulting in the amino acid substitution P1371Q; P = 0.01, OR = 1.51). DLG5_e26, tagging haplotype A, was significantly associated with IBD (Table 2), providing a second independent replication of association. The combined P values for the repeated, independent associations with IBD (n = 2) were 0.029 for 113A and 0.0007 for DLG5_e26, and those for the repeated, independent associations with Crohn disease (n = 3) were P = 0.001 for 113A and P = 0.0004 for DLG5_e26.
Because the 4136A risk allele is not included on the common haplotypes carrying 113A, but instead on a rare haplotype (frequency <1%), we calculated the global differences in genotype combinations for 113A and 4136A to estimate the risk for homozygosity or compound heterozygosity. This analysis identified a significant difference in genotype frequencies (global χ2 = 13.61, P = 0.0029) in individuals with IBD compared with healthy controls. The OR was 1.74 (95% confidence interval = 1.31-2.32) for individuals carrying at least two risk alleles (113A and/or 4136A), suggesting that the overall clinical impact of rare single coding mutations such as 4136A on the IBD phenotype is limited. Our disease model that links 113A and 4136A to the positional signal detected in DLG5 is further supported by the identification of rare, coding, 'private' variants (resulting in the amino acid substitutions S121G, E514Q, R957H and P979L; frequency <0.5%) through systematic sequence analysis in 47 individuals.
We were interested in the hypothetical impact of the associated variants, R30Q and P1371Q, on the function of the DLG5 protein. DLG5 has been implicated in regulating cell growth and maintaining cell shape and polarity8. A recent study9 suggested an epithelial function for DLG5 as a binding partner of vinexin at sites of cell-cell contact, and our preliminary results on expression of DLG5 mRNA in a variety of tissues confirm the presence of the transcript in the colon, the intestine and isolated intestinal epithelial cells (Supplementary Fig. 1 online). It is therefore conceivable that DLG5 has a role in maintaining epithelial structure and that genetic variants in DLG5 interfere with epithelial barrier function in the colon.
DLG5 contains one DUF622 domain, four PDZ domains and one SH3 domain followed by one guanylate kinase domain (Fig. 1)10,11. All these domains are assumed to be involved in protein-protein interactions, supporting the idea that DLG5 is a multifunctional adapter and scaffold protein. We carried out in silico analysis of the potential structural and functional implications of the variants R30Q and P1371Q (Supplementary Fig. 1 online). The results of this analysis suggested that both variants probably impair the scaffolding functions of DLG5 (Supplementary Methods and Supplementary Fig. 1 online).
Finally, we examined potential locus-locus interactions between variants of DLG5 and variants of CARD15, the first susceptibility gene identified for Crohn disease1,2,3. Genetic susceptibility to Crohn disease is mainly conferred by three polymorphisms that induce structural changes in the leucine-rich repeats of CARD15 (R702W, G908R and 3020insC). The allele frequencies of these SNPs range from ∼4% to 14% in the cohorts with Crohn disease examined to date, and ∼30–40% of individuals with Crohn disease are heterozygous for at least one of the variants, compared with ∼10% of control subjects1,2,3. We examined interactions between DLG5 and CARD15 by stratifying trios in two groups according to the genotype of the affected child. In trios with IBD and Crohn disease, haplotype A (represented by DLG5_e26) was undertransmitted in the groups carrying both the risk-associated and non-risk-associated variants of CARD15 (Table 3), which suggests that haplotype A reflects genetic variation that acts independently of CARD15 variants. In trios with Crohn disease, we observed significantly greater transmission of 113A in individuals carrying the risk-associated versus non-risk-associated variants of CARD15 (Table 3). This suggests that the 113A variant is of particular relevance in individuals with Crohn disease and, further, that an interaction may exist between the risk-associated haplotype of DLG5 and the risk-associated variants of CARD15.
We found replicated association between genetic variations in DLG5 and the risk of developing IBD. The risk-associated DLG5 haplotype D is uniquely distinguished by the 113A variant and is suggested to be causative, as are rare, private SNPs. The conferred risk is moderate, which is in agreement with a polygenic disease model. Genetic interaction studies suggest interactions between CARD15 variants and 113A in DLG5, but these studies are not yet conclusive and will require large, consolidated efforts by several groups to achieve appropriate statistical power. Future studies in diverse and very large samples are needed to evaluate the population relevance of variants in DLG5 in this chromosomal region. Functional studies need to define the molecular properties of DLG5 variants and their influence on the clinical presentation of IBD.
Individuals with IBD were recruited by the clinical group through the Charité University Hospital (Berlin, Germany) and at the Department of Internal Medicine I, University Hospital Kiel, Germany. Diagnosis of IBD and subsequent classification into Crohn disease or ulcerative colitis was determined by standard diagnostic criteria12,13 and has been described previously3,5,13. All individuals were of European descent. We carried out LD mapping in trios consisting of father, mother and child affected with IBD, in which one parent or neither parent was affected with Crohn disease or ulcerative colitis. These trios were identified for LD mapping and have been described3. For our confirmatory cohort, we extracted trios randomly from the multicase families used in our previous linkage studies5 and supplemented this group with 92 additional trios recruited for this purpose. For case-control association, we compared 538 additional, independent individuals with IBD (singletons) with age- and sex-matched volunteers from the Kiel University blood donation program. All study participants gave informed, written consent. The recruitment protocols and study procedures were approved by the ethics committees of the Charité University Hospital, Berlin, Germany, and the Schleswig-Holstein University Hospital, Campus Kiel, Germany, respectively.
In the first stage of microsatellite LD mapping, we genotyped 11 microsatellite markers (D10S547, D10S548, D10S211, D10S611, D10S213, D10S1780, D10S220, D10S1790, D10S609, D10S201 and D10S2470) in 393 families with IBD (422 affected sibling pairs).
Information on primer sequence, allele size range, suggested amplification conditions and genetic position can be obtained from the Genethon and Marshfield databases (see URLs). Genotypes were generated at the University of California Los Angeles using PCR and fluorescence-labeled primers on an ABI 377 sequencer.
SNP discovery in DLG5.
To identify all crucial SNPs in the coding sequence of DLG5, as well as exon-intron boundaries and the promoter region, we sequenced 47 individuals with IBD using an ABI 3700 automated sequencer as previously described3. The primers and probes for 33 SNPs discovered or verified by resequencing, and the sequences of the new SNPs, are given in Supplementary Tables 3 and 4 online.
We selected SNP markers for the initial fine mapping experiment based on information available from the public databases. For analysis of DLG5, we used SNPs generated or verified in-house. We generated SNP genotypes using the TaqMan allelic discrimination method as previously described3. Taqman assays were from ABI.
We tested each marker for Hardy-Weinberg equilibrium in the control populations using a χ2 test and then carried out genetic analyses at several levels. To confirm the association with Crohn disease, we first subjected each marker to single-locus tests for linkage and transmission disequilibrium testing (TDT) analysis followed by haplotype analysis as implemented in GENEHUNTER (Vs. 2.1; ref. 14). To assess significance of the TDT results for each marker, we did permutation tests using the same genotype data described previously15,16. In 105 permutations of the entire data set of 28 analyzed markers for DLG5, we observed a single χ2 value greater than 9.91 4,635 times (empirical P = 0.046), and 874 simulations had two markers with a χ2 value greater than 14.5 (empirical P = 0.0087). We calculated pairwise LD between each marker pair and between haplotype blocks as described15,16. For case-control analysis, we calculated χ2 values using Fisher's exact test; we calculated genotype-based ORs using Fisher's contingency tables and tested association similarly. We calculated combined P values for determining the overall significance of the observed independent association findings as outlined17.
Exon 1 identification.
Because a BLAST analysis of the sequence from exon 1 as described10 showed this sequence to be derived from human mitochondrial DNA, we concluded that this sequence probably arose as an artefact of RACE amplification. We used sequence from exon 2 instead to identify expressed-sequence tags from porcine and bovine genomes containing unique 5′ sequences (EMBL IDs BI402246, BM484383 and BI847653). These have high similarity and could be identified within the human contig containing DLG5. This new exon of at least 300 nucleotides in the 5′ untranslated region is located ∼57 kb upstream of exon 2 of DLG5.
The Marshfield database is available at http://research.marshfieldclinic.org/genetics. The Genethon database is available at http://www.genethon.fr. The National Center for Biotechnology Information's SNP database is available at http://www.ncbi.nlm.nih.gov/SNP. The SNP Consortium website is available at http://snp.cshl.org. The National Genome Research Network is available at http://www.ngfn.de.
Note: Supplementary information is available on the Nature Genetics website.
Hugot, J.P. et al. Association of NOD2 leucine-rich repeat variants with susceptibility to Crohn's disease. Nature 411, 599– 603 (2001).
Ogura, Y. et al. A frameshift mutation in NOD2 associated with susceptibility to Crohn's disease. Nature 411, 603– 606 (2001).
Hampe, J. et al. Association between insertion mutation in NOD2 gene and Crohn's disease in German and British populations. Lancet 357, 1925– 1928 (2001).
Bonen, D.K. & Cho, J.H. The genetics of inflammatory bowel disease. Gastroenterology 124, 521– 536 (2003).
Hampe, J. et al. A genomewide analysis provides evidence for novel linkages in inflammatory bowel disease in a large European cohort. Am. J. Hum. Genet. 64, 808– 816 (1999).
Lu, G. et al. Inflammatory modulation of calcium-activated potassium channels in canine colonic circular smooth muscle cells. Gastroenterology 116, 884– 892 (1999).
Nakamura, H. et al. Identification of a novel human homolog of the Drosophila dlg, P-dlg, specifically expressed in the gland tissues and interacting with p55. FEBS Lett. 433, 63– 67 (1998).
Humbert, P., Russell, S. & Richardson, H. Dlg, Scribble and Lgl in cell polarity, cell proliferation and cancer. Bioessays 25, 542– 553 (2003).
Wakabayashi, M. et al. Interaction of lp-dlg/KIAA0583, a membrane-associated guanylate kinase family protein, with vinexin and beta-catenin at sites of cell-cell contact. J. Biol. Chem. 278, 21709– 21714 (2003).
Shah, G. et al. The cloning, genomic organization and tissue expression profile of the human DLG5 gene. BMC Genomics 3, 6 (2002).
Purmonen, S. et al. HDLG5/KIAA0583, encoding a MAGUK-family protein, is a primary progesterone target gene in breast cancer cells. Int. J. Cancer 102, 1– 6 (2002).
Lennard-Jones, J.E. Classification of inflammatory bowel disease. Scand. J. Gastroenterol. 170 (Suppl), 2– 6 (1989).
Truelove, S.C. & Pena, A.S. Course and prognosis of Crohn's disease. Gut 17, 192– 201 (1976).
Markianos, K., Daly, M.J. & Kruglyak, L. Efficient multipoint linkage analysis through reduction of inheritance space. Am. J. Hum. Genet. 68, 963– 977 (2001).
Daly, M.J. et al. High-resolution haplotype structure in the human genome. Nat. Genet. 29, 229– 232 (2001).
Rioux, J.D. et al. Genetic variation in the 5q31 cytokine gene cluster confers susceptibility to Crohn disease. Nat. Genet. 29, 223– 228 (2001).
Fisher, R.A. Statistical Methods for Research Workers 10th edn. (Oliver and Boyd, London, 1946).
We thank all cooperating clinical centers, clinicians, families and individuals with IBD and the German Crohn's and Colitis Foundation (DCCV) for their participation; J. Papp for microsatellite genotyping; A. Andersson, B. Petersen, A. Dellsén, T. Engler, M. van Giezen, Å. Jägervall, T. Kim, N. Tepe and T. Wesse for technical assistance; M. Will and T. Lu for bioinformatics support; F. Friedrichs for assistance in statistical analysis; O. Bengtsson and K. Forsman-Semb for discussions; and M. J. Daly for the Haploview software. This study was supported by the 5th Framework Program of the European Commission and the Federal Ministry of Science and Education through the National Genome Research Network and the Competence Network “Inflammatory Bowel Disease”, and by a coordinated research group of the German Research Foundation (DFGFOR423).
B.C., B.M., S.P. and M.L.-F. work for AstraZeneca, a pharmaceutical company, and G.H.W. and D.S. work for Conaris Research Institute AG. These authors have indirect interests in the intellectual property generated here.
About this article
Cite this article
Stoll, M., Corneliussen, B., Costello, C. et al. Genetic variation in DLG5 is associated with inflammatory bowel disease. Nat Genet 36, 476–480 (2004). https://doi.org/10.1038/ng1345
This article is cited by
Biology Direct (2016)
Caenorhabditis elegans susceptibility to gut Enterococcus faecalis infection is associated with fat metabolism and epithelial junction integrity
BMC Microbiology (2016)
Variant detection and runs of homozygosity in next generation sequencing data elucidate the genetic background of Lundehund syndrome
BMC Genomics (2016)
Meta-analysis of associations between DLG5 R30Q and P1371Q polymorphisms and susceptibility to inflammatory bowel disease
Scientific Reports (2016)
Systematic meta-analyses and field synopsis of genetic and epigenetic studies in paediatric inflammatory bowel disease
Scientific Reports (2016)