Introduction

Hirschsprung’s disease (HSCR, OMIM 142623) represents the main genetic cause of functional intestinal obstruction, with an incidence of 1/5000 live births. This developmental disorder is a neurocristopathy and is characterized by the absence of the intestinal ganglion cells of the nerve plexuses in variable lengths of the digestive tract. The disease usually presents in infancy, although some patients present with persistent, severe constipation later in life. Symptoms in infants include difficult bowel movements, poor feeding, poor weight gain, and progressive abdominal distention (Chakravarti and Lyonnet 2002).

The length of the aganglionic segment of the large intestine has been used to classify the disease. In short segment HSCR (S-HSCR, 80% of cases), aganglionosis does not extend beyond the upper sigmoid, while in long segment (L-HSCR, 20% of cases), the aganglionosis extends more proximally. There are also rare cases, classified as total colonic aganglionosis (TCA) or total intestinal aganglionosis. Hirschsprung’s disease most commonly presents sporadically, although it can be familial (approximately 20% of cases), and it is frequently associated with many other neurocristopathies and with chromosomal abnormalities. Males are affected fourfold more often than females, a difference most prominent in S-HSCR (Amiel et al. 2008).

Hirschsprung’s disease is a heterogenic disorder, and a number of genes have been shown to play a role in the disease etiology (Parisi and Kapur 2000; Heanue and Pachnis 2007). Much genetic and functional evidence points to the RET gene (MIM 164761), located on chromosome 10q11.21, as a major disease-causing locus in HSCR (Edery et al. 1994). RET encodes a receptor tyrosine kinase that is expressed in cell lineages derived from the neural crest and which plays a crucial role in the regulation of cell proliferation, migration, differentiation, and survival during embryogenesis and functions as a receptor for growth factors of the glial cell line-derived neurotrophic factor (GDNF) family (Iwashita et al. 2001).

Nonetheless, in most studies, mutations in the RET coding sequence are identified only in 50% of familial and 10–20% of sporadic cases, which in total is only a small fraction of the HSCR patient population (Angrist et al. 1995; Gabriel et al. 2002). Despite this, several linkage and association studies still point to RET as a major HSCR gene (Bolk et al. 2000; Gabriel et al. 2002). These studies have indicated that the RET locus is linked to the disease in almost all familial cases, regardless of their mutation status, and is also associated with HSCR in a large proportion of patients with sporadic HSCR, who do not have RET coding mutations. These results suggest the existence of common non-coding RET variants or common polymorphisms that act in HSCR pathogenesis.

Several recent reports have shown that one haplotype, which starts in the 5′-region of the RET locus and spans approximately 27 kb [4 kb encompassing the 5′-untranslated region (UTR) and exon 1, 23 kb encompassing intron 1 and exon 2], is associated with the disease. The associated haplotype has a comparable frequency in all European populations (56–62%), while in Chinese Hong Kong populations, the frequency is substantially higher (85%) (Borrego et al. 2003; Garcia-Barcelo et al. 2003, 2005; Burzynski et al. 2004; Fitze et al. 2003; Pelet et al. 2005).

The associated haplotype was reconstructed with single nucleotide polymorphism (SNP) marker alleles. Some of these SNP alleles have been studied in more detail to determine a possible functional effect, such as rs1800858 located in exon 2 and two promoter SNPs, rs10900296 and rs10900297 (also known as SNP-5 G > A and SNP-1 A > C, respectively) (Fitze et al. 2002). However, none of these have been proven to be a real functional variant or to be in linkage disequilibrium (LD) with other closely located mutation(s).

Recent studies have suggested that the HSCR-associated RET mutations may be enhancer mutations, located approximately in the middle of intron 1 (Burzynski et al. 2005; Emison et al. 2005). Our previous study was consistent with this hypothesis. In particular, we further found three SNPs (rs2506005, rs2435357, rs2506004) in the enhancer region that were significantly associated with HSCR in Han Chinese patients (Zhang et al. 2007). However, other non-coding region RET variants in Han Chinese HSCR patients are largely unknown.

In the study reported here, we tested for allelic and haplotypic associations of RET with HSCR and investigated the possible role of haplotypes formed of eight SNPs in the pathogenesis of sporadic HSCR in a Chinese population.

Materials and methods

Subjects

This study was approved by the Ethics Committee of Zhejiang University, and all subjects gave informed consent for the genetic analyses. Blood samples were obtained from 125 unrelated patients (96 males, 29 females) diagnosed with sporadic HSCR. Their diagnoses were based on the histological examination of either biopsy or surgical resection material for the absence of enteric nerve plexuses. Twenty-six patients were affected with L-HSCR (including four with TCA) and 99 with S-HSCR. One hundred and forty-eight control DNA samples were obtained from a panel of unaffected individuals of Han Chinese backgrounds.

SNP selection and genotyping

We selected eight SNPs in the RET gene from the SNP database (http://www.ncbi.nlm.nih.gov/snp/). These markers included one SNP (rs741763: C > G, located 4 kb upstream of the transcriptional start site) and six SNPs located in intron 1 [IVS1 + 1813: G > A (rs2435365); IVS1 + 2846: C > A (rs1864410); IVS1 + 3460: C > T (rs2435364); IVS1 + 6000: C > A (rs2435362); IVS1-7593: G > A (rs752975); IVS1-2863: T > C (rs2505535)] and one coding SNP (rs1800858; in exon 2, also known as SNP2) (Fig. 1). Genomic DNA was isolated from peripheral blood leucocytes by standard procedures. Genotyping was performed by the ligase detection reaction, which has been demonstrated to be a highly specific and sensitive assay for SNP detection (Xiao et al. 2006). Fluorescent dye-labeled PCR products were used with an ABI PRISM 377 DNA Sequencer for genotyping, and data were analyzed using Genemapper software (Applied Biosystems, Foster City, CA).

Fig. 1
figure 1

Schematic representation of the RET gene and the locations of polymorphisms analyzed in this study. The RET gene structures are given in the upper part of the figure; the intron 1 region is magnified and approximate positions of typed single nucleotide polymorphisms (SNPs) are shown in the lower part

Statistical analysis

Deviation from Hardy–Weinberg equilibrium (HWE) was examined in controls by the χ2 test. Based on the logistic regression method, the case–control association of genotypes in five inheritance models (co-dominant, dominant, recessive, over-dominant, log-additive) was tested, and the odds ratios (OR) and 95% confidence intervals (95% CI) were tested using SNPstats software (availability: http://www.bioinfo.iconcologia.net/SNPstats) (Sole et al. 2006). Subsequent statistical analyses were performed using SHEsis software (http://www.analysis.bio-x.cn) (Shi and He 2005). D′ and r 2 were calculated to evaluate the magnitude of LD. Haplotype frequencies were estimated using the implementation of the EM algorithm coded into the haplo.stats package (http://www.mayoresearch.mayo.edu/mayo/research/biostat/schaid.cfm). The association analysis of haplotypes was similar to that of genotypes with logistic regression, and the results are shown as the OR and 95% CI. The analysis used a two-tailed estimation of significance. Statistical significance was defined as < 0.05.

Results

Nucleotide polymorphism analysis

For all eight SNPs, the genotypic distribution in controls conformed to HWE. Tables 1 and 2 summarize the allele and genotype frequencies of these eight SNPs. P values for Hardy–Weinberg proportions of the SNPs are also shown in Table 1. We found that a large proportion of our patients were homozygous at the markers that showed the highest association with HSCR, whereas these homozygous genotypes were comparatively lower in controls (Table 2). The ORs for the homozygote genotypes ranged from 3.909 to 21.523, whereas those for the heterozygote genotypes ranged from 0.982 to 6.087. Interestingly, only the A/G genotype for rs752975 had a significant difference in distribution between HSCR cases and controls.

Table 1 The eight SNPs of the RET gene investigated in the cases (n = 125) and controls (n = 148)
Table 2 Genotype frequencies of polymorphic variants of RET in cases (n = 125) and controls (n = 148)

To investigate the association of RET SNPs with HSCR phenotypes, the allelic distribution of the polymorphisms was analyzed between cases with L-HSCR and cases with S-HSCR. The variant C allele of rs2505535 was significantly under-represented in L-HSCR patients (= 0.015; Table 3). Other SNPs showed no significant difference between S-HSCR and L-HSCR patients.

Table 3 Allele frequencies (%) of the SNPs in S-HSCR and L-HSCR cases

When logistic regression was used to carry out association analysis after modeling of the SNPs effects as additive, dominant, or recessive, the best inheritance models were obtained according to the smallest AIC (Akaike information criterion) value (Table 4). For rs741763, rs2435364, and rs752975, the recessive model was accepted as the best inheritance model; for rs1864410, rs2435362, and rs2505535, the co-dominant model was accepted as the best inheritance model; for rs2435365 and rs1800858, the log-additive model was accepted as the best inheritance model. Interestingly, the C allele of rs2505535 was represented as a protecting allele.

Table 4 Association analysis of eight SNPs with HSCR using logistic regression

Pairwise linkage disequilibrium (LD)

Pairwise LD between the eight SNPs was calculated for cases and controls. D′ and r 2 for all possible pairs of SNPs are shown in Fig. 2a and b, respectively. We found strong LD (D′ > 0.75) among seven non-coding region SNPs (rs741763, rs2435365, rs1864410, rs2435364, rs2435362, rs752975, rs2505535).

Fig. 2
figure 2

Linkage disequilibrium (LD) analysis of SNPs in the RET region. The white bar above the SNP names represents the relative location of each SNP along the gene. Below the SNPs are haplotype blocks estimated by SHEsis software. a The number at the intersection of each pair of SNPs represents the pairwise D′ values between two SNPs. b The number at the intersection of each pair of SNPs represents the pairwise r 2 values between two SNPs

Haplotype analyses

Based on the single-locus association analyses, we estimated frequencies for the haplotypes consisting of the eight marker loci. Eighteen different haplotypes encompassing the eight SNPs were observed in cases and controls. Among these, five haplotypes showed significant differences between patients and controls (Table 5: haplotype-C, -G, -H, -P, -Q). Haplotype analysis showed that these eight risk-associated alleles were present in the same risk haplotype, haplotype-G. Its frequency was estimated to be 59.6% among patients and 18.1% among controls (< 0.01). Another haplotype, haplotype-H, which differs only from the risk haplotype in the eighth SNP, was present in 7.4% of patients and 22.1% of controls and appeared to be a protective haplotype (< 0.01). We also found two haplotypes, I and H, that were present only in cases and in controls, respectively, confirming the important role of non-coding markers in the development of HSCR.

Table 5 Haplotype frequencies of eight SNPs

Discussion

Hirschsprung’s disease is considered to be a genetic disease caused by genetic alterations. To date, at least 11 genes have been associated with sporadic or syndromic forms of HSCR. These ‘HSCR genes’ are generally related to the developmental program of neural crest cells and include the RET proto-oncogene, GDNF, neurturin (NTN), endothelin 3 (EDN3), endothelin receptor (EDNRB), endothelin-converting enzyme 1 (ECE1), transcriptional factors SOX10 and PHOX2B, Smad interacting protein 1 (SIP1), KIAA1279, and TITF1 (Parisi and Kapur 2000; Garcia-Barcelo et al. 2007; Heanue and Pachnis 2007).

Among these identified genes, RET has been proved to be a major genetic risk factor. Mutations in the other ten genes are rare and found primarily in syndromic HSCR cases. However, for the more common sporadic form of HSCR, RET coding mutations have been found in no more than 20% of patients, whereas non-coding mutations in RET are suggested to impart susceptibility in other HSCR cases (Brooks et al. 2005; Emison et al. 2005; Griseri et al. 2007).

Since the ‘haplotype’ in the 5′-region of the RET locus was identified, many studies have focused on the common variants in this halotype ( Fitze et al. 2003; Burzynski et al. 2004; Garcia-Barcelo et al. 2005; Pelet et al. 2005). These findings indicate the possible presence of a mutation(s) in the RET promoter or in its regulatory sequences. A number of studies have reported that predisposing haplotypes for HSCR are characterized by the presence of the A variant of SNP2 (Fitze et al. 1999; Garcia-Barcelo et al. 2003), suggesting that the A C A combination of the two promoter and the SNP2 alleles may represent the core haplotype associated with HSCR, acting as a modifying risk allele in the development of the condition.

To refine the mapping of such alleles and characterize their genetic actions, we investigated a sample of 125 sporadic Han Chinese HSCR cases using SNP analysis across the 5′-genomic domain of the RET locus, from the 5′-UTR to exon 2. Our results represent the first experimental evidence indicating that all of the seven non-coding SNPs are present in Chinese populations. Moreover, all seven markers are strongly associated with the disease, with significant differences in the frequencies of particular alleles in patients versus controls. Although our results are also similar to those obtained from a Caucasian population (Emison et al. 2005), they additionally indicate that the C allele of rs2505535 may play a protective role in the pathogenesis of HSCR in our Chinese population in comparison to that in the Caucasian population (Emison et al. 2005). It would appear that the T allele of rs2505535 acts as the associated allele for HSCR in our Chinese population.

Since these SNPs were first reported in a Chinese population, it is essential to calculate the inheritance models by logistic regression analysis and investigate the associations of RET SNPs with the HSCR phenotype for further study of HSCR in the Chinese population. We found that except for the C allele of rs2505535, there was no significant difference between S-HSCR and L-HSCR patients in terms of SNPs, suggesting that the non-coding region variants have little association with the severity of the disease. One of the more interesting findings of our study is the special role of rs2505535 in the pathogenesis of the Chinese HSCR population, which differs from previous studies. By checking allele frequency data for rs2505535 on HapMap, we found that the allele frequencies are rather different between different ethnic groups, with the G allele frequencies being 0.97, 0.8, 0.456 among African–American, Caucasian, and Chinese, respectively. These results imply that even if the SNP is associated with genetic factors involved in the development of HSCR, the latter may be different and variable in different populations. To date there has been little information reported on the relationship between the non-coding variants and the phenotype of disease. Whether this specific genetic factor is involved in the severe HSCR phenotype in Chinese population requires further investigation.

The vast majority of informative sporadic HSCR cases have been found to be homozygous for the predisposing RET 5′-haplotype. A large proportion of our cases proved to have homozygous genotypes, whereas these homozygous genotypes appeared at a comparatively low frequency in the controls. This is especially the case for rs752975, where the A/A genotype revealed a highly increased risk for the development of HSCR (OR > 20), and for rs1864410, where the OR for the A/A genotype was 9.6. This high risk may be related to features of their structures: both are located in conserved regions that contain binding sites for relevant transcription factors. Interestingly, rs1864410 is recognized by ubiquitous transcription factors, such as SP1-erk1(1) and AP2/SP1 (Burzynski et al. 2005). These data suggest that rs1864410 and rs752975 may function as candidate disease-associated variants.

Unexpectedly, rs1800858 was not in LD with the seven other non-coding SNPs in our population. This result deviates from that found in previous studies (Burzynski et al. 2004). There are little data on these seven non-coding SNPs; however, they may account, in part, for the ethnic-dependant prevalence of the disease. Future studies will be required to clarify this point.

Predisposing haplotypes for HSCR have been extensively studied. Burzynski et al. (2004) reported haplotypes of 13 DNA markers within and flanking RET in a European population. These researcher found six markers (rs741763, SNP-5, SNP-1, rs2435362, rs2565206, and rs1800858) in the 5′-region of RET that were strongly associated with the disease. The largest distortions in allele transmission were found at the same markers, suggesting that the 5′-UTR of RET may play a key role in HSCR, even when no RET mutation was found (Burzynski et al. 2004).

In a subsequent study, these same researchers identified 84 sequence differences by sequencing the haplotype region in a patient and in a control homozygous for the risk and the non-risk haplotypes, respectively. They found that only one region, located in intron 1, was conserved among all species (including avians) and that only one SNP, rs2506004, was present in this region (Burzynski et al. 2005). However, they neither tested larger samples nor performed a functional analysis. Fortunately, Emison et al. recently genotyped 29 SNPs in 126 families of an American population; these researchers found that12 of these were localized in the 5′-UTR region and in intron 1 of RET. They also showed that two SNPs (rs741763 and rs2505997) in the 5′-UTR region, six SNPs (rs2435365, rs2435364, rs2435362, rs2435357, rs752975, and rs2505535) in intron 1, and one SNP (rs1800858) in the RET protein-coding region have the greatest statistical significance and the largest transmission distortions. The highest transmission frequency was observed for one of the alleles from marker rs2435357 (also known as the enhancer mutation), located in intron 1 (Emison et al. 2005). The frequency of the disease-associated haplotypes in the consortium controls corresponded to the frequency of the rs2435357 ‘T’ allele (approx. 24%) in the world population, estimated on the basis of HapMap data. A much higher frequency for the longer associated haplotype was observed in Chinese patients (72–76%) (Emison et al.2005; Zhang et al. 2007).

In our study, we also evaluated the RET ‘risk haplotypes’ present in a Han Chinese population in the 5′-UTR of RET. Interestingly, there was one haplotype in the 5′-region of RET—G A A T A A T A—that was over-represented in Han Chinese patients with HSCR. If this haplotype is functional, 59.6% of the Han Chinese patients could share the same genetic aetiology. Our results indicate that our identification of the haplotype in the Han Chinese population was in accordance with the ‘risk haplotype’ reported in the Caucasian population, but with different markers. In addition, our results failed to show a relatively higher frequency in the ‘haplotype’ reported by Garcia-Barcelo et al. (2005). This difference may be due to the different markers we used.

In conclusion, we observed a very strong association for eight SNPs in the enhancer and 5′-region of RET. An increased risk of HSCR was most evident for patients homozygous for the associated alleles. The haplotype consisting of these markers showed similar results, indicating that a strong founder effect is present in our population. The alleles and, consequently, the haplotype, found in our study are similar to those found by others who analyzed the Caucasian population. Our results lead us to conclude that non-coding mutations in RET play important roles in the development of HSCR. The unknown functional disease variant(s) with a dosage-dependent effect in HSCR is likely located between the enhancer region and exon 2 of the RET gene.