Introduction

Hirschsprung’s disease (HSCR; MIM no. 142623) is the congenital absence of ganglion cells in the submucosal and myenteric plexi of the gut.1 The length of the aganglionic segment is variable,2 and in 70% of cases HSCR is an isolated trait.3 Overall prevalence of HSCR is estimated at 1/5000 live births.3 HSCR is a multifactorial disorder exhibiting non-Mendelian inheritance and low, sex-dependent penetrance with male preponderance.4 The high recurrence among siblings and the occurrence of HSCR as part of the phenotype of various syndromes suggest the importance of genetic factors.1, 4

RET proto-oncogene (RET), which encodes a receptor tyrosine kinase, is the main gene implicated in HSCR.5, 6 Approximately 50% of familial cases and 7–35% of nonfamilial cases have loss-of-function germline RET mutations.7, 8 Common variants in the RET promoter (rs10900296 and rs10900297), at a SOX10-binding site in intron 1 (rs2435357) and in exon 2 (rs1800858; c.135G>A; p.A45A) have also been associated with HSCR,9, 10 suggesting that common as well as rare variants might influence the occurrence of HSCR.

HSCR is attributed to impeded migration of enteric neural crest cells (ENCCs) through the embryonic hindgut between weeks 5 and 12 of gestation.11, 12 Animal studies indicate that the GDNF-GFRA1-RET signaling pathway (in which RET forms a ligand/receptor complex with one of its ligands, GDNF and its co-receptor, GFRA1) is important to the survival, proliferation and migration of ENCCs in the developing gut.11, 13, 14 Other genes may also be involved. Knockdown of the transcription factor achaete–scute complex homolog 1 (Drosophila; Ascl1) in mice embryos retards the differentiation of myenteric neurons in the intestine.15 Disruption of the transcription factors, homeobox B5 (Hoxb5) and paired-like homeobox 2b (Phox2b), and the L1 cell adhesion molecule (L1cam), results in the delay or failure of migration of ENCCs to the distal intestine in mice embryos.16, 17, 18 In cell culture, Prok1, which encodes the secreted protein prokineticin 1, induces ENCC proliferation and differentiation; this effect on proliferation is eliminated by knockdown of its receptor, Prokr1.19

Given the potential importance of common genetic variants in HSCR, and the failure to identify disease-causing rare mutations in most nonfamilial HSCR cases, our objective was to examine associations between HSCR and single-nucleotide polymorphisms (SNPs) in candidate genes (ASCL1, HOXB5, L1CAM, PHOX2B, PROK1 and PROKR1) for which there is evidence of a role in the proliferation, migration and differentiation of ENCCs. We also investigated differences in the associations between selected RET SNPs and HSCR by race/ethnicity because such differences might exist but have received little attention.

Materials and Methods

Subjects

This was a population-based, nested case–control study that included HSCR cases born from 1998 through 2005 and identified from the New York State Congenital Malformations Registry. Physicians and hospitals are mandated by law to report birth defect cases that come to their attention if the child is under 2 years of age and was born, or resides, in New York State. Cases had to have at least one British Pediatric Association code for HSCR (751300, 751310, 751320 and 751330) in the registry records. There were 420 live-born HSCR cases among 2 023 083 resident live births (1 case per 4817 live births) in New York State from 1998 to 2005. A total of 32 (7.6%) HSCR cases with chromosomal anomalies (all Down syndrome) and 81 (19.3%) cases with other major congenital malformations were excluded. The remaining 307 cases had HSCR as their only major congenital malformation (isolated HSCR cases); 1 HSCR case was subsequently excluded because of missing data on maternal race/ethnicity. A random sample of controls was frequency-matched to HSCR cases by race/ethnicity at a control:case ratio of 4:1, yielding 1216 controls. Controls had no congenital malformations and were selected from the New York State Newborn Screening Program’s records for the birth years 1998–2005.

New York State birth certificates were obtained for all study subjects and were linked to the records of the New York State Newborn Screening Program for retrieval of archived residual dried blood spots. One case could not be matched, and another case and one control were mismatched. After exclusion of these subjects, 304 cases and 1215 controls remained.

We considered the possibility that monozygous twins discordant for HSCR might have genetic differences that result in one twin, but not the other, being affected with HSCR. Therefore, the unaffected siblings from the same gestation as HSCR cases (12 twin and 2 triplet sets) were also included to permit comparison of genetic data between monozygous twin pairs discordant for HSCR. Data from unaffected siblings were not used in statistical analyses.

After records were matched and biological specimens were processed, the specimens and associated data were made anonymous. This study was approved by the Institutional Review Board of the New York State Department of Health and reviewed by the Office of Human Subjects Research at the National Institutes of Health.

DNA extraction

DNA was extracted from 3 mm-diameter segments punched from the dried blood spots. Extraction involved the removal of cellular debris and DNA precipitation with sodium hydroxide.

Identity testing

Births from the same gestation were tested for zygosity by genotyping one sex marker and 13 short tandem repeat loci using the AmpFlSTR COfiler and Profiler plus PCR amplification kits (Applied Biosystems, Foster City, CA, USA). Four pairs of monozygous twins (all male) discordant for HSCR were identified.

RET sequencing

RET exons and flanking regions in introns were sequenced for all 304 cases and the four unaffected siblings of monozygous twin pairs discordant for HSCR (conditions and primers20, 21, 22, 23 described in Supplementary Information and Supplementary Table 1). Sequencing was also performed for 10 randomly selected controls to assess RET sequence diversity among unaffected individuals and to check that there were no systematic sequencing errors among cases. In addition, exon 1 of RET was sequenced for all controls to obtain genotypes for the rs10900296 and rs10900297 promoter SNPs. We used GenBank reference sequence NG_007489.1 for genomic DNA and NM_020975.4 for cDNA. Nucleotides were numbered with +1 representing the A of the ATG translation initiation codon (codon 1) of the reference cDNA sequence. The bioinformatic tools, PolyPhen-2 and SIFT, were used to predict the effects of novel RET missense variants.24, 25 Human Splicing Finder was used to predict the effects of novel variants on mRNA splicing.26

Genotyping

A total of 37 haplotype-tagging SNPs in the six candidate genes were genotyped (listed in Supplementary Table 2). SNPs with a minor allele frequency of 0.1 and r2<0.8 were selected based on the HapMap European, Chinese, Japanese and Yoruban populations to permit representation of genetic variation in the race/ethnic groups that make up the study population. In addition to the two exon 1 SNPs, five SNPs in RET were genotyped (listed in Supplementary Table 2). The seven RET SNPs were chosen because they had been reported to be associated with HSCR.9, 27, 28 Whole-genome amplification and genotyping of DNA was performed by KBiosciences (Herts, UK) (conditions described in Supplementary Information).

Tests for deviation from Hardy–Weinberg equilibrium (HWE) were performed for all 44 SNPs, separately for cases and controls and stratified by race/ethnicity within each group, considering adjustment for multiple comparisons using the Bonferroni method (352 tests: P<0.00014). In non-Hispanic white cases, PROKR1 rs6722313 and RET rs10900296, rs1864410, rs2435357 and rs1800858 were not in HWE. In non-Hispanic white and Hispanic controls, PROKR1 rs6722313 was not in HWE and was excluded from further analyses. No deviations from HWE were observed for other race/ethnic groups. The lack of HWE for selected RET SNPs in cases has been described in other reports that have examined their association with HSCR,28, 29 and is expected because of the strong relationship between RET and HSCR.

For each race/ethnic group, linkage disequilibrium (LD) measures were estimated using Haploview based on the genotypes of controls.30

Statistical analysis

The main statistical analysis included 1215 controls and 301 unrelated, isolated cases. The case group comprised the older sibling from each of three case sibling pairs (from different gestations) and 298 unrelated cases. Data on maternal and infant characteristics were obtained from the birth certificates and compared between case and control groups using Fisher’s exact test. Characteristics that could be biologically relevant to birth defects and that had P-values <0.1 in bivariate analyses were included as covariates in regression models; because infant sex was not considered to be a cause of birth defects, it was not included as a covariate in the models. Logistic regression was used to compare genotype distributions between cases and controls and to estimate odds ratios (OR) and 95% confidence intervals. Homozygosity for the major allele was the reference group with which being heterozygous and being homozygous for the minor allele were compared. Analyses were performed for the overall group of study subjects, and separately by race/ethnic group. Analyses involving all case and control infants were adjusted for race/ethnicity. Subjects whose race/ethnicity was categorized as ‘other’ were not analyzed separately because of small numbers.

Additional analyses included the younger case sibling from each of the three sibling case pairs; generalized estimating equations were used to account for the relatedness between siblings. Statistical analyses were performed using the SAS software, version 9.2 (SAS Institute, Cary, NC, USA).

Haplotype analyses were performed using the HPlus software (http://cdsweb01.fhcrc.org/HPlus/); these analyses involved only unrelated individuals and included the same covariates as the genotype analyses. The most frequent haplotype among controls was used as the reference for calculating ORs and 95% confidence interval. Only haplotypes with a frequency >0.01 among cases or controls were considered in the analyses. Genotype and haplotype analyses involving SNPs in L1CAM, a gene on the X chromosome, were performed for males and females separately.

All analyses were repeated excluding subjects with rare RET variants and restricting to singleton births to determine whether these factors influenced the results. The Bonferroni method was used to adjust for multiple testing (43 tests; P<0.0012).

Results

Case mothers were more likely than control mothers to be multiparous (Table 1). The two groups did not differ significantly by maternal age, race/ethnicity, education, maternal diabetes, use of in vitro fertilization or other assisted reproductive techniques, plurality or birth year. There were more males among cases than among controls; the sex ratios were 2.46 and 1.07 for the case and control groups, respectively.

Table 1 Comparison of characteristics between HSCR cases and controls

RET coding and splice-site variants

A RET coding or splice-site variant was present in 38 (12.5%) of 304 cases; the variants were heterozygous in 37 of the 38 cases. A total of 34 cases had one variant each and 4 cases had two variants each. In all, 32 different coding and two different splice-site variants were observed (Table 2). We searched for these variants in databases of genetic variants and in previous reports7, 31, 32, 33, 34, 35, 36 to determine whether any were novel. The databases included the Human Gene Mutation Database (www.hgmd.cf.ac.uk), the Multiple Endocrine Neoplasia type 2 RET proto-oncogene database,37 dbSNP (www.ncbi.nlm.nih.gov/projects/SNP), 1000 Genomes (www.1000genomes.org) and the National Heart, Lung and Blood Institute Exome Sequencing Project database (http://evs.gs.washington.edu/EVS/).38 There were 17 coding variants and one splice-site disruption variant that have not been previously reported; each was observed in only one individual. In all, 20 of the 27 missense variants are predicted by PolyPhen-2 or SIFT or both to disrupt protein function. The nonsense and frameshift variants are potentially damaging, as well as the c.1759+1G>A and c.1879+1G>A variants located at the first base pair of introns and predicted by Human Splicing Finder to disrupt a splice site. Of the 16 previously reported variants, 9 (p.L56M, p.A386V, p.G446R, p.L452I, p.Y791F, p.V804M, p.P841L, p.R886Q and p.R982C) were present in the National Heart, Lung and Blood Institute Exome Sequencing Project database. Variants in this database were identified by sequencing exomes in 5379 DNA samples obtained from European-American and African-American individuals that had participated in large epidemiological studies.38 In the database, the minor allele frequency was 1.7% for p.R982C but was <1% for the other eight variants. This indicates that the minor alleles of these nine variants are likely to be rare in the general population.

Table 2 RET coding and splice-site variants in patients with HSCR

RET variants in controls, non-twin siblings and monozygous twins

Of the 10 controls sequenced for RET, only 1 had a coding variant and none had a splice-site variant. The coding variant (c.1465G>A; p.D489N) has been reported previously (dbSNP rs9282834) and is predicted to be benign by PolyPhen-2 and SIFT. This variant was not observed in any of the HSCR cases.

RET missense, nonsense, frameshift and splice-site variants were not observed among the three pairs of case siblings (from different gestations). However, the siblings from one pair were both heterozygous for the previously unreported c.654G>A (p.P218P) variant, which is predicted by Human Splicing Finder to generate a cryptic splice site. This variant was also observed in 10 other cases.

There were no differences in either RET coding sequences or genotypes for the common variants in RET and the candidate genes between monozygous twins (N=4 pairs) discordant for HSCR. One pair had the p.Y146H variant that has been reported previously and is predicted to be benign by PolyPhen-2 and SIFT. This pair also had the c.654G>A (p.P218P) variant.

Case characteristics according to presence of RET coding and splice-site variants

Race/ethnicity, sex and other characteristics for HSCR cases with (N=38) and without (N=263) RET coding and splice-site variants are shown in Table 1. There were no statistically significant differences between controls and the cases with RET coding and splice-site variants. Cases in whom these RET variants were absent were more likely than controls to have mothers who were multiparous and smoked during pregnancy. Both groups of cases had more males than females but the comparison with controls was only statistically significant in the group without RET variants. We also calculated minor allele frequencies for the 43 SNPs in RET and the six candidate genes, and compared them between the two groups of cases (Supplementary Table 3). We found no comparisons that remained statistically significant after adjustment for multiple testing.

Associations with RET SNPs by race/ethnicity

Table 3 presents ORs and 95% confidence interval for the associations between HSCR and RET SNPs. Having at least one copy of the minor allele of six of the seven RET SNPs was associated with HSCR in study subjects overall, and in non-Hispanic white, Hispanic and Asian subgroups (Table 3). The strongest associations were observed for having two copies of the minor allele of rs10900296, rs1864410, rs2435357 and rs1800858: all OR point estimates were >10 and P-values ranged between 10−3 for the smallest subgroup (Asians) to 10−31 for study subjects overall. These associations in study subjects overall, non-Hispanic whites and Hispanics, and the association with rs1800858 in Asians, remained statistically significant after adjustment for multiple testing. There was variation in the magnitude of ORs by race/ethnicity. Although some ORs were elevated for African-Americans, there were no statistically significant associations between any of the seven RET SNPs and HSCR in this subgroup. For six of the seven SNPs there was a low frequency of individuals homozygous for the minor allele among African-Americans (Supplementary Table 4).

Table 3 ORs and 95% CIs for associations between RET SNPs and HSCR, by race/ethnicitya,b

SNPs rs1864410, rs2435357 and rs1800858 were in strong LD with each other in all race/ethnic groups (all r20.80). They were also in strong LD with rs10900296 in the non-Hispanic white, Hispanic and Asian subgroups (all r2>0.70) but not in African-Americans (all r2<0.40).

Genotype–phenotype associations for other candidate genes

Table 4 shows P-values, calculated from two-degree-of-freedom tests in logistic regression, comparing SNP genotypes between cases and controls. Based on a nominal P-value <0.05, some of the SNPs in the candidate genes involved in ENCC proliferation and migration were associated with HSCR and these associations varied by race/ethnicity (number of subjects with each genotype is shown in Supplementary Table 5). ASCL1 SNPs were associated with HSCR in non-Hispanic whites (rs1874875; P=0.015) and African-Americans (rs17450122; P=0.029). In addition, PROK1 rs7513898 was associated with HSCR in African-Americans (P=0.044).

Table 4 P-values for associations between HSCR and SNPs in candidate genes for enteric nervous system development, including (+) and excluding (−) cases with RET coding and splice-site variantsa

As we wanted to determine whether the SNPs were associated with HSCR among cases that did not have RET variants that might cause HSCR, we repeated the logistic regression analyses excluding the 38 cases with RET coding and splice-site variants. In addition to the findings already noted for ASCL1 and PROK1 SNPs, L1CAM rs4646265 was associated with HSCR in females among study subjects overall (P=0.0094) and among non-Hispanic whites (P=0.020). Moreover, HOXB5 rs4793943 (P=0.034), rs4793589 (P=0.033), rs872760 (P=0.034) and rs1529334 (P=0.036), and PHOX2B rs6811325 (P=0.049) were associated with HSCR in Hispanics.

Among Hispanics, three of the four HOXB5 SNPs (rs4793943, rs4793589 and rs872760) were in strong LD (r2>0.9) with each other and were in moderately strong LD with HOXB5 rs1529334 (r2=0.77–0.79).

Except for RET, none of the associations in the candidate genes were statistically significant after adjustment for multiple comparisons using the Bonferroni method. Similar results were obtained after including the three younger case siblings, and after restricting the analyses to singleton births.

Haplotype–phenotype associations

Haplotypes with the RET rs10900296 minor A allele (in non-Hispanic whites), and the rs10900296–rs10900297–rs1864410 A–C–A alleles (in Hispanics and Asians) were associated with HSCR (Supplementary Table 6). RET haplotypes were not associated with HSCR in African-Americans. The HOXB5 rs4793943 minor G allele and ASCL1 rs2291854 minor T allele also differentiated risk haplotypes in Hispanics. In African-Americans, ASCL1 haplotypes associated with HSCR had the major A allele for rs9782; the haplotype with the strongest association (P=0.005) also had the minor G allele for rs17450122.

Discussion

Most previous studies of HSCR have focused on RET because of the crucial importance of RET signaling in enteric nervous system development. However, attention must be given to other genes for several reasons: our data and previous studies show that only a small proportion of HSCR cases have known RET coding sequence mutations,7, 8, 39 penetrance differs by sex4 and the correlation between specific RET mutations and HSCR severity varies.40 Genes that regulate ENCC proliferation, migration and differentiation are strong candidates because their disruption in animals leads to phenotypes that resemble HSCR in humans.15, 16, 17, 18, 19 We confirmed associations between HSCR and common variants in HOXB5 and PHOX2B, and observed that associations with RET SNPs varied by race/ethnicity. After adjustment for multiple comparisons, many associations with RET SNPs remained statistically significant but our findings for variants in other candidate genes did not. Others have reported associations between HSCR and SNPs in HOXB5 and PHOX2B,16, 41, 42 evidence that suggests that common variants in these genes could be involved in HSCR. We have extended the investigations of previous studies by using a large population-based sample of HSCR cases, examining SNPs in additional candidate genes and exploring associations in multiple race/ethnic groups. We have also provided precise estimates of the prevalence of HSCR among live births and the proportion of cases with other birth defects, based on a consecutive case group born over an 8-year period. These estimates are in the range reported by others using data collected from smaller cohorts.3, 43, 44

Animal studies suggest that there are interrelationships between the candidate genes we studied and RET expression. In cultures of rat neural crest stem cells, Ascl1 induces Ret expression and promotes neurogenesis.45 Hoxb5 disruption in mouse neural crest cells leads to reduced Ret expression and impaired migration of the cells through the embryonic gut.16 Phox2b inactivation results in downregulated expression of Ascl1 and Ret in mouse embryonic ENCCs.17 In humans, a genome-wide association study conducted in a Chinese population also found an interaction between another gene (NRG1 that encodes neuregulin 1) and RET.46 Two SNPs in NRG1 were associated with HSCR if subjects were also homozygous for the minor T allele of RET rs2435357. These interrelationships suggest that variants in the selected candidate genes could influence RET signaling in humans and affect HSCR risk. Therefore, a more comprehensive examination of both the rare and common variants in these genes would be worth further investigation.

In our population-based sample of HSCR cases, 34 RET coding and splice-site variants were identified, 18 (52.9%) of which were novel. Most of the 34 variants were heterozygous, and therefore dominant, in contrast to the recessive effects we observed for common variants in RET and the other candidate genes. Notably, there were no differences between members of monozygous twin pairs discordant for HSCR with regard to coding, splice-site and common variants in RET and common variants in the candidate genes. Possible reasons for HSCR discordance include de novo mutations in other genes involved in enteric nervous system development, the influence of epigenetic factors and differences in intrauterine insults experienced by each twin.

Emison et al.10 observed differences by race/ethnicity in the association between RET rs2435357, which disrupts an enhancer site in intron 1, and HSCR. The minor allele was twice as frequent in haplotypes transmitted to Chinese than European cases and this correlated with the twofold higher minor allele frequency in chromosomes from Chinese than European individuals. We added to these findings by including other race/ethnic groups in our analysis of RET SNPs. We found that RET SNPs were associated with HSCR among all race/ethnic groups except African-Americans. For six of the seven SNPs tested, the minor allele was least frequent in African-Americans. Therefore, the small number of African-American individuals that were homozygous for the minor allele could have contributed to the lack of association between these SNPs and HSCR in this group.

A major strength of this study was the large, population-based sample of cases and controls. The case group is a consecutive sample from all live births in New York State. In a previous report, the New York State Congenital Malformations Registry ascertained at least 86.4% of cases when all types of major malformations were considered.47 Furthermore, our study included subjects of different race/ethnic groups to test for associations in each of these groups. The limitations of the study included the lack of medical record data; consequently, the extent of aganglionosis in cases could not be determined. Because of small sample sizes, there was low power to examine associations in some race/ethnic groups. In addition, we were unable to perform functional assessments of the genetic variants that we analyzed. As a result, we could not determine whether the RET coding and splice-site variants identified directly affected gene function.

In conclusion, we found that associations between common RET variants and HSCR varied by race/ethnicity: no association was present in African-Americans. We also confirmed previously reported associations with HOXB5 and PHOX2B, suggesting that interactions between RET and genes that regulate proliferation, migration and differentiation of ENCCs may be important in HSCR. From a population-based perspective, the minor alleles of the RET SNPs we studied are probably important to HSCR susceptibility in non-Hispanic whites, Hispanics and Asians but are unlikely to contribute to most cases in African-Americans, because the percentage of individuals homozygous for the minor alleles is very low. In addition, our results for monozygotic twins discordant for HSCR suggest that coding and noncoding regions of other genes, epigenetic changes and variation in the intrauterine environment need to be investigated as determinants of HSCR. Our findings for variants in HOXB5 and PHOX2B provide further evidence that genes regulating ENCC activity during gut development are key elements in the mechanism of HSCR. It is possible that SNPs in these genes could alter the penetrance of RET risk alleles; therefore future work should explore the potential functional effects of SNPs in these genes.