Introduction

Nonsyndromic oral clefts [cleft lip only (CLO, MIM 119530), cleft lip with cleft palate (CLP, MIM 119530), and cleft palate only (CPO, MIM 119540)] together represent one of the most frequently observed congenital anomalies. CLO and CLP are generally categorized into one entry, i.e., cleft lip with or without cleft palate (CL/P), because these two phenotypes are thought to have the same genetic etiology, while CPO may involve a set of genes different from those for CL/P. Oral clefts are classified into nonsyndromic and syndromic oral clefts (Schutte and Murray 1999). Approximately 70% of CL/P cases are nonsyndromic and 30% syndromic (Schutte and Murray 1999; Cobourne 2004). The prevalence of nonsyndromic oral clefts is 0.4–2.0/1,000 births (Schutte and Murray 1999; Natsume et al. 2000), with prevalence in Japanese seemingly higher than in other populations (Vanderas 1987; Tanabe et al. 2000). The occurrence of oral clefts is explained by the “multifactorial threshold model” involving both genetic and environmental factors (Cobourne 2004), but most of these factors remain unknown. Some genes causing syndromic oral clefts as a single gene defect have been identified, e.g., MSX1 (van den Boogaard et al. 2000), IRF6 (Kondo et al. 2002), PVRL1 (Suzuki et al. 2000), and TBX22 (Braybrook et al. 2001). These are good candidate genes for some instances of nonsyndromic CL/P and CPO because it has been shown that reduced activity of the proteins they encode can affect oral development (Lidral et al. 1997; Jezewski et al. 2003; Zucchero et al. 2004). In addition, there are other potential candidate genes, although, with the exception of TGFB3, which showed a positive association with nonsyndromic oral clefts in some populations (Sato et al. 2001; Beaty et al. 2002; Kim et al. 2003), an association between these genes and oral cleft in humans has not yet been reported. Disruption of CLPTM1 by a chromosomal translocation was reported in a patient with oral cleft (Yoshiura et al. 1998). Mutations in PAX9 were reported in some syndromic oral cleft patients (Das et al. 2003). DLX3 and TBX10 are involved in the development of mouse oral cleft (Juriloff et al. 2001; Bush et al. 2004). Although population-based, genome-wide case-control analysis and the transmission disequilibrium test (TDT) are powerful methods with which to find genes responsible for susceptibility to nonsyndromic oral clefts, a genotype–phenotype association depends on the population history (Freedman et al. 2004).

The aim of our study was to investigate the contribution of seven candidate genes (TGFB3, DLX3, PAX9, CLPTM1, TBX10, PVRL1, and TBX22) to oral clefts in the Japanese. Here, we present the results of mutation searches and association studies. This is the first report on intensive candidate gene analysis of oral clefts in the Japanese.

Materials and methods

Subjects

The subjects studied included 112 nonsyndromic Japanese CL/P patients (45 females and 67 males) and 16 CPO patients (10 females and 6 males), their parents (256; a total of 128 trios), and 192 phenotypically normal adults for the association study. To verify that the base change in PAX9 was not observed in healthy control samples, we searched for the mutation in 282 phenotypically normal controls. DNA from blood samples collected from patients and their parents at Tokyo Dental College Hospital and Nagasaki University Hospital, and samples from controls (volunteers) collected at Nagasaki City, was used for mutation searches and case-control studies. All sampling was performed with written informed consent. Diagnosis of CL/P or CPO was made through clinical inspections by well trained dentists and oral surgeons. The study protocol was approved by Institutional Review Board (IRB) for Ethical, Legal and Social Issues (ELSI) at each university/college.

Candidate genes for nonsyndromic oral clefts

TGFB3

TGFB3, the transforming growth factor-beta 3 gene, located at 14q24, is especially expressed in medial edge epithelium cells of the palatal shelves for normal fusion of the palatal shelves, and is required for adhesion of the medial edge epithelium and elimination of the midline epithelial seam of the palatal shelves (Proetzel et al. 1995). In addition, TGFB3 knock-out mice were previously reported to have a developmental defect of the secondary palate and delayed pulmonary development (Proetzel et al. 1995). Results of previous association studies on TGFB3 have remained controversial, with reports of both significant association with nonsyndromic oral clefts in various populations (Sato et al. 2001; Beaty et al. 2002; Kim et al. 2003) and negative association (Lidral et al. 1997; Tanabe et al. 2000).

DLX3

DLX (distal-less homeobox) genes encode transcription factors containing the homeodomain that plays an important role in early patterning of embryonic structures such as craniofacial tissues. Dlx3 belongs to the distal-less gene family in vertebrates, and its human homolog is DLX3 at 17q21(Kraus and Lufkin 1999). The mouse Dlx3 gene was identified from a candidate region for mouse nonsyndromic cleft lip by a linkage study (Juriloff et al. 2001). Although point mutations in human DLX3 cause tricho-dento-osseous syndrome without CL/P (Price et al. 1998), one function of DLX3 may be to interact with MSX1, which may be a gene causative for CL/P and CPO (Bryan and Morasso 2000; van den Boogaard et al. 2000). We chose DLX3 as a candidate because of its chromosomal localization and relation to MSX1.

PAX9

PAX9, the paired box gene 9 at 14q12-q13, encodes a transcription factor containing the DNA-binding paired domain (Peters et al. 1998). Mouse Pax9 is extensively expressed in the neural-crest-derived mesenchyme of the palatal shelves and tooth (Peters et al. 1998). Pax9 knock-out mice presented with secondary cleft palate, tooth agenesis, and other abnormalities (Peters et al. 1998). PAX9 mutations in human are reported to cause hypodontia involving molars that are frequently accompanied by CL/P (Das et al. 2003).

CLPTM1

CLPTM1, the cleft lip and palate associated transmembrane protein-1 gene at 19q13.2, was isolated from the breakpoint of a balanced chromosomal translocation [t(2;19)(q11.2;q13.3)] in a family where a CLP phenotype co-segregated with the translocation in a mother and her children but not in the maternal grandmother (Yoshiura et al. 1998). CLPTM1 may be involved with the immune system (Takeuchi et al. 1997), but its definitive function has not yet been identified. Eight rare and nine common variants of this gene were detected by a search for mutations in 74 unrelated patients with nonsyndromic CLP, but no significant association was obtained (Yoshiura et al. 1998). Nevertheless, as previous linkage analysis and TDT suggested that a CL/P locus maps to 19q13 close to CLPTM1 (Wyszynski et al. 1997), CLPTM1 is still a candidate gene for nonsyndromic oral clefts in some populations.

TBX10

TBX10, the T-box transcription factor-10 gene at 11q13.1, is a member of the T-box gene family that encodes DNA-binding transcription factors (Law et al. 1998). Members of this family are known to play essential roles in mesoderm structures specific to early human developmental stages (Papaioannou and Silver 1998). Bush et al. (2004) found that Dancer (Dc) mice exhibiting CL/P carry a spontaneous mutation in Tbx10, which is located near the centromere of mouse chromosome 19. The localization of human TBX10 syntenic to mouse Tbx10 is 11q13, where a susceptibility locus to nonsyndromic CL/P was identified by genome-wide affected-sib pair analysis (Prescott et al. 2000). Moreover, TBX10 strongly associates with TBX22, a known causative gene for CPO (Braybrook et al. 2001; Bush et al. 2004; Marcano et al. 2004).

PVRL1

Autosomal recessive CL/P-ectodermal dysplasia syndrome (CLPED1, MIM 225000), previously called Margarita Island ectodermal dysplasia (MIM 225060), is characterized by CL/P, dental anomalies, hand anomalies, hidrotic ectodermal dysplasia, and occasionally mental retardation (Bustos et al. 1991). Suzuki et al. (2000) identified a nonsense mutation (G546A; W185X), a deletion (546delG), and a duplication (959dupG) of PVRL1, the poliovirus receptor like-1 gene at 11q23, in CLPED1 families from Margarita Island in north Venezuela, in patients from Israel, and patients from Brazil, respectively. PVRL1 encodes a cell adhesion molecule, nectin-1, which is the principal receptor for alpha-herpes viruses (Suzuki et al. 2000). Sozen et al. (2001) found that the W185X mutation in PVRL1 is one of risk factors for nonsyndromic CL/P in the Cumaná region of northern Venezuela and in Margarita Island.

TBX22

TBX22, the T-box transcription factor-22 gene at Xq21.1 and a member of the T-box gene family, is important for both palatal and tongue development (Braybrook et al. 2001), and was shown to be expressed in the palatal shelves and tongue by in situ hybridization in both human and mouse (Braybrook et al. 2002). Functional loss of TBX22 causes X-linked cleft palate with ankyloglossia (CPX, MIM 303400) (Braybrook et al. 2001). Marcano et al. (2004) found three TBX22 mutations in CPX and CPO patients (Brazilians, North Americans, and Filipinos) from three geographically distinct populations, i.e., 105–106delGC and 581–582insCAG in one and two Brazilian CPX patients, respectively, and 548C>T (P183L) in an American CPO patient, indicating that TBX22 contributes significantly to CPO patients. A genome-wide sib-pair analysis for nonsyndromic CL/P also identified susceptibility loci in the Xcen-q21 region in which TBX22 is located (Prescott et al. 2000).

Mutation search and genotyping

We performed mutation searches in all 128 patients with CL/P or CPO by sequencing all exons and part of the introns of the seven candidate genes. PCR amplification was performed as follows: denaturation at 94°C for 1 min, followed by 40 cycles of 94°C for 30 s, 60–65°C for 30 s, 72°C for 30 s, and a final extension at 72°C for 7 min, using TaKaRa Ex TaqHS (TaKaRa, Shiga, Japan) according to the manufacturer’s protocol. Dimethyl sulfoxide (5–10%, v/v) was added for the amplification of GC-rich regions. PCR products were sequenced using a BigDye Terminator Cycle Sequencing Ready Reaction Kit v3.1 (Applied Biosystems, Foster City, CA), and run on an ABI 3100 automated sequencer (Applied Biosystems). Sequencing primers were designed inside the amplification primers. Sequencing electropherograms were aligned by ATGC software version 3.0 (Genetyx, Tokyo, Japan), and single nucleotide polymorphisms (SNPs) or mutations were analyzed visually. Patient’s parents and normal control individuals were genotyped using TaqMan assay on ABI 7900HT (Applied Biosystems) or direct sequencing on an ABI3100 sequencer (Applied Biosystems).

Statistical calculation for case-control study and TDT

Case-control analysis was performed for individual SNPs detected during the mutation search, or for haplotypes constructed from SNPs identified within a linkage disequilibrium block. All data from the analysis were calculated using SNPAlyze version 4.0 statistical software package (Dynacom, Mobara, Japan). Because many low-frequency haplotypes tended to show significant P-values, we adopted relatively common haplotypes (frequency >0.05) for further statistical analysis. In case-control analysis, we used the standard Bonferroni correction to adjust for multiple testing. We divided a type I error significant at level 0.05 by the number of the independent tests to give a Bonferroni-corrected P-value. TDT using individual SNPs or haplotypes was performed with FBAT version 1.5.5 software (http://www.biostat.harvard.edu/~clange/default.htm).

Results

Our mutation search revealed a possibly causative missense mutation, 640A>G, in exon 3 of PAX9 in two children from one family with nonsyndromic CL/P and their phenotypically normal mother. This mutation results in an amino acid change from serine to glycine at position 214 (S214G), and was not found among a total of 474 control samples (Fig. 1).

Fig. 1a–d
figure 1

A missense mutation found in PAX9 in a Japanese family. a Schematic representation of the structure of PAX9 showing the position of the mutation 640A>G (S214G) in exon 3. b Two children (JPKr21 and JPKr22) with CL/P and their mother without CL/P had the 640A>G (S214G) mutation. c Sequencing profile showing the A to G substitution in exon 3 of PAX9, resulting in a serine to glycine substitution (S214G). d Consensus amino acid sequence at the mutation site (boxed in red) of PAX9 in human, mouse, blowfish, chick, and frog

A total of 66 SNPs (23 SNPs in TGFB3, 4 in DLX3, 4 in PAX9, 12 in CLPTM1, 10 in TBX10, 9 in PVRL1, and 4 SNPs in TBX22), including 23 novel polymorphisms other than those registered in the NCBI (http://www.ncbi.nlm.nih.gov/SNP) and JSNP (http://snp.ims.u-tokyo.ac.jp/) databases were found during the mutation search (Table 1). Among these genes, those whose frequencies corresponded to a Hardy–Weinberg distribution in the normal control, and those showing linkage disequilibrium (LD) block structures, were used for case-control analysis in CL/P cases (Fig. 2). Data for CPO cases is not presented here as the number of such samples was too small.

Table 1 Single nucleotide polymorphisms (SNPs) in seven candidate genes and their allele frequencies between CL/P patients and normal control individuals
Fig. 2
figure 2

Linkage disequilibrium between single nucleotide polymorphisms (SNPs) of five of the seven candidate genes in Japanese CL/P patients. Data for two other genes or for SNPs with minor allele frequency (<0.2 in CL/P patients) were excluded. Absolute values of D′ and r2 calculated using the statistical software package SNPAlyze version 4.0 (Dynacom, Mobara, Japan) are shown above and below the diagonal, respectively. Values of |D′| >0.8 and of r2>0.5 are highlighted in gray

The case-control analysis revealed that 4 TGFB3 SNPs showed a significant association (P<0.01) with CL/P (Table 1). The lowest P-value was obtained at IVS1+5321 (P=0.0016) and the second lowest at IVS1+2118 (P=0.0024). Association at IVS1+5321 was not significant after the standard Bonferroni correction (0.0016>0.0008=0.05/66), but it was still significant when we divided by the number of SNPs in TGFB3 (0.0016<0.0022=0.05/23). Under such correction, IVS1+2118 is unlikely to hold statistical significance. We used these two SNPs for subsequent TDT, because their minor allele frequencies of >0.05 were sufficient to obtain statistically reliable analysis. Consequently, a significant association (P=0.0412) was obtained at IVS1+5321, albeit insignificant at IVS1+2118 (P=0.7630) (Table 2). The odds ratio for each SNP genotype was calculated, assuming that SNPs in TGFB3 have dominant or recessive effects. Table 3 shows the odds ratio from TGFB3-SNPs with the minor allele frequency >0.05 among CL/P patients and with P<0.05 in case-control analysis. Eight SNPs showed significant associations between “major-homozygotes” and “major-heterozygotes plus minor-homozygotes”. Moreover, IVS1+5321 showed the highest odds ratio of 2.34 [95% confidence interval (CI) =1.40–3.93]. These results suggest that TGFB3 plays a role with a recessive effect in the development of CL/P in the Japanese. In the other six candidate genes analyzed, no association was observed.

Table 2 Results of transmission disequilibrium test (TDT) using three SNPs in TGFB3
Table 3 Odds ratios with 95% confidence intervals (CI) of SNP genotypes in TGFB3. Only those SNPs with frequencies >0.05 in CL/P patients and P<0.05 in case-control analysis are listed. Two (IVS1+5321 and IVS1+5417) and three SNPs (IVS1−1572 and IVS1−1283 and IVS1−952) in TGFB3 are in complete linkage disequilibrium (LD; r2=1)

We next performed haplotype-based association analysis by selecting SNPs with minor allele frequency >0.2 and D′-value >0.5 within an LD block at the TGFB3 locus. Table 4 shows the results for two-SNP haplotypes including IVS1+5321 with a frequency of >0.05 in CL/P patients and with P<0.01. A haplotype “A/A” for the IVS1+5321/VS1−1572 loci gave the lowest P-value (P=0.00055), being lower than the Bonferroni-corrected P-value (0.00104=0.05/48). Haplotype “A/A” constructed from IVS1+5321/IVS1−1572 was more frequent in CL/P patients, although IVS1−1572 alone did not show a significant P-value in either the population-based analysis or TDT (Tables 1, 2). TDT using haplotypes “A/A” for IVS1+5321 and IVS1−1572 showed a significant association with CL/P (P=0.0252) (Table 5). The P-values obtained from haplotype-based association studies were lower than analyses using individual SNPs.

Table 4 Case-control analysis using haplotypes including IVS1+5321 in TGFB3. Only haplotypes with frequencies >0.05 in CL/P patients and P<0.01 are listed. Three SNPs (IVS1−1572, IVS1−1283 and IVS1−952) in TGFB3 are in complete LD (r2=1). Significant Bonferroni-corrected P-value is 0.00104, after dividing 0.05 by the number of haplotypes constructed from SNPs in TGFB3 (0.05/48)
Table 5 TDT using four haplotypes consisting of IVS1+5321 and IVS1−1572 in TGFB3

Discussion

In the present study, we performed comprehensive genetic analysis, including mutation searches and association studies, of seven candidate genes linked to CL/P and CPO in a Japanese population. The mutation search identified a missense mutation, 640A>G (S214G), in exon 3 of PAX9 in two children with CL/P and in their mother without CL/P. The mutation site in this family is located at an amino acid residue that is conserved across species, but is outside the obvious functional domain of PAX9. Since it was not found among 474 Japanese controls, the 640A>G mutation is likely causative for CL/P in these children, albeit inconclusively. Since many PAX9 mutations have been reported to be associated with molar missing (Stockton et al. 2000; Das et al. 2002) or with molar-hypodontia with CL/P (Das et al. 2003), PAX9 may play a role in molar genesis in humans. Unfortunately, re-evaluation of dental anomalies in the mother in this case was not possible as her clinical and examination data were unavailable. It is not surprising that a single gene exhibits variable expression affecting dental anomalies and CL/P, e.g., MSX1 mutations lead to various phenotypes of hypodontia, CL/P and/or CPO (van den Boogaard et al. 2000). Variable expression levels depend on penetrance, pleiotropic effects of the gene, or expression of modifier gene(s) (Carinci et al. 2003). A new conception recently proposed that phenotypic effects in most single-gene defects may result from the combined actions of oligo-locus alleles (Badano and Katsanis 2002).

Among the seven candidate genes analyzed for a possible association with CL/P, several SNPs in TGFB3 showed significant results, especially at the IVS1+5321 (P=0.0016) and IVS1+2118 (P=0.0024) sites (Table 1). Allele A at IVS1+5321 and allele A at IVS1+2118 were both seen more frequently in CL/P patients than allele G and allele T, respectively. Moreover, IVS1+5321 showed the highest odds ratio of 2.34 (95% CI=1.40–3.93) under an assumption of a recessive effect (Table 3). SNPs surrounding IVS1+5321 also showed high odds ratio under the same assumption. From these results, it is suggested that the major SNP allele in TGFB3 has a recessive effect for a risk of CL/P in the Japanese. The association at IVS1+5321 was also confirmed by both TDT (Table 2) and the case-control analysis using two-SNP haplotypes that include IVS1+5321 (Table 4). All these results indicate that TGFB3 is one of the genes linked to susceptibility to CL/P in the Japanese. Because P-values obtained by haplotype analysis were much lower than those obtained by analysis with individual SNPs, the most important SNP affecting CL/P may be located somewhere in an LD block defined by the IVS1+5321/IVS1−1572 haplotype in TGFB3.

There was a discrepancy between the case-control analysis and TDT in the results of IVS1+2118: the former indicated a significant association (P=0.0024), while the latter gave an insignificant result (P=0.7630) (Tables 1, 2). This may reflect the different statistical power between case-control analysis and TDT. With this assumption, a power calculation was performed using 112 trios (or cases) and 192 controls using Genetic Power Calculator (http://statgen.iop.kcl.ac.uk/gpc/; Purcell et al. 2003). When we applied a relative risk of 1.8 with a recessive effect, the 80% power was 0.62 for case-control analysis and 0.52 for TDT. The discrepancy may be due to the small number of samples examined, and thus we may have to collect more subjects for association analysis and TDT. In other words, we need 180 samples for the case-control test and 220 trios to achieve 80% power at type I error of 0.05. Population-based association studies are more sensitive than family-based studies such as TDT; however, if a positive association is ever obtained from a family-based study, it will provide strong evidence (Mitchell et al. 2002).

There have been many association studies on TGFB3 (Lidral et al. 1997; Tanabe et al. 2000; Sato et al. 2001; Beaty et al. 2002; Kim et al. 2003). Among them, four reports showed a positive association (Sato et al. 2001; Beaty et al. 2002; Kim et al. 2003), while others obtained negative results. Beaty et al. (2002) revealed a significant association between the D14S61 locus that is 100-kb from the 3′ end of TGFB3 and nonsyndromic CPO in Caucasian, African American, and other populations by TDT. In the Japanese population, Sato et al. (2001) revealed a positive association between nonsyndromic CL/P and a CA repeat polymorphic marker 60-kb from the 5′ end of TGFB3 (Lidral et al. 1997). Kim et al. (2003) reported that the allele G at the IVS5+104 site in TGFB3 increased the risk (odds ratio=15.92) of nonsyndromic CL/P in the Korean population, although their finding conflicts with the result of our case-control analysis in the Japanese (Table 1). To our knowledge, our study is the first to report a positive association between CL/P and TGFB3-SNPs reproduced by both population- and family-based analyses. Therefore, the results of our study are expected to be reliable and reproducible among Japanese CL/P patients.

Regarding the five genes examined in addition to PAX9 and TGFB3 (DLX3, CLPTM1, TBX10, PVRL1, and TBX22), mutations have been reported in some syndromic oral clefting patients, such as in familial CLPED1 patients from Margarita Island, Israel and from Brazil (Suzuki et al. 2000) as well as in patients with X-linked cleft palate with ankyloglossia (CPX) and CPO (Braybrook et al. 2001; Marcano et al. 2004). However, in the present study, no causative mutation or positive association was observed between oral clefts and individual SNPs in these genes. Because TBX22 maps to the X-chromosome, we performed statistical calculations for males and females individually, but no significant association was observed in either case. It is likely that these genes are not major factors playing a role in oral clefts in the Japanese.

In conclusion, we identified a novel mutation in PAX9 that may contribute to the development of nonsyndromic CL/P. We also demonstrated, by both population- and family-based analyses, a positive association between TGFB3 and nonsyndromic CL/P in the Japanese. Our results will assist in understanding the development and prevention of CL/P and CPO.