Introduction

Oral clefts are clinically and genetically heterogeneous disorders, involving both genetic and environmental factors. MSX1 and TGFB3 have emerged as candidate genes for oral cleft, based on expression studies1, 2 and animal models.3, 4 Their involvement is further supported by linkage and association studies in populations of different ethnic origin.5, 6, 7, 8 Although no deleterious mutation has been reported in TGFB3.7, 9 MSX1 has been suggested to function upstream of TGFB3 in the development of oral cleft.10

The aim of the current research was to investigate the genetic association of MSX1 and TGFB3 in a Malay population with non-syndromic orofacial cleft. The association study was performed using intragenic CA markers for MSX1 and TGFB3. The coding regions of MSX1 were also directly sequenced.

Materials and methods

Subjects

A total of 124 Malay families were enrolled in the study by the Reconstructive Sciences Unit of Hospital Universiti Sains Malaysia, Kelantan. Of these families, 90 complete case–parent trios had non-syndromic cleft of the lip, with or without cleft palate (CL/P), 22 incomplete case–parent trios also had non-syndromic CL/P but were missing a sample from at least one parent and 12 families had CP only (CPO). Among all cleft cases, 41 families had a positive family history of oral cleft. Only completed families were included in genotyping of the CA markers, whereas all 124 affected cases were included in direct sequencing. All subjects were prospectively recruited in a consecutive manner since 2008. Our center is a tertiary referral center, and cases with variable types of orofacial clefts were referred to us throughout the Kelantan state. Kelantan is bordered to the north by Narathiwat Province of Thailand, thus, increasing the likelihood of some population mixture. However, 95% of Kelantanese, and the majority of the Narathiwat Province population, have a Malay ethnic background. All patients with a minor anomaly, such as low-set ears, mild hypertelorism, clinodactyly and single palmar crease, were included in the study. Patients identified during clinical assessment by a team of specialists in the fields of craniofacial surgery, pediatrics and genetics to have heart disease, a known syndrome or other major or more than two minor defects were excluded from this study. In total 100 healthy volunteers, with neither cleft nor a family history of cleft, were used as controls. The study was approved by Human Ethics Committee of Universiti Sains Malaysia, and informed consent from subjects was obtained before taking blood samples.

Genotyping of MSX1 -CA and TGFB3-CA repeats

In total, 90 patient–parent trios with non-syndromic CL/P, excluding CP only and uncompleted CL/P families, were included for genotyping of MSX1-CA and TGFB3-CA repeats. The amplification of both intragenic CA repeats was performed using previously described primers for both MSX1-CA9 and TGFB3-CA repeats,11 with minor modifications. Multiplex-PCR amplification using fluorescent primers was performed using the Type-It Microsatellite Amplification Kit (QIAGEN, GmbH, Hilden, Germany). The PCR products were run on an ABI 3100 sequencer (Applied Biosystems, Foster City, CA, USA) and analyzed using GeneScan v3.0 (Applied Biosystems). Direct DNA sequencing was conducted to determine the size of the MSX1-CA repeat allele.

Mutation screening of MSX1

All 124 affected individuals with oral cleft (12 cases with CP only and 112 cases with CL/P) were included for mutation screening of MSX1. Specific primers were designed, via the Primer3 program (v. 0.4.0),12 to cover the coding regions of MSX1. Both exons were amplified by standard PCR using the following primers: exon-1 (forward: 5′-GCCAGTGCTGCGGCAGAA-3′, reverse: 5′-CGCCTGGGTTCTGGCTACTCACT-3′) and exon-2 (forward: 5′-TGATCATGCTCCAATGCTTCT-3′, reverse: 5′-ACCAGGGCTGGAGGAATC-3′). The amplification was carried out in a total volume of 25 μl, containing 40 ng of genomic DNA, 2% Q-solution (QIAGEN), 0.5 μM of each primer, 0.2 mM dNTPs, 2 mM MgCl2, 1.25 U of AmpliTaq Gold polymerase (Applied Biosystems) and 1 × PCR buffer supplied by the manufacturer. The Q-solution was applied only for the amplification of exon 1. The PCR conditions were as follows: initial denaturation for 10 min at 95 °C; followed by 35 cycles of denaturation at 95 °C for 30 s, annealing at 61 °C for 2 min (exon 2) or 64 °C for 2 min (exon 1) and elongation at 72 °C for 1 min; and final elongation at 72 °C for 10 min.

The PCR products were purified using the QiaQuik PCR purification kit (QIAGEN), sequenced directly using the BigDye terminator V3.1 sequencing standard kit (Applied Biosystems), as recommended by the manufacturer, and run on an automated ABI Prism 3100 Genetic Analyzer (Applied Biosystems). DNA sequences were analyzed using BLAST with reference sequences for MSX1 cDNA (NM_002448.3) and genomic DNA (NT_006051.18). The samples with variations were resequenced with both forward and reverse primers.

Bioinformatics analysis

The potential consequence of each variant on the MSX1 protein structure was predicted by PolyPhen (http://www.bork.embl-heidelberg.de/PolyPhen). The multiple alignments were performed using the ClustalX program version 2.0.12 (University College Dublin, Dublin, Ireland). The Homo sapiens MSX1 gene protein (NP_002439) sequence was aligned with that of the chimpanzee (Pan troglodytes; XP_517087.2), pig (Sus scrofa; NP_001156359.1), mouse (Mus musculus; NP_034965.2), rat (Rattus norvegicus; NP_112321.1), cow (Bos taurus; NP_777223.1) and chicken (Gallus gallus; NP_990819.1). The ESE finder program was used to predict exonic splicing enhancers (http://rulai.cshl.edu/tools/ESE2/).

Statistical analysis

The χ2-test, odds ratio (OR) and 95% confident interval (CI) were calculated using an online program, available at http://statpages.org/, to compare results between affected patients and controls. The standard transmission disequilibrium test and linkage disequilibrium between markers were computed using the FAMHAP program version 19 beta (University of Bonn, Bonn, Germany). Subsequently, we used a log-linear model to estimate the single- and double-dose effects of alleles using the HAPLIN program, version 3.5.13 This software uses a previously introduced model14 to compute both the relative risk (RR) and the likelihood ratio. We used a reciprocal reference with the threshold set to 0.01 (1%) for haplotype frequency.

Results

A total of 90 trios with non-syndromic CL/P were analyzed for CA repeats in the MSX1 and TGFB3 genes. Both markers were found to be in Hardy–Weinberg equilibrium (P>0.05). Five alleles were identified for MSX1-CA, including: allele 1, 12 CA repeats (190 bp); allele 2, 11 CA repeats (188 bp); allele 3, 10 CA repeats (186 bp); allele 4, 9 CA repeats (184 bp); and allele 5, 8 CA repeats (182 bp). Allele 4 of the MSX1-CA repeat was the most common. TGFB3-CA was less polymorphic: three alleles containing 163, 165 or 167 base pairs were identified. The results of transmission disequilibrium test did not indicate preferential transmission for either the MSX1-CA or the TGFB3-CA markers. When the analysis was restricted to the maternal or paternal transmission, the TGFB3-CA exhibited a trend toward maternal transmission (P=0.058). The 163-bp allele was overtransmitted (Pvalue=0.04, ORtransmission=2.1), whereas the 165-bp allele was undertransmitted (Pvalue=0.06, ORtransmission=0.52) from heterozygous mothers to the affected patients. A maternal double dose of the 163-bp allele was shown to have a slightly higher RR (RR=1.18, 95% CI, 0.534–2.61) compared with the 165-bp allele (RR=0.843, 95% CI, 0.39–1.89), but this difference did not reach statistical significance. Furthermore, a fetal double dose of the 163-bp allele (RR=1.81, 95% CI, 0.814–4.04) demonstrated a higher RR compared with the 165-bp allele (RR=0.553, 95% CI, 0.249–1.26), consistent with a maternal genotype effect.

In a total screening of 124 affected individuals with oral cleft (112 CL/P and 12 CP only cases), five variants were found in the coding regions of MSX1. Of these variants, two common variants of 101C>G (A34G) and 330C>T (G110G) in exon 1 have been previously reported and were observed in both affected patients and controls (Table 1). The difference in allele frequency was not significant for 101C>G (P=0.813), whereas the 330C>T variant showed a significant association with cleft (P=0.001, OR=2.241, 95% CI, 1.357–3.700). The A34G was also found to be heterozygous in eight control samples (Table 1).

Table 1 Comparison of specifies alleles of 101C>G and 330C>T among 112 cases with isolated CL/P and 100 normal controls

Of the remaining MSX1 variants, three heterozygous, non-synonymous variants, 440C>A (P147Q), 109A>C (M37L) and 800G>C (G267A), were found. With the exception of the G267A variant being found in one control sample, these variants were not observed in 200 control chromosomes.

The patient who harbored the P147Q variant had right unilateral cleft lip with CP and mild hypertelorism. Interestingly, similar to the proband, his otherwise healthy, non-cleft sister had mild hypertelorism. The P147Q variant was not found in this sister; however, his other non-cleft sister, without hypertelorism, was heterozygous for this variant. This observation is not surprising, given that disruption of normal facial development is common among cases with orofacial clefts, and hypertelorism is commonly present to some degree in both affected cases and their non-cleft relatives.15, 16 This variant was inherited heterozygously from the non-cleft father who had neither dental anomaly nor other subphenotypes of cleft. Dental anomalies and other subphenotypes of cleft were not found in either the proband or family members.

The M37L variant was found in a patient with bilateral CLP without additional anomalies. This variant was segregated from a non-cleft father who was homozygous for this variant. In this family, the grandfather's siblings had oral cleft from the paternal side, but samples were not available for tracing the variant. Upon medical examination of the father, neither dental abnormality nor other subphenotypes of cleft were found.

The patient with G267A had incomplete cleft lip, without other associated dysmorphic features. The variant was inherited from a heterozygous non-cleft father who had a high-arched palate and torus palatinus. There was no family history of cleft.

Discussion

Several studies using MSX1-CA and TGFB3-CA markers have found significant associations with non-syndromic orofacial cleft,7, 17, 18 whereas others have found no significant association.8, 11, 19, 20 In the present research, no transmission distortion was found in the transmission disequilibrium analysis for either MSX1-CA or TGFB3-CA intragenic markers, but TGFB3-CA exhibited a trend to excess maternal transmission. Our findings suggest that both the maternal and fetal RR results are in agreement for both TGFB3-CA alleles, as the 163-bp allele had a deleterious recessive effect, whereas the 165-bp allele had a protective recessive effect. The present results are consistent with a previously report that found a parent of origin effect for the TGFB3 gene among central European populations.21

Upon sequencing the MSX1 coding regions in 124 patients with oral cleft, five variants were found, including three known variants (A34G, G110G and P147Q) and two novel variants (M37L and G267A) as summarized in Table 2. Furthermore, in agreement with a previous report on Asian populations,22 the G110G variant displayed a significant association between patients with non-syndromic cleft lip with or without CP and normal controls.

Table 2 Summarized clinical and genetic data of patients with MSX1 variant

In addition to its well-known homeodomain, six other highly conserved domains have recently been identified within the human MSX1 protein.23 It appears that amino-acid changes within the homeodomain lead to the development of tooth agenesis with high penetrance and a dominant mode of inheritance. Almost all reported mutations within the homeodomain to date have been detected in cases with tooth agenesis.23, 24 In addition, orofacial clefts have been seen as associated anomalies among cases with tooth agenesis bearing an MSX1 mutation.25 Therefore, it appears that MSX1 has an overlapping role in tooth, lip and palate development. In spite of a significant association between non-syndromic orofacial clefts and MSX1, previous studies have not identified clear Mendelian inheritance or a genotype–phenotype correlation.17, 22, 26 In contrast to the tooth agenesis, most of the MSX1 mutations that have been identified among non-syndromic clefts cases are found outside the most highly conserved homeodomain.17, 22, 26

Similar to other studies, we have found three variants without clear Mendelian inheritance in single-cleft cases. Of these variants, only the P147Q variant, which has been detected in a number of Southeast Asian populations, is a relatively common variation in the MSX1 gene. In addition to the present study, this variant has been detected in three Vietnamese families,17 two Filipino families9 and three Thai families.26 The prevalence of this variant may reflect a Founder effect in these populations. The P147Q variant was previously reported as a damaging variant;9, 17 however, evidence from a Thai population suggested that it may have no disease-causing effect because it was detected in controls.26 Although the population admixture of our subjects with the Thai population was expected, we did not detect the P147Q variant in 200 control chromosomes. There is no clear correlation between genotype and phenotype of individuals who are carrying the P147Q variant. Consistent with the present study, it has been detected across affected and non-affected members of the same family and in unrelated healthy controls.9, 17, 26 Although this variant is located within a conserved phosphorylation motif within the MH3 domain,23 it is revealed that it may not be pathogenic.26 Therefore, it, at best, may contribute to clefting with other genetic and environmental risk factors in a complex inheritance pattern.

In the present study, the M37 amino acid was substituted with leucine (another hydrophobic amino acid). Methionine is a very hydrophobic amino acid and is conserved at position 37, across species as far back as amphibians.23 Although this variant was predicted to be benign using the Polyphen program, it is located exactly one amino acid after the polyalanine tract of MSX1 at the N-terminal, from nucleotides 29 to 36 (29-AAAATAAA-36). Polyalanine tracts are found in the homeodomain of many transcription factors,27 and have been shown to be associated with several human disorders.28 Current evidence suggests that polyalanine tracts function by changing the normal folding, degradation and cytoplasmic aggregation of the mutant proteins.28 Both alanine repeats and the flanking amino acids are conserved, suggesting that both are important to the function of the polyalanine tract.28 Furthermore, the alanine at position 34 is highly conserved and located inside the polyalanine tract (29-AAAATAAA-36) (Figure 1). Similar to M37, its conservation also extends back to the amphibians.23 The A34G variant was observed in both cases and controls as a non-pathogenic variant in populations of different ethnic background.17, 22, 26 However, the M37L variant was not found in 200 control chromosomes. Furthermore, the non-cleft father who was homozygous for this variant did not exhibit subphenotypes of cleft or dental anomaly. Therefore, the M37L variant appears to be very rare and may be detectable across a large number of subjects. Most MSX1 variants have been found in only single CL/P cases, and therefore, a large control group would be needed to obtain significant results.24

Figure 1
figure 1

The multi-alignment of the human MSX1 protein with its orthologs among various species, generated by using the ClustalX program version 2.0.12. The methionine at position 37 is not located in a highly conserved block, although it is conserved among different species. The glycine at position 267 is less conserved but is located in a conserved block of amino acids. The proline at position of 147 is located in a highly conserved block and is highly conserved among different species.

As in the Thai study,26 we also found an amino-acid substitution at the position 267 of the MSX1 protein. In the previous study, the glycine at this position was substituted with cysteine (G267C).26 This 267 variant was predicted to be possibly damaging in the Thai study but to be benign in the present research. This difference may be due to the fact that cysteine is a more hydrophobic amino acid than glycine and alanine. This amino acid is not highly conserved but is located in a block that is conserved among species (Figure 1). The patient with this variant had a mild form of cleft and no dental abnormality. The father of the patient had a slightly high-arched palate and torus palatinus, but not cleft lip or dental anomalies. Furthermore, this variant was found in one control sample, suggesting a very rare polymorphic site.

In conclusion, with the exception of the P147Q variant, several single, rare variants have been reported in either families or specific populations with oral clefts.17, 22, 26, 29 Oral clefts are more severe than tooth agenesis, and recent data suggest that MSX1 causative mutations for tooth agenesis may not be sufficient to cause oral clefts.29 In addition, the change in amino acid itself may not be a sufficient criterion to predict disease in complex traits such as oral clefts.24 Therefore, these rare variants could, at best, contribute to clefting as part of a complex inheritance pattern, with both additional genes and environmental factors having a role.