Introduction

The mode of inheritance of nonsyndromic cleft lip with or without cleft palate (CL±P) has recently been the subject of considerable investigation. Most researchers have concluded that their results were consistent with major locus or oligogenic inheritance of the trait [reviewed in 1; also see 2, 3] although others have argued that the traditional polygenic model of inheritance is still a strong contender [e.g., see 4]. If inheritance is polygenic, it may be difficult or impossible to locate the relevant genes using linkage or association methods; if inheritance involves one primary locus or only a few loci, it should be possible to locate all or most of the loci using these methods.

In 1989, Ardinger et al. [5] published a landmark paper demonstrating significant association between CL± P and a TaqI RFLP at the transforming growth factor alpha (TGFA) locus: the C2 allele frequency was 14% in patients compared with 5% in controls. They had examined this locus because of its demonstrated role in palate development in the mouse [6, 7]. Subsequently, three of four additional association studies have demonstrated a similar and significant relationship between CL± P and the TaqI TGFA RFLP [810], the exception being the investigation of Qian et al. [11] and Stoll et al. [12, 13]. However, of two family linkage studies, both have significantly excluded close linkage and failed to obtain any evidence in support of linkage between CL±P and TGFA [14, 15]. This is very reminiscent of the results of association/linkage studies between insulin-dependent diabetes and insulin gene region markers: association has been proven beyond reasonable doubt, while linkage has yet to be demonstrated, even using the same families that reveal the association [for review, see 16].

We undertook the present study to determine if association between CL±P and TGFA was demonstrable in subjects from West Bengal, India. Multiple-affected (‘multiplex’) families were also analyzed for genetic linkage between CL±P and TGFA. Since our quantity of DNA was limited, we analyzed a PCR-based single-strand conformation polymorphism (SSCP) marker in the TGFA region rather than RFLPs.

Materials and Methods

Subjects

Fourteen extended families with multiple nonsyndromic CL±P members were ascertained in West Bengal, India, and examined by the authors. None of the affected individuals exhibited features suggestive of hereditary syndromes which include clefting. Whenever possible, multiplex families were selected which contained distantly related affected individuals such as first or second cousins, since simulation studies prior to fieldwork had demonstrated that these family structures provided maximal information for linkage analysis using highly polymorphic markers. The genetic relationships between the affected persons in the 14 families obtained were as follows: 5 families with affected siblings (3 with 2 affected siblings, 1 with 3 affected siblings, and 1 with 3 affected siblings plus an affected third cousin), 1 with an affected parent and affected child (a second affected child had died), 1 with an affected uncle-nephew pair, 2 with affected first cousins, and 5 families in which the affected persons were more distantly related than first cousins (e.g. first cousins once-removed, second cousins, etc.).

To test for association between TGFA and CL±P, all 34 individuals affected with cleft lip with or without cleft palate were compared to a control sample of 38 unaffected family members. The control sample was composed of people unrelated to each other: it included one random unaffected family member per pedigree (e.g. sibling of proband) and all individuals marrying into that pedigree who were unrelated to that unaffected person (e.g. aunts and uncles by marriage).

TGFA Typing

DNA was extracted from blood using modifications of previously described methods [1719]. Primer sequences for the TGFA SSCP termed ‘K’ were obtained from Dr. Jeff Murray and have since been published [20]. This primer pair amplifies a 345-bp product in the 3′ untranslated region of TGFA. PCR amplifications were performed as described by Weber and May 1989 [21] with minor modifications. Reaction volumes of 15 µl contained 25 ng genomic DNA, 200 µM each dATP, dGTP, and dTTP, 2.5 µM dCTP, 50 mM KCl, 10 mM Tris (pH 8.3), 1.5 mM MgCl2, 0.05% Tween 20, 0.05% NP-40, 170 µg/ml BSA, 15 pmol primers, 0.8 µCi 3,000 [alpha-32P]dCTP and 0.5 units Taq polymerase. Samples were processed through 30 cycles each consisting of denaturation at 94°C for 1 min, annealing at 55°C for 2 min, and primer extension at 72°C for 1 min, preceded by an initial denaturation of 7 min at 94°C and terminated by a final elongation of 7 min at 72°C. PCR products were diluted 1:2 with standard formamide dye, denatured at 94°g1 C for 1–2 min, loaded in 2-µl aliquots onto SSCP gels consisting of 5% acrylamide:bis-acrylamide (49:1), 10% glycerol and 1 × TBE, and separated at 20 W for 5 h at room temperature using a fan for cooling. Gels were then transferred to filter paper, dried, and exposed to X-ray film overnight.

Statistical Analysis: Association

Association analysis was performed by comparing the frequencies of the SSCP TGFA alleles in all the CL ± P subjects (or subgroups of these) and the control group of unrelated family members. The significance of the observed differences were determined by χ2 test and (since sample sizes were often small) also by Fisher’s exact test, using the Statistical Analysis System (SAS) software [22].

To address possible inadequacies of standard association analysis when applied to family data, we also performed an alternate method of association analysis called AFBAC (for ‘affected-family based controls’), wherein each parental TGFA allele was counted only once [23, 16]. Alleles occurring in affected persons in a family (the ‘affected’ group of alleles) were compared with alleles not occurring in affected persons (the ‘control’ group of alleles) by χ2 test. The ‘affected’ group of alleles was also subdivided by whether they occurred only in persons with CL, only in persons with CL±P, or in both types of persons within a family.

Statistical Analysis: Linkage

Linkage analysis was performed using an autosomal dominant model with reduced penetrance (0, 0.4, 0.4) for the inheritance of CL±P, suggested by our previous segregation analyses [3]. Lod scores were calculated at recombination fractions of 0.0, 0.01, 0.05, 0.10,0.20,0.30 and 0.40 using the program MLINK in the LINKAGE program package [24] version 5.1.

Results and Discussion

Table 1 presents the TGFA SSCP-K allele frequencies in the sample of unaffected individuals and in the affected individuals, the latter as a whole and also subdivided by presence/absence of palatal involvement (CL+P or CL only) and by unilateral/bilateral involvement of the cleft lip. p values are given both for the χ2 test and Fisher’s exact test. Unaffected individuals did not have TGFA frequencies significantly different from all affected individuals considered together (χ2 with 2 d.f., p = 0.390). However, when affected people were subgrouped into those with CL only and those with CL+P, there was significant heterogeneity among these three groups (CL, CL+P, controls: χ2 with 4 d.f., p = 0.002). Further inspection showed that those with CL only had a higher frequency of SSCPK allele 2 and lower frequencies of alleles 1 and 3 than the unaffected, a significant difference (χ2 with 2 d.f, p = 0.008; Fisher’s exact p = 0.007), while those with CL+P had higher frequencies of alleles 1 and 3, and a lower frequency of allele 2 than the unaffected, a nonsignificant difference (χ2 with 2 d.f, p = 0.198; Fisher’s exact p = 0.197). In other words, the TGFA SSCP-K frequencies in individuals with CL only and in those with CL+P deviated in opposite directions; thus, the allele frequency differences between them were highly significant (χ2 with 2 d.f, p = 0.0002; Fisher’s exact p = 0.00008).

Table 1 TGFA SSCP-K allele frequencies in unaffected and affected individuals

We were concerned that the results of the association analysis could be biased by the nature of the sample. Firstly, affected family members were often related to each other and so could represent duplicated information. Secondly, spurious association between TGFA alleles and type of cleft could result merely from familial aggregation of both marker alleles and cleft types. Thirdly, a control sample of unaffected relatives tends to be overly conservative since relatives are more likely than the general population to share alleles with the affected persons. To circumvent these concerns, we performed AFBAC association analysis. In AFBAC analysis, each independent parental TGFA allele is counted only once, so there is no redundancy in counting due to the occurrence of multiple affected persons within families, or due to aggregation of affected with particular clefting types within families. Also, the AFBAC ‘control’ alleles (those alleles in the families but not in any affected members) are representative of a random sample of alleles from the general population and therefore are not overly conservative.

Table 2 shows the results of the AFBAC analysis. The total number of alleles in each AFBAC group were usually smaller than in the association analysis based on affected/unaffected individuals (table 1), reflecting the reduction of redundancy in counting alleles. However, for the CL+P group there are 3 more alleles in the AFBAC analysis; these extra alleles resulted from inferring the genotypes of several untyped CL+P individuals. The SSCP-K allele frequencies in control, CL and CL+P AFBAC groups shown in table 2 are strikingly similar to the frequencies presented in table 1, suggesting that the standard association analysis based on individuals from the families (rather than alleles, as in AFBAC) was not greatly biased. The TGFA frequencies among the ‘control’ group of alleles (alleles not occurring in any affected person in the family) were not significantly different from the TGFA frequencies among the ‘affected’ group of alleles (those occurring in any affected person in the family) (χ2 with 2 d.f, p = 0.581). However, ‘control’ group TGFA frequencies differed significantly from CL group frequencies (χ2 with 2 d.f, p = 0.024), and CL group TGFA frequencies differed strongly from CL+P group frequencies (χ2 with 2 d.f, p = 0.002). In summary, the statistical significance of comparisons in the AFBAC analysis (table 2) paralleled that of comparisons based on analysis of affected/unaffected individuals (table 1).

Table 2 TGFA SSCP-K allele frequencies in AFBAC analysis groups

Previous association studies of TGFA and CL±P have utilized RFLPs detected by several enzymes. The RFLP most consistently associated with CL±P has been that detected by TaqI, with affected individuals having an elevated frequency of the C2 allele. This allele is in very strong linkage disequilibrium with allele 3 of the SSCP-K system, while the TaqI CI allele is found with SSCP-K alleles 1 and 2 [J. Murray, pers. commun.]. Thus, to be consistent with results of other investigations, our study should have revealed a higher frequency of SSCP-K allele 3 in affected individuals than in unaffected controls. However, we found an increase of this allele only in affected persons with CL+P (0.18) (table 1). Our sample of CL±P subjects from India is characterized by a CL+P to CL ratio of about 1:2, while in most studies this ratio is in the order of 2:1. Elsewhere we have hypothesized that this aberrant ratio is due to a high mortality of CL+P infants in rural India as a result of problems with breast feeding and inadequate medical services [3]. Our lower proportion of CL+P individuals, who have a higher SSCP-K allele 3 frequency, could account for our not observing the significant differences between affected and unaffected individuals seen in most other studies using RFLP data.

Linkage analysis between CL±P and the TGFA SSCP in the 14 extended multiplex pedigrees produced a non-significant maximum lod score of 0.13 at a recombination fraction of 0.20 (the maximum lod score must be at least 3.0 to constitute significant evidence for linkage). The lod score was <−2 at a recombination fraction of 0.0, which significantly excludes linkage with no recombination. Thus, our multiplex families provided little evidence in support of linkage between TGFA and a locus conferring predisposition to CL±P.

Subdivision of CL±P individuals by whether or not they had cleft palate (CL versus CL+P) revealed important TGFA differences not previously reported by other investigators. These differences strongly suggest that the TGFA locus is not acting on predisposition to the CL±P trait per se, but on expression (severity) of the trait. In other words, rather than being a ‘disease’ locus with a mutant allele necessary for the occurrence of the defect, TGFA is primarily a ‘modifying’ locus that alters expression at the as yet unidentified major disease locus. This could account for the inability/difficulty of demonstrating genetic linkage between CL±P and TGFA. Family R002 illustrates both the lack of linkage between TGFA and CL± P, and the association of specific TGFA alleles with expression of the CL±P trait. In this completely informative mating, the parents have TGFA genotypes 21 and 31, a CL child has genotype 21 and a CL±P child has genotype 31. Therefore, the affected children share no TGFA alleles (evidence against linkage), yet the CL child had allele 2 (associated with CL) and the CL+P child has allele 3 (associated with CL+P).

It is not clear what aspect of expression is influenced by the TGFA locus. In our sample, presence of absence of palate involvement was associated with significant TGFA frequency differences (χ2 p = 0.0002; Fisher’s exact p = 0.00008), while there was only a borderline significant association between TGFA and severity of expression as assessed by uni/bilateral affection (table 1: χ2 with 2 d.f., p = 0.059; Fisher’s exact p = 0.057). Bilateral clefting tends to occur more often in those with CL+P than in those with CL only (27% versus 9% in our sample). Among the affected individuals in our sample, there were two inbred children from different consanguineous matings in which the parental genotyping showed the opportunity for TGFA region homozygosity in the children. However, both of these affected inbred children were heterozygous at TGFA, suggesting that the TGFA region genes act on CCLP expression in a non-recessive manner.

There are no previous reports of TGFA RFLP frequencies in CL±P individuals subdivided by presence or absence of palate involvement. Stoll et al. [12, 13] reported no association between CL±P and either the TaqI or BamHI TGFA RFLP, but found a significant relation between the BamHI RFLP and bilateral versus unilateral affection. Thus, their BamHI result is similar to our SSCP result in that it provides evidence for TGFA effects on expression of, rather than predisposition to, CL± P. Stoll et al. [12, 13] also examined TaqI and BamHI TGFA RFLPs in subjects with cleft palate only (CP), a disorder which by family studies has been shown to be genetically distinct from CL±P, but they found no significant differences between CP and control subjects. Shiang et al. [25, 26] have performed case/control studies of the TGFA SSCP-K in unrelated subjects affected either with CL±P or with isolated cleft palate (CP). They found significant association of the SSCP-K allele 3 with both CL±P and CP. Their result is consistent with our finding of a higher frequency of SSCP-K allele 3 in patients with CL+P than in patients with CL only. Their result also supports our conclusion that TGFA is a modifying locus and not a ‘necessary’ CL±P disease locus. Thus, TGFA may modify expression of both the CL±P trait and the genetically distinct CP trait. It is of interest that TGFA was first suggested as a candidate locus for human CL±P due to its demonstrated role in mouse palate development.

In conclusion, the results of the present study strongly suggest that a locus in the region of TGFA (probably TGFA itself) modifies expression/severity of the CL±P trait, while predisposition tn CL±P per se is controlled by a major locus located elsewhere in the genome. The fact that the effects of a minor modifying locus are detectable by association analysis provides encouragement to our continuing efforts to identify the major locus.