Introduction

Cleft lip with or without cleft palate (CL/P) is a common birth defect that represents a major public health burden, both socially and medically. The prevalence of CL/P varies from 0.6 to 1.7 per 1000 livebirths among Caucasians, with African Americans showing a lower prevalence (0.4/1000) and Japanese having a higher frequency (2.1/1000).1 Cleft lip and palate (CLP) and cleft lip (CL) comprise 45 and 25%, respectively, of all children born with an oral cleft.2 A population-based study in California showed that 62% of CL/P births had no other major malformation.3 In all, 10–15% of all CL/P cases report a positive family history.4

Despite strong familial clustering, segregation analyses have not revealed consistent evidence for any single mode of inheritance for nonsyndromic CL/P. Various studies have suggested that CL/P may follow a dominant or recessive model (with incomplete penetrance), a multifactorial threshold model and/or an oligogenic epistatic model.5,6 For example, Marazita et al,7 in a large study from England, suggested that CL/P may be controlled by an autosomal major gene with additional multifactorial contributions. Alternatively, Scapoli et al8 found evidence for a two-locus model with a dominant major gene and a recessive minor gene.

The study of CL/P is further complicated by the fact that a combination of genetic and environmental factors contributes to its etiology.9 Linkage analysis of multiplex families and association studies using either case–control or family-based designs have become primary methods for identifying potential genes for CL/P.10,11 These two approaches do not always give consistent results, however. For example, the transforming growth factor alpha (TGFA) gene, located on chromosome 2p13, is the most widely studied candidate gene in CL/P and has shown association with CL/P in a number of case–control studies, but not in all.12 Most linkage studies of TGFA using multiplex families with nonsyndromic CL/P, on the other hand, have failed to show any evidence of linkage.13,14

Linkage studies have revealed several other candidate regions, recently reviewed in detail.11 Scapoli et al15 found significant linkage disequilibrium between the GABRB3 gene and CL/P. Evidence for linkage to chromosome 1p near the 5,10-methylenetetrahydrofoalte reductase locus and in the 1q21 and 1q32–42.3 regions16 has been reported for CL/P. Regions on chromosomes 6p, 2p, 4q and 17q have all shown some evidence of linkage to CL/P.17 Beiraghi et al18 found linkage to 4q in one family and Mitchell et al19 showed evidence of association to this same region, but a later study showed evidence against linkage in this region.20 Pezzetti et al21 reported a possible interaction between two regions that mapped in 6p23 and 2p13 in 38 CL/P multiplex families from Italy. However, Wong et al22 studied Swedish multiplex CL/P families, and found no evidence of linkage to selected candidate genes on chromosomes 2, 4, 6 or 19. Similarly, Marazita et al23 did not find significant linkage in any of these regions in 36 CL/P multiplex Chinese families. Stein et al24 found significant linkage with BCL3 on chromosome 19q in a fraction of their families. Wyszynski et al25 and Martinelli et al26 failed to find further evidence of linkage to this marker, but they did find a significant association for an allele at this marker using the transmission disequilibrium test (TDT).

Using affected sib-pairs in a genomewide screen, Prescott et al27 identified 11 regions on eight chromosomes yielding nominally significant evidence of linkage (ie P-values <0.05). These eight chromosomal regions were on chromosomes: 1p, 2p, 6p, 8q, 11 cen, 12q, 16p and Xcen-q. In the genome scan of their Chinese families, Marazita et al23 had positive multipoint results for regions on chromosomes 1, 2, 3, 4, 6, 18 and 21. Statistically significant associations using the TDT were also found on chromosomes 3, 4, 5, 6, 7, 11, 12, 16, 20 and 21 in these Chinese multiplex families.23

Identifying a genetic component involved in the etiology of CL/P remains a challenge. Results of previous linkage studies have been largely inconclusive and often contradictory, hampered by the availability of only small pedigrees, modest numbers of multiplex families, varying racial and ethnic groups and, perhaps the unspecified role of environmental factors in the etiology of oral clefts.17 Here, we present the results of a genome-wide screen on 10 multiplex families with nonsyndromic CL/P and the subsequent fine mapping regions on chromosome 2 in a total of 26 multiplex families.

Methods

Multiplex families

Multiplex families were recruited as part of our ongoing studies of nonsyndromic CL/P from a variety of sites, some of which have been described previously.25 In total, 10 multiplex families with sufficient biological samples for a genome-wide screen were available. Of these families, six were recruited from the Hospital Infantil de Mexico ‘Federico Gomez’,14,25 two were from Buenos Aires (Argentina) and two were from the University of Iowa. In all, 59 individuals from these 10 families were genotyped (29 affected and 30 unaffected). A total of 368 microsatelite markers with an average intermarker distance of 9 cM and an average heterozygosity of 0.76 were genotyped by the Center for Inherited Disease Research (CIDR). After the analysis of the genome-wide screen data, additional microsatellite markers were identified in regions showing suggestions of linkage for fine mapping. In all, 16 additional multiplex families were added for these fine mapping efforts including: six families from Maryland, one family from Argentina and nine families from the Czech Republic. In these 26 multiplex families, a total of 137 individuals were genotyped. The mean family size was 9.2 individuals, ranging from a minimum of three family members to a maximum of 24. There were a total of 74 affected individuals (40 male: 34 female) and 169 unaffected individuals (83 male: 86 female) in these 26 families.

Statistical methods

Since there is no clear model of inheritance for CL/P, nonparametric multipoint linkage analyses were carried out using GENEHUNTER (v2.0) and ASM (v.1.0).28,29 The Zlr scoring function under the exponential model was used to assess statistical significance of observed allele sharing identical-by-descent between all affected members in a family. When there are missing genotype data and/or small numbers of pedigrees, the Zlr score provides a good measure of excess allele sharing.

Under the null hypothesis of no linkage, 1000 replicate sets of these families were simulated from the framework map using the Merlin program30 and Z-all scores were computed with GENEHUNTER (v2.0) to compare against this test statistic from the observed data. This approach provided an empirical estimate of statistical significance tailored to the number of markers and the number of families considered here. Chromosome-wide empirical P-values were estimated separately for the original 10 families used in the genome-wide screen and for the 26 families used for fine mapping. In addition, single-point TDTs were carried out on the final sample of 26 families. These TDT statistics were computed using the sib-pair31 (v0.99) program for each the 31 individual markers on chromosome 2.

Results

The initial genome-wide screen showed Zlr scores above 2. 0 on chromosomes 2, 6, 17 and 18, and minor peaks on chromosomes 7 and 16. Subsequent fine mapping of these regions with additional markers failed to confirm evidence for linkage on chromosomes 6, 7, 16, 17 and 18. Results of mapping on chromosome 2 in the initial 10 families are shown in Figure 1. For these 10 families (Argentinean, Iowan and Mexican), there were two peaks, one on 2p at map position 26 (Zlr=3.19, chromosome-wide empirical P-value=0.058) near markers D2S262 and D2S1400 and the other on 2q at position 248 (Zlr=2.76, chromosome-wide empirical P-value=0.158) near marker D2S338.

Figure 1
figure 1

NPL scores for chromosome 2 on 10 multiplex nonsyndromic CL/P families used in the genome-wide screen (25 markers).

All 26 CL/P multiplex families were used for fine mapping with 31 markers. The highest Zlr score for all families combined (thick solid line) occurred on 2q at position 247 near marker D2S338 (Zlr=2.56) (Figure 2). Marker D2S2968 on 2p (at map position 69) gave a Zlr score of 1.74. Chromosome-wide empirical P-values were calculated for both of these regions by generating 1000 replicates of these 26 families under the null hypothesis of no linkage to determine how often a Zlr score could be the same or higher than the observed Zlr score merely by chance. The peak on 2q showed an empirical P-value of 0.028; the empirical P-value for the peak on 2p was only 0.49.

Figure 2
figure 2

NPL scores for chromosome 2 on 26 multiplex nonsyndromic CL/P families using 31 markers.

Conditional linkage analysis was performed using the ASM (v1.0) program29 where families were weighted based on their nonparametric linkage (NPL) score at map position 247. Families with a positive NPL score at position 247 on chromosome 2q were assigned a weight of 1, while families with a negative NPL score were assigned a weight of zero. As seen in Figure 2, Zlr scores in the 2p region were distinctly negative in those families yielding some evidence for linkage in the 2q region (dashed line). Families with negative NPL scores in the 2q region had positive NPL scores in the 2p region (thin solid line).

Two individual markers on 2p region yielded some evidence of linkage and disequilibrium in these 26 multiplex families. Allele 8 at marker D2S168 (map position 27) was transmitted to an affected child 15 times and not transmitted two times; allele 1 at marker D2S1400 (map position 27.6) was transmitted 23 times and not transmitted eight times (empirical global P-values, 0.022 and 0.006, respectively).

Discussion

A genome-wide screen of 10 multiplex nonsyndromic CL/P famlies, and subsequent fine mapping of these 10 families plus an additional 16 families showed some evidence for linkage to two regions on chromosome 2. There was also evidence of linkage heterogeneity, however, with a subset of these 26 families showing evidence of linkage to 2q near marker D2S338 (map position 247 cM), while the original 10 families had given stronger evidence for linkage in the 2p region.

Marazita et al23 found LOD scores of 1.45 and 1.91 at D2S2944 (210 cM) and D2S1363 (227 cM), respectively, in 36 Chinese multiplex families with nonsyndromic CL/P. While the total NPL values in these regions were small, HLOD scores were >1, suggesting only a subset of their Chinese families was responsible for this evidence of linkage. Additionally, in another study, two families exhibiting CLP with mild facial dysmorphism showed evidence of linkage to 2q35–36.32 Whether these two studies have identified the same or different susceptibility genes remains to be determined.

D2S338 lies 10 Mb distal of D2S1363 and the intervening region does not include any previously reported candidate genes for CL/P. It is possible that the linkage observed in this study is being driven by only one or two families, with one family (from Mexico) yielding evidence for the 2p region and another family (from Iowa) at the 2q region. It is clear that linkage heterogeneity exists among these multiplex families where certain genes may be important in different racial, ethnic or geographic groups, further complicating the effort to map genes for CL/P.

It is interesting to note that different results were obtained using TDT and NPL analyses. The strongest evidence of linkage was observed at 247 cM near marker D2S338; however, the TDT results yielded two markers showing excess transmission of an allele in the 2p region (map position 27 cM). Zlr scores in this region were <1.0. This phenomenon has been observed elsewhere in linkage studies of CL/P. While Stein et al24 found evidence of linkage to BCL3 on chromosome 19, Wyszynski et al25 and Martinelli et al26 found no evidence from linkage analysis, but did show evidence of disequilibrium with alleles at BCL3 among CL/P cases using the TDT. Thus, it remains a challenge to resolve conflicting signals of linkage and linkage disequilibrium in family studies of CL/P, likely because multiple genes control its etiology.