Introduction

Fanconi anemia (FA; MIM 227650) is a rare autosomal recessive disorder characterized clinically by a progressive bone marrow failure associated with several congenital abnormalities, including skeletal and renal malformations, microcephaly, and abnormal skin pigmentation (Giampietro et al. 1993; Altay et al. 1997). Patients have an increased risk of cancer, most notably of acute myeloid leukemia (Alter 2003). Cells from FA patients grown in vitro exhibit increased spontaneous chromosomal aberrations and are hypersensitive to DNA cross-linking agents such as diepoxybutane (DEB) and mitomycin C (MMC), a characteristic of strong diagnostic value (Berger et al. 1993). FA shows extensive locus heterogeneity, with at least eight complementation groups (FAA–FAG) determined by somatic cell hybridization and corresponding to separate defect genes (Joenje et al. 1997; Timmers et al. 2001). Six genes, viz., FANCA, FANCC, FANCD2, FANCE, FANCF, and FANCG, have been identified (Fanconi Anaemia/Breast Cancer Consortium 1996; Strathdee et al. 1992a, 1992b; Whitney et al. 1995; Hejna et al. 2000; Timmers et al. 2001; Waisfisz et al. 1999; de Winter et al. 1998, 2000a, 2000b; Saar et al. 1998). The products of these six FA genes have no homology with other known proteins or to each other. However, several studies have demonstrated that FANCA, FANCC, FANCE, FANCF, and FANCG proteins are assembled to form an FA nuclear protein complex (Kupfer et al. 1997; Garcia-Higuera et al. 1999; de Winter et al. 2000c; Siddique et al. 2001; Medhurst et al. 2001; Taniguchi and D'Andrea 2002). This complex is required for the mono-ubiquitination of FANCD2 on lysine 561 and for the translocation of the FANCD2 protein to nuclear foci to interact with BRCA1 in response to DNA damage (Garcia-Higuera et al. 2001).

A recent study has shown that cell lines derived from FAD1 and possibly from FAB patients have biallelic mutations in the BRCA2 gene and express truncated BRCA2 protein. This study confirms that BRCA2 is a FA gene and that mutations in this gene lead to the FA-D1 group (Howlett et al. 2002).

The relative prevalence of each FA subtype widely varies with the geographic or ethnic origin of the studied populations (Whitney et al. 1993; Verlander et al. 1994; Joenje et al. 1995; Joenje 1996; Savoia et al. 1996; Tachibana et al. 1999). FANCA, which maps to chromosome 16q24.3 (Pronk et al. 1995), is the most commonly mutated FA gene, accounting for 60%–65% of all cases, but its prevalence widely varies according to ethnical background. The FAA subtype is predominant in North America, South Africa (Afrikaner population), Italy, Japan, and Turkey (Jakobs et al. 1997; Tipping et al. 2001; Savoia et al. 1996; Tachibana et al. 1999; Balta et al. 2000). FANCA has an open reading frame of 4.3 kb distributed among 43 exons spanning over 80 kb genomic DNA (Lo Ten Foe et al. 1996; Ianzano et al. 1997). A large spectrum of mutations has been identified in the FANCA gene, including microdeletions, large deletions, microinsertions, and point mutations (Levran et al. 1998; Centra et al. 1998; Morgan et al. 1999; Wijker et al. 1999; Tachibana et al. 1999). One "Tunisian" mutation in exon 10 (890–893del) and two "Moroccan" mutations in exons 24 and 43 (2172–2173 insG, 4275delT) of the FANCA gene have been identified in Israeli non-Ashkenazi Jewish patients (Tamary et al. 2000). Except for the afore-mentioned report, FA has not yet been investigated in North African population and particularly in Maghrebian population, which is characterized by its heterogeneous ethnic background and by a high rate of consanguinity.

In order to identify to which complementation group Tunisian patients belong, 39 unrelated families were investigated by using haplotype analysis and homozygosity mapping. Thirty-four families were assigned to the FAA group, whereas one family was probably not linked to FANCA gene or to any known FA gene. For the patients who are assigned to the FAA group, we screened the FANCA DNA or cDNA by direct sequencing. Four novel mutations in four patients and two new polymorphisms were identified.

Patients and methods

Patients

Thirty-nine unrelated families with a total of 49 FA patients were investigated (Table 1). All patients were diagnosed on the basis of clinical symptoms in combination with hypersensitivity to MMC, as determined in a standard cytogenetic chromosomal breakage test. Nine families were multiplex with at least two affected children; 20 families were simplex with one affected and at least one healthy child, and 10 singleton families had only one affected child. Seven patients, belonging to seven unrelated families, received an allogenic bone marrow transplant. Clinical data were obtained by reviewing the medical records from the referring physicians; some of these data have been reported by Frikha et al. (1998). Malformations were present in 96% of cases and consisted mainly of skeletal malformations. Abnormal skin pigmentations were present in 92% of cases. The ages of patients ranged from 1 year to 30 years. The mean age at diagnosis was 9 years. Profound aplastic anemia developed in 43% of cases. At the time of analysis, 64% of the patients were still living. The mean age of death for patients who had died was 16 years. More than 90% (92%) of FA families were consanguineous, and the degree of kinship was generally first cousins (60%) or second cousins (25%).

Table 1. Characteristics of 39 FA Tunisian families. Families were denoted as TF, CF, and SF according to the regions in which they live and corresponding respectively to north, center, and south of Tunisia (+ linked to FANCA gene by haplotype analysis and homozygosity mapping, not linked to FANCA, NI non-informative families). Families in bold type were selected for mutation screening (for each multiplex family, only one patient was investigated). Mutations identified are given in parenthesis. Carrier haplotypes segregating with 49 patients with the closest markers (D16S3407-D16S303-D16S3121-D16S3026) linked to FANCA gene are reported right

Genotyping

After informed consent, blood samples were collected, and genomic DNA was extracted by standard procedures (Sambrook et al. 1989). Microsatellite markers were selected from the Genethon mapping panel (Dib et al. 1996). Primer pairs corresponding to seven polymorphic markers were amplified for linkage analysis at 16q24.3. The markers span an interval of about 10 cM and have the following order from centromere to telomere: D16S520-D16S498-D16S3123-D16S3026-D16S3121-D16S303 (Pronk et al. 1995). D16S3407 is an intragenic marker located 16 kb upstream the 5' untranslated region of the gene (Morgan et al. 1999). D16S303, D16S3121, and D16S3026 have been mapped within the critical region of the FANCA gene (Fanconi Anaemia/Breast Cancer Consortium 1996). For families not linked to the FANCA gene, microsatellite markers corresponding to the FANCC, FANCD2, FANCE, FANCF, and FANCG genes were tested. Polymerase chain reaction (PCR) was performed in a total volume of 50 μl containing 200 ng genomic DNA, 10 μM each primer, 250 μM each dNTP, 1.5 mM MgCl2, 1 U Taq DNA polymerase. Amplification buffer contained 20 mM TRIS-HCl (pH 8.8) and 50 mM KCl.

The reactions were performed by using a "hot-start" procedure; initial denaturation was for 5 min at 96°C followed by amplification for 40 cycles with denaturation at 94°C for 1 min, annealing for 1 min at the appropriate temperature, and extension at 72°C for 1 min, followed by a final extension at 72°C for 7 min.

The amplified products were run on a 6% polyacrylamide gel and then transferred to a nylon membrane by contact blotting procedure. Amplified fragments were revealed by hybridization at 42°C for 3 h in hybridization buffer with a radioactively labeled poly(AC) probe [dCTPα-32P] according to Hazan et al. (1992).

Mutation screening

Genomic DNA amplification and direct sequencing

Exons and their flanking regions were amplified from genomic DNA. Primer pairs were generated from intron sequences (Ianzano et al. 1997; Levran et al. 1997; Savino et al. 1997) in order to amplify the 43 investigated exons in 27 PCRs with products ranging from 220 bp to 1340 bp. PCR conditions were as described above. PCR products were purified on QIAquick spin columns (Quiagen) and directly sequenced with the Big Dye terminator kit (Applied Biosystems) with PCR primers on an ABI Prism 377 DNA sequencer (Applied Biosystems).

cDNA isolation, amplification, and direct sequencing

In order to identify the alterations in RNA level, total RNA was isolated from total blood by means of the TRIzol reagent (Gibco BRL Life Technologies). The cDNA first-strand was synthesized by using the ProSTAR first-strand reverse-transcriptase PCR (RT-PCR) kit (Stratagene). The complete cDNA corresponding to the FANCA gene was amplified in 10 overlapping fragments (primer sequences are available upon request). RT-PCR products were purified and directly sequenced.

Linkage analysis

Linkage analysis was performed by using the computer program Genehunter v2.1 (Kruglyak et al. 1996). The frequency of the disease is estimated to be 0.00001, assuming a fully penetrant autosomal recessive mode of inheritance. The allele frequencies of the markers were calculated from a sample of unrelated individuals of the studied population. Multipoint parametric LOD scores were calculated under a hypothesis of genetic homogeneity (LOD) or genetic heterogeneity (HLOD) by using Genehunter. A single locus and a multilocus (up to four consecutive markers) TDT using the tdtx (x=.,2,3 or 4) commands of Genehunter were also performed.

Results

Haplotype analysis and homozygosity mapping

Forty-nine patients, belonging to 39 unrelated families, were subtyped by using linkage analysis with markers linked to previously known FA genes. In view of the higher prevalence of the FAA complementation group, all families were first genotyped with markers overlapping the FANCA gene region. For each family, the most likely haplotypes were constructed by visual inspection and by use of Genehunter (genotyping results available on request). Haplotype analysis and homozygosity mapping showed that, except for one multiplex family (SFA6; Fig. 1), all patients belonging to eight multiplex consanguineous families were homozygous by descent for all the markers overlapping the FANCA gene region. The affected children of each family had inherited identical haplotypes from their respective parents, whereas the unaffected children had inherited either the maternal or the paternal haplotype carrying the disease or the haplotypes non-transmitted to the affected children. In these families, segregation of the FANCA markers and the FANCA gene was observed. Among the 20 simplex families, 17 were consanguineous. Haplotype analysis showed that, except for three families (TFA1, TFA4, and SFA10), all patients belonging to these families were homozygous for the intragenic marker and for at least two other markers closely linked to FANCA gene. Furthermore, haplotypes transmitted to affected and to unaffected offspring were different for each family. This result suggested a likely assignment of these simplex families (except for TFA1, TFA4, and SFA10) to complementation group A. TFA4 and SFA10 were non-consanguineous and non-informative families. Among the 10 families with only one affected child, one was non-informative, and patients from the remaining families were homozygous by descent for all markers linked to FANCA gene. This result suggested that the FANCA gene was likely to be responsible for FA in these families. Multipoint parametric LOD scores were calculated for all families between the FANCA gene and the makers overlapping the FANCA region. Significant LOD scores were obtained for all markers, with a maximum multipoint LOD score of 8.89 at marker D16S3026 (Table 2), whereas the highest HLOD score value of 9.86 was observed at D16S303 (α=0.9; α being the proportion of linked families). All HLOD scores were above statistical significance with the smallest value of 5.10 at marker D16S520. When genetic homogeneity was assumed, the LOD scores at these markers dropped, indicating significant genetic heterogeneity (Table 2). These statistical results were not surprising, as the consanguineous family SFA6 excluded linkage to FANCA gene (Table 3). Homozygosity mapping excluded linkage with the other FA known genes.

Fig. 1.
figure 1

Pedigree and haplotype analysis for the SFA6 family with closest markers linked to FANCA genes. Each generation is designated by a Roman numeral (I-V), and the members of generations IV and V are designated by Arabic numerals. Affected individual IV-2 has not been investigated (squares males, circles females, slashed symbols deceased individuals, shaded symbols affected individuals)

Table 2. Multipoint parametric LOD scores between FANCA and the various markers for all the studied families (α proportion of linked families to FANCA gene). The intermarker distances are estimated taken from Pronk et al. (1995), Fanconi Anaemia/Breast Cancer Consortium (1996), and Morgan et al. (1999)
Table 3. Multipoint parametric LOD scores between FANCA and the various markers for the SFA6 family

The patient born to the consanguineous family TFA1 was heterozygous for all markers linked to FANCA gene and for markers of the FANCC, FANCD2, FANCE, FANCF, and FANCG genes. The affected and unaffected children did not inherit the same chromosomes from the parents.

In order to identify a possible founder effect in Tunisian families, haplotypes were analyzed in the families consistent with linkage to FANCA gene. We found that the more common haplotype 2-5-2 (D16S3407-D16S303-D16S3121) was shared by 12 families, 10 of which were from southern Tunisia, one was from central Tunisia, and one came from northern Tunisia. One other haplotype was observed 3-5-5-4 (D16S3407-D16S303-D16S3121-D16S3026) and was shared by four families from central Tunisia (Table 1, Fig. 2). A single locus and a multilocus TDT were performed for markers D16S3407, D16S303, D16S3121, and D16S3026. Different haplotypes were found that were in linkage disequilibrium with the disease locus. Haplotype 2-5-2-3 was particularly interesting with 12 transmissions yielding a TDT value of 12 (Table 4).

Fig. 2.
figure 2

Geographic distribution and characteristics of 39 Tunisian FA families. The geographic origins of the patients' ancestors were mainly based on the birth places of the patients' grandparents, as provided by the families participating in this study. Squares Families, asterisks multiplex families, n non-consanguineous families, a families with patients who had received an allogenic bone marrow transplant, filled squares families who carried the first haplotype (2-5-2), hatched squares families who carried the second haplotye (3-5-5-4). For the non-consanguineous families, the paternal and the maternal grandparents of the patients were from the same area

Table 4. Single locus and multilocus TDT for the four proximal markers from the FANCA region. T and NT are respectively the number of transmissions and non-transmissions of the allele or the haplotype. The TDT value is thus equal to (T−NT)2/(T+NT)

Mutation screening

The complete coding sequence of the FANCA gene was amplified either from genomic DNA in 26 fragments or from cDNA and screened for mutations by direct sequencing. The search for mutations was started by screening all patients (a total of 43 patients) who were assigned to the FAA group by linkage analysis and homozygosity mapping, for the previously described recurrent mutations (1115–1118del and 3788–3790del) found within FANCA in exons 13 and 38, respectively. These mutations were identified in many unrelated patients from various populations (Wijker et al. 1999). We also screened the "Tunisian" (890–893del) and the "Moroccan" (2172–2173insG, 4275delT) mutations occurring in exons 10, 24, and 43, respectively (Tamary et al. 2000). None of these mutations or other mutations was observed in these exons in any of our patients. Screening for mutations in the remaining 38 exons was performed in 16 patients from different families. Patients were selected on the basis of their belonging to the more informative families. A total of four novel homozygous mutations were identified in four unrelated FA patients (Fig. 3).

Fig. 3.
figure 3

Sequence analysis of the four mutations identified. A1 Genomic DNA sequence for the partial exon 19 (underlined 4-bp deletion), A2 1751–1754del in patient TFA11, A3 parents are heterozygous carrier for the mutation, B1 1606delT in patient TFA9, B2 parents are heterozygous carrier for the mutation, C1 513G→A in patient TFA29, C2 parents are heterozygous carrier for the mutation, D1 IVS24+166A→G, D2 parents are heterozygous carrier for the mutation, A4, B3, C3, D3 DNA sequence flanking each mutation. The deletion consensus sequence CCTG and the direct repeats are underlined. The deleted/substituted sequence is in bold. The sequence deleted in patient TFA11 (A4) is also surrounded by a highly GC-rich region (>70%), a feature that has frequently been observed at the site of human DNA deletions (Krawczak and Cooper 1991)

The first mutation detected in patient TFA11 was a deletion of 4 bp of the sequence (TCCC) in exon 19 (1751–1754del) and created a premature termination codon 20 amino acids downstream (Fig. 3, A2). The second mutation, a deletion of T at position 1606 in exon 17 (1606delT) was found in patient TFA9 in the homozygous state (Fig. 3, B1). This deletion results in a shift of the reading frame and the production of a termination codon 21 amino acids downstream. For patient TFA29, the G to A transition at position 513 (513G→A) led to a substitution of the tryptophan codon with a stop codon (W171X) and, therefore, to a premature truncation of the FANCA protein in exon 5 (Fig. 3, C1). RNA was available from patient CFA2, and sequencing of cDNA revealed an insertion of 166 bp between exons 24 and 25. Sequence analysis of genomic DNA revealed an A to G transition 166 bp downstream from exon 24 (IVS24+166A→G; Fig. 3, D1). This transition results in the use of a downstream cryptic splicing donor site and the insertion of an intronic segment. The score for the cryptic donor splice site was calculated according to the method of Shapiro and Senapathy (1987) and was higher (s=86) than the score for the normal signal (s=78). The inserted sequence creates a stop codon 40 bp downstream from the tail of exon 24, resulting in a truncated protein of 835 amino acids. Patient CFA2 belonged to the second haplotype, which was shared by three other unrelated patients. Sequencing of exons 24–25 and the intronic region showed that none of these patients carried this mutation, indicating that the second haplotype was not associated with a recurrent mutation and suggesting a mutational heterogeneity affecting Tunisian patients, even those sharing the same haplotype.

For the four mutations found in the four unrelated patients, linkage analysis and haplotype data were consistent with the segregation of the mutation in the parents and the healthy sibs who carried the mutation in a heterozygote state.

In addition to the predicted pathogenic mutations, 11 polymorphisms were detected; nine of these were previously reported and described in the Fanconi Anemia Mutation Database (http://www.rockefeller.edu/fanconi/mutate). Two new polymorphisms were identified within intron 24, (IVS24–5G/A and IVS24–6C/G). They might be common variants in our population, because they appeared at homozygous state in the majority of patients and in those of their family members who were studied (80 of 100 chromosomes analyzed).

Discussion

In a total of 39 unrelated families living in various geographic locations in Tunisia, linkage analysis and homozygosity mapping have shown that 94% of these families are likely to belong to the complementation group FAA (Table 1). There is evidence of a high prevalence of this group in Tunisian FA families. A similar situation has been observed in Italy and Japan (Savino et al. 1997; Tachibana et al. 1999).

For one family (SFA6), haplotype and LOD scores analyses did not show linkage to FANCA gene. Homozygosity mapping excluded linkage with the other known FA genes, suggesting that patients born to this consanguineous family could possess mutations in the BRCA2 gene or may belong to a new complementation group. Nevertheless, because of the hypermutability and the instability of the FANCA gene, we cannot exclude the hypothesis of an intrafamilial heterogeneity of mutations. Indeed, in this family, one of the affected children is homozygous for the markers overlapping the FANCA gene (including the intragenic marker), and the other is heterozygous for the same markers but does not share the same chromosomes with the unaffected sibs (Fig. 1). This result suggests that the second affected sib may be a compound heterozygous for two different mutant alleles within the FANCA gene, one of which would have been inherited from one of the two parents and would correspond to the mutation revealed in the homozygous state in the first affected sib, and the other corresponding to a neomutation. Furthermore, the affected child who is homozygous for the FANCA markers carries the most common haplotype (2-5-2) in the homozygous state, whereas the other FA child carries the same haplotype in a heterozygous state. A similar situation of intrafamilial heterogeneity of mutations has been observed for several other disorders within Arab populations (Carrasquillo et al. 1997; Rinat et al. 1999; Ben Arab et al. 2000).

Another patient born to a consanguineous family (TFA1) was heterozygous for all the markers linked to the FANCA, FANCC, FANCD2, FANCE, FANCF, and FANCG genes. Neither the affected nor the unaffected child inherited the same chromosomes from the parents. For this family, the patient may be a compound heterozygous for two different mutant alleles inherited from the two parents and thus not be excluded from any of the tested genes. For the two unclassified patients, TFA1 and SFA6, only a complementation analysis and the screening for mutations will allow molecular diagnosis.

In view of the high prevalence of the FAA group in Tunisia as revealed by linkage analysis, the presence of only two common haplotypes suggests the occurrence of a founder effect for FA in our population, with specific mutations segregating within each haplotype. This situation has been reported in a molecular study of FA families from the Afrikaner population of South Africa, a study showing the presence of five haplotypes associated with four different mutations (Tipping et al. 2001). Somewhat surprisingly, in spite of a high consanguinity rate, our results suggest that patients sharing the same haplotype probably carry different mutations (at least for the second haplotype), providing no evidence of common ancestry. Thus, several independent mutational events have probably affected the FANCA gene and are responsible for the disease.

Screening for mutations in the FANCA gene has led us to identify four novel mutations in the homozygous state in four unrelated FA patients. These mutations are likely to be pathogenic and would be predicted to generate truncated proteins, although residual function for part of the protein cannot be excluded. Three of these mutations are flanked by motifs (CCTG) and direct repeats (Fig. 3, A4, B3, C3), which have previously been identified as being mutation hot-spot consensus sequences for particularly spontaneous small deletions (<20 bp; Krawczak and Cooper 1991; Smith and Adair 1996). We could amplify FANCA cDNA only for patients who were diagnosed at an early age and who did not show profound aplastic anemia. For the others, we failed to amplify the FANCA cDNA, although we succeeded in amplifying cDNAs of other genes (e.g., IFNGR). The presence of mutations probably reduces transcription of the FANCA gene or alters the stability of its mRNA, as reported by others (Wijker et al. 1999). On the other hand, RNA was extracted from 10–20 ml whole blood, and the transcript of FANCA gene was reported as being undetectable in unstimulated peripheral blood leukocytes from non-leukemic individuals (J.P. de Winter, unpublished; Joenje and Patel 2001).

We have screened a panel of 16 patients from different geographic areas of Tunisia by direct sequencing of genomic DNA or cDNA to identify the spectrum of mutations in the FANCA gene. We have identified only four mutations and 11 polymorphisms, nine of which are common and have been described previously. These data indicate an overall detection rate of 25%; this means that a large proportion of mutations are not detected by current scanning protocols. Several reasons may account for this low detection level. First, FA is genetically a heterogeneous disease, and hence, some of the patients (particularly those belong to simplex or singleton families) may indeed not belong to the FAA group. Second, some patients may carry mutations that are not detected by our screening protocol, such as (1) large intragenic deletions, (2) some mutations that are identified only in cDNA, or (3) mutations localized in either introns or the promoter region.

To our knowledge, this investigation is the first systematic genetic and molecular analysis of FA in the Maghreb region and particularly in Tunisia. The North African population is characterized by a high degree of consanguinity in spite of its various ethnic backgrounds. The rate of consanguineous mating is estimated to 33% in northern Tunisia (Riou et al. 1989). Depending on the area studied, this rate reaches over 60%. This can be explained by the fact that endogamy is culturally favored and serves to keep property within families (Khlat 1997; Steensma et al. 2001); in addition, in some areas, geographic isolation may favor consanguinity. In consequence of the high rate of inbreeding, there is an increase in the prevalence of recessive genetic disorders (Hoodfar and Teebi 1996). This may explain the relatively high incidence of FA in Tunisia (1.4/million/year; M. Frikha, personal communication/in preparation).

The present study has led to several improvements in patient management: (1) genetic counseling in informative families, (2) better identification of bone marrow graft donor, and (3) follow-up of grafted patients by evaluating chimerism (C. Bouchlaka et al., in preparation).