Cystic fibrosis (CF) is a common, lethal, autosomal recessive disorder among Caucasians. The frequency of the disease in the general population is 1:2,500 live birth, which calculates to a carrier frequency of 1:25. However, the frequency varies between different ethnic and distinct geographic populations [1]. The disease is characterized by progressive lung disease, pancreatic dysfunction, elevated sweat electrolytes, and male infertility [1]. However, there is a substantial variability in disease expression in all clinical parameters. In 1989 the gene causing CF was cloned and the protein designated cystic fibrosis transmembrane conductance regulator (CFTR) was identified [24]. The most common CFTR mutation was found to be a 3 bp deletion, which leads to the removal of a phenylalanine residue at amino acid 508 of the protein, designated Δ508. This mutation is carried by 70% of the CF chromosomes worldwide but its frequency varies greatly among different populations [5]. In Europe, the frequency of the Δ508 mutation decreases from northwest to southeast, ranging from 90% in Denmark to about 30% in Turkey. In many European populations, like those in Belgium, Germany, The Netherlands, England and France, the frequency of the ΔF508 mutation is over 70% [5, 6]. Among Ashkenazi Jews the frequency of the ΔF508 mutation is less than 30% [7].

A recent extended haplotype analysis using highly polymorphic microsatellite markers has been used to study the origin and evolution of the ΔF508 mutation in Europe [8]. The results indicated that the ΔF508 mutation occurred more than 52,000 years age, in a population genetically distinct from the present European group. The mutation spread throughout Europe in chronologically distinct expansions, which might be responsible for the different frequencies of the ΔF508 in Europe.

CFTR Gene Structure and Mutation Distribution

The CFTR gene consists of 27 exons and spans over 250 kb in chromosome 7q31. The CFTR protein consists of 1,480 amino acids arranged in repeated motif structure and of 12 membrane spanning regions (TM1 to TM12), two ATP-binding domains (NBF1 and NBF2) and a highly polarized domain named the R domain which is thought to have a regulatory function [3]. The CFTR protein is allocated in the apical membrane of normal epithelial cells by its two hydrophobic membrane-spanning regions and the two nucleotide binding domains and the R domain are oriented in the cytoplasm. It forms a chloride channel regulated by cyclic AMP [913]. In addition, the CFTR appears to have other functions, including the regulation of other channel proteins [13].

The focus of this review is the genetic basis for disease expression and phenotypic variability among different patients. In addressing the question of how the different mutations disrupt the normal functions of the CFTR protein and how these defects affect the disease, it is important initially to consider the nature and the frequency of the mutations causing the disease.

So far over 500 different CFTR mutations were identified [The CF Genetic Analysis Consortium, pers. commun.]. Most of these mutations are rare, found in only one family. Only 10 mutations were found in more than 100 patients, but in specific ethnic groups mutations other than the ΔF508 were found in relatively high frequencies [6]. For example, the stop codon mutation, W1282X, which is carried in the Ashkenazi Jewish population by 50% of the chromosomes, appears worldwide in only 2% of the CF chromosomes [6, 7]. Among the Finnish, Italian, Amish, Hutterite, and other non-Ashkenazi Jews, relatively higher frequencies of specific mutations were found [6, 14]. This high frequency is probably the result of a founder effect or a genetic drift. The identification of the majority of mutations causing disease enables carrier testing of the general population for this mutation but the abundance of the additional CFTR mutations hampers the availability and accuracy of such a test. It is generally accepted that as of today only 85–90% of the mutant alleles, in most populations, might be detected by mutation screening.

Spectrum of Mutations

In addition to genetic analysis and prenatal counselling, the information on mutations causing disease may provide an important insight into the CFTR function. Spectrum analysis of the mutations revealed that about 50% of the mutations are missense mutations, 20% are frameshift mutations caused by small insertions or deletions and 15% are nonsense mutations. The rest are mutations affecting splicing and other variations [CF Genetic Analysis Consortium, pers. commun.]. The mutations are located along the entire CFTR gene, but there are several mutations hotspots. For example, in exons 4, 7 and 17b, which are parts of the two transmembrane domains, over 30 mutations in each exon were identified. A missense mutational hotspot in NBF1 is also apparent. In exon 11, within a small region of 15 bp 11 different mutations were identified. This pattern suggests that ATP binding or hydrolysis are critical for normal CFTR function. In addition, there is a nonsense mutational hotspot in exon 13. Within 100 codons, 10 nonsense mutations were identified. This might indicate that the first half of the protein is insufficient to confer the normal CFTR activity.

Insight into the functions of the individual domains of the CFTR protein has come from studies of the outcome of specific mutations. The TM1–TM12 appear to contribute to the formation of the Cl channel pore, since mutations of residues within these regions alter the anion selectivity of the channel. The NBF1 and NBF2 control channel activity through an interaction with cytosolic nucleotides. The regulatory domain also controls channel activity: phosphorylation of the regulatory domain, usually by cAMP-dependent protein kinase, is required for the channel to open [reviewed in detail in 1].

Association between Genotype and Phenotype in CF

CF is characterized by a wide variability of clinical expression: patients are diagnosed with various modes of presentation at different ages, from birth to adulthood, with considerable variability in the severity and rate of disease progression of the involved organs. Although most CF patients are diagnosed in the first year of life with typical lung disease and/or pancreatic insufficiency (PI), an increasing number of patients have recent years been diagnosed with atypical disease only in adulthood. Furthermore, although progressive lung disease is the most common cause of mortality in CF, there is a great variability in the age of onset and the severity of lung disease in different age groups. Variability is also found in male infertility. Almost all male CF patients are infertile due to congenital bilateral absence of the vas deferens (CBAVD); however, recently fertile CF male patients have been reported [15, 16]. The extent of the pancreatic disease also varies. Most affected individuals suffer from PI; however, approximately 15% of the patients possess sufficient exocrine pancreatic function to permit normal digestion (PS) [1719]. A remarkable concordance of the pancreatic function status was found within affected family members [20], suggesting that genetic factors could influence the severity of pancreatic disease and possibly its rate of progression. Evidence for a genetic basis for the severity of the pancreatic disease came from studies of the distribution of haplotypes linked to the CFTR locus and from the distribution of the ΔF508 mutation among PS and PI patients. More than 50% of the CF-PI patients were homozygous for the ΔF508 mutation but none of the CF-PS patients suggesting that the ΔF508 mutation is associated with PI, and that mutations associated with PS would be dominant over those with PI [4, 21].

The cloning of the CFTR gene and the identification of its mutations have promoted extensive research into the association between genotype and phenotype, which has contributed to our understanding the molecular mechanisms of the remarkable clinical heterogeneity of CF. The clinical presentation of patients homozygous to the ΔF508 mutation was more severe compared to milder mutations, and was associated with earlier age of onset, higher sweat Cl levels, younger age and PI [22]. Despite the severe nature of this mutation, the severity of pulmonary disease varied considerably among the patients. This association of homozygosity for the ΔF508 mutation with a generally more severe disease presentation was subsequently confirmed by other investigators [2326]. Several subsequent studies analyzed the genotype-phenotype correlation in several mutations for which a large enough number of patients (>10 patients) was available [15, 16, 2737]. These studies have shown that there are mutations other than the ΔF508 mutation that are associated with severe disease presentation. The clinical presentation of patients carrying two of these severe alleles is similar to that of the ΔF508.

Only few mutations were found to be associated with the milder phenotype [15, 16, 29]. Patients carrying at least one mild mutation were diagnosed at a later age, were older at the time of the study, had lower sweat chloride levels, some with normal or borderline values, had a better nutritional status and most of them had PS. Interestingly, some of these patients had normal or intermediate sweat chloride levels (40–60 mEq/l). In these patients, the CF diagnosis could be made only after genotype analysis was available. Table 1 lists the CFTR mutations which were shown to be associated with either the severe or the mild phenotype.

Table 1 Classification of mutations according to severity of phenotype

Higher frequencies of CFTR mutations were found among patients with incomplete CF expression. For many years, elevated sweat electrolyte levels were the gold standard for CF diagnosis. Although patients with atypical CF presentation and intermediate sweat electrolyte levels were reported, it remained unknown whether they had CF or some other similar diseases. The cloning of the CFTR gene and the identification of CFTR mutations enabled, in many cases, a definite diagnosis of CF. Patients suspected of having atypical CF presentation were tested for the presence of CFTR mutations and certain mutations, usually those associated with the milder phenotype were found. The clinical characteristics of atypical CF patients are diagnosis above 10 years of age, survival into adulthood, chronic sinopulmonary disease, pancreatic sufficiency, and sweat chloride < 60 mEq/l. It is recommended to refer such patients for CFTR genotyping. Table 2 lists clinical conditions with increased frequencies of CFTR mutations; however, their direct association with CF is still unclear. Absence of a known common mutation does not rule out CFTR-associated disease, since in all populations there are still unidentified mutations.

Table 2 Patients with increased CFTR mutation frequencies

In summary, the results of these genotype-phenotype studies indicate that the variability in disease presentation is the result of the type of the CFTR mutation. Furthermore, these studies confirmed the previous hypothesis that the milder mutations are dominant over the severe and therefore will determine the phenotype.

Congenital Bilateral Absence of the Vas deferens

Almost all men with CF are infertile, presenting with azoospermia but normal spermatogenesis, as a result of CBAVD [1]. CBAVD is also found in otherwise healthy infertile men, although detailed clinical evaluation of these men was not reported. Holsclaw et al. [38] hypothesized that CBAVD is a mild form of CF. Mutations in the CFTR gene were analyzed in men with CBAVD [3948], and 10–20% were found to carry two mutated CFTR alleles (with at least one of the mutations being mild), 40–60% were found to carry one known CFTR mutation and in 30–50% no CFTR mutations were found. Several of these mutations were found only in infertile males with CBAVD. This significantly higher frequency of CFTR mutations among men with CBAVD indicates that in many cases CBAVD is caused by defective CFTR alleles and might be considered an atypical CF disease. However, the genetic basis of CBAVD in the other males with CBAVD, and its association with CF remained unclear. Extended haplotype analysis of polymorphic sites in the CFTR locus in familial CBAVD supported the hypothesis that in many families CBAVD is associated with two CFTR mutations; however, in others it might be caused by other mechanisms such as homozygosity or heterozygosity for partially penetrant CFTR mutations [48]. More recently it has been shown that 30–40% of the chromosomes of males with CBAVD carry a splice variant, 5T allele, in the acceptor/branch site of exon 9 [4345]. This variant was found to cause high levels of exon 9 skipping leading to a nonfunctional protein. The high frequency of the 5T allele among males with CBAVD is significantly different from its frequency in the general population (3–5%) suggesting that the 5T allele is associated with CBAVD [4345].

Males with CBAVD are currently participating in programs of sperm aspiration and subsequent in vitro fertilization. It is therefore important to perform CFTR mutation analysis of these men and their female partners prior to their participation in this program.

Molecular Mechanisms Underlying Disease Variability among Patients Carrying Different CFTR Mutations

A different approach to understanding the genotype-phenotype correlation involved in vitro studies of the CFTR function. Sheppard et al. [49] studied the expression of normal and ΔF508 CFTR cDNA in epithelial cells. They found that the wild-type CFTR transfected with the recombinant vaccinia virus generated a mature fully glycosylated form of the protein consistent with delivery to the plasma membrane. The mutant ΔF508 produced only the immature core-glycosylated form of the protein, indicative of defective processing. The partially glycosylated protein is degraded instead of trafficking to the apical cell membrane. As a result, the ΔF508 CFTR cannot be detected at the cell surface, and thus, leads to a loss of the cAMP-mediated chloride conductance. In another mutation, G551D, the protein can be processed and correctly targeted to the plasma membrane but lacks full responsiveness to stimulation by cAMP [50]. Recombinant adenoviruses were used to transduce normal and variant forms of CFTR into surface epithelial cells of human bronchial xenografts grown in mice. Using immunoperoxidase electron microscopy analysis it was shown that the mutation G551D is predominantly localized to the apical plasma membrane of the cells. In cells expressing ΔF508 CFTR the peroxidase staining was localized to the nuclear envelope and the endoplasmic reticulum. These studies suggested that different mutations cause different defects of protein production and function. This led Tsui [51] and subsequently Welsh and Smith [52] to suggest four mechanisms by which mutations disrupt CFTR function.

Class I mutations cause defective protein production. In this group there are mutations containing premature termination signals and mutations causing truncated or unstable protein. These mutations are expected to produce little or no CFTR chloride channels. Class II mutations are associated with defective protein processing, as shown in the ΔF508 mutation. The abnormally processed protein which fails to progress through the biosynthetic pathway is degraded. As a result it cannot be detected at the cell surface. However, the mutant ΔF508 protein was found to have near normal activity when it reaches the cell surface [50]. Class III mutations are associated with defective regulation. Alanysis of some mutant proteins that do reach the apical membrane shows that several have mutations in the nucleotide binding domains. Some mutations cause total loss of ability to be stimulated by ATP, and in others this ability is reduced. Class IV mutations are associated with defective conductance. Patch clumping studies have shown that the CFTR chloride channels are capable of generating cAMP-regulated chloride currents, though the amount of current is reduced. Table 3 lists mutations according to these classes.

Table 3 Classes of CFTR mutations

Another class of mutations, class V mutations, was recently introduced [53, 54] which includes mutations affecting the level of normal mRNA transcript and protein required for normal function. This class might include mutations affecting correct splicing of pre-mRNA transcripts by either exon skipping, inclusion of extra crypt exon(s) and by affecting mRNA and/or protein stability (table 3).

Genotype-Phenotype Variations among Patients Carrying the Same CFTR Mutations

As has previously been shown, CFTR mutations can be classified as associated with either severe or mild disease; however, there is a substantial variability in disease expression among patients carrying the same mutation, mainly in the extent of pulmonary involvement. This variability was found both among patients carrying the same severe mutations and among patients carrying the same mild mutations. Several mutations show an extreme variability among patients. One such mutation is the missense mutation, G85E [5557]. All the tested clinical parameters were significantly more variable compared to those found among patients with severe or mild mutations [57]. It is important to note that the variability in the G85E mutation was also found among siblings and among patients from the same extended family. For example, in one family, 2 children were diagnosed before 1 year of age with typical CF presentation, had a severe course and died at a relatively early age. Their subsequent brother developed liver disease at the age of 12 with no other signs of CF, including <60 mEq/l sweat Cl. His diagnosis was made by genotyping only. Thus, this mutation cannot be classified as severe or mild mutation. The missense mutation R334W was also associated with an extremely variable expression [37]. Discordance between sibs was found in pancreatic and lung disease. Another genetic variation associated with a high variability of phenotypic expression is the 5T allele. This variant was found to cause high levels of exon 9 skipping leading to a nonfunctional protein [58]. As discussed above a substantial portion of infertile males with CBAVD carry this allele. However, fertile males with the same genotypes were also identified [4345]. In addition the 5T allele was shown to affect the disease severity of patients carrying the mutation R117H [59]. In patients carrying the R117H mutation on the 5T allele the phenotype was mild CF and in patients carrying the mutation on a 7T allele the phenotype was CBAVD only. More recently, the 5T allele was found among individuals with atypical and typical CF with no other identified CFTR mutation [44, 60]. Analysis of their clinical presentation indicated that most patients suffered from respiratory disease presenting as asthma-like symptoms, nasal polyposis, chronic sinusitis, chronic bronchitis or bronchiectasis. Several patients had pancreatic insufficiency, 2 with meconium ileus. Sweat Cl levels ranged from normal to elevated levels. Of the males with respiratory disease who were old enough to be evaluated for fertility status, several were fertile, one of them had pancreatic insufficiency [60]. Thus the 5T allele might be considered a mutation causing disease associated with an extreme variability of clinical presentation: from normal healthy fertile individuals or males with CBAVD to the atypical or typical clinical phenotype of CF. Again, this extreme variability was found among members of the same family.

Molecular Mechanisms Underlying Disease Variability among Patients Carrying the Same CFTR Mutations

The molecular mechanisms underlying the variability found among patients carrying the same CFTR mutations are not well understood. The results of expression studies of the ΔF508 mutation might suggest that among patients homozygous for the ΔF508 mutation the variable lung disease might result from allelic differences in genes associated with trafficking the mutant protein to the cell surface. Patients with a more severe lung disease might have low levels of protein successfully transported to the cell membrane while patients with a milder lung disease might have higher levels, which confers partial functioning of epithelial respiratory cells.

The variability found among patients carrying class V mutations might shed new light on the molecular mechanisms underlying the disease severity. In cases where the mutation affects the level of normal splicing and leads to transcription of normally and aberrantly spliced transcripts, the disease might be caused by insufficient levels of the normally spliced CFTR transcripts. Several mutations were shown to lead to the transcription of both aberrantly and normally spliced transcripts [15, 36, 58]. The mutation 3849+10kb C→T creates a partially active splice site in intron 19 which can lead to the insertion of a new 84-bp ‘exon’, containing an inframe stop codon. Patients carrying the 3849+10kb C→T mutation have milder clinical presentation with variable disease severity [15, 16]. Variability in the level of the aberrantly spliced transcripts was found in respiratory epithelial cells from CF patients carrying this mutation. The level of the aberrantly spliced transcripts correlated with severity of pulmonary function. Patients producing undetectable levels of aberrantly spliced transcripts had normal or minimal lung disease, whereas patients with high levels of aberrantly spliced transcripts had severe lung disease. No correlation was found with sweat chloride levels, pancreatic status or age [61], consistent with the hypothesis that tissue-specific splicing factors affect the level of alternative splicing differently in different tissues. The 5T allele should also be included in this group. As mentioned above, the frequency of this allele in the normal population is nearly 5%; however, several studies found that among males with CBAVD the incidence of the 5T allele was nearly 30% [4345]. These results implicate that insufficient levels of normally spliced CFTR transcripts might be associated with infertility. More recently, the level of the aberrantly spliced transcripts lacking exon 9 was studied in respiratory epithelial cells of different patients. As for the 3849+10kb C→T mutation the level of the aberrantly spliced transcripts correlated with the severity of pulmonary function [62]. Futhermore, the level of the aberrantly spliced transcripts in epididymal epithelial cells of males with CBAVD was consistently low. A significant difference was found between the level of the aberrantly spliced transcripts in respiratory and epididymal epithelial cells of the same individual. Allelic differences in splicing factors might contribute to the variability in disease expression in different tissues of the same individual and in the same tissue of different patients.

Understanding the mechanisms of CFTR dysfunction may suggest therapeutic strategies. For class I and II mutations, pharmacological agents designed to enhance opening of mutant channels may not be very effective, since protein is not produced or little is present in the correct cellular location. Strategies to relocate class II mutations including the most common ΔF508 mutation could be beneficial. This has been shown to be possible by in vitro studies. Denning et al. [63] showed that reducing the incubation temperature of epithelial cells containing ΔF508 CFTR from 37 to 23–30°C caused some of the mutant proteins to escape from the endoplasmic reticulum, being fully glycosylated in the Golgi complex and delivered to the cell membrane. Presumably the folding process is able to occur, at least partially, at reduced temperature. For patients with class IV mutations stimulation of the opening of the mutant CFTR channels is required. Further understanding of the mechanisms regulating alternative splicing will contribute to potential therapy for patients carrying class V mutations. Thus, if gene therapy may not be available each mechanism of disease causing mutations may require a different therapeutic approach.