Introduction

Autosomal recessive polycystic kidney disease (ARPKD [MIM 263200]) is an important cause of renal- and liver-related morbidity and mortality in neonates and infants, occurring in 1 in about 20,000 live births (Guay-Woodford et al. 1996; Zerres et al. 2003). Principal histological manifestations involve the fusiform dilation of renal collecting ducts and hepatobiliary ductal plate malformation. Severely affected neonates display massively enlarged, echogenic kidneys resulting in a “Potter” oligohydramnios phenotype. About 30–50% of affected newborns die shortly after birth from respiratory insufficiency. Those who survive the neonatal period or present later in life express widely variable disease phenotypes with systemic hypertension, end-stage renal disease and sequelae of portal hypertension (Guay-Woodford and Desmond 2003; Bergmann et al. 2005a).

The PKHD1 gene (MIM 606702) for ARPKD on chromosome 6p12 is an exceptionally large gene with a minimum of 86 exons (Ward et al. 2002; Onuchic et al. 2002). The longest open reading frame (ORF) comprises 66 exons encoding a protein of 4,074 amino acids. There is increasing evidence that PKHD1 and its murine orthologue undergo a complex pattern of alternative splicing (Onuchic et al. 2002; Nagasawa et al. 2002) pointing to its biological impact. Recent data suggest the existence of various, partly secreted polyductin isoproteins with putatively different functions and Notch-like post-translational processing (Menezes et al. 2004; Hiesberger et al. 2005; Kaimori et al. 2005; Masyuk et al. 2005). The predicted full-length protein polyductin/fibrocystin represents a novel putative integral membrane protein with a possible function as a receptor. Protein expression in kidney and liver is consistent with the sites affected by the disorder. In common with most other cystoproteins, polyductin has been shown to be localised to primary cilia with concentration in the basal body area (Masyuk et al. 2003; Ward et al. 2003; Menezes et al. 2004; Wang et al. 2004; Zhang et al. 2004).

Currently, almost 300 different PKHD1 micromutations (point mutations and small deletions/duplications/insertions) on about 700 mutated alleles are listed in the locus-specific database (as of 9 May 2006) (http://www.humgen.rwth-aachen.de; Onuchic et al. 2002; Ward et al. 2002; Bergmann et al. 2003, 2004a, 2004b, 2005a, 2005b and own unpublished data; Furu et al. 2003; Rossetti et al. 2003; Losekoot et al. 2005; Sharp et al. 2005). The large size of the gene, lack of knowledge of the encoded protein’s/proteins’ function(s), and the supposed complex pattern of splicing pose significant challenges to the prediction of the functional consequences of PKHD1 sequence alterations. Moreover, the wide variety of different PKHD1 mutations with the majority of changes unique to single families in “non-isolate” populations sets further requirements for investigation. Thus, special note is warranted in differentiating pathogenic nucleotide substitutions from harmless sequence variants.

Recent studies yielded mutation detection rates of about 80% for the entire clinical spectrum of ARPKD patients (Bergmann et al. 2004b, 2005a; Losekoot et al. 2005; Sharp et al. 2005). However, the molecular defect still remains to be determined in a considerable proportion of chromosomes. While genetic heterogeneity is unlikely for the vast majority, important causes of missing mutations may be pathogenicity of changes currently categorized as non-pathogenic as well as of variants residing outside the coding region in introns and other regulatory elements. It is widely accepted that truncating mutations as well as changes affecting the invariant canonic splice sites should be considered as disease-associated; however, the biological effects of exonic base-pair substitutions (missense and silent changes) and intronic alterations outside the splice consensus sites are often difficult to assess without further functional analyses. It is still a matter of debate how frequently sequence changes involving splicing cause disease (Baralle and Baralle 2005). A proportion of 15% has been proposed in a study considering only changes affecting core splice consensus sequences (Krawczak et al. 1992). However, other surveys even found approximately 50% of mutations resulting in aberrant splicing and demonstrated that most of these splicing mutations did not involve the conserved AG/GT dinucleotides (Teraoka et al. 1999; Ars et al. 2000). These data indicate the importance of studying mutations at both the genomic and RNA levels. As regards PKHD1, investigations at the transcript level are severely hampered as polyductin is not widely expressed in peripheral blood lymphocytes, which is the material usually available from patients for mRNA analyses. To circumvent this restriction, we determined the functional significance of a novel putative PKHD1 splice site mutation using expression constructs for the mutant allele.

Materials and methods

Patient and DNA studies

The female patient analysed was the first-born child of a French non-consanguineous couple. After a reportedly uneventful pregnancy and on-schedule delivery, the girl perinatally developed severe respiratory insufficiency with a massively distended abdomen and passed away few hours after birth. Post mortem confirmed the diagnosis of ARPKD. With respect to the couple’s request for a prenatal diagnosis in a future pregnancy, we attempted to identify the underlying PKHD1 mutations in this family. Given the poor quality of paraffin-extracted DNA only available from the index patient, we screened the parental DNA samples for mutations in the 66 exons encoding the longest open reading frame (ORF; GenBank: NM_138694, AY129465) by denaturing high-performance liquid chromatography (DHPLC) on a Wave Fragment Analysis System (Transgenomic, Crewe, UK) and subsequent direct sequencing of abnormal elution profiles (http://www.humgen.rwth-aachen.de; Bergmann et al. 2004b). DNA from this family and 200 apparently unrelated, healthy control individuals was obtained after informed consent had been given. To exclude c.53-3C>A as a polymorphism or rare sequence variant, DNA samples from these 200 normal controls were tested under appropriate DHPLC conditions.

Splicing analyses

The mother’s DNA sample (known to be heterozygous for the allele c.53-3C>A) was amplified using genomic primers located in introns 1 and 4 previously used for DHPLC analysis. This resulted in a genomic DNA fragment of 2.8 kb. PCR was performed in a 50-μl volume using AccuPrime Taq DNA Polymerase (Invitrogen, Karlsruhe, Germany). PCR primers and conditions are available on request. After purification with the QiaQuick PCR Purification Kit (Qiagen, Hilden, Germany), the construct was inserted into the pTargeT vector (Promega, Mannheim, Germany) according to the manufacturer’s protocol and subsequently transformed into Escherichia coli strain JM109 (Promega). After overnight culture, the recombinant vectors were selected by boiling preparation and subsequent restriction digestion. Plasmids containing the wild-type (WT) and the mutant allele were prepared using the QIAprep Spin Miniprep Kit (Qiagen). Samples were run and analysed on an ABI PRISM 377 fluorescent DNA sequencer (Applied Biosystems, Darmstadt, Germany) using the ABI PRISM® BigDye Terminator Cycle Sequencing Ready Reaction Kit, version 2.0 (Applied Biosystems) according to standard protocols.

Purified plasmids were used for transfections of COS7 and HEK293 cells. Cells were grown in 35-mm tissue culture dishes to 95% confluency and transfected with 4 μg plasmid and 10 μl Lipofectamine 2000 (Invitrogen) per dish. After 24 h, cells were harvested and prepared for RNA extraction. Total RNA was isolated using the QIAamp RNA Blood Mini Kit (Qiagen). Reverse transcription (RT)-PCR was performed with 1 μg of total RNA in a 20-μl reaction using an RT system (AMV reverse transcriptase; Promega). The resulting cDNA was PCR-amplified using primers hybridising to exons 2 and 4; afterwards, all products were sequenced. PCR primers and conditions are available on request. For both alleles the experiments were analysed in triplicate.

To estimate whether the cDNAs obtained by the minigene approaches are also detectable in vivo and to exclude the possibility that the mutant allele merely represents an alternatively spliced product also present in some WT transcripts, we additionally analysed human total kidney RNA (Invitrogen).

Results and discussion

The locus-specific database (as of 9 May 2006) (http://www.humgen.rwth-aachen.de) currently contains almost 300 different PKHD1 mutations; however, experimental or functional data have not been shown in any of these mutations. This is mainly caused by the fact that investigations at the transcript level are severely hampered as PKHD1 is not widely expressed in peripheral blood lymphocytes, which is the material usually available from patients for mRNA analyses. Given that biological effects of intronic changes beyond consensus splice sites are difficult to assess without further functional analyses, only PKHD1 changes affecting core splice consensus sequences have been considered as pathogenic so far.

In line with this, we classified the novel PKHD1 variant c.53-3C>A in IVS2 as a change of unknown significance despite several lines of evidence in favour of its putative pathogenicity. First, the change was shown to reside on the maternal chromosome of the perinatally deceased index patient, whereas the frame-shifting mutation c.5895dupA (p.Leu1966fs) in exon 36 segregated paternally. Moreover, the novel variant was not detected among 200 normal controls, and bioinformatic programs (http://www.fruitfly.org/seq_tools/splice.html and http://www.cbs.dtu.dk/services/NetGene2) predicted a significant lowering of the strength of the acceptor splice site of intron 2 (scores of 0.81–0.20 and 0.63–0.17 respectively). Although these prediction programs may be quite useful, their significance should not be overestimated and respective data have to be dealt with carefully. Moreover, the prevalence of any individual PKHD1 mutation can be expected to be very low in the general population, given an estimated carrier frequency of approximately 1:70 in non-isolated populations (Zerres et al. 1998) and the vast number of different alleles (http://www.humgen.rwth-aachen.de). Thus, despite screening of a large number of normal chromosomes, evidence from segregation analysis, and lack of further changes in the rest of the gene, we did not feel confident about postulating a pathogenic impact of c.53-3C>A on the mature PKHD1 RNA given that its location at position −3 did not affect the splice consensus dinucleotides.

Thus, we attempted to determine the functional significance of this change by transfection of renal epithelial cells with expression constructs. As depicted in Fig. 1, PCR and sequencing analyses of the resulting cDNAs using primers in exons 2 and 4 revealed one single band with a length of approximately 370 bp in the WT construct in human embryonic kidney (HEK) as well as in green monkey kidney (COS) cells that corresponds to the full-length cDNA. In contrast, in the patient a smaller band of approximately 290 bp was detected in both cell lines corresponding to the skipping of exon 3, while in a small proportion of products the full-length cDNA band also appeared. We were able to exclude the possibility of endogenous mRNA production of the cells by repeating the experiments with untransfected cells, which did not result in a product. Moreover, by analysing human total kidney RNA we could exclude the mutant allele merely representing an alternatively spliced product, a conceivable option given the complex and extensive array of PKHD1 splice variants (Onuchic et al. 2002; unpublished data). Furthermore, by analysing human total kidney RNA we demonstrated that the cDNA obtained by the WT minigene assay was also detectable in vivo.

Fig. 1
figure 1

After transfection of the WT and mutant constructs into HEK293 and COS7 cells, the resulting transcripts were analysed by RT-PCR and subsequent cDNA sequencing using primers in exons 2 and 4. As depicted on the gel file, one single band with a length of 365 bp was present in the WT construct in HEK293 as well as in COS7 cells that corresponds to the full-length cDNA. In contrast, in the patient a smaller band of 287 bp was detected in both cell lines corresponding to skipping of exon 3. The patient’s sequence encompassing the fused exons 2 and 4 is depicted below the gel. The arrow indicates the fusion site. In a proportion of products also the full-length cDNA band appeared

From what is known from compiling human mutation data, it is most likely that exon 3 with a length of 78 bp is just lacking and flanking exons 2 and 4 are fused in the mature mRNA (Black 2003). In this scenario, the corresponding transcript is supposed to restore the reading frame. While one may argue that an in-frame deletion may just represent a rare apathogenic genetic variant, it is widely believed that deletion of exons encoding non-repetitive parts of a protein are not expected to remain without clinical consequences (Michael Krawczak, personal communication). In contrast, it may rather be hypothesized that the missing exon 3 is crucial for polyductin function(s). Given its predicted protein structure (http://www.au.expasy.org/) the skipping of exon 3 can be expected to disrupt the last residues of the presumed signal peptide resulting in defective cotranslational transport of the polyductin protein. A decisive role of exon 3 is further strengthened by harbouring the most frequent PKHD1 mutation c.107C>T (p.Thr36Met) (http://www.humgen.rwth-aachen.de).

It is presently unknown how many alternative PKHD1 transcripts are actually translated into protein and do have biological function(s). If various mRNAs are translated, the PKHD1 gene may encode numerous distinct polypeptides differing in size and amino acid sequence. This attractive hypothesis is supported by different lines of evidence. First, multiple bands could have been detected in Western blot analysis of diverse groups (Masyuk et al. 2003; Ward et al. 2003; Menezes et al. 2004; Wang et al. 2004; Zhang et al. 2004). Moreover, Masyuk and colleagues showed the translation of different partly secreted polyductin isoproteins in cholangiocytes (Masyuk et al. 2005). The Germino group detected the expected full-length polyductin product (>400 kDa) and a C-terminally tagged 80–90 kDa product in the plasma membrane when using a cell surface biotinylation assay (Kaimori et al. 2005). Intriguingly, Hiesberger et al. (2005) demonstrated that regulated intramembrane proteolysis (RIP) is induced by primary cilia dependent Ca2+signalling and generates a C-terminal polyductin fragment that can signal directly to the nucleus. Finally, it will be of paramount importance to establish which isoforms are essential for renal and hepatobiliary integrity to better understand the role of polyductin in the etiology of ARPKD. The identification of PKHD1 mutations is therefore one means of deciphering those exons whose presence in a transcript is essential for the function of polyductin. Accordingly, and in addition to what has been said above, the presence of exon 3 seems to be crucial for proper polyductin function(s).

Conclusively, taken all data together, with 400 control chromosomes negative for this change, compatible segregation of mutations in this pedigree, a significant lowering of the strength of the acceptor splice site by use of bioinformatics, and the obtained minigene results, it can be postulated that the mutant allele c.53-3C>A does in fact has biologic relevance. To the best of our knowledge, this is the first study that functionally demonstrates a pathogenic effect on PKHD1 splicing. Furthermore, to assess splicing our survey highlights the usefulness and importance of studying mutations at both the genomic and the RNA levels by minigene splicing assays.

Data access

OMIM: 606702 (PKHD1), 263200 (ARPKD); GDB: 433910; GenBank: NM_138694.2, AF480064, AY074797, AY129465; http://www.humgen.rwth-aachen.de (PKHD1 Mutation Database).