Introduction

Polycystic liver disease (PLD) is part of the phenotype of two inherited disorders; autosomal dominant PLD (ADPLD) and autosomal dominant polycystic kidney disease (ADPKD). In 83–94% of ADPKD patients, polycystic livers are seen.1, 2 Variants in Protein Kinase C Substrate 80 K-H (PRKCSH), SEC63 homolog (Saccharomyces cerevisiae; SEC63), and Low-density lipoprotein Receptor-related Protein 5 (LRP5) cause ADPLD, and are present in ~25% of cases,3, 4 whereas variants in Polycystic Kidney Disease 1 (PKD1) and Polycystic Kidney Disease 2 (PKD2) are responsible for ADPKD in virtually all cases.5 Protein products of genes underlying PLD are located in the endoplasmatic reticulum (ER) or primary cilium.6, 7 Experimental data favor a genetic interaction network between ER-localized protein products of PRKCSH and SEC63, and cilium-localized PKD1 and PKD2.7, 8 The finding of LRP5 variants in PLD suggests that Wnt signaling may be disrupted downstream of this interaction network. Genes that underlie PLD thus function in distinct organelles and pathways, despite a final common cystogenic effect. Furthermore, ciliopathy-associated genes act outside of the PKD1/PKD2 genetic interaction network,7, 9 and may also cause liver cysts. The search for new genes should therefore not be limited to currently known genomic sites. At a tissue level, PLD appears to be a recessive disease. Complete loss of cyst gene expression from diseased epithelium follows loss-of-heterozygosity (LOH),10, 11, 12, 13, 14, 15 which may be related to cyst genetic instability.16, 17 The proportion of somatic variants varies with the gene that is affected in the germline. Recent studies found that second, somatic variants or LOH occurred in 56/71 liver cysts (79%) from patients with PRKCSH variants,11 in 4/5 (80%) PKD2 variant carriers,18 but only 1/14 cysts (7%) from a patient with a SEC63 variant.10 This indicates LOH incidence depends upon the genetic and phenotypic background. We hypothesize that a ‘two-hit model’ is a general principle for the development of hepatic cysts. Therefore, somatic LOH regions in cyst epithelium may harbor novel candidate PLD-causing genes, which harbor heterozygous germline variants in the respective cases. Considering the genetic interaction network in PLD,7, 8 digenic or transheterozygous variants at two genetic loci may also have a role. Transheterozygous PKD1/PKD2 variants have been described in renal cysts,14, 15 whereby a variant in one cyst gene is succeeded by a variant in a second cyst gene. Cysts with heterozygous variants in PRKCSH and SEC63 continue to express the relevant proteins.10, 11 It is reasonable to hypothesize that transheterozygosity may be another mechanism in hepatic cyst formation. This study aims to determine novel genetic loci that are involved in cystogenesis both at germline and somatic level. To this end, we followed an unbiased approach and assessed copy-number variations (CNVs) and LOH regions in PLD cyst epithelium using a genome-wide high-resolution cytogenetic array analysis.

Methods

Patient material

We obtained DNA from liver cyst cholangiocytes of 23 newly included patients who underwent either laparoscopic cyst fenestration or aspiration sclerotherapy from 2011–2014 because of large cysts. All patients except one were females, and had single or multiple liver cysts. All patients had severe symptoms and the mean age was 54 (range 42–83) years. Seventeen patients had ADPLD, three had ADPKD, and three had solitary or sporadic cysts. Use of this tissue for research was reviewed and approved by the regional ethics review board ‘Commissie Mensgebonden Onderzoek regio Arnhem-Nijmegen’.

Cyst work-up

We isolated cholangiocytes by four methods (Supplementary Figure S1; Table 1). First, as described from 23 previously studied laparoscopy-derived liver cysts (six patients) obtained from 2010 to 2012,18 we collected cells from fresh tissue by ethylenediaminetetraacetic acid detachment. Keratin (KRT)-19 staining indicated the purity of each sample. Second, we collected cells from 30 laparoscopy-derived liver cysts of eight patients from 2012 to 2014. These cells expanded into adult liver organoids using conditions suitable for their expansion.19 Under these conditions, only stem cells with a cholangiocyte-like phenotype expressing KRT19 persisted. DNA from one cyst per patient was studied using high-density SNP microarrays (Affymetrix Cytoscan HD, Santa Clara, CA, USA). DNA from the remaining 22 cysts was used to assess somatic loss of the wild-type allele of heterozygous PRKCSH germline variants by Sanger sequencing. Third, symptomatic cyst patients were referred to our hospital for aspiration sclerotherapy. We collected 68 cyst fluid aspirates from 50 patients in 2011 and 2012. We subjected all samples to centrifugation, KRT19 staining, and fluorescent-activated cell sorting (FACS) of cholangiocytes (Supplementary Methods). This yielded eight additional samples for single-nucleotide variant (SNP) microarray studies. Fourth, we grew eight cultures from 30 aspiration sclerotherapy fluids collected from 2012 to 2014, using conditions suitable for the expansion of adult liver stem cells.19 We obtained cyst fluid and epithelium samples, and stored them in the course of treatment following the Dutch Code for the proper secondary use of human tissue. Use of this tissue for research was reviewed and approved by the regional ethics review board ‘Commissie Mensgebonden Onderzoek regio Arnhem-Nijmegen’.

Table 1 Patient characteristics

DNA isolation from cyst cholangiocytes

We isolated DNA from the cyst cholangiocytes using the QIAamp DNA Micro kit (Qiagen, Hilden, Germany) according to the manufacturer’s protocol. For samples with low DNA yields (which included all FACS obtained samples), whole-genome amplification (WGA) using the Qiagen REPLI-g Mini kit (Qiagen) was performed.

Genetic analysis by microarray and genotyping

We assessed CNVs, LOH regions, and regions of homozygosity using genome-wide high-resolution cytogenetic array analysis (CytoScan HD, Affymetrix). We screened whole-blood PLD patient DNA for germline variants in PKD1 (NG_008617.1, NM_001009944.2), PKD2 (NG_008604.1, NM_000297.3), PRKCSH (NG_009300.1, NM_002743.3), SEC63 (NG_008270.1, NM_007214.4), and LRP5 (NG_015835.1, NM_002335.3) using direct sequencing as described previously.11 Briefly, we isolated DNA from whole blood using the PureGene DNA isolation kit (Gentra Systems/Qiagen, Minneapolis, MN, USA) or High Pure Polymerase chain reaction (PCR) template preparation kit (Roche, Mannheim, Germany), and stored it at 4 °C. PCR amplified PKD2, PRKCSH, SEC63, and LRP5, exons and flanking intronic sequences with specific primers (Supplementary Table S1). Screening for germline variants in PKD1 (NG_008617.1, NM_001009944.2) was carried out by a method adapted from Tan et al.20 In short, primer sequences (Supplementary Table S1) for long range PCR were chosen on specific regions of the PKD1 gene preventing the amplification of the known duplications of the first 33 exons at the proximal side of the gene. PCR reactions were performed according to the manufacturer’s manual using FastStart Taq DNA Polymerase System supplemented with GC-RICH solution (Roche) for exon 1 or GeneAmp High Fidelity PCR System (Life technologies, Carlsbad, CA, USA) supplemented with 5% DMSO for all other fragments. Annealing temperatures during PCR were carefully selected for each amplicon to amplify the desired regions. After purification of PCR amplicons from gel using the QIAEXII Gel Extraction Kit (Qiagen), a total of 500 ng of equimolar amounts of the PCR amplicons of each sample were sequenced in a single run using Ion Torrent Next-Generation Sequencing (Life Technologies).

All cysts from patients with PRKCSH variants were analyzed for LOH using direct sequencing of this variant. We screened the entire coding region of BUB1 Mitotic Checkpoint Serine/Threonine Kinase (BUB1) (NG_012048.1, NM_004336.4) in a cohort of unrelated patients with ADPLD (n=100) that met the Reynolds’ criteria.21, 22 We performed high-resolution melting curves (RotorGene-Q; Qiagen) to reveal differences in melting curve shape that correlate to BUB1 genotype variants and validated these findings by Sanger sequencing. Adenylate Cyclase 1 (ADCY1) (NG_034198.1, NM_021116.2), Insulin-Like Growth Factor Binding Protein (IGFBP) 1 (NC_000007.14, NM_000596.2), and IGBPF3 (NG_011508.1, NM_001013398.1) were analyzed in a patient with cyst chromosome 7 copy-number loss. Exons and flanking intronic sequences were amplified using PCR with specific primers (Supplementary Table S1). We purified PCR amplicons from gel using the QIAEXII Gel Extraction Kit (Qiagen) and sequenced them with the BigDye terminator kit (Applied Biosystems, Foster City, CA, USA) and ABI3730, ABI310, or ABI3100 Genetic Analyzers (Applied Biosystems, Boston, MA, USA), or (from 2014 onwards) Ion Torrent sequencing (Life Technologies).

Data analysis

We considered large regions of homozygosity on the autosomes (>3.0 Mb) and large CNVs (>1.0 Mb) for further analysis. For samples not derived by FACS/WGA, we placed no limits for size of LOH or CNVs around known cyst genes. We only assessed X chromosomes for whole-chromosome abnormalities. For selected cases, we compared the array results derived from cyst cell DNA (somatic) and germline DNA (genomic) to ensure identification of somatic events. We selected cases for germline analysis based on FACS/WGA origin of cyst DNA or presence of non-mosaic >1.0 Mb CNVs. We used individual CytoScan HD data chromosome analysis suite (CHAS, Affymetrix) V2.1. For analysis involving multiple SNP array data, we used Nexus Copy Number (Biodiscovery, El Segundo, CA, USA) V6.0.

SNP array data

The data discussed in this publication have been deposited in NCBI’s Gene Expression Omnibus and are accessible through GEO Series accession number GSE78808 (http://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?token=kjipyeauhnwbzqv&acc=GSE78808). Described variants have also been deposited at NCBI’s database of genomic structural variations (Wills et al, 2016) and are accessible through dbVar (http://www.ncbi.nlm.nih.gov/dbvar) accession number nstd125.

Results

Patient characteristics

We included DNA from 24 liver cysts from 23 patients for Affymetrix CytoScan HD SNP array analysis in this study (Figure 1, Table 1; Supplementary Figure S1 for inclusion flow chart), together with 23 cysts from six previously described patients (Supplementary Table S2; Supplementary Figure S2). In the new cohort, all patients except one were females, and had single or multiple liver cysts. All patients had severe symptoms and the mean age was 54 (range 42–83) years. Seventeen patients had ADPLD, three had ADPKD, and three had solitary or sporadic cysts. Six carried PRKCSH variants, one carried a variant in SEC63, and two had PKD1 variants.

Figure 1
figure 1

Overview of abnormal regions found in this study by SNP array.

Germline variants

In the total cohort, three patients displayed germline abnormalities detectable by our criteria outside of PRKCSH, SEC63, LRP5, PKD1, and PKD2 genomic regions. We found a heterozygous 3.0 Mb 2q13 complex rearrangement in germline and cyst DNA of a PRKCSH variant carrier (patient 8; Figure 2). This rearrangement constituted a copy-number gain as well as a copy-number loss, and contained the genes BUB1, ACOXL, BCL2L1, ANAPC1, MERTK, TMEM87B, FBLN7, and ZC3H8* (Figure 3). Peripheral blood and cyst DNA of another patient (#1) displayed a large copy-number loss (12.7 Mb) of chromosome 3p containing Wnt signaling effector catenin (cadherin-associated protein) beta 1 (CTNNB1), among others (Figure 4). We additionally identified a large copy-number loss (2.9 Mb) in patient #1, which contained the genes SYK, NFIL3, and ROR2**. This CNV in chromosome 9 occurs with low frequency in a normal, healthy cohort. A third ADPLD patient carried a gain of the entire X chromosome (Triple-X) in her germline and somatic DNA (patient 18; Figure 2).

Figure 2
figure 2

Upper panel: 2q13 complex rearrangement present in genomic blood DNA of PRKCSH variant patient (#8). Lower panel: presence of three X chromosomes in genomic blood DNA of blood of another PRKCSH variant patient (#19).

Figure 3
figure 3

BUB1 is present at the centromeric part of the 2q13 loss region.

Figure 4
figure 4

Left: CNN LOH on chromosome 3p of patient #18, 19, and 1, from above to below in that order. Right upper panel: mosaic whole-chromosome 7 gain in an ADPLD patient (#11) with unknown variant, chromosome 7p shown. Right lower panel: 1.7 Mb loss on chromosome 7p of a PRKCSH patient cyst (#8.2).

* Acyl-CoA Oxidase-Like (ACOXL); BCL2-Like 1 (BCL2L1); Anaphase Promoting Complex Subunit 1 (ANAPC1); MER Proto-Oncogene, Tyrosine Kinase (MERTK); Transmembrane protein 87B (TMEM87B); Fibulin 7 (FBLN7); Zinc-Finger CCCH-Type Containing 8 (ZC3H8); ** Spleen Tyrosine Kinase (SYK); Nuclear Factor, Interleukin 3 Regulated (NFIL3); Receptor Tyrosine Kinase-Like Orphan Receptor 2 (ROR2).

Germline BUB1 variants

The 2q13 rearrangement pointed to the well-known mitotic checkpoint gene BUB1.23 To test whether this gene had a more general role in ADPLD, we studied the gene in genomic DNA from 100 severely affected ADPLD patients. Using high-resolution melt curves and subsequent Sanger sequencing on abnormal melting curves, we identified one synonymous SNP (rs370559107: hg38 chr2:g.110672765C>G, silent variant) in these 100 patients. In addition, a single, nonsynonymous SNP (rs61730706; hg38 chr2:g.110667649G>A, c.677C>T, p.(Ala226Val)) was present in the DNA of one patient of this cohort, which occurs with a minor allele frequency of 0.002–0.005 in the general population. Polyphen and SIFT showed conflicting effects (SIFT: 0.28, tolerated; Polyphen (HumDiv): possibly damaging 0.749; Polyphen (HumVar) benign 0.219; Variant taster variant (P=0.617)). We did not consider this SNP as causative or disease-related, as it is normally present in the general population.

LOH surrounding PRKCSH, SEC63, LRP5, PKD1, and PKD2

To validate our approach, the SNP array cohort included seven cysts from six patients with known PRKCSH variants. The SNP arrays of five cysts of four patients revealed copy-number-neutral (CNN) LOH on chromosome 19 at the position where PRKCSH is located (Supplementary Figure S4). Sanger sequencing of additional cyst DNA from patients (#12, 17, and 19) with known PRKCSH variants revealed somatic loss of the respective wild-type alleles in 17 out of 22 cysts (Supplementary Table S3). DNA of a cyst of patient #7 contained a >30 Mb homozygous region on chromosome 16p. Although initially classified as having isolated PLD with a single renal cortical cyst, a germline PKD1 variant was found (hg38 chr16:g.2090007T>G, c.12632A>C, p.(Glu4211Ala)). No CNN LOH was found in cyst DNA of the other two ADPKD patients by SNP array, and we proceeded by directly sequencing PKD1 and PKD2. Patient #18 had a germline PKD1 variant (hg38 chr16:g.2135508G>A, c.182C>T, p.(Pro61Leu)). In the cyst, the second allele was likely affected by a somatic deletion (hg38 chr16 g.[2135508G>A(;)2090517_2090558delinsGGAC], c.[182C>T(;)12171_12212delinsGTCC], p.[Pro61Leu(;)Ser4057Argfs*87]). We detected no variants in germline or somatic DNA of the final ADPKD patient.

LOH across ADPLD genes was a frequent occurrence in somatic DNA of our total cohort, regardless of variant status of germline DNA. Approximately 40% of 34 cysts with high-quality SNP array data displayed LOH or allelic imbalance (AI) across PRKCSH, whereas >25% of cysts had PKD1 LOH or AI. PKD2, SEC63, and LRP5 all had lower percentages of LOH/AI ~15, ~15 and ~10%, respectively. Copy-number losses and gains in ADPLD occurred less frequently.

Novel, digenic abnormalities in cysts

As described above, patient #8 presented with two germline events. We identified a PRKCSH variant and a complex rearrangement resulting in BUB1 haploinsufficiency. DNA from her second cyst displayed multiple second hits when compared with peripheral blood. A 1.7 Mb copy-number loss region on chromosome 7 was present as well as CNN LOH surrounding PRKCSH on chromosome 19 (Figure 4). The loss region includes the genes ADCY1, IGFBP1, and IGFBP3, as well as the pseudogene Septin 7B2 (SEPT7B2). It was not detected in the SNP array profile of germline DNA, nor were germline variants detected in any of these genes using Sanger sequencing. Sanger sequencing also did not reveal variants in these genes in the cyst. Another PRKCSH mutant cyst displayed mosaic trisomy 7 (#11), although only present in a subset of cells. Copy-number-neutral LOH of chromosome 19 (1.2 Mb), encompassing PRKCSH, and chromosome 11 (1.3 Mb), encompassing LRP5, was present in cyst DNA of an ADPKD patient (#18, germline PKD1 variant). In this patient, an additional copy-number-neutral LOH region was detected on chromosome 3p (hg19 chr3:g.49,735,745_53,133,526), partially overlapping with a region of a PRKCSH variant carrier (#19; hg19 chr3:g.46,715,645_52,852,488). Notably, patient #1 with a larger copy-number-neutral LOH on chromosome 3p on germline and cyst DNA also had overlap (hg 19 chr 3:g.39,874,567_52,653,645). The minimal region of LOH contained >60 genes, containing multiple tumor suppressor genes (Supplementary Table S4).

Sporadic cysts

Two sporadic cysts of the same patient showed novel chromosomal LOH (patient #28; Supplementary Table S2). These cysts had been previously described without abnormalities encompassing the then known cyst genes PRKCSH, SEC63, PKD1, or PKD2. Copy-number-neutral LOH (2.0 Mb) was found around LRP5, a novel cyst gene (Supplementary Figure S5). Germline DNA revealed no variant in LRP5, and the presence of two germline heterozygous LRP5 SNPs (hg38 chr.11:g.68403545T>C, rs545382; hg38 chr.11:g.68425222G>A, rs556442) indicated that LOH was only present at a somatic level. Unfortunately, no cyst DNA remained, and it could not be confirmed whether this copy-number-neutral LOH led to loss of wild-type alleles by variant. Finally, the sporadic cyst from patient 20 displayed a 3.8 Mb mosaic copy-number gain at the telomeric part of chromosome 16p on cytoscan. The signal was relatively weak, yet may suggest presence of a cellular subpopulation of this cyst with PKD1 gain.

Discussion

Here, we show that germline and somatic abnormalities outside of known cyst genes frequently occur in PLD, and many cysts of PLD patients have a unique genetic signature. LOH in known regions was confirmed for PRKCSH (22/29) and PKD1/PKD2 (2/3) variants. In the SNP array cohort of 23 new patients (24 cysts), we detected 12 cysts with sizable copy-number losses or LOH outside of earlier identified genomic regions. In three patients, we found the presence of unique germline aberrations in PLD. In addition, a patient with a primary liver phenotype had a germline PKD1 variant. Cyst DNA showed recurrent copy-number loss on chromosome 3, with overlap between three patients. On chromosome 7, a 1.7 Mb copy-number loss and a mosaic whole-chromosome gain were found. Sporadic cysts displayed LOH around LRP5 and a mosaic gain surrounding PKD1 at a somatic level.

Although many genomic abnormalities were present beyond genomic regions known for second hits, most were unique. Chromosomal mapping suggests that each cyst follows an independent genetic pathway, similar to mosaicism observed in tissues of other somatic second-hit disorders.24, 25 This likely reflects significant heterogeneity that is at the genomic root of cyst development. It also reflects that general genomic instability presumably precedes cyst development.16, 17 BUB1 haploinsufficiency is a known driver of chromosomal instability resulting in LOH and tumorigenesis,23, 26 and a 2q13 microdeletion has previously been described.26, 27 Although we show that it is unlikely that this specific gene is generally involved in LOH in PLD, other drivers of genetic instability may be present in the disease. Wnt signaling abnormalities appear to be common in PLD. Wnt signaling abnormalities are associated with tumorigenesis in the liver and kidney, and cause aberrant cyst development in mice.28, 29 CTNNB1 deletions and APC variants are also frequently found in hepatoblastoma.30, 31 Variants in canonical Wnt signaling component LRP5 causes ADPLD in man,4 whereas the c-terminal tail of polycystin-1 might interact with beta-catenin.32 In the same vein, our finding of copy-number-neutral LOH around CTNNB1 and with copy-number loss of ROR233 in the germline of an ADPLD patient, together with copy-number-neutral LOH around LRP5 in two sporadic cysts of another patient, further implicates this pathway as crucial for cystogenesis.

Trisomy X, the final germline abnormality we found, is unlikely to be involved in cystogenesis. Although female gender is a risk factor for cyst development and complications,34 this is most likely owing to the sex hormones.35, 36 Furthermore, gene expression of triple-X cells is largely limited to one copy by X-inactivation, excluding 5–10% of genes on the X chromosome located in pseudoautosomal regions.37 We see this as a chance finding, considering its prevalence of 1 in 1000 females. On a somatic level, possible digenic or transheterozygous variants, and a PKD1 gain were present on cysts. The 1.7 Mb chromosome 7 copy-number loss of a PRKCSH variant carrier might point toward a novel transheterozygous modifier region. Haploinsufficiency of genes such as IGFBP1, IGFBP3, and ADCY1 might be relevant in cyst development, as these genes are related to the IGF and adenylyl cyclase pathway that the anti-cystogenic somatostatin analogs are involved in.38, 39, 40, 41 Surprisingly, another cyst displayed a mosaic gain over the whole of chromosome 7, indicating that overexpression of these or other genes may also be relevant for cyst development. More difficult is the recurrent, overlapping LOH region on chromosome 3 (49,735,745-52,653,645). Although two cysts display clear LOH, the third may have had a partially normal cellular subpopulation. Over 60 genes are present at this location, none of which are known cyst-related genes. This 3p21.3 region does contain a cluster of tumor suppressor genes,42 such as tumor susceptibility gene Rassf1a.43 The cluster frequently undergoes LOH in early formation of different tumors,42 including hepatocarcinomas and cholangiocarcinomas.44 The mosaic gain of PKD1 in one cyst may be consistent with overexpression of Pkd1 in mice leading to cystogenesis.45 The mosaicism of this region was relatively low however, and may represent an artifact. Given the substantial investment that went into genotyping this cohort, we express our disappointment that we found no more specific leads to clarify the origination of multiple cysts in the liver. Concluding, our chromosomal mapping indicates significant genetic heterogeneity outside of known second-hit regions of liver cysts. We identified unique cystogenic regions, as well as characteristics of general genomic instability in hepatic cyst DNA. These findings may explain the large number of APDLD cases without a known variant, as well as phenotypic dissimilarities between similar cyst germline variants.