Introduction

Classical Hodgkin lymphoma (cHL) represents about 1% of all neoplasms occurring worldwide. Although generally rare, it is one of the most frequent cancers affecting young adults. The incidence is ~3/100,000 in Western countries and it accounts for 12% to 15% of all malignant lymphomas [1, 2].

By morphology cHL is sub-classified into four subtypes: mixed cellularity, lymphocyte rich, lymphocyte depleted, and nodular sclerosis [3]. The unifying morphologic and diagnostic feature of all cHL subtypes is the presence of characteristic large cells termed Hodgkin- and Reed-Sternberg cells (HRSC). These cells are scarce, usually accounting for 1% to 5% of the cellular composition in the affected tissues. Virtually all HRSC arise from germinal center B-cells as proven by the demonstration of clonally rearranged and hypermutated immunoglobulin heavy- (IGH) and light-chain genes [4]. More than 25% of cHL cases bear loss-of-function IG gene mutations leading to lost expression of the immunoglobulin receptor. Under physiologic circumstances, this defect would lead to apoptosis of the respective B-cell, but the HRSC manage to escape apoptosis by deregulating key cellular signaling pathways. It has been shown that Epstein-Barr virus (EBV) infection is one of the several possible rescue mechanisms [5, 6]. HRSC, like most other tumor cells, often bear evidence of past or ongoing genetic instability as shown by DNA content analysis by flow cytometry and array-comparative genomic hybridization (aCGH) [7,8,9]. Multiple genes encoding for nuclear factor-κB (NF-κB) pathway compounds, including REL, TNFAIP3, NIK, TRAF, IκBα, IκBε, and CYLD, were found altered in cHL causing aberrant pro-survival signaling [10,11,12,13]. Another signaling cascade affected in cHL is the JAK-STAT pathway. JAK2 is amplified in about 20% of cases and SOCS1, encoding the main inhibitor of STAT activity, is inactivated by mutations in 40% of cHL [8, 14, 15]. In addition, evidence exists that deregulated expression of the polycomb group proteins and NOTCH1 as well as several micro-RNAs contribute to the neoplastic transformation of the crippled germinal center B-cells [16,17,18]. Finally, overexpression of the immune checkpoint molecule PD-L1 and loss of β2-microglobulin expression due to inactivating B2M mutations point towards the role of immune evasion in cHL pathogenesis [19, 20].

Tumor cell scarcity has substantially impaired cHL investigation at the genetic level. This is because the majority of high-throughput molecular research methods such as gene expression and copy number microarrays, RNA sequencing and, to a lesser extent, high-throughput sequencing, requires a medium- to high tumor cell purity to obtain robust results. For this reason, most information about the molecular pathogenesis of cHL was obtained from investigations of cell lines, HRSC-rich cases and laser microdissected primary tumor samples. The laser capture microdissection (LCM) technique is still widely used and until very recently has been an essential part of virtually all cHL genetic studies. However, it has important disadvantages that limit its use. First, the microdissection process involves high-energy laser radiation that can seriously damage the integrity of DNA, thus, reducing the success rate and quality of downstream analyses. Second, this method is very labor-intensive — the majority of genetic studies require at least 1000 microdissected cells, which have to be manually collected. Finally, the technique operates on tissue sections meaning that not intact but sliced cells containing truncated nuclei are collected. Depending on the number of cells harvested, this might lead to biased results in assays, where good sample representation is important, such as aCGH.

An alternative to LCM is fluorescence-activated cell sorting (FACS). It is a high-throughput process, which enables the automated collection of thousands of cells with high purity. In 2006 a protocol for cytometric analysis and FACS of HRSC from fresh or frozen tissue has been developed [21] and recently applied for the first whole-exome sequencing study in primary cHL [20]. However, until now there has been no protocol developed for HRSC enrichment from formalin-fixed and paraffin-embedded (FFPE) tissues.

Methods for FFPE tissue dissociation into single cell nuclei-suspensions have been established several decades ago and then primarily used for DNA-content analysis of tumor cells [22]. Specific labeling of some protein markers in archival tissues have also been developed [23,24,25,26]. However, broader applications of this technique, especially multi-color analysis, was hampered by limited choice of specific antibodies suitable for formalin-fixed antigens, poor antigen preservation after enzymatic tissue digestion and high autofluorescence of fixed cells, which make marker-specific fluorescence signals difficult to detect. This meant that individual protocols with optimized procedures for nuclei extraction, antigen retrieval, marker labeling, and signal detection had to be developed for each individual application.

Here, we describe a novel FACS-based technique to enrich HRSC nuclei from FFPE tissues and apply it to perform comprehensive genetic characterization of ten archival cHL clinical samples.

Materials and methods

Patient samples

The clinical samples of ten primary cHL (all CD30+ , CD15+ , PAX5dim+ , CD20−, MUM1+ ), were retrieved in the form of FFPE tissue blocks from the archive of the Institute of Pathology at the University Hospital Basel, Switzerland, based on the following main criteria: (1) good fixation and preservation of antigenicity for CD30 and MUM1 at the time of processing, and (2) lymphadenectomy specimens with sufficient tissue and tumor cell content of >0.5% (for exact case selection strategy see Supplementary Table S1). Retrieval of tissue and data were according to the regulations of the local institutional review boards and data safety laws. The study was approved by the ethics committee of Central and North-Western Switzerland.

Immunohistochemistry

Immunohistochemistry (IHC) on whole tissue sections was performed using the automated immunostainer Benchmark XT (Ventana/Roche, USA) with a biotin-streptavidin-peroxidase detection system according to the manufacturer’s protocols. For antigen retrieval, tissue sections were immersed and microwaved in Cell Conditioning Solution 1 (CC1, Ventana/Roche). Upon nuclei extraction, IHC stainings were performed manually using biotinylated secondary antibodies and a streptavidin-peroxidase detection system. Antibodies used for IHC stainings are listed in Supplementary Table S2.

Nuclei extraction, staining and sorting

We use the term “nuclei extraction” throughout this paper to designate the process of mechanical and enzymatic tissue dissociation, which results in suspension that contains un-nucleated debris of membranes and cytoplasms, intact cell nuclei with remnants of attached cytoplasms as well as intact cell nuclei without cytoplasms. For simplicity, the latter two fractions are called “nuclei” in the text. Nuclei extraction from FFPE tissue was performed as described previously, with modifications [27]. All steps were carried out in ambient temperature unless indicated otherwise. In all, 55µm-thick sections were cut and deparaffinized with several xylene washes. Tissue was then rehydrated by serial washes of decreasing ethanol concentration [1 ml each, 100% (2 × ), 96%, 70%, 50%, 20%]. To remove remaining ethanol, the tissue was washed once with 1 ml of citrate buffer, pH 6.0. After wash, 1 ml of citrate buffer was added again and samples were incubated for 30 min at 90 ˚C in order to break fixation-induced protein crosslinks and to facilitate subsequent enzymatic digestion. After cooling to room temperature for 10 min, citrate buffer was carefully removed and samples were washed twice with 1 ml PBS pH 7.4/0.5 mM CaCl2. Two types of enzymatic cocktails, here named C-C-H and C-D for brevity, based on publications by Holley et al. [28] and Corver et al. [29], respectively, were used for sample digestion. Enzymes for cocktail C-C-H (collagenase type 3, purified collagenase and hyaluronidase, all purchased from Worthington Biochemical Corp., NJ, USA) were dissolved in PBS pH 7.4/0.5 mM CaCl2 buffer at 1000 units/ml concentration and kept frozen at −20 °C until use. The working solution of cocktail C-C-H contained 10 units/ml of collagenase type 3, 80 units/ml of purified collagenase, and 100 units/ml of hyaluronidase in PBS pH 7.4/0.5 mM CaCl2 buffer. Enzymes for cocktail C-D (collagenase Ia from Sigma-Aldrich, Steinheim, Germany, and dispase from Life Technologies, Austin, TX, USA) were dissolved in standard RPMI 1640 cell culture medium at 20 mg/ml concentration and kept frozen at −20 °C until use. Working solution of cocktail C-D was diluted in PBS pH 7.4/0.5 mM CaCl2 buffer and contained 2 mg/ml of collagenase Ia and 2 mg/ml of dispase.

Using cocktail C-C-H, up to three 55 µm-thick tissue sections were digested in one 2 ml microtube in a thermal block set at 37 °C and 1000 rpm agitation for at least 12 h or until complete digestion. To facilitate tissue dissociation by mechanical force, one 4 mm diameter glass bead was added to the microtube. Cocktail C-D was used for fast tissue dissociation. Samples were digested for 0.5–2 h — as appropriate for individual sample — until full dissociation. In case of prolonged digestion time ( >30 min), fractions of separated nuclei were collected at 30 min intervals to avoid over-digestion, loss of cell membranes, and antigenicity. Fractioned nuclei were kept on ice until full dissociation of the remaining sample was completed. After fraction collection, 1 ml of fresh enzyme mix was added to the remaining undigested tissue.

Samples dissociated with either cocktail C-C-H or C-D performed equally in subsequent antibody staining, cytometry analysis, and enrichment steps, therefore, both methods can be used. Cocktail C-D was preferred if fast sample preparation was desirable, however, it is more labor-intensive because samples have to be regularly inspected to avoid over-digestion.

Following full dissociation, nuclear suspension was passed 10–20 times through a 25-G needle and filtered through 30 µm nylon mesh. Then samples were centrifuged for 5 min at 1500 × g, supernatant containing enzymes was discarded and pellets were re-suspended in 500 µl of PBS pH 7.4/10% fetal bovine serum (FBS) and placed for 30 min at room temperature (RT) to block unspecific antibody binding sites. Samples were again centrifuged for 5 min at 1500 × g and the pellets were re-suspended in 500 µl of PBS pH 7.4/5 mM EDTA/0.5% FBS.

For immunocytochemical staining, nuclei were split into staining- and appropriate control tubes. Washing buffer (PBS pH 7.4/5 mM EDTA/0.5% FBS) was used for antibody dilution and intermediate washes. Unlabeled mouse monoclonal antibody against CD30 (clone Ber-H2, Dako, Glostrup, Denmark) was diluted 1:500 (0.38 µg/ml) and incubated with digested nuclei at least 10 h at + 4˚C with mild occasional agitation. Following incubation, the sample was washed twice and an Alexa647-conjugated secondary anti-mouse IgG antibody F(ab) fragment (Invitrogen, Eugene, OR, USA) diluted 1:2000 was added. After 30 min incubation in the dark the sample was washed twice and an unlabeled rabbit monoclonal antibody diluted 1:200 (0.015 µg/ml) against MUM1 (clone EPR5653, Abcam, UK) was added and incubated for at least 10 h at+ 4 ˚C with mild occasional agitation. After incubation, samples were washed twice and subsequently labeled for 30 min with biotinylated secondary anti-rabbit IgG antibody (3.75 µg/ml, Vector laboratories, CA, USA) and streptavidin-conjugated phycoerythrin (PE) (0.5 µg/ml, Jackson ImmunoResearch Laboratories, USA). Finally the samples were suspended in 2–3 ml of PBS/5 mM EDTA/0.5% FBS supplemented with 10 µg/ml of 4’,6-diamidino-2-phenylindole (DAPI) to enable simple discrimination between debris and intact nuclei. The samples were transferred to 5 ml polypropylene round-bottom tubes and kept on ice. They could either be used immediately or stored in the dark at +4 °C for up to 4 weeks without any noticeable decrease of nuclei integrity or fluorescent signals.

Sample analysis and sorting was performed either with BD Influx or Aria III instruments with standard configurations (Becton-Dickinson, San Jose, CA, USA). Nozzles of 85 µm or 100 µm diameter were used with appropriate sheath fluid pressure and drop formation parameters. Flow rate of 3000–6000 events/second was used and sorting was performed in high purity mode typically collecting ~5000–50,000 of positively gated events per sample, depending on sample cellularity and size. For each sample at least 500,000 of non-malignant tumor-infiltrating cell nuclei [gated as MUM1- and CD30-negative, 2 N DNA content, low forward scatter (FSC) and side sactter (SSC)] were sorted. DNA extracted from these populations was used as a germline control in sequencing analysis. Aliquot of all sorted nuclei-populations were mounted on microscopy slides and stained with hematoxylin and eosin (H&E) for morphological evaluation and purity estimation.

DNA extraction and whole-genome amplification

Sorted nuclei were centrifuged for 10 min at 5000 × g and supernatant was discarded. Due to small amount of available nuclei, direct lysis without fine purification was applied for gDNA isolation from HRSC populations. Sorted HRSC nuclei were suspended in 15 µl of lysis buffer containing 10 mM Tris-HCl, 50 mM KCl, and 0.08% Tween 20 and 0.001% of ATL buffer from the QIAamp® Micro Kit (Qiagen, Hilden, Germany). Then 2 µl of proteinase K was added and samples were incubated for 14 h at 56 °C. Then proteinase K was deactivated by 5 min incubation at 95 °C. Concentration of gDNA was measured with Qubit dsDNA HS Assay Kit (Thermo Fisher, Eugene, OR, USA). Typical DNA recovery after direct proteinase K digestion was 30–50% (calculations assuming theoretical 6.6 pg of DNA per diploid genome). Genomic DNA (gDNA) from non-malignant control populations was extracted with Maxwell®16 FFPE Plus LEV DNA Purification Kit (Promega, Madison, MI, USA) following the manufacturers’ recommendations.

At least 10 ng of gDNA per case was used for whole-genome amplification (WGA) with the GenomePlex WGA2 kit (Sigma Aldrich) according to protocol provided by the manufacturer with minor modifications: the first fragmentation step was skipped and the number of PCR cycles was increased from 14 to 20. Also, 10 ng of commercially available germline female DNA (Promega, Southampton, UK) was amplified exactly according to the manufacturer’s instructions and used as reference for array-CGH. Amplified DNA was purified with GeneJet PCR purification kit (Thermo Fisher, Vilnius, Lithuania) and quantified with Qubit.

Array-comparative genomic hybridization

Following whole-genome amplification, the DNA from FFPE samples was within 100–300 bp size range. The average size of the reference WGA DNA was ~5000 bp, therefore, it was heat-fragmented for 35 min at 95 °C to achieve similar size distribution to that of the amplified DNA extracted from the HRSC. aCGH hybridizations with Agilent SurePrint G3 CGH 180k arrays (Agilent Technologies, Santa Clara, CA, USA) was performed with 500–1000 ng of whole-genome amplified sample and reference DNA, and was completed according to protocol provided by the manufacturer and as previously described [30]. After slide scanning, data were assessed with a series of QC metrics and subsequently analyzed using Agilent Cytogenomics software with the aberration detection algorithm ADM2 [31]. Aberrant genomic intervals were exported and processed with a custom workflow programmed in R.

Library preparation and sequencing

High-throughput sequencing (HTS) libraries were prepared in duplicate for each sorted tumor sample using IonTorrent AmpliSeq (Life Technologies, Carlsbad, CA, USA) library preparation workflow according to adjusted manufacturer’s instructions. Sorted tumor cell-depleted nuclei-populations were used for germline control library preparation. An in-house custom lymphoma panel assaying 68 genes and consisting of four primer pools was used as published previously [32]. For initial target enrichment at least 2 ng of unamplified sample DNA were used per primer pool. Reaction volume was 5 µl and PCR amplification continued for 22 cycles. After initial PCR, products from different pools were combined and processed according to standard protocols, which also included secondary re-amplification of barcoded fragments. After purification, library quality were checked by the Bioanalyzer High Sensitivity assay (Agilent Technologies) quantified by qPCR using IonTorrent library quantification kit and loaded to Ion530 chips with the automated IonChef instrument. Sequencing was performed with the IonTorrent S5XL sequencer achieving average 1000 × and 250 × coverage for tumor and germline samples, respectively.

Variant identification in each sequenced sample was performed with the Ion Torrent Variant caller with default low stringency parameters. At initial filtering only non-synonymous exonic and splice site variants with Phread quality score >50, strand bias <0.75, and variant supporting reads >5 were retained. Germline variants were removed in each sample by subtracting variants identified in sorted matched non-tumoral population. Further, to prevent false identification of variants due to comparably low gDNA input for library preparation and fixation artifacts, all mutations that were detected in only one tumor sample but not in another were removed. Positions of all remaining mutations were manually evaluated in aligned BAM using the IGV viewer to exclude sequencing artifacts due to homopolymer sequence or mispriming. Mutation sites were also investigated in the BAM files of matched non-tumoral control samples to confirm absence of variant reads in the germline.

Results

Clinical characteristics

Primary cHL of ten patients (four female, six male) were processed. Median age at diagnosis was 50 years ranging from 12 to 76 years and represented the typical age distribution for cHL. By morphology most tumors were classified as belonging to the nodular sclerosis subtype (6) followed by the mixed cellularity (3) and the lymphocyte depleted (1) subtype. EBV infection was detected in four cases. Cases’ characteristics are summarized in Table 1.

Table 1 Clinical characteristics

Nuclei extraction and labeling

At the beginning, we focused on identifying working antibody–antigen combinations, for proteins specifically expressed by the HRSC or the surrounding microenvironment, which would be useful for the enrichment of the cells of interest. A pair of nuclear transcription factors — MUM1 (IRF4) and PAX5 — were chosen because of the availability of robust monoclonal antibodies designed for FFPE tissues and favorable expression profile in cHL. MUM1 is a marker of plasmacytic differentiation [33] and is strongly expressed by HRSC and non-neoplastic plasma cells in cHL. PAX5 is a B-cell lineage marker [34] expressed by mature B-cells and also present in HRSC. In addition to nuclear markers, two cytoplasmic markers — CD30 and CD138 — were tested. CD30 is expressed specifically by HRSC [35] in cHL, whereas CD138 is expressed by plasma cells [36], but is negative in almost all cHL [37], and was thus intended to help separating plasma cell nuclei from HRSC nuclei in MUM1-positive populations.

After optimization of the antigen retrieval-, tissue dissociation-, and antibody incubation conditions, successful reactivity was achieved for three markers: MUM1, PAX5 and CD30 (Fig. 1). We found that prolonged incubation with primary antibodies (at least 10 h at + 4 C°) was important for antigen recognition, because shorter incubation times resulted in weaker staining or complete absence of it (data not shown). Of note was the successful antibody recognition for the membrano-cytoplasmic marker CD30. This demonstrates that while the majority of other cell types lose their cytoplasm after tissue dissociation, HRSC keep membranous and cytoplasmic fragments with retained antigenicity. Antibody labeling against CD138 was not successful despite the fact that most of the plasma cell nuclei retained small fragments of membranous and cytoplasmic remnants attached. This is most probably due to the sensitivity of the CD138 epitope to the proteolytic treatments applied. PAX5 was found to be sensitive to an extensive digestion with collagenase type 3. Incubation with this enzyme totally abolished recognition of PAX5 epitopes by the anti-PAX5 antibody (clone EPR3730(2)), while the exclusion of this enzyme from the enzymatic cocktail C-C-H resulted in reliable and specific antigen recognition.

Fig. 1
figure 1

Labeling of HRSC for nuclear and membranous/cytoplasmic markers. Representative cHL case showing characteristic positivity of the large HRSC for the membrano-cytoplasmic marker CD30 and the nuclear markers MUM1 and PAX5 (left). After enzymatic tissue dissociation cell nuclei with membranous and cytoplasmic remnants retain the targeted epitopes and can be specifically labeled with monoclonal antibodies (right)

Even if specific epitope-paratope interaction is achieved, this does not guarantee successful translation into recognizable signal in flow cytometry. The problem is that formalin fixation induces changes to the tissue molecules constituting in a way that significantly increases autofluorescense [38]. We compared FFPE and fresh frozen tissue autofluorescense at multiple channels of the BD Influx instrument (Fig. 2). A significant increase of autofluorescence was observable throughout the entire emission spectrum used for signal detection. This means that compared to unfixed tissue, a stronger specific signal has to be generated to overcome sample autofluorescense. Possible strategies to achieve this include: (1) signal amplification, (2) choice of brightest fluorophores emitting at non-overlapping spectral ranges that are the least affected by autofluorescence [39], (3) autofluorescence subtraction [40, 41], and (4) treatment of samples with chemicals aiming to diminish autofluorescense [42].

Fig. 2
figure 2

FFPE tissue-derived nuclei have higher autofluorescense throughout the detection spectrum of a flow cytometer. Fresh frozen and FFPE nuclei derived from the same biopsy were analyzed. Detector voltage and gain settings were kept stable during acquisition. Significant increase of autofluorescense (shift of the curve peaks to the right) is observable for FFPE material in all the measured detection channels of the BD Influx instrument

We have chosen the first two approaches to deal with this problem. To overcome high autofluorescence in the yellow channel (580/30 nm range) we used one of the brightest fluorophores — phycoerythrin (PE). Moreover, two-level signal amplification was employed by the use of biotinylated secondary antibody and streptavidin-conjugated PE. The use of this second level of amplification was crucial following MUM1 labeling as the PE-conjugated secondary antibody did not result in detectable signals (data not shown). To detect a second marker we choose a detector channel at the far-red spectrum (670/30 nm), which had the lowest autofluorescense increase and relatively bright and stable Alexa647 fluorophore.

This set-up gave satisfactory results and allowed to separate positive from negative cell nuclei using MUM1 and CD30 as markers. Detection of PAX5 expression by HRSC nuclei, however, was problematic irrespective of the detection set-up used; on cytometry analysis, the separation was poor between weakly positive and negative events. Because normal B-cells with high levels of PAX5 expression could be consistently and clearly separated, we attributed this to the known weak PAX5 expression in HRSC nuclei.

Gating strategy for isolation of HRSC cells

Following successful labeling, we analyzed each case by flow cytometry. A gating approach detailed in Fig. 3a was applied to select and sort HRSC. Depending on the level of antigen expression and preservation, variable degrees of separation of positive and negative populations by the specific signal were achievable. Most often specifically stained nuclei partially overlapped with negative highly autofluorescent events making the distinction tricky. For gating, each fluorescently labeled marker was plotted against autofluorescence in the green channel (530/40 nm). Gate positions were determined by fluorescence intensity in isotype controls and were additionally guided by the appearance of an unstained sample. In addition to protein markers, we used large nucleus size (reflected by the increased forward and side scatter parameters in flow cytometry) to refine the gating and to select HRSC nuclei, which are consistently larger as compared to the dominant nuclei population. This typically improved sort purity by 10–15%. In contrast, we found that cell ploidy as determined by staining with DAPI, although expected to be >2 N due to the frequent polyploidy in HRSC, was not a useful parameter to increase purity of the sorts. This experience was based on experiments, when MUM1+ /CD30 + populations were further gated according to 2 N or >2 N DNA content. When sorted and evaluated by microscopy, both contained approximately similar fractions of HRSC nuclei. Finally, applying our gating strategy, MUM1- and CD30-based sorting resulted in average 37 × (range 7 × –85 × ) enrichment of HRSC as compared to the initial purity estimation from the tissue slide (Table 2 and Fig. 3b). Sorted non-malignant infiltrating cell nuclei-populations, gated as negative for CD30 and MUM1, 2 N, SSC & FSC low, were free of HRSC nuclei and, therefore, were used as germline controls in sequencing experiments.

Fig. 3
figure 3

Gating strategy to select and sort HRSC nuclei. a At the first step DAPI staining was used to discriminate intact nuclei from cell debris (1). DAPI signals were also used in pulse geometry gating to select single cells and to discriminate doublets (2). Then MUM1-positive events were selected and subsequently gated according to positivity for CD30 (3, 4). MUM1- and CD30-positive events mainly consisted of HRSC nuclei, but to further refine the selection, an additional gate selecting events that have SSC and FSC was applied, utilizing the physical properties of HRSC nuclei (large size and high complexity). Overlay plot shows that CD30 + /MUM1 + events have higher SSC and FSC as compared to cells that are only positive for MUM1 (5). b H&E and immunohistochemical stainings for MUM1, CD30, PAX5, and CD15 of sorted populations were used to evaluate the HRSC nuclei enrichment efficacy by conventional light microscopy. HRSC nuclei were primarily identified by their gigantic size and characteristic pale chromatin. Remnants of the cytoplasm can be seen on some nuclei. Black arrows show nuclei of non-neoplastic lymphocytes representing unwanted contamination of the enrichment. Small non-neoplastic cell nuclei were negative for the investigated markers confirming the specificity of the stainings

Table 2 HRSC enrichment in processed cHL cases

Genetic analysis identifies recurrently altered cellular processes in cHL

In total 43 mutations affecting 15 genes were identified by targeted HTS (Fig. 4a and Supplementary Table S3). A cHL case had in average four mutations (range 1–9) and 19 (range 7–47) copy number aberrations. Median variant allelic frequency (VAF) of detected somatic mutations was 33.8 (IQR 30.1) showing that the achieved degree of enrichment was more than sufficient for a reliable variant identification. In our cohort the most frequently mutated gene was SOCS1, a negative regulator of the JAK-STAT signaling pathway, which was affected in all but one case (9/10, 90%). This gene was not targeted by chromosomal copy number alterations, but in all cases mutations were functionally disruptive, confirming their somatic origin and a status of SOCS1 as a tumor suppressor. Interestingly, in two cases SOCS1 alterations coincided with STAT6 missense mutations, which localized in the DNA-binding domain of the protein. This shows that mutations affecting the same pathway are not necessarily mutually exclusive and can potentially complement each other’s activity. In agreement with this, aCGH detected frequent 9p gains affecting JAK2 on 9p24.1 (6/10 cases), which were concomitant to SOCS1 and STAT6 mutations (Fig. 4b). A JAK2 gain was also detected in case 4, which was devoid of SOCS1 and STAT6 mutations, thus JAK-STAT signaling pathway was deregulated in all investigated cHL tumors. In agreement with genetic data, IHC stainings of phosphorylated STATs, particularly of STAT5 and 6, substantiated JAK-STAT pathway activation in all studied cases (Table 3 and Fig. 5). 9p gains also affected PD-L1 (and PD-L2) encoding ligands for the T-cell PD-1 receptor. Overexpression of PD-L1 helps HRSC to suppress immune response and avoid destruction by cytotoxic T-cells. Immune evasion in cHL is also exemplified by B2M mutations found in two investigated cHL cases. Notably, B2M and PD-L1 defects were mutually exclusive.

Fig. 4
figure 4

Genetic aberrations in cHL cases. a Somatic mutations profile of each case as detected by targeted HTS. Most somatic mutation occur with allelic frequency >10% showing sufficient malignant cell enrichment for reliable clonal and subclonal mutation identification. b Summary of copy number aberrations detected by aCGH. Known recurrently affected regions were identified, such as gains of 2p and 9p, and losses of 6q23.3. Somatic mutations detected by HTS are overlaid and color-coded according to type to provide comprehensive picture of genetic alteration in cHL

Table 3 Immunohistochemical staining of phosphorylated STATs evaluated as percentage of positive HRSC
Fig. 5
figure 5

Immunohistochemical stainings for phosphorylated STATs. Strong specific staining of giant HRSC nuclei is visible, indicating activity of this signaling pathway and functionally confirming the results of the genetic characterization. See Table 3 for a more detailed information on phosphorylated STATs’ expression in the entire study cohort

Other frequently mutated genes included TNFAIP3 (4/10 cases) and TP53 (3/10 cases). TNFAIP3 is a tumor suppressor, which acts as a negative regulator of NF-κB signaling. In all four mutant cases, the frameshift or missense mutation of TNFAIP3 was accompanied by a heterozygous deletion of 6q23.3, suggesting biallelic inactivation (Fig. 4b). Additionally, six cases had gains of the entire short arm of chromosome 2, which — among many other genes — also contains the REL proto-oncogene, a subunit of the NF-κB complex. Finally, in three cases CARD11, another member of NF-κB pathway was disrupted by deletions and damaging missense mutation.

Two out of three TP53-mutated cases had the highest overall number of mutations, suggesting that p53 function deficiency may had contributed to an increased genetic instability and accumulation of genetic defects in these cHL. Most nucleotide alterations were detected in case 7, which also contained mutation in the ATM gene, the latter being also involved in maintenance of genome stability.

Discussion

Tumor cell scarcity in cHL has posed a great challenge for understanding its molecular pathogenesis. Most genetic investigations up to date have relied on cultivated cHL cell lines and rare primary clinical samples with higher malignant HRSC content or laborious enrichment by LCM. Despite these challenges, the main cellular pathways and genetic mechanisms of cHL pathogenesis have been uncovered. These include frequent genetic defects in components of the JAK-STAT- and NF-κB signaling pathways, activation of immune escape mechanisms as well as change of cell transcriptional program following EBV infection [43]. In spite of this crucial basic knowledge, many questions remain unanswered. For example, it is unknown how genetic factors influence heterogeneous disease outcomes. In addition, the prognostic and/or predictive value of known genetic aberrations is currently unclear. Moreover, genetic characterization of cHL relapses — modes of tumor evolution, degree of intratumoral genetic heterogeneity and genetic events leading to recurrence — is also lacking. To gain deeper understanding larger, well-designed studies, involving serial samples as well as samples representing the entire spectrum of the disease are necessary. Such studies require large patient cohorts. However, such cohorts are difficult to acquire due to low disease frequency and often limited biopsies, which are justifiably processed to suit routine diagnostic procedures. Therefore, a possibility to use FFPE material for such studies can potentially greatly expand the spectrum of samples available for cHL research.

Despite recent advances in sequencing technology and bioinformatics tools, such as single cell genomics and circulating tumor cell as well as cell-free DNA interrogation [44, 45], the tumor cell fraction in the respective samples still remains a crucial parameter, which has a huge influence on the overall analysis success. It is largely accepted that for a reliable copy number estimation by aCGH ~70% tumor cell fraction is preferable. This threshold is much lower for mutational profiling by HTS, which generally has a high sensitivity allowing to detect variants present in 1-5% of cells in a sample. However, highest possible tumor purity is still preferred as it makes data analysis easier, enables identification of subclonal mutations and helps to filter out false positive mutation calls due to sequencing errors, which typically occur at low allelic frequency.

Our method for HRSC enrichment form FFPE tissue is based on previous developments enabling fixed tissue dissociation and interrogation by flow cytometry [28, 29, 46]. Several authors have used vimentin and keratin staining of FFPE tissue-derived cells to separate adenocarcinoma from the surrounding stroma [29, 47, 48]. A major advantage of this approach is that a highly abundant structural protein is used as a marker for enrichment, which greatly improves signal intensity and separation of positive and negative populations. However, we and others have previously shown that specific labeling and sorting of FFPE tissue-derived nuclei based on less abundant nuclear transcription factors and membranous molecules is also feasible [27, 46, 49]. Major obstacles to be overcome are preservation of tissue antigenicity and development of a bright and specific signal, which enables recognition of marker expression in cytometry analysis. Generally, to the first issue, mildest possible enzymatic and antigen retrieval treatments are preferable. Highly efficient but unspecific enzymes such as pepsin or trypsin were successfully used for tissue dissociation for subsequent cytometric ploidy analysis, but preclude specific protein labeling as they negatively affect tissue antigens and increase the amount of cell debris [22]. We show here that two enzymatic cocktails, one consisting of dispase and collagenase, the other consisting of two different collagenases and a hyaluronidase, are efficient for tissue dissociation and preserve antigenicity, which is crucial for every subsequent steps of HRSC isolation. In addition, careful selection of primary antibodies, which are specifically designed and tested on the FFPE tissues for diagnostic purposes, increases chances of high quality labeling. In the particular case of HRSC enrichment — combination of MUM1 and CD30 proved to be successful. To the second issue, the known problem of high autofluorescence in fixed tissue can be tackled by a rational selection of fluorophores and signal amplification-techniques. In our experience, one or two levels of amplification by the use of labeled secondary antibody or streptavidin-conjugated fluorophores, as well as avoidance of spectral overlap proved to be successful. We suggest readouts in the far-red spectrum, as autofluorescence is the lowest in that range.

Our enrichment method has several advantages in comparison to the LCM technique, which has been extensively used in previous cHL genetic studies. First, although initial sample preparation is more laborious, once completed it allows high-throughput processing of tissue samples and collection of large numbers (i.e., tens of thousands instead of hundreds) of HRSC nuclei. In this way, the tumor cell complexity of the sample is more adequately represented and can be captured by the subsequent genetic analyses. Second, it allows investigation of the entirety of the sample as tissue is homogenized and therefore regional, phenotypic selection biases are nullified. Third, in contrast to LCM, where the high-energy laser beam catapults cells of interest, FACS does not inflict additional damage to the sample DNA. Therefore, higher quality nucleic acids can be extracted. Based on our experience with FFPE samples this can be a qualitative difference precluding genetic analysis if the former approach is used. Fourth, our method allows simultaneous cytometric data collection about the sample, such as ploidy, scatter parameters, etc., which can be useful in some contexts. Finally, our method allows simultaneous separation not only of HRSC, but also of non-tumoral cells, which can be used as germline controls.

Good fixation and preservation of tissue and its antigens is an important prerequisite for the successful enrichment. This represents the main weakness of our method, as some samples cannot be enriched to the desired degree because labeling of MUM1 and/or CD30 cannot be achieved due to loss of antigenicity. In addition, enrichment to 60–70% — typically achieved by our technique — is inferior to the theoretical precision of LCM. Another limitation is a modest efficiency of the enrichment procedure. We do not recommend it for extremely small samples, such as small core needle biopsies, because, inevitably, part of the nuclei (up to 20%) is lost during sample processing due to extensive washing, changes of pipette tips, and microtubes, etc. Further, some of the HRSC nuclei do not retain cytoplasmic remnants and are negative for CD30, and therefore the standard gating procedure excludes them. To remedy this, alternative gating strategies, e.g., sorting of MUM1-positive/CD30-negative and FSC & SSC high events, can be used and yield less pure, but sufficiently enrich populations. Despite the above-mentioned limitations, our HRSC enrichment technique offers a very useful tool enabling genetic analysis by aCGH as well as by HTS in the vast majority of cHL samples, which were until now inaccessible.

Somatic genetic alterations, such as mutations and copy number aberrations affecting members of JAK-STAT-and NF-κB pathways, mechanisms of immune evasion by B2M mutations and PD-L1 locus amplifications, that we detected in our cohort of cHL are in good agreement with previous genetic studies [8, 20, 43, 50,51,52]. In particular, the recent and, so far, most comprehensive whole-exome study of cHL matches our results by showing wide-spread JAK-STAT pathway mutations in most of cHL cases, including previously only anecdotally noticed STAT6 mutations [51].

In this work, we present detailed instructions and a proof-of-principle of a novel technique for HRSC enrichment and genetic analysis from archival FFPE clinical cHL samples. This high-throughput method makes the vast resources of archived clinical cHL samples accessible for interrogation by modern genetic approaches. Partly realizing the promise of our enrichment technique, we are studying a collective of clinically well-annotated primary and relapsed cHL cases aiming to shed light on the prognostic importance of the HRSC mutational landscape. Our preliminary observations on these cases further point towards the feasibility of the innovative methodology described here.