Introduction

Lymphoproliferative disorders are a large and heterogeneous group of hematological malignancies. Mature B-cell lymphoproliferative syndromes comprise 80% of all lymphomas1. Diffuse large B-cell lymphoma (DLBCL) and follicular lymphoma (FL) are the two most common types of lymphoma1. The diagnosis, prognosis assessment and efficacy evaluation mainly depend on tissue biopsy, laboratory data, and imaging tests. Classically, the biopsy assessment includes the immunohistochemical detection of markers and fluorescence in situ hybridization studies (MYC, BCL6, and BCL2 rearrangements)2,3,4,5,6. Continuous efforts are being made to identify genomic biomarkers to better understand the behavior of lymphomas, predict their evolution and bring solutions to clinical practice. Next-generation sequencing (NGS), which allows for massive, parallel, high-throughput DNA sequencing, has emerged over the past decade and has provided new insights into the genomic and transcriptomic characterization of mature B-cell malignancies8. NGS has become a useful tool for the complete characterization of the spectrum of genetic variants in non-Hodgkin’s lymphoma (NHL). Research on molecular profiles in NHL has advanced significantly in recent years. Various groups have attempted to establish prognostic scores3 and genetic risk clusters based on genetic characteristics4 or by combining the characteristics with clinical and analytical data5,6. The results of these studies are promising; however, the means to apply these technologies are still limited in most centers, and validation is required for implementation into clinical practice. Thus, while NGS lymphoid panels should be implemented in clinical practice, there is as yet no standard approach, and features such as gene selection, sequencing platform, read depth, and variant analysis can differ among laboratories.

Although tissue biopsy is the gold standard for identifying genetic variants, it might not reflect the entire molecular complexity of every patient with lymphoma7,8. Once the diagnosis of lymphoma is reached based on a tissue biopsy, a liquid biopsy can be applied to complement the tissue findings. Liquid biopsy, which is non-invasive, can also be used to explore the entire mutational landscape of the lymphoma, given that this approach has the potential for collecting the tumor-circulating tumor DNA (ctDNA) derived from most, and potentially all, tumor locations in the body. Liquid biopsy has thereby progressively transformed cancer diagnoses and prognoses, as well as oncologic therapy in general and lymphoma in particular9. This technique is expected to lead to important improvements in initial risk stratification, response evaluation at the end of induction therapy, and in surveillance strategies and target therapy selection in patients with lymphoma.

The aim of our study was to validate an NGS lymphoid panel for solid and liquid biopsy in the most common NHLs (DLBCL and FL) and to assess the concordance between genetic mutations detected in solid and liquid biopsies.

Patients and methods

Clinical cohort

The study included 47 nonconsecutive patients diagnosed with NHL 32 DLBCL [20 DLBCLs-not otherwise specified (NOS), 4 high-grade double-hit lymphomas, 5 high-grade NOS lymphomas and 3 primary mediastinal B-cell lymphomas] and 15 FL from 2014 to 2019 in Gregorio Marañón General University Hospital. The study also included formalin-fixed paraffin-embedded (FFPE) tissue samples from the time of diagnosis, along with matched same-day plasma samples from 26 of these patients (14 DLBCL-NOS, 4 high grade) (Supplementary Tables 1 and 2).

Ethics approval was obtained from the Research Ethics Committee of Gregorio Marañón General University Hospital. All patients provided written informed consent according to the principles of the Declaration of Helsinki.

DNA extraction

All FFPE sections (n = 47) were subjected to DNA extraction with the QIAGEN Generead DNA FFPE Kit (QIAGEN, Germany) according to the manufacturer’s guidelines. Peripheral blood samples were collected from 26 patients, placed in 10 mL EDTA tubes and centrifuged at 1800 × g for 10 min to isolate plasma, which was aliquoted into 1.5–2-mL tubes and stored at − 80 °C. Cell free DNA (ctDNA) was extracted with a QIAamp circulating nucleic acid kit (QIAGEN, Germany). FFPE and ctDNA were quantified using a Qubit dsDNA BR Assay (THERMO FISHER SCIENTIFIC, Waltham, MA, USA).

NGS experiments and data analysis

We selected The Lymphoma Solution (SOPHIA GENETICS, Switzerland) targeted panel, given that it targets 54 relevant genes in lymphomagenesis (193 kb) (Supplementary Table 3). For each FFPE tissue sample, 32–100 ng of total DNA was used to prepare the library according to the manufacturer’s protocol. Pools of up to 12 purified libraries were captured. For each circulating tumour DNA (ctDNA) sample, 2.5–55 ng of circulating DNA was used to prepare the library. Due to the intrinsic characteristics of the ctDNA samples, adapter ligation was performed directly without initial DNA fragmentation, followed by hybridization with the capture probes, also in pools of up to 12 purified libraries. Lastly, two capture pools (24 samples) were sequenced on a NextSeq platform (ILLUMINA, US; Paired-end 2 × 151 bp; mid-output kit).

We used the Sophia DDM platform (SOPHIA GENETICS, Switzerland) to analyze single nucleotide variants and small insertions and deletions. FASTQ files were uploaded to the data portal and aligned with the human reference genome (GRCh37/hg19). After annotation in DDM, non-synonymous variants located in exonic or ± 1.2 intronic splice regions were retained, and variants with a minor allele frequency < 0.01 (based on ExAC, GnomAD and 1000 Genomes databases), were selected for the downstream analysis. Currently, there is no standardization to establish which is the best cut-off point for VAF. In this sense, we decided to set the percentage at 5% in the FFPE since there are a high percentage of tumor in these samples, in an attempt to avoid false positives. However, the cut-off was reduced to 1% in the plasma samples where there is a lower percentage of tumor and we could lose mutations.

We used an Integrative Genomics Viewer (Broad Institute, USA) to visualize the variants aligned against the reference genome to confirm the accuracy of the variant calls by checking for possible strand biases and sequencing errors. Copy number variations (CNVs) were not analyzed in this study.

The ctDNA concentrations were expressed in haploid genome equivalents (hGE) per mL of plasma (hGE/mL) and were calculated by multiplying the mean VAF for all mutations used for detection calling by the concentration of cfDNA (pg/mL of plasma) and dividing by 3.3, using the assumption that each haploid genomic equivalent weighed 3.3 pg, as previously described by Scherer et al. (Supplementary Table 5).

Test validation

For technical validation, input DNA requirements, library generation and sequencing, two rounds of validation were performed consecutively. Three previously characterized samples with known single nucleotide variants and/or indels, as in the 24 FFPE tissue samples, were analyzed. Multiple intercapture and intracapture replicates, as well as inter-run and intra-run replicates were included (data not shown).

Statistical analysis

The patient characteristics are presented as frequencies (n) and percentages (%) for categorical variables or as medians and ranges for continuous variables. Categorical data were compared with Fisher’s exact or chi-squared test, when appropriate, and continuous data were compared using a two-tailed paired Mann Whitney U test. R Statistical Software was used for all statistical tests. Probability values < 0.05 were considered significant.

Results

Gene panel features

A total of 73 samples (47 FFPE and 26 ctDNA) were sequenced, resulting in a median of 8,290,518 reads in the FFPE samples and 11,071,271 in the ctDNA samples. The median percentage of mapped reads was 97% in both types of samples (Supplementary Table 4). The median percentage of mapped base pairs on-target was 83% in the FFPE samples and 73% in the ctDNA samples. The median percentage of duplicate fragments per sample was 35% in the FFPE samples and 62% in the ctDNA samples. The median deep coverage of target regions was 2101x (range 231x–6518x) in the FFPE samples (median coverage heterogeneity of 0.04%) and 3678x (range 1906x–9270x) in the ctDNA samples (median coverage heterogeneity of 0.24%) (Supplementary Table 4).

Mutational data from the FFPE samples (n = 47)

The gene panel was performed on 47 patients with NHL; 93.6% (44/47) presented at least one variant in the FFPE tissue samples with VAF ≥ 5%. In total, 372 somatic alterations were detected (Table 1). The patients presented a median of 6 mutations per sample (range 0–37). Missense mutations were the most frequent at 253/372 (67.6%), followed by 48/372 (12.9%) frameshift mutations, and 34/372 (9.1%) nonsense mutations (Table 1). Figure 1 and Supplementary Figs. 1 and 2 present the gene frequencies by NHL subtype detected in the total cohort.

Table 1 Mutational analysis of formalin-fixed paraffin-embedded tissue samples.
Figure 1
figure 1

Frequencies of mutated genes in the cohort (n = 47). Significant differences between follicular lymphoma and diffuse large B-cell lymphoma (p < 0.05*) (p < 0.1**).

In FL, 83% of the patients presented BCL2 rearrangement and a total of 93 somatic alterations, with a median of 7.4 mutations per sample (range 2–22). The most frequently mutated genes were KMT2D (80%), TNFRSF14 (48%), CREBBP (40%), BCL2 (40%), TNFAIP3 (32%), SOCS1 (32%), CARD11 (28%) and EZH2 (28%) (Fig. 1). A total of (13/15) 87% FL samples presented mutations in epigenetic modifiers genes.

In contrast, 28% of the patients with DLBCL presented BCL6 rearrangement, 25% presented c-MYC rearrangement, and 16% presented BCL2 rearrangement, with 16% presenting double-hit lymphomas. Furthermore, the patients presented a total of 279 somatic variants with a median of 8.6 mutations (range 0–35). In the overall cohort (n = 32), the most frequently mutated genes were SOCS1 (40%), KMT2D (40%), EP300 (40%), and c-MYC (32%) (Fig. 1). Sixty-eight percent (22/32) of the patients presented mutations in epigenetic modifier genes.

When comparing the germinal center B-cell (GCB) DLBCLs (n = 17) with the activated B-cell (ABC) DLBCLs (n = 9), PIM1 mutations were present only in the patients with GCB DLBCL (41% vs. 0%; p = 0.03), and XPO1 mutations were present only in the patients with ABC DLBCL (22% vs. 0%; p = 0.08), with statistically significant differences (Supplementary Fig. 1).

When we analyzed the patients with high-grade DLBCLs-NOS (n = 5), those with double-hit/triple-hit (n = 4) DLBCLs, and those with DLBCL-NOS (n = 20), c-MYC and TCF3 were more present in the high-grade DLBCLs-NOS than in the DLBCLs-NOS (44% vs. 15%, p = 0.1; and 22% vs. 0%, p = 0.089, respectively). Mutations in EZH2 and MAL were more frequent in the high-grade double-hit DLBCLs (50% vs. 4%, p = 0.04; 75% vs. 12%, p = 0.02). Mutations in TP53, TCF3, and CD58 were more frequent in the high-grade DLBCLs-NOS (60% vs. 20%, p = 0.11; 40% vs. 0%, p = 0.025; 40% vs. 8%, p = 0.12) (Supplementary Fig. 2).

When we compared the mutations in FL (n = 15) versus those in DLBCL (n = 32), we found that the variants in the following genes were more frequently present in FL than in DLBCL: BCL2 (p = 0.003), CREBBP (p = 0.003), KMT2D (p = 0.012), and TNFRS14 (p = 0.015), with significant differences. In contrast, PIM1 variants (p = 0.033) were more frequent in DLBCLs (Fig. 1).

Recurrent mutations (1–3) were found in ARID1A, B2M, BCL2, CIITA, CREBBP, EP300, EZH2, FOXO1, KMT2D, MAL, MYD88, NFKBIE, PIM1, SOCS1, STAT6, TP53, and XPO (Table 1). Only EZH2 (p.Tyr641Asn/His/Phe), CIITA (p.Ser781_Val782delinsLeuAla), EP300 (p.Gly211Ser), and MAL (p.Phe33Cys) presented more than 4 recurrent mutations (Supplementary Fig. 3).

The presence of more than 1 mutation in the same gene was detected in several genes including MYC, SOCS1, PIM1, CIITA, KMT2D, and BCL2 (Table 1). Nine patients presented mutations in the c-myc protooncogene, 4 presented more than 1 mutation and concomitant with MYC translocation. Eighty-three percent (27/33) of the MYC mutations occurred in exon 2. The other genes with more than one variant were SOCS1 (59 mutations in 17 patients), PIM1 (28 mutations in 6 patients), CIITA (12 mutations in 8 patients), and KMT2D (36 mutations in 22 patients); all of these patients were diagnosed with DLBCL. Also, 4 patients with FL presented more than 2 mutations in BCL2, all with BCL2 rearrangement.

Mutational data in FFPE and cfDNA (n = 26)

The cfDNA samples collected from the study patients at diagnosis were subjected to targeted sequencing (n = 26) (Table 2). In 92% (24/26) of the samples, we detected some variant in the free DNA in plasma. A total of 386 variants were detected (174 in the ctDNA samples and 212 in the FFPE samples). Of the total variants, 123 mutations (63.7%) were detected in both types of samples, 51 mutations were detected only in the ctDNA samples (13.2%), and 89 mutations were detected only in the FFPE samples (23%) (Fig. 2). Those variants that were detected in both types of samples had higher VAFs (28%) in the FFPE samples than in the ctDNA samples (17.9%). When considering only those mutations with VAFs > 10% in the FFPE samples, the percentage of mutations identified in both samples was 86%; specifically, the ctDNA samples that had a percentage of mutations < 50% had an input ctDNA concentration < 0.5 ng/µL (Supplementary Table 4). Overall, 96% (25/26) of the patients had at least one alteration observed in the ctDNA sample that was identical to that in the FFPE tissue sample.

Table 2 Mutational analysis of circulating tumor DNA and formalin-fixed paraffin-embedded tissue samples.
Figure 2
figure 2

Concordance of mutations between solid and liquid biopsies. Allele frequencies (AF) are provided for the solid tissue biopsies (green bar plot) and for the liquid biopsies (yellow bar plot).

We found that the median number of mutations detected in ctDNA was higher among the stage III and IV patients than the early-stage patients (6 vs. 2.5 mutations, p = 0.05) and in the patients with bulky disease (7 vs. 3 mutations, p = 0.04) (Supplementary Fig. 4).

For the 51 variants detected only in ctDNA (12 patients, 9 DLBCL and 3 FL), the median VAF was lower than those that were also identified in the FFPE samples (2.5 vs. 9.1%). Interestingly, there were 5 patients harboring more than 2 mutations in the ctDNA samples that were not detected in their matched FFPE samples (UPN of 19, 24, 28, 43, 46). Four of these patients presented bulky disease and were stage III at diagnosis.

The mean baseline ctDNA concentration was 42.803 hGE/mL (range 0–635.152) at diagnosis (Supplementary Table 5). Higher ctDNA levels were also correlated with bulky disease (4.369 vs. 15.852 hGE/mL, p = 0.016). There were no differences based on the stage (Supplementary Fig. 4).

Discussion

The optimal assessment of NHL includes morphological and immunophenotypic studies and chromosome and molecular analyses. NGS techniques provide relevant additional data for diagnosis, prognosis, and therapeutic management. Although NGS data on lymphomas require further validation before being implemented in daily practice, their clinical application is just around the corner. Numerous studies over the past decade have analyzed hundreds of tumor genomes of DLBCLs and FLs to better understand the molecular pathogenesis of these diseases3,4,5,11,12,13. In this study, we validated an NGS panel for DLBCL and FL in FFPE and ctDNA samples at diagnosis.

As one might expect of a cancer derived from cells and an environment of combinatorial diversity, heterogeneity is a defining characteristic of FL and DLBCL. We detected 372 pathogenic variants in 54 genes in 47 of the FFPE samples (93 in FL [median of 7.4 variants] and 279 in DLBCL [median of 8.6 variants]). In our study, 83% of the patients with FL presented BCL2 rearrangements, and the variants most frequently detected were KMT2D, TNFRSF14, CREBBP, BCL2, TNFAIP3, SOCS1, CARD11, and EZH2. Eighty-seven percent FL samples presented mutations in epigenetic modifier genes. These results agree with those from previous studies in the literature14,15,16. Twenty-eight percent of the patients with DLBCL presented BCL6 rearrangement, 25% presented c-MYC rearrangement, and 16% presented BCL2 rearrangement, with 16% presenting double-hit lymphomas. However, c-MYC rearrangement might be over-represented in our cohort compared with that described in the literature, given that the cases were not selected consecutively17,18. The variants most frequently detected were present in SOCS1, KMT2D, EP300, c-MYC and TP53, and 68% of the samples presented mutations in epigenetic modifier genes.

The mutational profile of DLBCL differs depending on the cell of origin. While GCB DLBCL is characterized by frequent translocations of BCL2 and mutations of the epigenetic modifiers CREBBP and EZH2, these abnormalities are rare in ABC DLBCL. In contrast, mutations in genes encoding proteins implicated in B-cell receptor signaling and the nuclear factor kappa-light-chain-enhancer of activated B cells pathway (such as CD79b and MYD88) and genes involved in the regulation of the cell cycle (such as CDKN2A) contribute to the molecular pathogenesis of ABC DLBCL5,19,20. Our study found differentiated genetic profiles according to the GCB and ABC subtype. BCL2 rearrangement, EZH2, PIM1, CD58, and NFKBIE were present only in the GCB subtype while XPO1 was present only in ABC. Also, different profiles were observed in those patients classified as having high-grade lymphomas, where mutations in EZH2 and MAL were more frequent in high-grade double-hit lymphomas and mutations in TP53, TCF3 and CD58 in high-grade NOS lymphomas. More extensive and complex panels than the ones used in this study are needed to adequately perform the molecular classification4,5. However, it is not entirely clear which strategy will be the most appropriate for clinical practice: large panels of genes, exomes, or whole genomes. What is clear is that, by including genetic analyses of lymphomas, we will be able to reach a much more certain diagnosis by establishing genetic risk profiles, as is the case for other hematological neoplasms such as acute leukemia, thus bringing us closer to more personalized care.

Undoubtedly, the paradigm of lymphoma diagnosis has changed since the incorporation of ctDNA. In addition to the genetic studies already performed on solid biopsies, we have the option of performing these genetic studies on non-invasive samples such as liquid biopsies. This type of sample has been increasingly used for a variety of applications in oncology, including diagnosis, prognosis, and the identification of therapeutic targets10. In addition, ctDNA provides information on tumor burden and the dynamics of treatment response21,22. Our study assessed the utility of liquid biopsy in B-cell lymphomas in routine clinical practice through the validation of a commercial gene panel in patients with lymphoma at diagnosis. Including 26 patients, we showed that the use of liquid biopsies is feasible in routine clinical practice for DLBCL and FL. Specifically, ctDNA was detectable in 92% of the patients, and in 96% of the cases we were able to identify at least 1 alteration in ctDNA that was identical to the FFPE at diagnosis, indicating the potentially universal applicability of ctDNA. When explaining the reasons for the differences found between FFPE and plasma samples, we believe that they have to do mainly with the quality of the sample and the characteristics of the tumor. In our study, some mutations present in FFPE were not detected in plasma samples, probably due to a low total amount of plasma used (< 5 ml) and, therefore, the quantity of ctDNA obtained was insufficient in a few cases. It is also true that localized diseases or those with a low tumor burden could release a small amount of ctDNA into plasma, so we have learned that a volume of at least 10 ml of plasma should be used for optimal analysis. On the contrary, mutations detected in plasma and not in FFPE may be due to the heterogeneity of the tumor, taking into account that we are analyzing only a small fragment of tissue and not the entire tumor, so not all clones would be represented. Different is with the liquid biopsy, where from all the existing lesions DNA is being released into the bloodstream.

Although various studies have shown the usefulness of these techniques in specialized centers8,23, particularly in clinical trials, the applicability of this technique in routine clinical practice has rarely been reported. Numerous reviews on the subject have listed the potential benefits of liquid biopsy20,23,24, both in the diagnosis and follow-up of NHL; however, the standardization of these tools is not yet a reality.

As previously described, we found a correlation between advanced stage and bulky disease and the number of ctDNA mutations23,25. Our analysis also found mutations in the liquid biopsy from patients at localized stages and with low tumor burden, which means that this tool can also be used in this patient group. As previously mentioned, not less than 10 ml must be used, in order to obtain a greater amount of DNA and thus be able to identify all mutations. Moreover, we found that patients with bulky disease had more mutations found only in ctDNA (i.e., not in the FFPE samples), which could indicate that ctDNA samples better represent the tumor’s genetic variability than standard biopsies. The possibility of finding a different mutational profile when comparing liquid biopsies and FFPE samples from the same patient has already been demonstrated by Sherer et al.8, who identified transformed FL in a liquid biopsy sample from a patient with low-grade FL, which had not been previously identified in the paraffin biopsy. Liquid biopsy could therefore be a useful strategy when looking for specific mutations for target molecules, especially in patients with bulky disease.

In conclusion, our results confirm that the NGS techniques provides additional relevant data at the time of diagnosis, not only in FFPE samples but also in ctDNA, both complementary, and also the liquid biopsy provides the extra of how easy it is to obtain. These ctDNA samples are useful not only in patients with advanced stages and large masses, but also provide information in patients with localized disease and low tumor burden. Although there is still a lack of standardization today, it is important that we begin to incorporate these techniques into clinical practice, given the valuable information they can offer us about the lymphoma.