CREBBP and STAT6 co-mutation and 16p13 and 1p36 loss define the t(14;18)-negative diffuse variant of follicular lymphoma

The diffuse variant of follicular lymphoma (dFL) is a rare variant of FL lacking t(14;18) that was first described in 2009. In this study, we use a comprehensive approach to define unifying pathologic and genetic features through gold-standard pathologic review, FISH, SNP-microarray, and next-generation sequencing of 16 cases of dFL. We found unique morphologic features, including interstitial sclerosis, microfollicle formation, and rounded nuclear cytology, confirmed absence of t(14;18) and recurrent deletion of 1p36, and showed a novel association with deletion/CN-LOH of 16p13 (inclusive of CREBBP, CIITA, and SOCS1). Mutational profiling demonstrated near-uniform mutations in CREBBP and STAT6, with clonal dominance of CREBBP, among other mutations typical of germinal-center B-cell lymphomas. Frequent CREBBP and CIITA codeletion/mutation suggested a mechanism for immune evasion, while subclonal STAT6 activating mutations with concurrent SOCS1 loss suggested a mechanism of BCL-xL/BCL2L1 upregulation in the absence of BCL2 rearrangements. A review of the literature showed significant enrichment for 16p13 and 1p36 loss/CN-LOH, STAT6 mutation, and CREBBP and STAT6 comutation in dFL, as compared with conventional FL. With this comprehensive approach, our study demonstrates confirmatory and novel genetic associations that can aid in the diagnosis and subclassification of this rare type of lymphoma.


Introduction
Follicular lymphoma (FL) is the second most common nodal non-Hodgkin lymphoma accounting for~20% of all lymphomas 1 . The proliferation of germinal center B-cells (GCB) forming abnormal follicles coupled with translocation of the antiapoptotic gene BCL2 with IGH resulting in t(14;18)(q32;q21) are diagnostic hallmarks of FL 1 . However, there are exceptions, as~5% of low-grade follicular lymphoma (LGFL) show a predominantly diffuse growth pattern 2,3 , and~10% of FL lack t(14;18) 1 , most of which represent high-grade disease.
The 2016 WHO classification recognizes several variants and related entities of FL, the latter of which is designated as conventional follicular lymphoma (cFL). The morphologically low-grade spectrum includes in-situ follicular neoplasia, duodenal-type FL, and the diffuse FL variant (dFL) with the former two entities consistently demonstrating t(14;18) BCL2/IGH rearrangements. The morphologically high-grade spectrum includes testicular FL and pediatric-type FL (pFL), neither of which carry BCL2/IGH rearrangements. Genomic analysis of cFL has shown that in addition to t(14;18), a number of recurrent copy number variants (CNVs) 4-10 and somatic mutations can be found [11][12][13][14][15][16][17][18][19] , such as CNVs of 1p36, mutations of epigenetic regulators KMT2D, CREBBP, and EZH2, and mutations of TNFRSF14. The genetic abnormalities found in cFL serve as the basis against which variant subtypes can be compared.
dFL is the only LGFL variant lacking t (14;18). This entity was first described in 2009 in 35 cases as an unusual type of LGFL with a predominantly diffuse growth pattern, characteristic immunophenotype, and near-uniform deletion of chromosome 1p36 3 . This variant of FL was distinguished from LGFL with a predominantly diffuse growth pattern, as the former consistently lack the characteristic BCL2 rearrangement, whereas the latter consistently demonstrate t (14;18). Besides the genetic difference, the 2009 description of dFL also found characteristic clinical features, such as frequent groin/inguinal site of presentation, bulky low clinical stage disease, and good prognosis. Subsequent to this description, two other series evaluating 11 cases 20 and 6 cases 21 of dFL confirmed recurrent 1p36 abnormalities and/or TNFRSF14 mutations 20 , as well as mutations of CREBBP and STAT6 20,21 . The aims of the present study were to use a comprehensive approach to build upon existing, yet incomplete, literature, and determine unifying pathologic and genetic abnormalities. Findings from this study improve our understanding of the relationship between dFL and cFL, the molecular pathogenesis of dFL, and identifies potential molecular markers that may aid in the diagnosis and accurate subclassification of this rare variant of FL.

Pathologic case selection
Excisional biopsies were selected from the pathology archives of the Johns Hopkins Hospital (JHH) and the National Cancer Institute (NCI) after appropriate institutional review board approval. The JHH archives were searched from 1984 to 2013 for all cases of LGFL with a predominantly diffuse growth pattern (≥75%), occurring in an inguinal/groin site, and showing coexpression of CD23. The NCI archives were searched from 2000 to 2014 for cases citing the original Katzenberger et al. description 3 . BCL2 protein expression by immunohistochemistry and BCL2 rearrangement status for FISH were not used as selection criteria. Cases with components of histologic grade 3 or diffuse large B-cell lymphoma (DLBCL), or with high proliferation indices (>30%) were excluded. The histologic and immunohistochemical stains, and available clinical and ancillary data, were reviewed in concert by the study authors with consensus agreement of the final diagnoses.
Fluorescence in-situ hybridization FISH using the Vysis LSI IGH/BCL2 dual color dual fusion probe (Abbott Molecular, Des Plaines, IL) was performed on all cases (per manufacturer's protocol) without clinically available FISH analysis using formalinfixed paraffin embedded (FFPE) tissue. Since histologic sectioning results in overlapping cells, only dual fusion signals identified in the same plane were considered as true fusions. Detailed methods can be found in Supplementary Information.

SNP-microarray analysis
DNA was extracted using 4-10 unstained slides of FFPE tissue using the Pinpoint Slide DNA Isolation system (Zymo Research, Orange, CA). In brief, unstained slides were deparaffinized with xylene followed by tissue dissection and Proteinase K digestion. DNA cleanup was performed using the QIAamp DNA mini kit with the QIAcube instrument (QIAGEN, Valencia, CA). SNPmicroarray using the HumanCytoSNP12 BeadChip platform (Illumina, San Diego, CA), which assesses~300,000 polymorphic loci, was performed per manufacturer's protocol. Analysis was completed using KaryoStudio (Illumina, San Diego, CA) and Nexus Copy Number (BioDiscovery, Hawthorne, CA). SNP and gene annotations were compared against the National Center for Biotechnology Information (NCBI) genome build 37 (GRCh37/hg19). CNVs were determined by independent and consensus review by R.R.X and C.D.G. Detailed interpretive criteria can be found in Supplementary Information.

Targeted next-generation sequencing analysis
Using 200 ng of extracted DNA from the above analysis, DNA hybrid capture libraries were prepared using an Agilent SureSelect-XT (Agilent Technologies, Santa Clara, CA) custom-designed target enrichment kit evaluating fullgene sequences of 641 cancer-related genes (see Supplementary Information), as previously described 22 . Following DNA quality control, shearing and library preparation, next generation sequencing was performed on an Illumina HiSeq 2500 (Illumina Biotechnology, San Diego, CA) using 2 × 100 bp Rapid Run v2 paired end chemistry. Using vendor supplied software, FASTQ files were generated. All reads were aligned to NCBI GRCh37/hg19 using the Burrows-Wheeler alignment algorithm v0.7.10. Piccard Tools v1.119 was used for SAM to BAM conversion. The final BAM file was used for variant calling using a custom pipeline MDLVC v.6 22 and HaplotypeCaller v3.3. Variants with low variant allele frequency (VAF) (<5%) and variants found in a reference pool of normal samples were excluded. Samples (cases 5 and 16) with lower quality sequencing data had an additional VAF filter of 9% applied. Variants meeting quality criteria were then annotated using the COSMIC database v82, dbSNP v150, Annovar (07042018), and Ensembl variant effect predictor 23 . Manual review of variant calls was performed using the Broad Institute's Integrated Genomics Viewer v.2.3.4. Tumor mutation burden (TMB) was calculated using a subset of the sequenced genes per published methods 24 . Detailed NGS and bioinformatics methods can be found in Supplementary Information. Variant significance was determined by crossreferencing COSMIC (v88), gnomAD (r2.0.2) and ClinVar databases factoring in VAF, functional consequence, level of evidence in the respective databases, evidence of the variant in hematolymphoid malignancies, presence of other variants affecting the same amino acid, and mutational frequency of the gene in hematolymphoid malignancies. Using this rubric, each variant was assigned into one of four categories: likely somatic, cannot exclude somatic/possibly somatic, cannot exclude somatic/possibly germline, and likely germline. Within the "cannot exclude somatic" category, variants were grouped into the possibly germline category if the gene in question had not been reported to be mutated in either dFL 20,21 , pFL 25 , cFL 14,19 , or MZL 26 . Variants assigned to the "likely somatic" and "cannot exclude somatic/possibly somatic" categories were included for further analyses.

Tumor clonality and cellularity analysis
Tumor clonality and subclonality analysis was assessed based on several formulas that take into account the admixture of lymphoma cells with normal cells, the presence of clonal and subclonal mutations, and the combined impact of CNVs and coding mutations (see Supplementary Information and supplementary Figures S1-S3). The most dominant mutation in each tumor, which accounted for the impact of co-occurring CNVs, was used to estimate tumor purity/cellularity. All other variants were divided by this number to derive the normalized subclonal representation of the mutation within the tumor. If tumor purity estimates based on VAF greatly exceeded the morphologic estimate, variants contributing to the overestimation were rereviewed for the likelihood of germline derivation and potential for undetected cooccurring CNVs. Should these variants be found, tumor purity was recalculated accordingly. If this resulted in reassignment of variants to the possibly/likely germline category, these variants were then excluded from subsequent analyses.

Statistical analysis and graphing
Statistical analyses were performed using GraphPad Prism version 8 (GraphPad Software, La Jolla, CA) and Microsoft Office Excel 2010 (Microsoft, Redmond, WA). Continuous variables were compared using parametric unpaired two-tailed t tests, while categorical variables were compared using Fisher's exact test. Detailed statistical analyses are described in Supplementary Information. Mutation representation within protein domains was mapped using MutationMapper 27 and Lollipop 28 .

dFL shows unique pathologic features
In total, 16 cases of LGFL meeting the inclusion and exclusion criteria were identified. All cases underwent consensus review by the study authors. Summary patient and pathologic findings are detailed in Table 1. Histologically, all cases showed ≥75% diffuse growth with many demonstrating microfollicle formation ( Fig. 1), which are miniaturized abnormal follicles predominantly composed of centrocytes lacking follicular dendritic networks. Other notable features include frequent sclerosis and interstitial fibrosis, focal preservation of normal lymph node structures, including normal germinal centers, and more rounded nuclear cytology of the lymphoma cells. Cases with microfollicles tended to have lymphoma cells with centrocyte-like nuclei within microfollicles, and lymphoma cells with more rounded nuclei outside. Immunohistochemical studies confirmed that all lymphomas expressed BCL6, CD10, and CD23, and most expressed variable BCL2. Two cases showed equivocal BCL2 staining (cases 7 and 8) due to extensive T-cell admixtures. One case was BCL2 negative (case 15). Some cases also demonstrated disparate staining patterns for BCL2 (cases 6 and 9) and CD10 (case 9) within and outside of microfollicles.
Chromosome 16p13 and 1p36 are recurrently altered in the absence of BCL2/IGH FISH for IGH/BCL2 was completed for 15 of 16 cases. All interpretable results showed two green (IGH) and two orange (BCL2) signals without evidence of fusion (Fig. 2c). SNP-microarray studies were performed on all cases (  Supplementary Table S1. CREBBP and STAT6 are highly recurrently comutated Next-generation sequencing (NGS) was performed in all cases. A total of 161 "likely somatic" and "cannot exclude somatic" variants were identified in 56 genes. 157 (97.5%) of these variants were classified as likely somatic, while 4 (2.5%) were classified as cannot exclude somatic. Clonality and cellularity analysis (see below) reclassified two "likely somatic" variants (KMT2C and SPEN) and two "cannot exclude somatic" variants (KMT2C and NOTCH1) as possibly germline, and reclassified the remaining two "cannot exclude somatic" variants (CSMD3 and CARD11) as possibly somatic. Once the possibly germline variants were removed, along with other "likely germline" variants, a total of 157 likely/possibly somatic mutations were identified in 56 genes (Fig. 4). The number of mutations identified in each case ranged from 6 to 18 (median 9.5, 95% CI 7-11). Potential aberrant somatic hypermutation, suggested by the presence of multiple nondeleterious mutations with similar variant allele frequencies occurring within a single exon and allele, was identified in 3 cases (cases 6, 10 and 11) involving BCR, SOCS1, and ACTB, respectively. TMB was calculated for 14 of (Fig. 5). Thirteen cases demonstrated CREBBP and STAT6 comutation (13/16, 81.2%). Lymphomas carrying mutations in both genes harbored fewer total alterations compared with lymphomas lacking comutations (Fig. 4a). This observation held true for the number of mutations (median 8, ranging from 6 to 11 vs. median 18, ranging from 12 to 18; p value < 0.0001), CNVs (median 2, ranging from 0 to 5 vs. median 6, ranging from 3 to 9, p value of 0.0191), and total alterations (median 10, ranging from 7 to 16 vs. median 24, ranging from 14 to 27, p value of 0.0006). Clonality and cellularity assessment (see below) showed that these differences could not be accounted for by lower tumor purity in the comutated cases (Fig. 6a). Even though the number of mutations and CNVs differed between these two groups, TMB did not differ    Figure S4).

CREBBP mutations are clonally dominant
Integrating both mutation VAFs and (co-occurring) CNV B-allele frequencies, the cellular representation of individual alterations was calculated (Fig. 6b), which enabled estimation of tumor purity/cellularity (Fig. 6a), and more accurate variant significance classification. This analysis showed that CREBBP mutations are dominant clonal events in most cases (Fig. 6c)  CREBBP and STAT6 comutation and 16p13 and 1p36 loss represent unique features of dFL In order to determine if the recurrent CNV and mutational findings from the present study are enriched in dFL, a detailed literature review was performed ( Table 2). Each recurrent, and select combinations of, alteration found in the current report was pooled with less comprehensive analyses from three previous studies of dFL 3,20,21 to identify unique features of dFL. Of note, two cases from the Siddiqi et al. study were not included, as those cases had demonstrable BCL2/IGH rearrangements. The aggregate frequencies of particular alterations found in dFL were contrasted with previously published reports for cFL [4][5][6][7][8][9][10][11][12][13][14][15][16][17][18][19] and MZL 26,[29][30][31][32][33][34][35][36][37][38][39][40][41] . Although the previous studies describing CNVs in cFL used a variety of techniques, most of these studies (8/13, 61%) were performed using SNParray platforms similar to the present method indicating that the results obtained in these prior studies should be comparable to our findings. Compared with cFL and MZL, 16p13 and/or 1p36 abnormalities are far more frequent in dFL. CREBBP mutations are slightly more common in dFL, and STAT6 mutations are much more common in dFL. CREBBP and STAT6 comutation is particularly enriched in dFL. All recurrent alterations found in dFL are statistically significantly underrepresented in MZL (Table 2).

Discussion
The present study of 16 cases of dFL is the largest series to include detailed pathologic, chromosomal, and NGS analyses that reveal novel, and unifying, pathologic and genetic findings. Not only do our findings support continued classification of dFL a variant of cFL, our findings also show how comprehensive molecular profiling can aid in the differential diagnosis and workup of low-grade Bcell lymphoma (LGBCL).
Pathologic analyses identified novel morphologic features of dFL, such as frequent sclerosis, microfollicle formation, and rounded nuclear cytology, in addition to the known features of diffuse histology, focal preservation of normal lymph node structures, coexpression of CD23, and variable expression of BCL2 3,20 . Microfollicles lack follicular dendritic networks rendering them distinct from typical follicles/nodules found in cFL. To our knowledge, this growth pattern has only been associated with dFL 1 , and has not been described in any other type of LGBCL to date. Although CD23 was used as a selection criteria for 6 of 16 cases, all cases showed CD23 coexpression suggesting that this may be a unifying feature of dFL, whereas cFL is only occasionally CD23 positive 42,43 . Variable BCL2 expression in the absence of BCL2/IGH rearrangements suggests alternative mechanisms of BCL2 up-regulation on the DNA 44,45 or transcriptional 46,47 level, although we did not find either copy number gains of the BCL2 locus or mutations of BCL2 in dFL. Our data demonstrated new associations of loss/CN-LOH of 16p13 and CN-LOH of 1p36, and confirmed the reported absence of t(14;18) and recurrent loss of 1p36 3,20 . However, the frequency 1p36 abnormalities in our series was far lower than originally published 3 , but is similar to the rate reported by Siddiqi et al. 20 . This difference may be related to selection, sampling, and/or technical biases. Unlike the Katzenberger et al. 3 report where CD23 positivity was found in approximately two-thirds of the lymphomas, there was uniform expression of CD23 in this and the Siddiqi et al. series, which could skew the distribution of the genetic findings. Alternatively, with larger numbers of dFL being studied, the full spectrum of chromosomal abnormalities is emerging unmasking a lower prevalence of 1p36 deletion. Finally, technical bias could also account for these differences, as the original report used FISH, which has a superior analytical sensitivity (5% of nuclei) to both aCGH and SNP-microarray. While all of the cases we studied had at least 15% tumor cells, it is plausible that subclonal loss of 1p36 may be missed by our approach. Irrespective of the reason for this discordance, combined data suggest that loss of 1p36 alone is not sufficient, or specific, for dFL, especially if array-based techniques or NGS are used. Unlike previous studies, the most predominant CNV observed in our series was loss/CN-LOH of 16p13, which was only found in two cases (22.2%) in the Siddiqi et al. study 20 . This apparent discrepancy may, again, be technique related, as array CGH (aCGH) used by Siddiqi et al. typically shows inferior analytical sensitivity, and cannot detect CN-LOH. Not only are 16p13 abnormalities a novel association in dFL, we also found that the minimally altered region(s) encompassed CREBBP, CIITA, and SOCS1, which suggests a possible cooperative mechanism for tumor immune evasion 19,48 .
Targeted NGS showed near-uniform mutations of CREBBP and STAT6 with clonal dominance of the CREBBP mutations suggestive of a founder event. The mutational profiles of dFL in our series showed frequent mutations in genes implicated in GCB derived lymphomas 11,12 , including CREBBP, TNFRSF14, KMT2D, and EZH2, which offers genetic confirmation for the current classification of dFL as a FL variant. We did not identify MAPK pathway mutations associated with pFL 25,49 indicating dFL shares more genetic similarities with cFL than pFL. Unlike cFL, where t(14;18) represents the founder event 13 and CREBBP mutations represent subsequent driver events, our data suggest CREBBP mutations represent founder events in dFL in the absence of BCL2/ IGH rearrangements.
With regard to CREBBP mutations, the enrichment for non-truncating mutations within the HAT domain, which leads to enzymatic loss of protein function 12 , is similar to what has been previously described in cFL 50 . Unlike previous reports of cFL or GCB DLBCL 12,14 , which show majority mono-allelic loss of CREBBP, our series identified majority bi-allelic loss of CREBBP. In mice, heterozygous/haploinsufficient loss of CREBBP coupled with BCL2 overexpression in B-cells leads to the development of GCB lymphomas 50 . Without BCL2/IGH, however, other antiapoptotic mechanisms may be implicated in dFL, such as STAT6 comutation. STAT6 is commonly mutated in classical Hodgkin lymphoma (32%) 51 and PMBL (36%) 52 , but is not typically mutated in GCB lymphomas [11][12][13][14][15][16][17][18][19] . Our data show that STAT6 mutations are always comutated with CREBBP, or EP300 that forms the CREBBP/EP300 complex, in dFL, and are frequently associated with concurrent loss of SOCS1. The conspicuous co-occurrence of these alterations suggest a degree of cooperativity. Similar to previous studies of GCB lymphoma 53,54 , all of the detected STAT6 mutations in dFL were missense changes occurring in the DNA binding domain, which has been shown to activate JAK/ STAT signaling 53,54 . An important STAT6 target is the BCL-xL/BCL2L1 (BCL2-like antiapoptotic protein) gene 55 , which is often amplified in epithelial malignancies 56 . In PMBL, overexpression pSTAT6 leads to accumulation of BCL-xL 57 , a phenomenon that may be reversed by inducing the STAT6 negative regulator SOCS1 51,57 . The concurrent gain of function of STAT6 and loss of its negative regulator, SOCS1, in dFL may drive high levels of BCL-xL that could serve as a functional surrogate for BCL2 excess to cooperate with CREBBP bi-allelic loss in the development of dFL. Future studies could evaluate the possibility that CREBBP loss and STAT6 gain, possibly through BCL-xL, are sufficient to induce dFL-like lymphomas. b Prevalence of the corresponding abnormality in the entire group.*Case 16 was excluded from this analysis due to absence of SNP-array data.
A major limitation of this study is that it is correlative, and lacks functional confirmation of the findings and the proposed interactions. Another limitation is the small sample size and lack of clinical follow-up, which is a consequence of the exceedingly rare occurrence of this lymphoma, and the frequent extramural consultative nature of the pathology review. As described earlier, the uniform inguinal location and CD23 positivity found in the present study may bias the results towards apparent unifying pathologic and molecular features. Given that these two features are commonly used as criteria to diagnose this variant of FL, only a large-scale screen of diffuse-pattern LGFL would allow identification of sufficient numbers of CD23-negative/noninguinal cases to investigate this possibility. Although some t(14;18)-negative FL have BCL6 abnormalities (translocations or amplification), we did not pursue BCL6 translocations since that was not a criterion used in the original Katzenberger et al. 3 definition, and we did not find BCL6 amplification in our series. Additional limitations are technical in nature. There may be false negativity, in particular for subclonal 1p36 deletion, due to low tumor cellularity seen in a small number of cases. The lack of matched germline tissue can confound tumor-only SNPmicroarray and NGS analysis, although we have detailed conservative and comprehensive interpretive guidelines to limit misattribution of germline variants as somatic mutations. Last, we did not perform detailed genetic analyses of a control group comprising cFL and MZL to determine if the CNVs and mutations found in dFL are truly enriched by a direct case-control comparison. Since a broad range of techniques and analysis methods were used by the referenced studies, there may be apparent differences in chromosomal and mutational patterns that is simply methodology-related. However, since many of the referenced studies used very similar techniques to the ones used in the present study, and reproducible molecular patterns were identified through this review, the presented aggregate reanalysis of the literature should represent a reliable estimate of the true rates of chromosomal and molecular abnormalities found in cFL and MZL, from which dFL differ.
Combined with the previously published studies, 66 dFL cases have now been pathologically and genetically characterized. As the WHO classification moves towards molecularly-defined lymphoma entities, such as pFL, the unifying pathologic and genetic features described herein may aid in the accurate subclassification of LGFL. The diagnostic distinction between the dFL from cFL with prominent diffuse growth is specifically recommended by the 2016 WHO 1 when an excisional biopsy is available, as the former will consistently lack t(14;18) BCL2/IGH rearrangements. In diagnostically challenging cases, the ancillary work-up should begin with FISH. Once absence of BCL2 rearrangement is confirmed, NGS and CNV detection should follow. Identification of the characteristic 1p36 and/or 16p13 abnormalities along with CREBBP and STAT6 comutations would support a diagnosis of a t(14;18)-negative dFL. The present literature, including our findings, has identified genetically distinct profiles of subtypes of LGBCL, which support the incorporation of genomic studies in the routine lymphoma workup, as the field moves towards molecular classification of lymphoma subtypes.  Other mutations are represented as black dots c Individual mutations found in the top ten recurrently mutated genes shown as a proportion of the respective tumor % on the y-axis. Error bars represent median and interquartile range. Statistical analysis of clonal dominance show statistically significant differences between clonal dominance of CREBBP vs. subclonality of STAT6, TNFRSF14, KMT2D, EZH2, and CARD11 and EP300. Table 2 Recurrently detected copy number variants and mutations found in diffuse follicular lymphoma (dFL) compared with previously published studies of conventional follicular lymphoma (cFL) and marginal zone lymphoma (MZL). CNVs in dFL detected in the present report and previously published studies* 3,20 were grouped and compared with previously published studies of cFL 4-10 and MZL 26,29,[38][39][40][41] . Mutations in dFL detected in the present report and previously published studies* 20,21 were grouped and compared with mutations found in previously published studies of cFL [11][12][13][14][15][16][17][18][19] and MZL 26,[29][30][31][32][33][34][35][36][37] . The prevalence of each alteration in dFL, cFL, and MZL are shown along with the total number of samples studied. Statistical significance for each alteration found in dFL is tested against the same rates in cFL and MZL, and the resultant p values are shown. *Two cases were removed from a previously published study 20 of dFL in this aggregate analysis due to the presence of t(14;18) in those cases.