Molecular heterogeneity in peripheral T-cell lymphoma, not otherwise specified revealed by comprehensive genetic profiling

Peripheral T-cell lymphoma, not otherwise specified (PTCL, NOS) is a diagnosis of exclusion, being the most common entity in mature T-cell neoplasms, and its molecular pathogenesis remains significantly understudied. Here, combining whole-exome and targeted-capture sequencing, gene-expression profiling, and immunohistochemical analysis of tumor samples from 133 cases, we have delineated the entire landscape of somatic alterations, and discovered frequently affected driver pathways in PTCL, NOS, with and without a T-follicular helper (TFH) cell phenotype. In addition to previously reported mutational targets, we identified a number of novel recurrently altered genes, such as KMT2C, SETD1B, YTHDF2, and PDCD1. We integrated these genetic drivers using hierarchical clustering and identified a previously undescribed molecular subtype characterized by TP53 and/or CDKN2A mutations and deletions in non-TFH PTCL, NOS. This subtype exhibited different prognosis and unique genetic features associated with extensive chromosomal instability, which preferentially affected molecules involved in immune escape and transcriptional regulation, such as HLA-A/B and IKZF2. Taken together, our findings provide novel insights into the molecular pathogenesis of PTCL, NOS by highlighting their genetic heterogeneity. These results should help to devise a novel molecular classification of PTCLs and to exploit a new therapeutic strategy for this group of aggressive malignancies.


Introduction
PTCLs represent a clinically, histologically, and molecularly heterogeneous group of non-Hodgkin lymphomas (NHLs) derived from mature post-thymic T cells [1,2]. Among them, the most common entity is PTCL, NOS, accounting for~30% of all PTCLs [3]. Patients with PTCL, NOS generally demonstrate aggressive clinical course and are often refractory to standard therapy. By definition, PTCL, NOS includes cases that do not meet the criteria for any specific PTCL subtypes and has been considered a "wastebasket" category.
It has been recognized that a subset of PTCLs classified as PTCL, NOS has a T-follicular helper (TFH) cell phenotype (i.e., positive for CD4, PD-1, CD10, CXCL13, BCL6, and so on) and some pathological features of angioimmunoblastic T-cell lymphoma (AITL) [4][5][6]. In addition, recent genetic studies revealed that these cases share some of the recurrent genetic alterations found in AITL, such as mutations affecting TET2, DNMT3A, and RHOA [7][8][9][10][11]. Among these, the RHOA G17V mutation is highly specific to both PTCL subtypes and, when expressed in mouse T cells, induces TFH-cell specification and, together with TET2 loss, results in the development of AITL-like tumors [12]. On the basis of these findings, the revised World Health Organization (WHO) classification of hematological malignancies recommended that this subset of PTCL, NOS should be classified as PTCL with a TFHcell phenotype as a provisional entity (referred to as "TFH PTCL, NOS") [5]. However, the molecular pathogenesis of the remaining cases in the PTCL, NOS category is still poorly understood. The currently available genetic data from several small series reported different recurrent mutations and copy-number alterations (CNAs) [13][14][15][16], which preclude a solid conclusion as to the genomic landscape of the tumor. Systematic characterization of genetic alterations should significantly contribute to refining the molecular classification, improving prognostication, and identifying candidate therapeutic targets in this entity, as demonstrated in other lymphomas [6,17].
Here, we conducted a comprehensive genetic analysis to determine the spectrum of mutations, CNAs, and structural variations (SVs) in PTCL, NOS with and without TFH-cell phenotype. In particular, our efforts focused on genetically dissecting the molecular pathogenesis and identifying a new molecular subgroup of PTCL, NOS, showing unique genetic and clinicopathological features.

Patient samples
A total of 142 patients diagnosed with PTCL, NOS at six institutions were enrolled in this study according to the protocols approved by the Institutional Review Boards. This study was approved by the institutional ethics committees of the Graduate School of Medicine, Kyoto University and other participating institutes. All cases were reviewed, and a consensus diagnosis was made by expert hematopathologists according to the criteria of the 2008 WHO classification [18], of which 94 cases examined for tumor content. HTLV-1 infection was examined by anti-HTLV-1 antibody detection and/or southern blotting for HTLV-1 proviral DNA. HTLV-1-positive cases were considered as adult Tcell leukemia/lymphoma (ATL) and excluded from this study before analysis. Based on the recent revision of the WHO classification [5], TFH markers, including PD-1, CD10, CXCL13, and BCL6, were evaluated, and PTCL, NOS cases positive for at least two TFH markers were diagnosed as TFH PTCL, NOS. Because the minimum criteria for assignment of TFH phenotype is not well established, we considered PTCL, NOS cases expressing only one TFH marker as unclassifiable PTCL, NOS. Age, sex, and other clinical characteristics are summarized in Supplementary Table S1. Genomic DNA was extracted from fresh frozen tumor tissues or buccal swabs (as normal control) using the QIAamp DNA Mini kit (QIAGEN, Hilden, Germany) or commercially prepared by SRL Inc. (Tokyo, Japan). RNA was extracted from fresh frozen or formalin-fixed paraffin-embedded (FFPE) tumor tissues with RNeasy Mini Kit (QIAGEN).

RNA sequencing (RNA-seq)
Libraries for RNA-seq were prepared from the total RNA extracted from fresh frozen tumor tissues using the NEBNext Ultra RNA Library Prep kit for Illumina (New England BioLabs, Beverly, MA, USA), and subjected to sequencing using the HiSeq 2500 instrument with a standard 125-bp paired-end read protocol. The sequencing reads were aligned to the human reference genome (hg19) using STAR (v2.5.3) [34]. The mapped reads per gene were counted with featureCounts (v1.5.3) from the R-package "Rsubread", and normalized to counts per million (CPM) (R-package "edgeR") [35]. To identify significantly enriched pathways in each group, Gene Set Enrichment Analysis (GSEA: v2.2.4) with the Molecular Signatures Database-curated gene sets (hallmark and C2: v6.1) was performed for genes expressed at >1 CPM in two or more samples.

nCounter gene-expression assay
Details of the nCounter assay (NanoString Technologies, Seattle, WA, USA) have been reported previously [36]. Briefly, the GATA3, TBX21, and 20 housekeeping gene probes (NanoString Technologies) were hybridized to 300 ng of the total RNA for 16 h at 65°C, and applied to the nCounter Preparation Station for automated removal of excess probe and immobilization of probe-transcript complexes on a streptavidin-coated cartridge. The data were analyzed by using the nSolver 4.0 software (NanoString). To test the validity of nCounter analysis, a linear regression analysis was performed between normalized counts and CPM for GATA3 and TBX21 expressions, respectively.

SV and CNA analysis
SVs and CNAs were detected using the Genomon pipeline and the CNACS algorithm, respectively, as previously described [19,32,37]. Putative SVs were manually curated and further filtered by removing those (i) with Fisher's exact P-value >0.01; (ii) with ≥6 supporting reads in tumor; (iii) with allele frequency in tumor <0.02; or (iv) present in any of control samples. SV breakpoints were visually inspected using IGV. Candidate focal CNAs (shorter than half a chromosome arm, except for 17 p deletions involving TP53) were assessed in genomic regions where sequencing coverage was sufficient in unmatched control samples, and then manually reviewed and further filtered by removing those with <3 probes. Frequency of focal CNAs were calculated for 49 genes (i) with recurrent mutations or SVs (found in ≥3 cases) in our cohort (47 genes) and/or (ii) with focal homozygous deletions or high-level (copy number ≥4) amplifications in at least two samples (2 genes: CDKN2A and ARID2). To confirm CNAs detected by the CNACS algorithm, we conducted SNP array karyotyping for 24 samples using the Affymetrix GeneChip Human Mapping 250 K NspI array (Affymetrix, Santa Clara, CA, USA), as previously described [19,23]. Microarray data were analyzed to estimate the total and allele-specific copy numbers using CNAG/AsCNAR algorithms. Significantly recurrent arm-level CNAs were identified using a binomial distribution test, as previously described [38].

Mutation analysis
Pairwise correlations between alterations (present in ≥10 cases) were assessed by Fisher's exact test with Benjamini-Hochberg correction (q < 0.1). Mutational signature was determined by pmsignature (version 0.2.1), as previously described [39]. The RHOA G17V mutations were separately analyzed, because they behaved differently from other RHOA mutations.

Hierarchical clustering
Unsupervised hierarchical clustering of recurrent somatic alterations, including 49 genes (affected by mutations and/ or focal CNAs) and 14 arm-level CNAs, was performed with Spearman's rank correlation and Ward's linkage algorithm (R-package "heatmap.2").

Detection of HTLV-1 genome
For the detection of HTLV-1 sequence, after sequencing reads were mapped to the HTLV-1 genome (AB513134), the number of the HTLV-1-aligned reads were enumerated and divided by the number of the total reads mapped to the human reference genome (GRCh37). Then, the obtained ratio was evaluated for the cutoff value of 0.01%, which was determined so that all confirmed ATL cases were included (data not shown).

Survival analysis
Survival data were available for 46 patients with PTCL, NOS. Observations were censored at the last follow-up. The median follow-up was 22.6 months in surviving patients, and 24 patients were alive at the last follow-up. The Kaplan-Meier method was used to estimate overall survival, and the log-rank test was used to assess differences in overall survival between patient groups (R-package "survival").

CRISPR-mediated gene targeting
Human IRF2BP2 sgRNA-targeted sites were designed manually and checked in silico. The pSpCas9(BB)-2A-GFP (pX458) vector expressing Cas9 (Addgene plasmid 48138) was digested with BbsI and ligated to annealed and phosphorylated sgRNA oligonucleotides. Jurkat cells, obtained from the RIKEN Cell Bank, were transfected with indicated vectors using the Amaxa Nucleofector system (Lonza, Bazel, Switzerland) according to the manufacturer's instructions. CRISPR/Cas9-mediated targeting was confirmed by PCR-based deep-sequencing and expression analysis by real-time quantitative PCR, as previously described [37]. The sgRNA sequences and PCR primers are listed in Supplementary Table S4.

Real-time quantitative PCR
cDNA synthesis from the total RNA was performed with ReverTra Ace qPCR RT Kit (TOYOBO, Tokyo, Japan), and subjected to quantitative reverse transcription PCR with SYBR Premix Ex TaqII (Tli RNaseH Plus) (TaKaRa, Shiga, Japan) and LightCycler 480 System (Roche, Basel, Switzerland), according to the manufacturer's instructions. All assays were performed in three technical replicates for each biological replicate, and relative expression was normalized for 18S rRNA.

Immunoblot analysis
Cells were lysed, subjected to SDS-PAGE, and transferred to a PVDF membrane (Millipore, Bedford, MA, USA). The blot was incubated with the antibodies listed in Supplementary Table S5, and visualized by Immobilon Western Chemiluminescent HRP Substrate (Millipore).

Luciferase assay
Jurkat cells were collected 48 h after transfection with pX458 and pGL4.30 (luc2P/NFAT-RE/Hygro, Promega, Madison, WI, USA) vectors and assayed for NFAT luciferase activity using the Dual-Luciferase Reporter Assay System (Promega) and Wallac ARVO SX 1420 Multilabel Counter (PerkinElmer, Waltham, MA, USA). Firefly luciferase activity was normalized by Renilla luciferase activity (phRL-TK vector, Promega) in each sample, and is presented with a logarithmic scale relative to the activity in mock-transfected cells.

Cell proliferation assay
Five thousand cells transduced with indicated vectors were inoculated into 96-well culture plates, and their growth was monitored using Cell Counting Kit-8 (DOJINDO LABORATORIES, Kumamoto, Japan), according to the manufacturer's protocol.

Statistical analysis
Statistical analyses were performed with R 3.1.3 software (The R Foundation for Statistical Computing). Comparisons between groups were based on the Wilcoxon rank-sum test for continuous data with Bonferroni correction (if necessary), and the Fisher's exact test with Benjamini-Hochberg correction (if necessary) for categorical data. For functional assays, normality of data distribution and homogeneity of variance were assessed by the Shapiro-Wilk's test and F-test, respectively. Student's two-tailed t test was used to compare two groups, and a Welch's correction was applied when comparing groups with unequal variance (F-test, P < 0.05). In box plots, the center line and lower and upper hinges correspond to the median, and the first and third quartiles (25 and 75 percentiles), respectively. The upper and lower whiskers extend from the upper and lower hinges to the largest or smallest values, no further than 1.5× interquartile range from the hinges.

WES of patients with PTCL, NOS
To delineate the entire picture of genetic alterations in PTCL, NOS, we initially performed WES analysis of tumor and normal samples from 20 PTCL, NOS patients ( Supplementary Fig. 1A, B), including seven TFH, three unclassifiable, and four non-TFH PTCL, NOS cases from our cohort. In total, we detected 1068 somatic mutations (1.5 mutations/Mb/sample), including 971 SNVs and 97 insertions and deletions (indels), as well as 42 SVs (Fig. 1a;  Supplementary Tables S6-S8). These mutations mainly consisted of age-related C > T transitions at CpG sites,  followed by C > A substitutions at the CpCpT context, whose etiology has been unknown (Fig. 1b). Approximately a half of patients exhibited a low overall mutation frequency (<0.5 mutations/Mb), while there were four samples showing a moderate-to-high mutation rate (2-10 mutations/ Mb) (Fig. 1a). Additional targeted sequencing not only validated the somatic mutations detected by WES sequencing but also captured multiple previously reported mutations with low allele frequencies, such as those involving TET2 and RHOA (colored in red in Fig. 1c) [7][8][9][10][11], suggesting a possibility that some driver mutations overlooked with WES analysis. In addition to TFH-related mutations, the observed alterations included recurrent mutations and deletions of TP53 (n = 7), a well-known tumor suppressor gene [40], which were associated with a higher tumor mutation burden (Fig. 1d). Moreover, a number of mutations frequently observed in other subtypes of lymphomas, such as those in CDKN2A, VAV1, and TBL1XR1 [5,17,19,41], were also detected (Fig. 1a, Fig. S2A, B). Unexpectedly, 18 cases had a substantial number of sequencing reads mapped to the HTLV-1 proviral genome ( Supplementary Fig. S2C). After excluding these cases, who were considered to have ATL, we analyzed the remaining 124 cases and identified 438 non-silent somatic mutations (333 SNVs and 105 indels), with a median of three per sample (range 0-13) (Supplementary Table S9). These included numerous driver mutations which did not appear to be readily identifiable by WES due to low allele frequency ( Supplementary Fig. S2D), suggesting that deep-targeted sequencing would be required to delineate the entire landscape of driver alterations in PTCL, NOS. When the results from targeted-capture sequencing and WES were combined, a total of 41 genes were found to be recurrently mutated (in ≥3 cases), of which 12 (such as TET2, TP53, RHOA, and DNMT3A) were affected in more than 5% of a total of 133 cases ( Supplementary Fig. S2E). Copy-number analysis based on the sequencing method identified 222 focal CNAs in 41 recurrently altered genes (6 amplified and 35 deleted genes) (Supplementary Fig. S2F; Supplementary Table S10). Among them, ten (such as TP53, CDKN2A, CD28, HLA-B, and IKZF2) were affected in more than 5% of the cases, some of which showed highlevel amplifications or homozygous deletions (Supplementary Figs. S2F, S3A, B). In addition, 251 arm-level CNAs were detected in 14 significantly altered chromosome arms (seven gains and seven losses) (Supplementary Fig. S3C). SNP array karyotyping was also performed for 24 samples, in which 25 out of 27 focal CNAs (93%) and 136 of 141 arm-level CNAs (96%) were confirmed (Supplementary Fig. S3D; Supplementary Table S10).

TP53/CDKN2A-altered PTCL, NOS shows marked chromosome instability
In consistent with these findings, hierarchical clustering of recurrent somatic alterations revealed three molecular subtypes with discrete genetic features: those with TFH-related alterations (TET2, RHOA G17V, and IDH2) (group 1), those with TP53/CDKN2A alterations (group 2) and, those lacking any of the above alterations (group 3) (Fig. 2). While group shown with their odds ratios (OR). Orange colors depict gene pairs that are co-mutated more than expected by chance, and blue colors depict mutually exclusive gene pairs. c Two molecular subtypes [group 1 (TFH-related) and group 2 (TP53/CDKN2A)] and their major determinants in PTCL, NOS. Orange and blue lines represent co-occurring and mutually exclusive associations, respectively. The line width is proportional to the statistical significance (q value) of the association.
Genes showing at least two significant associations in (b) are shown 1 shows similar immunophenotype and genetic alterations to TFH PTCL, NOS [5], group 2 is supposed to represent a novel molecular subtype in PTCL, NOS (Fig. 5a). As revealed by genome-wide copy-number profiling, almost all group 2 cases exhibited extensive chromosomal abnormalities, which were rarely seen in other groups (Fig. 5b), pointing to a discrete genetic feature of group 2 tumors. This difference was quantitatively substantiated by a higher number of abnormal genomic segments in TP53-or CDKN2Aaltered cases than those harboring TFH-related or other alterations ( Fig. 5c; Supplementary Fig. S5A, B). Although group 3 showed a lower number of genetic alterations, ATM mutations and deletions were detected in a subset of group 3, some of which had extensive CNAs, similar to group 2 cases (Figs. 2b, 5b; Supplementary Fig. S5A, B). Given that ATM regulates the ARF-TP53 tumor suppressor pathway in response to DNA damage [46] (Fig. 3), ATM-altered tumors may exploit a shared oncogenic mechanism with group 2, which is characterized by tumor suppressor inactivation. GSEA analysis with RNA-seq data from 16 fresh frozen tumor tissues using curated gene sets showed that AITLrelated genes were the second most enriched signature in group 1, confirming the validity and reliability of expression analysis ( Fig. 5d; Supplementary Table S12). Among hallmark gene sets, genes associated with stromal response and inflammation were enriched in group 1, whereas cell cyclerelated genes were overrepresented in group 2 ( Fig. 5d; Supplementary Table S12), suggesting that differences of genetic features among molecular subtypes are reflected in gene-expression profiles. Although it has been reported that PTCL, NOS can be classified into two subgroups by GATA3 and TBX21 expressions [47], these expressions measured by RNA-seq, nCounter analysis, or IHC were similar among subtypes ( Fig. 5e; Supplementary Fig. S5C-F).
Histologically, group 2 showed a higher tumor content than group 1 (Fig. 6a), which is consistent with geneexpression profiling data showing strong immune or stromal cell-related signatures in group 1. With regard to clinical outcome, group 2 showed the worst survival, followed by group 1, whereas group 3 had an excellent outcome (Fig. 6b), suggesting that TP53/CDKN2A alterations and associated chromosomal instability confer an adverse prognostic impact. When compared with other PTCL subtypes, group 1, corresponding to TFH PTCL, NOS, showed a similar pattern of mutations and CNAs to AITL (Fig. 6c;  Supplementary Fig. S6; Supplementary Tables S13-S21), consistent with previous reports [9][10][11]. By contrast, group 2 showed a unique profile of somatic alterations, although it shared a number of aberrations with other PTCL subtypes (Fig. 6c). These findings indicate that TP53/CDKN2Aaltered cases have a molecular pathogenesis distinct from other PTCL subtypes, which may underlie their different clinical behavior.
Frequent genetic alterations associated with immune evasion in TP53/CDKN2A-altered PTCL, NOS Recurrent alterations in different PTCL, NOS subtypes affected a number of discrete functional pathways. Among these, uniquely overrepresented in group 2 was the pathways involved in immune surveillance ( Fig. 6d; Supplementary  Fig. S7), which include the components of the class I major histocompatibility complex (MHC) (HLA-A and HLA-B), the transactivator of MHC class II (CIITA), immune checkpoints (CD274), and molecules engaged in cell adhesion (CD58) and death signaling (FAS) (Figs. 2, 3). In group 2, most of these genes were affected by loss-of-function alterations, particularly by focal deletions (Fig. 7a; Supplementary  Fig. S8A, B), suggesting a possible link to genomic instability characteristic of this subgroup. Intriguingly, we identified recurrent loss-of-function mutations involving the PDCD1 gene (3% in the entire cohort), the gene encoding an inhibitory receptor, PD-1 (Fig. 7a). In addition to focal deletions found in ATL and other T-cell lymphomas, which were recently reported to induce T-cell malignancies in mice [48], frameshift and nonsense mutations of PDCD1 were observed in PTCL, NOS, suggesting that loss of PD-1 function takes place through multiple mechanisms.
Enrichment of somatic lesions in transcriptional and post-transcriptional regulators in TP53/CDKN2Aaltered PTCL, NOS Another significant finding was the enrichment of somatic alterations in transcriptional and post-transcriptional regulators in group 2 (Fig. 6d). These regulators included transcription factors (IKZF2, PRDM1, and ETV6), transcriptional co-repressors (TBL1XR1 and IRF2BP2), and RNA-binding proteins (DDX3X and YTHDF2) (Figs. 2, 3). IKZF2, also known as HELIOS, is one of the major regulators of T-cell development and affected in 29% of group 2 tumors exclusively through SV/CNAs, such as intragenic deletions, duplications, and inversions, most likely leading to dominant-negative spliced variants [49] ( Supplementary  Figs. S7, S8C). The heatmap shows somatic CNA segments (copy-number gains/amplifications, losses/ deletions, and uniparental disomies (UPD) with ≥10 probes) in each sample (horizontal axis) plotted by chromosomal location (vertical axis). Samples are vertically aligned in the same order as in Fig. 2b. c The number of abnormal chromosomal segments identified in cases with indicated alterations, regardless of presence or absence of other alterations. Each dot represents a single case. **P < 0.005, ***P < 0.0005, Wilcoxon rank-sum test with Bonferroni correction. d Significantly enriched gene signatures for group 1 (left) and 2 (right) in GSEA analysis with RNA-seq data from 16 cases (four, six, and six cases in group 1, 2, and 3, respectively) using the hallmark (top) and C2 (bottom) gene sets. e Dot plots of normalized counts of GATA3 (left) and TBX21 (right) expressions measured by nCounter analysis in each group (10,13, and 11 cases in group 1, 2, and 3, respectively). Wilcoxon rank-sum test IRF2BP2, which encodes an IRF2-dependent transcriptional co-repressor [50,51], was another common genetic target in group 2 tumors (Supplementary Fig. S7) and also affected in other lymphoma types [19,52], in which frequent nonsense or frameshift mutations are thought to lead to loss of function of IRF2BP2 (Fig. 7a). To assess the functional consequence of IRF2BP2 mutations on T-cell lymphomagenesis, we evaluated the effect of IRF2BP2 disruption on cellular growth and the transcriptional activity of NFAT, a major downstream target of T-cell receptor (TCR) signaling, using CRISPR/Cas9-mediated gene editing (Fig. 7b-d). Although IRF2BP2 disruption did not affect cell proliferation, it caused an enhanced transcription from an NFAT response element in a human T-cell line Other signaling pathway G ro u p 1 G ro u p 2 G ro u p 3 G ro u p 1 G ro u p 2 G ro u p 3 G ro u p 1 G ro u p 2 G ro u p 3 G ro u p 1 G ro u p 2 G ro u p 3 G ro u p 1 G ro u p 2 G ro u p 3 G ro u p 1 G ro u p 2 G ro u p 3 G ro u p 1 G ro u p 2 G ro u p 3 Number of alterations *** *** *** *** ** *** ***  8   TET2  RHOA  DNMT3A  IDH2  VAV1  CD28  YTHDF2  NOTCH1  TP53  CDKN2A HLA  [19], ALCL (n = 23) [21], and ENKTL (n = 25) [22]. Diagonal lines represent no data available. (Jurkat), regardless of co-transfection with NFAT1 or CD3/ CD28 stimulation, suggesting that loss-of-function alterations of IRF2BP2 lead to TCR signaling activation ( Fig. 7e; Supplementary Fig. S8D). By contrast, recurrent deteriorating deletions and mutations in YTHDF2 were detected in both groups 1 and 2 (8% of the entire cohort) ( Fig. 7a; Supplementary Fig. S7). This gene encodes a reader protein that recognizes N 6 -methyladenosine, the most abundant internal modification in mammalian mRNA, and reduces the stability of target transcripts [53], which may suggest the functional importance of deregulated mRNA stability in the pathogenesis of T-cell lymphoma.
Other commonly affected pathways and molecules in TP53/CDKN2A-altered PTCL, NOS Signal transduction molecules were also common mutational targets in group 2 tumors, including NOTCH1 and SOCS1 ( Fig. 6d; Supplementary Fig. S7). Although activating mutations in genes related to TCR signaling are reported in TFH-cell-derived lymphomas [11], in our cohort, more than two-thirds of both groups 1 and 2 cases harbored somatic changes in the components of TCR-NF-κB signaling and their downstream pathways (Fig. 6d). However, the spectrum of target genes substantially differed between group 1 and 2 tumors. In group 1, RHOA mutations represented by far the most predominant alterations. By contrast, the alterations in group 2 involved a broader spectrum of genes than those in group 1, such as CD28, PLCG1, CARD11, TNFAIP3, and PTPRC, which were frequently affected by focal CNAs, including high-level amplifications or homozygous deletions, rather than missense mutations ( Supplementary Figs. S3B, S7). In addition to TFH-related mutations, such as those affecting TET2, IDH2, and DNMT3A, recurrent mutations and CNAs/SVs were also present in a variety of epigenetic regulators, including histone modifiers (KMT2C (MLL3), KMT2D (MLL2), SETD1B, SETD2, and CREBBP) and SWI/ SNF-mediated chromatin remodelers (ARID1A and ARID2), in our entire cohort (Figs. 2, 3). Among these, two histone 3 lysine 4 methyltransferases, KMT2C and SETD1B, were frequently inactivated by loss-of-function mutations or focal deletions in group 2 ( Supplementary Figs. S7, S8E). The remaining group of molecules affected in PTCL, NOS were G protein-coupled receptors involved in T-cell trafficking, such as CCR4 and CCR7, which are also commonly mutated in other T-cell neoplasms (Figs. 2, 3).

Discussion
Through extensive genetic analyses using high-throughput sequencing, we have delineated a comprehensive registry of genetic alterations in PTCL, NOS. It includes not only known mutational targets in PTCL, NOS and other lymphoma subtypes but also novel recurrently altered genes previously unreported in this tumor type, such as KMT2C, SETD1B, YTHDF2, and PDCD1. As expected from a highly variable clinical presentation and prognosis, as well as pathological findings, PTCL, NOS is shown to be a heterogeneous entity in terms of genetic profile [3]. However, it should be underscored that PTCL, NOS does not represent a mere waste basket category, but comprises several discrete subtypes of mature T-cell neoplasms on the basis of unique genetic profiles.
Group 1 tumors, characterized by TFH-related mutations, such as TET2, RHOA G17V, and IDH2 mutations, correspond to a provisional entity of TFH PTCL, NOS, according to the revised WHO classification [5]. These tumors also exhibit a variety of somatic alterations at low frequencies, such as VAV1, CD28, and YTHDF2, most of which are shared by other PTCL, NOS subtypes, suggesting overlapping mechanisms of lymphomagenesis. Group 2 tumors are a previously unrecognized molecular subtype, which harbors frequent TP53 and/or CDKN2A alterations. This subtype shows the unique genetic features characterized by an increased burden of CNAs, which preferentially target molecules involved in immune surveillance and transcriptional regulation, including HLA-A/B and IKZF2. The high prevalence of TP53/CDKN2A alterations demonstrates the biological relevance of tumor suppressor inactivation and resultant genomic instability during T-cell lymphomagenesis, which is supported by the fact that T-cell lymphoma is one of the most common malignancies observed in p53-deficeint mice [40]. Except for ATM alterations in their subset, group 3 tumors lack a subtypedefining alteration, suggesting the necessity for further molecular investigation in this subtype.
Many efforts have been undertaken to further molecularly characterize and subdivide the heterogeneous group of tumors classified as this category. Microarray analysis of gene expression identified a biologically distinct entity showing a proliferation signature associated with a shorter survival [54]. More recently, large-scale gene-expression profiling enabled the characterization of two different molecular subgroups related to high expression of either TBX21 or GATA3 [47]. However, the molecular categorization of PTCL, NOS still remains controversial due to the inadequate understanding of the genetic landscape of the tumor. Therefore, the identification of the TP53/CDKN2Aaltered molecular subtype with different genetic features, and prognosis can offer a clue to understand the genetic heterogeneity of PTCL, NOS and provide novel insights into its molecular classification and patient stratification, hopefully leading to the improvement of diagnostic and therapeutic strategy for this deadly disease.