Introduction

Genome-wide studies have recently elucidated the mutational landscape of chronic lymphocytic leukemia (CLL) characterized by few genes mutated at moderate frequency and a larger amount altered in less than 5% of the cases.1, 2, 3, 4, 5, 6 The remarkable genomic plasticity of this disease has been further emphasized by the subclonal composition,4, 6 the identification of convergent mutational evolution in few patients7, 8 and the different patterns of clonal diversification upon disease progression.9, 10, 11, 12 This inter- and intra-tumor mutational diversity may be a relevant cause of the heterogeneous clinical outcome of these patients.

Different individual mutated genes have demonstrated their prognostic value13, 14, 15, 16, 17, 18, 19, 20, 21 and some models integrating gene mutations and chromosomal alterations have been proposed.16, 22, 23 However, the results are still controversial, probably due in part to the complex mutational composition of the tumors and the possible interactions between mutated genes and chromosomal alterations, which may not be well captured in studies of limited number of genes and samples.5, 6 The global perspective of the whole-genome/exome sequencing studies have provided new insights on the influence of the genomic complexity in the evolution of the disease. These studies uncovered that both the subclonal composition4, 6 and mutational complexity characterized by the accumulation of driver alterations of the tumors5 impair the prognosis of the patients. Moreover, initial studies using high-coverage next-generation sequencing (NGS) have revealed the prognostic impact of mutations present at very low allelic frequency.24, 25, 26, 27 Together, these studies suggest that understanding the heterogeneous evolution of CLL may require the integration of the subclonal architecture and mutational complexity of the tumors. Therefore, the aims of this study were to define the deep mutational architecture of the most frequently altered driver genes in CLL, and determine its relevance in the progression of the disease.

Materials and methods

Patients and samples

We studied 406 previously untreated CLL patients (Table 1). Tumor cells were purified from fresh or cryopreserved mononuclear cells.1 The median final fraction of tumor content was 98% (85% of the samples had >90%) as determined by flow cytometry. DNA was also extracted from purified normal blood cells from the same patients (purity >97%, median 99.8%). In 48 patients, longitudinal samples obtained at different time points of the disease were also examined (Supplementary Table S1). Informed consent was obtained from all patients according to the International Cancer Genome Consortium (ICGC) guidelines.28 This study was approved by the Hospital Clínic of Barcelona Ethics Committee.

Table 1 Patients' baseline characteristics at the time of sampling

Copy number analysis

Copy number alterations (CNA) were investigated using Genome-wide Human SNP Array 6.0 (Affymetrix, Santa Clara, CA, USA) in 376 cases and 44 sequential samples (Supplementary Methods).5 The proportion of tumor cells (or cancer cell fraction, CCF) carrying each CNA was estimated from the SNP array data (Supplementary Methods and Supplementary Figure S1). CNA were considered as clonal if their CCF was 85%, while subclonal otherwise.6 CNA drivers were previously described.5

Sequencing approach

The mutational status of 28 CLL driver genes was examined using a deep-targeted NGS strategy in the 406 patients and 48 sequential samples. The genes were TP53, SF3B1, BIRC3, NOTCH1 and ATM, recently analyzed for the same cohort,26 and 23 additional genes (POT1, NFKBIE, ZNF292, XPO1, EGR2, FBXW7, MGA, KLHL6, RPS15, MYD88, DDX3X, BRAF, NXF1, DTX1, BCOR, CCND2, KRAS, IRF4, MED12, ZMYM3, NRAS, TRAF3 and PIM1), which were selected among the most frequently mutated in prior whole-genome/exome sequencing studies (Supplementary Table S2).5, 6 Deep-targeted NGS libraries were performed using the Access Array system (Fluidigm, South San Francisco, CA, USA) (Supplementary Table S3) and/or the Nextera XT DNA library preparation kit (Illumina, San Diego, CA, USA) (Supplementary Table S4) before sequencing in a MiSeq equipment (Illumina) (Supplementary Methods).

Mutational analysis

A mean coverage >1500 × was obtained for nearly all targeted regions (Supplementary Table S1). A previously validated bioinformatic pipeline26 allowed the detection of mutations down to 0.3% of variant allele frequency (VAF) (Supplementary Methods and Supplementary Figure S2). Synonymous variants and known polymorphisms (dbSNP142, 1000 Genomes Project, custom CLL database5) were automatically removed (Supplementary Methods). Variants were considered somatic if they were truncating or identified as somatic mutations in COSMIC (v72) or in our custom CLL database.5 Variants not fulfilling the previous criteria were investigated in the germ line DNA of the patients by NGS, Sanger sequencing or allele-specific (AS)-PCR (Supplementary Tables S5 and S6 and Supplementary Methods). Overall, only somatic and/or truncating mutations were considered. All mutations reported at low VAF (<12%) were verified by AS-PCR and/or a second independent round of NGS (Supplementary Methods).

Estimation of the CCF of the mutations

The CCF carrying each specific mutation was calculated as follows: CCFmut=(((q–2)CCFCNA+2)VAFmut)/p, where q is the copy locus number for the sample, CCFCNA the CCF of the copy number alteration (0 to 1), VAFmut the VAF of the mutation, and p the tumor purity of the sample (0 to 1). As applied to the CNA, mutations were classified as clonal or subclonal if their CCFs was 85% or <85%, respectively. Given that mutations with very low CCF were frequently identified in this study, subclonal mutations were further classified as subclonal with high or low CCF (hereafter referred to as ‘subclonal-high’ and ‘subclonal-low’) using 25% as cutoff. This cutoff value, which corresponds to 12.5% of VAF, represents the common detection threshold of mutations by Sanger sequencing.24, 26

Inferring the temporal acquisition of alterations

First, we measured the variability in the estimation of the CNA and mutation CCFs due to the SNP array and sequencing methodologies (Supplementary Methods and Supplementary Figures S1 and S3). Next, we tested for each alteration the enrichment of out-going edges (instances where the alteration was present at a higher CCF than other alterations in the same tumor) compared with in-going edges (the alteration was present at a lower CCF than other alterations) and classified them as early, late or intermediate (not powered to be classified neither early nor late) events, as previously described.6 Temporal pairwise relationships were analyzed for each pair of alterations connected by at least five out-/in-going edges.6

Statistical methods

The prognostic impact was evaluated for time to first treatment (TTFT) and overall survival (OS) from the time of sampling. Deaths previous to any treatment were considered as competing events for the TTFT analysis. The Gray’s test was used for comparing cumulative incidence curves of TTFT, while the log-rank test was used to compare Kaplan–Meier curves of OS. Variables that were significant in the univariate analyses were subsequently included in the multivariate analyses modeled using Fine-Gray and Cox regression models for TTFT and OS, respectively.29 Backward-stepwise elimination was used to identify variables with an independent prognostic value. No differences were observed for TTFT and OS when comparing the subset of samples collected within the first year after diagnosis vs samples obtained after the first year (Supplementary Figure S4). Consequently, all clinical analyses were performed using the whole series of patients.

Associations between variables were assessed by Fisher’s exact test, Student’s t-test, Wilcoxon rank-sum test or Spearman's rank correlation coefficient, as appropriate. Kolmogorov–Smirnov test was used to test for uniform distribution of the mutated CCFs. Maximally selected rank statistics30 was applied to find thresholds for continuous variables with good prediction of clinical outcome (maxstat R package). P-values were adjusted using the Benjamini–Hochberg correction (Q-value). P-values <0.05 were considered significant. All calculations were performed using R (v3.2.4).31

Results

Subclonal driver mutations are more common than clonal mutations in CLL

We identified 609 mutations in 258 (64%) patients, 149 clonal, 201 subclonal-high and 259 subclonal-low (Supplementary Table S7). Clonal mutations were seen in 28% of the patients, which were the sole alterations in 10% and associated with additional subclonal mutations in 18% (Figure 1a). Isolated subclonal mutations were found in 36% of the cases, 22% with high CCF (subclonal-high) and 14% with low CCF (subclonal-low) (Figure 1a). Remarkably, subclonal-low mutations were identified as a sole abnormality in nearly all studied genes (26/28, 93%) accounting for 6–63% of mutations per gene (median 38%) (Figure 1a). Most genes showed a uniform continuous spectrum of mutated CCFs with the exception of NOTCH1 and FBXW7 in which most mutations were either clonal or subclonal-low; NXF1 and DTX1 that were predominantly clonal; and TP53, NFKBIE and BIRC3 that were mostly subclonal (Figures 1a and b). Both clonal and subclonal (high or low) mutations were located in the same regions of the gene (Supplementary Figure S5), suggesting they confer a similar selective advantage to the cell. Similarly, no significant differences were seen in the clinico-biological features of the patients according to the clonal/subclonal distribution of the mutations (Supplementary Figure S6). Since subclonal-low mutations were frequent, the mutation rates observed for most genes were significantly higher than in previous whole-genome/exome sequencing studies of untreated CLL patients (Figure 1c).5, 6

Figure 1
figure 1

Deep characterization of the mutational architecture of 28 CLL driver genes. (a) Pie chart of the proportion of cases grouped according to their mutational clonality in the entire cohort of 406 patients (top-right corner). Percentage of cases carrying subclonal-low, subclonal-high and clonal mutations in each gene. Only the mutation present at a higher CCF is represented in patients with multiple mutations affecting the same gene. Genes with a Q-value <0.1 in the Kolmogorov–Smirnov test applied to test for uniform distribution of the mutated CCFs are indicated. (b) Distribution of the CCF where each dot corresponds to the mutation of one patient. (c) Comparison of the mutational frequency of each gene identified in this study (deep NGS, blue) with the previously published data from the CLL-ICGC project5 (only CLL cases considered (n=428), orange), and DFCI/Broad series6 (only 123 pretreatment cases considered, yellow). Genes in which the mutational frequency observed in the different studies statistically differs are indicated.

Convergent mutational evolution is a common phenomenon in CLL driver genes

Convergent mutational evolution (CME), considered as the acquisition of more than one mutation in the same gene (ranging from 2 to 5), was identified in 19 (68%) of the 28 genes studied (Figure 2a). The number of cases with CME for a particular gene significantly correlated to its global mutational frequency (ρ=0.72, P<0.001). Overall, CME was observed in 66/406 (16%) of patients, accounting for 26% (66/258) of all mutated cases. Of note, multiple CME events affecting different genes within the same tumor were found in eight cases. Patients with CME had a trend towards a higher number of mutations in other genes (mean 2.23 vs 1.90, P=0.072), but not CNA (mean 1.25 vs 1.14, P=0.501), compared with mutated cases without CME (Supplementary Figure S7). The presence of CME was similarly observed across patients with mutated or unmutated IGHV and was not associated with age, Binet stage or clinical outcome (Supplementary Table S8 and Supplementary Figure S7).

Figure 2
figure 2

Convergent mutational evolution (CME) in CLL driver genes. (a) Bar plots of the percentage of mutated cases carrying one or more mutations in each gene. The Ph on the KLHL6 bar denotes that multiple mutations in this gene are mainly identified in the same allele (that is, phased events). (b) Graphical representation of ATM (CLL296) and BCOR (CLL385 and CLL335) mutations identified in three different patients. The CCF of each mutation is represented as a dot and the intervals show the sequencing variability. Histogram (bottom) shows the number of CME events with mutations at similar CCFs, different CCFs or both. (c) Patterns of CME in the longitudinal analysis. Representation of two cases in which mutations conferring CME for NOTCH1 and TP53, respectively, are acquired at different time points (left), and two cases with stable CME (that is, similar CCF of the mutations in the two samples analyzed) for SF3B1, and NOTCH1 and ATM, respectively (right).

A more detailed analysis of the CCFs of the mutations involved in CME showed that mutations had similar CCFs in 50 CME events, suggesting that they could represent either biallelic events or true CME (Figure 2b). Our methodology could not completely distinguish these two situations. A phasing analysis showed that mutations were mostly found in independent alleles, with the exception of KLHL6 in which virtually all mutations were present in the same allele (data not shown). On the other hand, different CCFs were observed in 24 CME events (6 of them carrying a mix of similar and different CCFs), suggesting that these mutations represented true CME events.

Besides, putative CME was identified in 10 of the 48 cases with sequential samples (Supplementary Tables S1 and S9). In six cases the CME were stable in both samples, whereas in four cases new CME were observed in the second sample, confirming that these mutations may be acquired at different moments of the disease (Figure 2c).

Subclonal architecture and evolutionary pathways in CLL

CNA were identified in 295/376 (78%) cases (range 1–26, median 2) (Supplementary Table S10), with no correlation between the CNA and the number of mutations of the tumors (ρ=0.18). Clonal driver CNA (79%) were more frequently found than subclonal (21%). Thus, 59% of the cases carried clonal driver CNA whereas only 8% of the patients harbored isolated subclonal alterations (Supplementary Figure S8).

Combining mutations and CNA, 86% (350/406) of patients carried at least one driver alteration (range: 1–8, median: 2), which was clonal in 66% (267/406) of cases (Figure 3a). Both clonal and subclonal driver alterations were detected in 41% (166/406) of patients whereas isolated subclonal driver alterations were seen in 19% (76/406) of cases (Figure 3a).

Figure 3
figure 3

CLL architecture and temporal acquisition of driver alterations. (a) Graphical representation of the mutational and CNA status of the 406 untreated CLL cases studied. Cases are sorted based of their clonality as shown by the outer and innermost layers. The outer bar plot represents the number of genes mutated with clonal and subclonal, only clonal, subclonal-high and subclonal-low mutations for each case. The following inner layer represents the total number of driver alterations per case and the IGHV mutations. In the innermost layers, the basic genetic alterations and the total number of driver CNA are shown. (b) Representation of CLL driver alterations according to their classification as early, late or intermediate events. Temporal relationships between specific pairs of alterations are represented by arrows. (c) Evolutionary patterns observed in the longitudinal analysis regarding the driver CNA and gene mutations. Evol, evolution; mut, mutations (top). Mutations acquired during the course of the disease before or after treatment (bottom). The P-value of the Wilcox test applied to compare the number of mutations acquired in genes predicted as late events vs intermediate is shown.

The deep sequencing with detailed information on the spectrum of clonal–subclonal distribution of the mutations together with CNA provided a thorough framework to refine their temporal relationship and hierarchical acquisition in CLL. The distribution of CCFs of the CNA and mutations suggested a scenario in which driver CNA are acquired earlier (mostly found clonal), whereas gene mutations may be acquired at any time during CLL evolution (found clonal and subclonal, indistinctly) (Supplementary Figure S9). To provide a detailed estimation of the temporal acquisition of individual alterations, we performed a specific statistical analysis which confirmed that CNA, particularly tri(12), del(13q), del(11q) and del(17p), but also other less recurrent CNA, are usually earlier events (Figure 3b and Supplementary Table S11). On the other hand, gene mutations were either late (NOTCH1, BIRC3, TP53, ZNF292, NFKBIE) or intermediate (ATM, POT1, SF3B1, RPS15, among others) supporting the idea that most mutations may be acquired at any time in the evolution of the CLL and frequently later than CNA (Figure 3b). The low mutational rate of CLL driver alterations as well as the much lower sensitivity for detecting subclonal CNA are limitations of this analysis. To overcome these limitations, we repeated the analysis considering only mutations identified above the sensitivity detection of the CNA (CCF25%). This analysis confirmed CNA as early or intermediate events (only two CNA were classified as potentially late), while all gene mutations were classified as late or intermediate (Supplementary Table S12).

In this temporal study, tri(12), del(17p) and del(11q) were initial hits preceding the acquisition of NOTCH1, TP53 and ATM mutations, respectively (Figure 3b, Supplementary Table S13 and Supplementary Figure S10). The only hierarchic relationship in individual gene mutations was found between RPS15 and NFKBIE. Regarding the temporary acquisition of alterations in specific pathways, only mutations in the NF-κB pathway could be defined as a later event than other driver mutations (Figure 3b and Supplementary Table S14).

To confirm this model we analyzed both the CNA and mutational profile in 44 sequential samples (Supplementary Table S1). Most cases (39 cases, 89%) had a stable CNA profile in both samples. However, mutational evolution, considered as the expansion of a preexisting mutated subclone or acquisition of new mutations, was seen in 17 of them, 5 post-treatment and 12 previous to any treatment (Figure 3c). In three (7%) cases there was a concomitant evolution of CNA and mutations corresponding to del(17p) and TP53 mutations. Only two (5%) cases had stable mutations and evolution of CNA (increase of tri(12) and a heterozygous del(13q) becoming homozygous, respectively). Moreover, acquisition of mutations in subsequent pretreatment samples was mainly observed in genes predicted as late events in the previous analysis (NOTCH1, NFKBIE, TP53, and BIRC3) compared with intermediate events (P=0.002; Figure 3c). Appearance of new mutations in post-treatment samples was only observed in TP53 (n=5) and BIRC3 (n=1). All these analyses together suggest that different CNA are the main initial events in CLL followed by an increasing number of somatic mutations, which are mostly acquired without any particular order among them (Supplementary Figure S9).

The clinical relevance of mutated genes is related to their CCF

To determine whether the CCF of the mutations may influence the outcome of the patients, we analyzed those genes mutated in more than 10 cases with an algorithm that integrates maximally selected rank statistics together with univariate continuous and categorical analyses (Supplementary Methods and Supplementary Figure S11). We identified three gene-specific CCF patterns influencing prognosis (TTFT and/or OS): (1) CCF-independent pattern: the mere detection of the mutation, even at very low CCF, had prognostic impact (NFKBIE, BRAF, MGA, DDX3X, XPO1 and POT1 for TTFT; TP53 and SF3B1 for OS) (Figure 4a and Supplementary Figure S12); (2) CCF-gradual pattern: the prognostic impact was related to the CCF of the mutated gene as a continuous variable (RPS15, ATM, NOTCH1 and SF3B1 for TTFT) (Figure 4a and Supplementary Figure S12); and (3) CCF-dominant pattern: the mutated gene had prognostic impact only when its CCF was above a certain threshold (FBXW7, and NOTCH1 for OS) (Figure 4a). A summary of the clinical impact and CCF-based pattern identified for each gene is shown in Supplementary Table S16.

Figure 4
figure 4

CCF-based patterns with prognostic impact. (a) Time to first treatment (TTFT) or overall survival (OS) curves of some representative mutated genes that follow a CCF-independent (top), CCF-gradual (middle) or CCF-dominant pattern (bottom) with impact on the outcome of the patients. The cutoff obtained by maxstat is shown on the top of the curves included in the CCF-dominant pattern. P-values for all pairwise comparisons are shown inside the plot areas. P, P-values by Gray’s test (TTFT) or log-rank test (OS). (b) Heat map of the co-occurrence of the driver alterations identified in 10 cases and IGHV mutational status by representing the adjusted P-value (Q-value) of the Fisher’s exact test. Mutated genes with clinical impact in the univariate analysis are depicted in bold.

Several mutated genes with prognostic impact were significantly co-occurring in the same tumors (Figure 4b). To identify which of them had an independent value we performed a backward-stepwise regression analysis including the mutated genes with prognostic impact in the univariate analysis together with high-risk CNA (del(17p), del(11q)), IGHV mutational status and clinical parameters (gender, age, Binet stage). This analysis revealed that SF3B1, BRAF, ATM, NOTCH1 and MGA mutations had independent prognostic impact for TTFT, while mutations in FBXW7, NOTCH1, SF3B1 and TP53 had independent value for OS (Table 2).

Table 2 Mutated genes with independent prognostic value for TTFT and OS

Tumor architecture predicts CLL progression and outcome

Finally, we explored the prognostic value of the global tumor architecture compared with individual alterations and standard parameters of poor prognosis. First, we confirmed the unfavorable outcome of patients carrying subclonal driver alterations (mutations and/or CNA),4, 6 and the progressively worse effect on outcome of the accumulation of driver alterations (1 to 4) (Supplementary Figure S13).5 However, patients with subclonal populations harbored higher number of driver alterations, suggesting that these variables may be correlated (Figure 5a). Therefore, we tested separately the number of driver alterations (0 to 4) in clonal tumors (that is, all aberrations were clonal), and the accumulation of clonal and subclonal driver alterations in cases that had at least one subclonal driver aberration (0 to 4) (number of driver alterations in subclonal tumors).

Figure 5
figure 5

Role of the subclonal architecture and mutational complexity in CLL evolution. (a) Boxplots of the number of driver alterations in patients with or without a subclonal driver alteration (left). Boxplots dividing the group of patients with a subclonal driver present in three groups regarding their clonality: cases with clonal and subclonal, subclonal-high and only subclonal-low alterations (right). (b) Comparison of TTFT between patients carrying 0, 1, 2, 3 or 4 driver alterations in the subgroup of patients with subclonal (left) or clonal tumors (right). (c) Survival curves according to the number of driver alterations in the subgroup of patients carrying subclonal (left) or clonal tumors (right).

The number of drivers in both clonal and subclonal tumors gradually shortened the TTFT of the patients with a similar prognostic value (Figure 5b). A multivariate model including also other markers of poor prognosis (IGHV mutations, Binet stage, age, gender, and SF3B1, TP53 and ATM status) revealed that the number of driver alterations retained its independent prognostic value for TTFT (Table 3). These results suggest that the number of drivers, rather than their clonal/subclonal representation, is the main predictor for short TTFT.

Table 3 Independent prognostic value of the accumulation of driver alterations for TTFT

Regarding OS, the number of drivers in subclonal tumors, but not in clonal tumors, was steadily associated with a worse outcome (Figure 5c). However, a multivariate analysis showed that the prognostic value of this parameter was not independent of the age of the patients and the IGHV and TP53 status of the tumors (Supplementary Table S17).

Of note, these results were unaffected by the use of different CCF cutoffs (70–95%) for defining the category of clonal/subclonal alterations (data not shown). All these findings suggest that in untreated patients the accumulation of driver alterations influences the rapid need for treatment independently of the subclonal composition of the tumors and standard prognostic parameters. In contrast, the increasing subclonal diversity, rather than the simple accumulation of driver alterations, is associated with a shorter OS of the patients, although this is mainly explained by their age, IGHV and TP53 status.

Discussion

The highly sensitive NGS strategy used in this study identified frequent small subclonal mutations in virtually all genes, including MYD88 and RPS15, previously considered as early clonal events in CLL.4, 6, 19 These small subclonal mutations were undetected in previous WG/WE/Sanger sequencing studies. Consequently, the frequency of mutations for most of these drivers is higher than previously considered.5, 6 Intriguingly, isolated subclonal mutations were more common than clonal mutations (55 vs 45% of mutated cases), suggesting that these aberrations are not initiating events in most CLL cases. These results were concordant with the analysis of the temporal acquisition of genomic alterations which confirmed CNA as frequent early events6 usually followed by the acquisition of somatic mutations. The longitudinal analysis of 44 cases confirmed this model accentuating that CNA tend to be stable during the course of the disease, whereas gene mutations are continually acquired during CLL evolution, which may evolve until becoming the major clone, even without treatment pressure. Although particular CNA tend to precede the acquisition of specific mutations (for example, tri(12) and NOTCH1 mutations, del(11q) and ATM mutations, among others), we did not find a stringent hierarchical pattern that could define the temporal order of acquisition of mutated genes. This is in contrast to myelodysplastic syndromes, in which early gene mutations seem to dictate the future acquisition of certain alterations.32 Intriguingly, contrary to previous observations,6 our clonal analysis revealed that del(17p) may precede TP53 mutations in CLL, as we did not find any case with del(17p) at lower CCF than mutations. On the other hand, concomitant evolution of del(17p) and TP53 mutated subclones was observed in longitudinal samples confirming previous hypotheses.6 Of note, isolated del(17p) are rare compared with isolated TP53 mutations (3 vs 28 cases) emphasizing that just performing FISH analysis will underestimate the number of TP53 alterations (Supplementary Figure S10). The interpretation of these findings is complex but may suggest that at least some of the mutations may be dominant negative with less pressure to select deletions of normal alleles.33, 34

CME has been identified in occasional CLL cases, particularly involving TP53, SF3B1, NOTCH1, BIRC3 and DDX3X.7, 8, 26 Our study shows that CME is a common phenomenon in CLL (26% of mutated cases) and occurs in most studied genes (68%). The CCF analysis of the mutations conferring CME highlights the need to differentiate among true CME, biallelic or phased events. Our longitudinal analysis suggests that mutations conferring CME are acquired at different stages of the disease and may evolve to reach an interclonal equilibrium. The high incidence of CME together with the frequent detection of small mutated subclones reflects the plasticity of CLL and emphasizes the relevance of specific driver genes for the evolution of the disease.

Previous studies have shown the clinical impact of small mutated subclones of TP53 or NOTCH1.24, 25, 26, 27, 35 The current study extends these observations to other CLL driver genes (NFKBIE, BRAF, MGA, RPS15 or POT1, among others) and identifies that the prognostic impact of some of them may be related to the quantitative representation of the mutated subclone. The quantification of the mutational CCF of the tumors, once confirmed in independent cohorts, may need to be considered in the development of prognostic models based on the mutational profile of the tumors.16, 22

Different studies have identified several mutated genes with prognostic impact in CLL.1, 2, 5, 6, 13, 14, 15, 16, 17, 18, 19, 20, 21 We have observed that a number of them tend to occur simultaneously in the same tumors. Here, we have identified the mutated genes that independently shortened TTFT (SF3B1, BRAF, ATM, NOTCH1, MGA) and OS (FBXW7, NOTCH1, SF3B1, TP53). This information may be useful to design the panel of relevant genes for future studies. In addition to individual genes, recent genomic studies have suggested that understanding the whole tumor architecture, rather than individual driver alterations, may be crucial to assess the prognosis of the patients. These studies identified the accumulative number of alterations per tumor5 or the presence of driver subclones4, 6 as promising parameters to improve the evaluation of CLL outcome. Our results showed that the total number of driver alterations (mutational complexity), regardless if they were clonal or subclonal, steadily shortened the TTFT and this was independent of the IGHV, Binet stage, and ATM, TP53 and SF3B1 status. The number of driver alterations in our study includes both mutated genes and CNA. Therefore, this finding expands the prognostic value of the karyotype complexity observed in previous studies.20, 36, 37, 38, 39 This observation is similar to the impact of driver mutations in myelodysplastic syndromes in which the evolution of the patients progressively deteriorated as the number of driver mutations increased.32

Conversely, our analysis reveals that the OS of the CLL patients seems more influenced by the subclonal diversity of the tumors rather than the number of driver alterations. Thus, the increasing accumulation of driver alterations only shortened the OS in tumors with subclonal populations (subclonal tumors). On the contrary, patients with clonal tumors, independently of their number of alterations, had a similar outcome than cases without driver alterations. However, the prognostic value of the subclonal diversity was not independent of age, IGHV and TP53 status. Our findings highlight the relevance of these factors in the survival of the patients treated with the available strategies. However, novel treatments are influencing the outcome of the patients and these parameters will need to be reevaluated in these new contexts.

In conclusion, we identified the relevance of the subclonal architecture and mutational complexity in the evolution of CLL. The progressive accumulation of driver alterations gradually shortened the TTFT independently of the clonal architecture, whereas the OS of the patients was influenced by the increasing diversity of the subclonal composition of the tumors, although this phenomenon seemed to be related to the IGHV and TP53 status. Our study has also identified relevant mutated genes that may orient the design of specific gene panels for future studies.

Accession number

The sequencing data have been deposited in the European Nucleotide Archive (ENA, accession number ERP020894).