Original Article | Published:


Comprehensive gene expression profiling and immunohistochemical studies support application of immunophenotypic algorithm for molecular subtype classification in diffuse large B-cell lymphoma: a report from the International DLBCL Rituximab-CHOP Consortium Program Study

Leukemia volume 26, pages 21032113 (2012) | Download Citation

  • A Corrigendum to this article was published on 09 April 2014


Gene expression profiling (GEP) has stratified diffuse large B-cell lymphoma (DLBCL) into molecular subgroups that correspond to different stages of lymphocyte development–namely germinal center B-cell like and activated B-cell like. This classification has prognostic significance, but GEP is expensive and not readily applicable into daily practice, which has lead to immunohistochemical algorithms proposed as a surrogate for GEP analysis. We assembled tissue microarrays from 475 de novo DLBCL patients who were treated with rituximab-CHOP chemotherapy. All cases were successfully profiled by GEP on formalin-fixed, paraffin-embedded tissue samples. Sections were stained with antibodies reactive with CD10, GCET1, FOXP1, MUM1 and BCL6 and cases were classified following a rationale of sequential steps of differentiation of B cells. Cutoffs for each marker were obtained using receiver-operating characteristic curves, obviating the need for any arbitrary method. An algorithm based on the expression of CD10, FOXP1 and BCL6 was developed that had a simpler structure than other recently proposed algorithms and 92.6% concordance with GEP. In multivariate analysis, both the International Prognostic Index and our proposed algorithm were significant independent predictors of progression-free and overall survival. In conclusion, this algorithm effectively predicts prognosis of DLBCL patients matching GEP subgroups in the era of rituximab therapy.


Diffuse large B-cell lymphoma (DLBCL), not otherwise specified, is classified by the World Health Organization (WHO) as a single-disease entity on the basis of morphological and clinical criteria.1 The standard therapy for patients with DLBCL is rituximab combined with cyclophosphamide, doxorubicin, vincristine and prednisone (R-CHOP), and this results in a long-term disease-free survival of 50%.2 Gene expression profiling (GEP), however, has identified molecularly and clinically distinct subgroups of disease within DLBCL, known as germinal center B-cell like (GCB) and activated B-cell like (ABC), and unclassified DLBCL.3, 4 The DLBCL gene-expression subgroups differ by the expression of more than 1000 genes, making them at this level as different as acute lymphoid and myeloid leukemias.5 The mechanisms of malignant transformation of the two subgroups involve distinct and specific pathways, with BCL2 rearrangement and C-REL amplification seen mostly in GCB–DLBCL and constitutive activation of the NF-κB pathway characterizing ABC-DLBCL.4, 6 These novel insights into the pathogenesis of the DLBCL subgroups are enabling the discovery of targets for investigational therapies.

The molecular distinction between the DLBCL subgroups is also important, because patients in these subgroups respond differently and retain a different prognosis when treated with R-CHOP.7 Patients with GCB–DLBCL have a more favorable outcome than those with ABC-DLBCL, irrespective of the International Prognostic Index (IPI) score.8 However, because of expense and technical constraints and the need for intensive bioinformative analysis, the use of GEP technology for routine clinical use is challenging. In an attempt to translate GEP classification into a manageable set of measurable proteins, several algorithms have been proposed in recent years based on immunohistochemical stains and tissue microarray (TMA) technique. The original algorithm, proposed by Hans et al,9 is based on the expression of three proteins: neprilysin or common acute lymphocytic leukemia antigen (CD10), B-cell lymphoma 6 (BCL6) and multiple myeloma oncogene 1/interferon regulatory factor 4 (MUM1/IRF4) and can classify DLBCL patients into two categories (GCB and non-GCB) with differing prognoses. This algorithm, however, was created for use in patients treated with CHOP without rituximab, and in addition, it had low concordance with GEP analysis (71% for GCB, 88% for non-GCB). The prognostic relevance of the Hans algorithm led to inconsistent results in subsequent studies performed in patient groups treated with R-CHOP.7, 10, 11, 12, 13, 14, 15

Three studies reported new staining algorithms obtained from patients treated with R-CHOP,12, 15, 16 two of which combined results with GEP analysis.12, 16 In the first study, Choi et al.12 developed an algorithm based on the expression of five biomarkers, which had a high concordance with GEP (93%). Compared with the Hans algorithm, the Choi algorithm integrated the analysis of two new molecules: forkhead box protein P1 (FOXP1) and serpin A9/germinal center expressed transcript 1 (GCET1), that allowed a better discrimination between GCB and non-GCB–DLBCL. These two markers exhibited reliable staining17, 18, 19 and could further address different steps of B-cell maturation. In a second study, Meyer et al.16 reported an algorithm (called the ‘Tally’ algorithm) that had a high concordance with GEP and was also based on the expression of five markers: CD10, GCET1, FOXP1, MUM1 and rhombotin-2/LIM domain only 2 (LMO2). However, based on a recent study, when these algorithms were tested for concordance with GEP and prognostic power on an independent cohort of patients, they did not correlate well with GEP results and showed poor prognostic power.20

We used five specific markers, including CD10, GCET1, FOXP1, MUM1 and BCL6 to describe consecutive stages of the differentiation of mature B cells through the GC. We constructed an effective algorithm, defined as Visco-Young algorithm, based on three of these markers that can match with high concordance (92.6%) between patients with GCB and ABC gene signatures. Our algorithm exhibits strong independent prognostic value that is almost equivalent to that of GEP analysis in a large cohort of DLBCL patients treated with R-CHOP.

Patients and methods


We studied 475 patients with de novo adult DLBCL cases that had been diagnosed between January 2002 and October 2009, as part of the International DLBCL Rituximab-CHOP Consortium Program Study. Cases were selected on the basis of the available GEP results and clinical data. All cases were reviewed by a group of hematopathologists (all primary center pathologists, SMM, MAP, MBM, AT and KHY), and the diagnoses were confirmed on the basis of WHO classification criteria. The current study was reviewed and approved as being of minimal to no risk or as exempt by each of the participating center Institutional Review Boards, and the comprehensive collaborative study was approved by the Institutional Review Board (IRB) at The University of Texas MD Anderson Cancer Center in Houston, Texas. A Material Transfer Agreement (MTA) was established and approved by each of the participating centers joining this collaborative project for the International DLBCL Consortium Program.

TMA immunohistochemistry

Immunohistochemical staining was performed on all 475 cases. The hematoxylin-eosin stained slides from each tumor were reviewed, and representative areas with the highest percentage of tumor cells were selected for TMA construction. Immunohistochemical analysis was performed on 4-μm TMA sections using a streptavidin–biotin complex technique, and antibodies reactive against the following antigens were utilized: CD3, CD5, CD10, CD20, CD30, CD79a, CD138, ALK-1, BCL2, BCL6, FOXP1, GCET1, GCET2 and MUM1. The samples were analyzed independently by a group of six hematopathologists/pathologists in addition to each of the contributing center hematopathologists, and disagreements were resolved by joint review on a multiheaded microscope.

GEP analysis

RNA was extracted from 475 formalin-fixed, paraffin-embedded tissue samples using HighPure Paraffin RNA Extraction Kit (Roche Applied Science, Indianapolis, IN, USA). Fifty nanogram RNA was transcribed into cDNA, linearly amplified using the WT-Ovation FFPE System (Nugen) SPIA method,21 and biotin labeled using FL-Ovation cDNA Biotin Module V2 (Nugen) in all the cases. For GeneChip hybridization, 5 μg of WT-Ovation amplified cDNA was applied to HG-U133 Plus 2.0 GeneChips (Affymetrix, Santa Clara, CA, USA) and hybridized overnight. GeneChips were washed, stained and scanned using the Fluidic Station 450 and GeneChip Scanner 3000 (Affymetrix) according to the manufacturer’s recommendations. For data analysis and classification, the microarray DQN (trimmed mean of differences of perfect match and mismatch intensities with quantile normalization22) signals were generated and normalized to the quantiles of beta distribution with parameters P=1.2 and q=3. A Bayesian model23 was also utilized to determine the class probability. The classification model was built on the 47 paired formalin-fixed, paraffin-embedded tissue-fresh frozen sample data set previously generated with confidence of 90–100% for both fresh frozen tissue and FFPE tissue.24 The same methodology developed during this pilot study has been validated and demonstrated to be applicable by using the LLMPP data set in the Gene Expression Omnibus database GSE# 10846 that have 181 CHOP-treated and 233 R-CHOP-treated DLBCL patients with FF samples.25

Receiver-operating characteristic (ROC) curve analysis to assess discriminatory accuracy of each marker

The ROC curves allowed us to visualize the specificity and sensitivity of each marker (CD10, GCET1, FOXP1, MUM1 and BCL6) in assigning cases to GCB or ABC classification before further categorization.26 The performance of each marker could be quantified by the area under the ROC curve (Supplementary Figure 1). All cases were classified separately as GCB or non-GCB based on the cutoff scores from both data sets and the proposed three-marker algorithm. Except for eight cases (1.7%), which were classified as GCB according to the cutoff scores from set 2 but not from set 1 due to the lower cutoff for BCL6, all other 467 cases were completely matched between both groups (κ=0.978), demonstrating the validity and reliability of our model. Of these eight cases, six (all six being CD10-negative) were GCB and two ABC according to GEP, indicating that the lower (30%) cutoff score for BCL6 is more sensitive and useful to identify those, especially CD10-negative GCB cases.

Rationale for the structure of the algorithm

In designing the algorithm, we emphasized the importance of CD10 expression (step 1), which is usually part of the initial diagnostic staining panel for hematopathologists, and its staining has shown the best concordance in different studies between different laboratories.27 We then analyzed GCET1, FOXP1 and MUM119, 28 expression in this order (step 2–4), following our rationale that will be discussed below. Finally, we left to BCL6 a minor role in recognizing patients with GCB–DLBCL (step 5) because of the variability and reliability of its staining.26 The five steps of the global algorithm are shown in Figure 1.

Figure 1
Figure 1

Stratification of 475 diffuse large B-cell lymphoma (DLBCL) patients using TMA immunohistochemistry: initial global algorithm. This algorithm illustrates our rationale for the sequential steps of differentiation of the B cells through the GC and is built on data from 475 patients. Values in parentheses indicate how patients were classified according to GEP analysis; uncl, unclassifiable cases. The first value in the parentheses indicates the number of cases where GEP and TMA coincided.

Cutoff establishment

We avoided cutoff values based on mean and median expression, as our protein marker expression had a non-Gaussian distribution (Table 1). Instead, by calculating the Youden index29 from our ROC curves, we identified the point on the curve corresponding to the maximum sensitivity and specificity for each marker to classify a DLBCL as either of GCB or ABC type according to GEP analysis. The Youden index pointed to optimal cutoff scores of 35% for CD10, 33% for BCL6, 45% for GCET, 75% for FOXP1 and 58% for MUM1. For CD10 and BCL6, the cutoffs were very close to 30%, which is the accepted cutoff for these two molecules.9 In order to avoid too many different cutoffs in the final algorithm, we compared the optimal cutoff of GCET1 and FOXP1 to 60% and found no change occurred in their sensitivity and specificity. Therefore, we modified the cutoff scores for both GCET1 and FOXP1 to 60%, thus maintaining the optimal cutoff for MUM1.

Table 1: Concordance between GEP analysis and 4 immunohistochemical algorithms in 431 patients who were classified by GEP either as GCB or ABC (excluding 44 unclassified patients)

Refining the global algorithm

The initial algorithm with the established cutoffs exhibited a straightforward concordance with GEP analysis (Figure 1). This concordance could be further improved by removing unnecessary passages or redundant decisional points. We removed all the subsequent steps for CD10+ patients and we eliminated step 4 (MUM1), obtaining a four-marker algorithm, which is shown in Figure 2a. Furthermore, after removing step 2 (GCET1) for CD10 patients, we obtained a three-marker algorithm, shown in Figure 2b. By simplifying the algorithm, we increased the number of concordant patients.

Figure 2
Figure 2

Stratification of 475 DLBCL patients using TMA immunohistochemistry: proposed three-marker and four-marker algorithms compared with the Choi and Hans algorithms. (a) The four-marker algorithm was developed from the initial global algorithm using four markers and correctly characterizes 92.8% of patients as either GCB- or ABC-DLBCL according to GEP analysis. (b) The three-marker algorithm represents a further simplification and characterizes 92.6% patients compared with GEP. (c) The Choi algorithm was developed with the same five markers as those in our initial global algorithm but with different cutoffs and sequences, and had a predictivity of 90.1% compared with GEP. (d) The Hans algorithm was based on the expression of three markers and had a predictivity of 87.3% compared with GEP. Values in the parentheses indicate how patients were classified according to GEP analysis; uncl, unclassifiable cases.

Validation set

To test the efficacy of the new algorithm in predicting survival in an independent series of cases, we applied the algorithm to a second group of 574 archival DLBCL cases studied using TMAs similarly to the first cohort but for which no GEP analysis was available. Of these, 237 patients had been treated with R-CHOP and 337 with CHOP without rituximab. The same selection criteria as those for the first cohort were applied to these patients. Clinical characteristics at presentation for the validation set were similar to the test set in terms of median gender (female in 45%, P=0.37), lactate dehydrogenase (elevated in 34%, P=0.66), AAS (III–IV in 49%, P=0.28), presence of B symptoms (32%, P=0.77) or IPI (0–2 in 64%, P=0.69), except for age. Patients of the validation set were significantly younger than patients of the test set (median age 58 years, P=0.007).

Fluorescence in situ hybridization for MYC gene rearrangement

Fluorescence in situ hybridization was performed on paraffin-embedded tissue sections with a locus-specific identifier IGH/MYC/CEP 8 tri-color, dual fusion probes (05J75-001 from Vysis, Downers Grove, IL, USA) and, due to shortcommings of the former in identifying alternative (non-IGH) MYC rearrangement partners, a locus-specific identifier MYC dual-color, break-apart probe (BP, 05J91-001 from Vysis). Fluorescence in situ hybridization signals were scored with a Zeiss fluorescence microscope (Carl Zeiss, Dublin, CA, USA). Cases on the TMA were considered for evaluation if at least 200 tumor cell nuclei per core displayed positive signals. Abnormal fluorescence in situ hybridization signals were recorded as percentage of cells showing an abnormality.

Response definitions and statistical analysis

Response assessment was standardized among different Institutions following the criteria based on CT-scan and bone marrow biopsy.30 Late deaths not related to the underlying lymphoma or its treatment were not considered treatment failures.30 The actuarial probability of progression-free survival (PFS) and overall survival (OS) was determined using the Kaplan–Meier method,31 and differences were compared using the log-rank test. A Cox proportional-hazards model was used for multivariate analysis.32 All variables with P<0.05 were considered to be statistically significant. The comparison of clinical and laboratory features at presentation was carried out with the χ2-test or the Spearman's rank correlation.


Comparison between the new algorithms and GEP results

The 475 patients were classified into GCB (231, 49%), ABC (200, 42%) or unclassifiable (44, 9%) cases by GEP analysis, as shown in Figure 3. The three-marker algorithm (Figure 2b) exhibited a very similar concordance to GEP analysis compared with the four-marker algorithm (only one additional mismatch; see Table 1). Hence, this simplified version was adopted for subsequent analysis. According to the three-marker algorithm, 252 patients (53%) had a GCB phenotype and 223 (47%) had a non-GCB phenotype (Figure 2b). The 44 cases that were unclassifiable by GEP were assigned to the GCB (21) or the non-GCB (23) subgroups by the new algorithm. Our algorithm had a concordance with GEP results of 92.6% for the 431 patients classified by GEP as having either GCB or ABC disease, compared with 92.8% for the four-marker algorithm. The Choi and Hans algorithms could correctly assign 90.1% and 87.3% of the cases, respectively (Table 1). Concordance of the three-marker algorithm was 93.1% for GCB (16 mismatches out of 231 patients) and 92% for ABC (16 mismatches out of 200 patients), both of which compared favorably with the Hans and Choi algorithms (Table 1). The ‘Tally’ algorithm proposed by Meyer et al.16 was applied to 342 patients whose samples could be classified without the need for LMO2 staining, and its concordance with GEP was 90.1%. The concordance of our algorithm with the recently proposed simplified Hans* and Choi* algorithms by Meyer et al.16 was 86.3% and 81.2%, respectively.

Figure 3
Figure 3

Heat map of hierarchical clustering of GEP on 475 DLBCL patients. Cases stratified as ABC-DLBCL on the left show all the cases express selected markers. Similarly, cases stratified as GCB–DLBCL on the right express hierarchically selected markers. Cases in the middle could not be stratified by GEP and considered as unclassifiable cases (UC).

Distribution and prognostic significance of the expression of each marker

With the Youden index, we established the positivity cutoffs of 30% or more for CD10 and BCL6 and 60% or more for GCET1, FOXP1 and MUM1. Expression above these cutoffs for CD10 was observed in 190 (40%) of patients, BCL6 in 375 (79%), GCET1 in 134 (28%), FOXP1 in 271 (57%) and MUM1 in 179 (38%). The distribution of the expression for each marker is shown in the histograms on Table 2. As the cutoffs determined with the Youden index were meant only to determine patients as having either GCB or non-GCB–DLBCL and were not intended for predicting survival, we divided the percentage of expression of each marker in percentiles in the 475 R-CHOP-treated patients. As shown in Table 2, CD10 expression was significantly associated with PFS when 0, 20 or 30% were used as cutoffs. BCL6 expression did not affect PFS. GCET1 and MUM1 expression were instead significantly associated with PFS at several cutoffs, with the Youden index and our chosen cutoffs being among the most predictive for PFS. The 60% cutoff for FOXP1 was the most predictive for PFS in this study group.

Table 2: Summary of biostatistical features of five immunohistochemical stains (CD10, GCET1, MUM1, FOXP1 and BCL6)

Clinical data and survival

Clinical characteristics at presentation for the 475 R-CHOP-treated patients with de novo DLBCL and stratified according to our proposed three-marker algorithm are shown in Table 3. Clinical variables were well balanced between GCB and non-GCB subgroups except for age, stage and IPI scores. Patients with the non-GCB phenotype were significantly older (median age, 65 vs 60 years) and had higher IPI scores (37% vs 22%; IPI 3–5) than patients with the GCB phenotype, as shown in Table 3.

Table 3: Clinical characteristics and their impact on survival of 475 DLBCL treated with R-CHOP, then stratified according to our three-marker algorithm as GCB or non-GCB.

Median follow-up was 42 months (range, 4–106 months). Overall, the 5-year OS and PFS were 62% and 60%, respectively (Figures 4a and 4b). No different outcome was observed in patients treated in different Institutions. As shown in Figures 4c 5-year OS was significantly different when patients were stratified according to GEP subgroups (69±3% for GCB vs 53±5% for ABC vs 60±4% for unclassified cases; P=0.02 for GCB vs ABC). Similarly, the 5-year OS was significantly different when patients were stratified according to our three-marker algorithm (71±3% for GCB vs 51±5% for non-GCB; P=0.003, Figure 4d). In terms of PFS, Figures 4e and f show that both the GEP (64±3% for GCB vs 46±5% for ABC vs 53±5% for unclassified; P=0.003 for GCB vs ABC) and our algorithm (64±4% for GCB vs 48±5% for non-GCB; P=0.002) can stratify patients into groups with significantly different 5-year PFS rates. As there were 44 unclassified cases by GEP and they would not be excluded in the clinical setting, the use of our algorithm allowed us to stratify this subset into two groups with distinct OS and PFS rates. These rates were nonetheless not significantly different which we believe can be attributed to the small number of cases (Supplementary Figure 2). In terms of OS and PFS, the new algorithm compared favorably both with the Choi and Hans algorithms. Five year OS was 65±4% for GCB vs 54±5% for non-GCB (P=0.04) and 5-year PFS was 66±3% for GCB vs 53±5% for non-GCB (P=0.02), according to the Choi algorithm. Using the Hans algorithm, 5-year OS was 64±4% for GCB vs 55±4% for non-GCB (P=0.06), and 5-year PFS was 67±4% vs 52±5% (P=0.02). The ‘Tally’ algorithm16 was significantly predictive for OS (P=0.009) and PFS (P=0.01).

Figure 4
Figure 4

OS and PFS analyses of R-CHOP-treated DLBCL patients when stratified by GEP and TMA immunohistochemistry algorithm. (a) OS curve for all patients. (b) PFS curve for all patients. (c, d) OS and PFS curves of 431 patients stratified by GEP results, excluding unclassifiable cases (44 of 475); unclassifiable cases are analyzed separately in Supplementary Figure 1. (e, f) OS and PFS curves of 475 patients divided into GCB and non-GCB according to our TMA algorithm.

In the validation set of 574 patients with available TMA data but without GEP analysis, we confirmed the reliability of our algorithm in predicting survival. In this independent subset of patients who were treated with either CHOP or R-CHOP, our algorithm could divide each group of patients into cohorts with significantly different PFS and OS rates (Figure 5). In the validation set, patients with GCB and non-GCB subtypes according to our algorithm did not differ significantly in terms of clinical characteristics at presentation, except for age that was significantly higher for patients with non-GCB subtype.

Figure 5
Figure 5

Validation set of OS and PFS analyses in 574 patients with DLBCL treated with cyclophosphamide, doxorubicin, vincristine, prednisone (CHOP) and R (Rituximab)-CHOP. OS and PFS analysis of 574 patients with available TMA but not GEP analysis, stratified according to our three-marker algorithm. (a) 337 CHOP-treated patients; (b) 237 R-CHOP treated patients.

Univariate and multivariate analyses

As shown in Table 3, univariate analysis for PFS and OS revealed that high IPI score, all the variables that compose the IPI (that is, stage, age, lactate dehydrogenase, performance status and number of extra nodal sites) and failure to achieve CR were significantly associated with shorter PFS. In multivariate analysis using the Cox regression model, an IPI score 3–5 (hazard ratio, 0.59; 95% CI, 0.43–0.83; P=0.002), non-GCB origin (hazard ratio, 0.59; 95% CI, 0.43–0.81; P=0.001) and failure to achieve CR (hazard ratio, 0.15; 95% CI, 0.10–0.21; P=0.032) were independent adverse prognostic factors for PFS. Similar results were obtained in terms of OS, with IPI score 3–5 (hazard ratio, 0.53; 95% CI, 0.38–0.74; P=0.0002), non-GCB origin (hazard ratio, 0.56; 95% CI, 0.40–0.77; P=0.0004) and failure to achieve CR (hazard ratio, 0.14; 95% CI, 0.10–0.20; P<0.0001) as independent prognostic factors. Age, stage, lactate dehydrogenase, performance status and number of extra nodal sites were not computed in the multivariate analysis because these variables are included in the IPI score. More interestingly, univariate and multivariate analysis were performed in the 237 patients treated with R-CHOP of the validation set. Similarly to the former, both IPI score 3–5 (hazard ratio, 0.57; 95% CI, 0.37–0.89; P=0.01), and non-GCB origin (hazard ratio, 0.63; 95% CI, 0.42–0.96; P=0.03) resulted independent adverse prognostic factors for PFS.

Kaplan–Meier analyses showed that, according to our algorithm, IPI score (0–1 vs 2–3 vs 4–5) could divide patients with GCB and non-GCB subtypes into cohorts with significantly different PFS rates (Supplementary Figure 3). When we combined the IPI score and our algorithm, we identified a group of patients with a very favorable PFS (IPI 0–1 and GCB phenotype, 5-year PFS of 86±1%) and a patient group with an unfavorable PFS (IPI score 4–5 and non-GCB phenotype, 5-year PFS of 28±7%).

Twenty-three of 296 patients (8%) had MYC rearrangements. Patients with MYC breaks had a significantly inferior OS (P=0.03) and PFS (P=0.01) compared with patients without breaks (median and mean OS 24 and 34 months, median and mean PFS 18 and 25 months, respectively). As shown in Table 3, MYC breaks were significantly more frequent among patients with GCB–DLBCL. Even though numbers were low, the pejorative impact of MYC breaks on survival of our patients reached statistical significance in GCB patients according to GEP (16 patients, P=0.02 for OS and P<0.0001 for PFS) but not in ABC (7 patients, P=0.32 for OS and P=0.66 for PFS). Similar results were obtained when we used our algorithm to split patients into GCB (17 patients, P=0.03 for OS and P<0.0001 for PFS) or non-GCB phenotype (6 patients, P=0.43 for OS and P=0.76 for PFS).


We designed a new algorithm based on the expression of CD10, FOXP1 and BCL6 that precisely stratifies the GCB and ABC subtypes of DLBCL. The associations of each marker with GCB or ABC-DLBCL and the cutoffs to determine positivity were assessed using ROC curves, which obviated the need to use any arbitrary cutoffs. Our algorithm had strong prognostic power matching that of GEP in R-CHOP-treated patients and was independent of IPI. In an independent cohort of patients treated with either CHOP or R-CHOP, we confirmed the algorithm’s prognostic predictive value. Finally, the algorithm proposed also allowed us to classify patients with DLBCL whose disease had been unclassifiable according to GEP, although survival analysis in this small group of patients did not reach statistical significance.

Our results confirm the reliability of previous findings,33 demonstrating that GEP can be performed by extracting RNA from formalin-fixed, paraffin-embedded tissue instead of frozen tissue, which is often not obtained at diagnosis and is becoming decreasingly available in the current era of small needle biopsy for diagnosis. The immunohistochemical algorithm can be easily performed by most laboratories on paraffin-embedded tissues and allows for the direct visualization of tumor cells.26 Moreover, compared with GEP analysis, the phenotype of the tumor reflects gene expression of the lymphoma cells, revealing which molecules are in fact expressed and functional and could thus be the target of new drugs. Recent studies have shown that some drugs enhance the activity of chemotherapy in ABC–but not GCB–DLBCL–providing a rationale for different therapeutic approaches for distinct DLBCL subtypes.34, 35, 36, 37, 38

Malignant B cells of DLBCL are thought to be ‘frozen’ at particular stages of B-cell development. In the GC microenvironment, specific proteins are up or downregulated at any one particular stage. It has been shown that B cells in the GC can migrate extensively within their respective compartments.39 In this scenario, GCET1 stains positive in rapidly dividing B cells (that is, Ki-67+ centroblasts) in the dark zone of the GC. Its expression is enhanced when B cells are stimulated by CD40 signaling40 and it is then likely to identify centroblasts that have been rescued from cell death and are prompted to proliferate and undergo somatic hypermutation and class-switch recombination.17, 28, 41 Foxp1 is an essential transcriptional regulator of B-cell development that influences B-cell development at very early stages,42 and its mRNA expression is also typically elevated in ABC-DLBCL.42 Cell lines that are at an intermediate stage of differentiation between GCB and ABC (that is, LIB) express CD10, BCL6 and MUM1 as well as FOXP1,19 indicating that this marker could represent a bridge from the GC stage to subsequent B-cell activation. Some preliminary data have suggested that smaller FOXP1 isoforms may have a role in activating the transcription factor MUM1, pushing B cells toward plasma cell differentiation.6, 41, 42, 43, 44 Hence, in the construction of our algorithm, we evaluated the expression of CD10, GCET1, FOXP1 and MUM1 in that order to progressively address the steps of B-cell maturation.

As BCL6 is the marker with the largest variability in its staining and interpretation between laboratories, only a minority of patients will need to rely on its staining for subset discrimination. According to our algorithm, the role of BCL6 is confined to patients (less than 20%) that are negative both for CD10 and FOXP1, while the Choi and the Hans algorithms gave strong decisional power to BCL6 (50% and 60% of patients, respectively). We acknowledge that the assignment of these patients to a specific subset might benefit of other GC-specific markers that have been used in other algorithms, but were not analyzed in our study. Among these, human GC-associated lymphoma (HGAL or GCET1) expression has been shown to correlate with improved survival in CHOP-treated patients with DLBCL.45 This observation was confirmed in our series of cases as well (data not shown). Similarly, LMO2 mRNA expression was reported as a predictor of superior outcome in DLBCL patients in a relatively small series of DLBCL cases, however, the finding has not been confirmed and validated from other groups in a large cohort of cases.46 HGAL is an adapter protein involved in prevention of lymphocyte migration, thus constraining lymphocytes to the GC.47, 48 Double-staining studies have demonstrated that most BCL6+ cells co-express HGAL, although several BCL6+ cells of the proliferating pole or dark zone of GCs lack staining for HGAL. Therefore, it is suggested that HGAL, unlike GCET1, may identify resting cells within the GC.49, 50 Other markers that are discriminatory for ABC-DLBCL, such as cyclin D2,9, 51 PRDM1/Blimp1 and XBP1,52 were excluded based on our previous experience and on the absence of data in R-CHOP-treated patients at the time of the approval for the current study. We found that GCET1 and FOXP1 were both predictive of PFS in R-CHOP-treated patients, regardless of the cutoff utilized (Table 1). To the best of our knowledge, this is the first report addressing the prognostic predictive value of GCET1 expression as a single marker for R-CHOP-treated patients with DLBCL.

The use of the algorithm we propose, when applied to patients uniformly treated with R-CHOP, had remarkable prognostic significance and was independent of IPI, as shown in the multivariate analysis. When we combined the IPI score and our algorithm, we could identify cohorts of patients at very low (IPI 0–1 and GCB phenotype, 5-year PFS of 86±1%) or very high risk of relapse (IPI score 4–5 and non-GCB phenotype, 5-year PFS of 28±7%).

The staining algorithm proposed by Choi et al.12 has shown good concordance with GEP analysis, but had a complicated structure, using five markers with different cutoffs. The more recent ‘Tally algorithm’16 was also based on the expression of five biomarkers, three of which are not commonly used by pathologists, and with arbitrary decisional cutoffs. In both studies, it is not clear whether patients with transformed, primary mediastinal, primary cutaneous or central nervous system DLBCL were excluded from the analysis, despite the peculiar biological features and clinical behavior of these tumor types.

Unlike any other new marker, CD10 is part of the initial immunophenotypic panel used by hematopathologists. Therefore, our use of CD10 as the first discriminating marker in the new algorithm simplifies the categorization of patients. The predictive value of CD10 positivity alone in identifying patients with GCB according to GEP was 95% in our series, similar to that reported by others,9, 12, 20 and could not be improved by the addition of any other markers. Thus, although we have maintained the structure of the Hans algorithm for CD10+ patients, our new algorithm improves the discrimination of CD10 patients, who were correctly assigned in 91.4% of cases, compared with 82.3% when the Hans algorithm was used. However, a very small subset of cases with strong CD10 expression was classified as ABC. Most of these cases had strong FOXP1 or MUM1 expression, but rarely expressed GCET1. On the other hand rare cases lacking CD10 expression were classified as GCB by GEP. Most of these cases had strong BCL6 and FOXP1 expression, while only few expressed GCET1. Morphologically, the first group had polymorphic morphology, whereas cases classified as ABC by IHC algorithm, but GCB by GEP analysis, showed typical centroblastic morphology. The low number of misclassified patients did not allow conclusion on clinical behavior of this particular subset.

We analyzed 296 patients with available material for the presence of MYC, and 8% had MYC rearrangements. Patients with MYC breaks had a significantly inferior outcome compared with patients without breaks. Although numbers were low, MYC breaks had a significant impact on survival of GCB, but not in ABC patients, either when recognized by means of GEP or of our algorithm.

We reviewed 466 patients with available material for morphological classification using 2008 WHO classification as criteria.53, 54, 55 Four hundred and five (87%) had centroblastic morphology. Of them, 251 had cleaved (59) or large noncleaved (192) cell types. Twenty-four patients had anaplastic morphology (5%) and 37 had immunoblastic morphology (IB, 8%). According to morphological subtype distribution, centroblastic morphology was significantly more represented in the GCB (54%) than IB (27%, P=0.001), while anaplastic morphology was GCB in 66%. Large noncleaved cell type was more represented in GCB (63%), similarly to the medium-sized cells (76%), and differently from cleaved cells (41%) and polymorphic cell type (38%). Patients with centroblastic morphology had significantly better OS (P<0.0001) and PFS (P=0.001) compared with IB or anaplastic morphology (P=0.0001 and P=0.004, respectively). According to cell type distribution, large noncleaved cell type had the better 5-year PFS and OS (65% and 70%, respectively). A significant difference in terms of OS or PFS was observed between large noncleaved cell type and others (P=0.04 and P=0.005, respectively).

In conclusion, we found that the expression of three markers can be combined to divide DLBCL into GCB and non-GCB subgroups with high specificity and that our method can predict an outcome similar to that of GEP analysis in R-CHOP-treated patients. Our findings are currently used in our new clinical trial DLBCL studies. We believe the algorithm presented here will substantially improve upon the performance of the former algorithms, and allow a better stratification of DLBCLs for further characterizing the pathways that identify each of the DLBCL subtypes and for testing the efficacy of new drugs in distinct subgroups.


  1. 1.

    , , , . Classification of lymphoid neoplasms: the microscope as a tool for disease discovery. Blood 2008; 112: 4384–4399.

  2. 2.

    , , , , , et al. Long-term outcome of patients in the LNH-98.5 trial, the first randomized study comparing rituximab-CHOP to standard CHOP chemotherapy in DLBCL patients: a study by the Groupe d'Etudes des Lymphomes de l'Adulte. Blood 2010; 116: 2040–2045.

  3. 3.

    , , , , , et al. Distinct types of diffuse large B-cell lymphoma identified by gene expression profiling. Nature 2000; 403: 503–511.

  4. 4.

    , , , , , et al. The use of molecular profiling to predict survival after chemotherapy for diffuse large-B-cell lymphoma. N Engl J Med 2002; 346: 1937–1947.

  5. 5.

    , , . Lymphoid malignancies: the dark side of B-cell differentiation. Nat Rev Immunol 2002; 2: 920–932.

  6. 6.

    , . Aggressive lymphomas. N Engl J Med 2010; 362: 1417–1429.

  7. 7.

    , , , , , et al. Addition of rituximab to standard chemotherapy improves the survival of both the germinal center B-cell-like and non-germinal center B-cell-like subtypes of diffuse large B-cell lymphoma. J Clin Oncol 2008; 26: 4587–4594.

  8. 8.

    A predictive model for aggressive non-Hodgkin's lymphomaThe International Non-Hodgkin's Lymphoma Prognostic Factors Project. N Engl J Med 1993; 329: 987–994.

  9. 9.

    , , , , , et al. Confirmation of the molecular classification of diffuse large B-cell lymphoma by immunohistochemistry using a tissue microarray. Blood 2004; 103: 275–282.

  10. 10.

    , , , , , et al. Prognostic impact of immunohistochemically defined germinal center phenotype in diffuse large B-cell lymphoma patients treated with immunochemotherapy. Blood 2007; 109: 4930–4935.

  11. 11.

    , , , , , et al. LMO2 protein expression predicts survival in patients with diffuse large B-cell lymphoma treated with anthracycline-based chemotherapy with and without rituximab. J Clin Oncol 2008; 26: 447–454.

  12. 12.

    , , , , , et al. A new immunostain algorithm classifies diffuse large B-cell lymphoma into molecular subtypes with high accuracy. Clin Cancer Res 2009; 15: 5494–5502.

  13. 13.

    , , , , , et al. The prognostic value of immunohistochemical subtyping in Chinese patients with de novo diffuse large B-cell lymphoma undergoing CHOP or R-CHOP treatment. Ann Hematol 2010; 89: 171–177.

  14. 14.

    , , , , , et al. Prognostic impact of immunohistochemical biomarkers in diffuse large B-cell lymphoma in the rituximab era. Cancer Sci 2009; 100: 1842–1847.

  15. 15.

    , , , , . Prognostic impact of activated B-cell focused classification in diffuse large B-cell lymphoma patients treated with R-CHOP. Mod Pathol 2009; 22: 1094–1101.

  16. 16.

    , , , , , et al. Immunohistochemical methods for predicting cell of origin and survival in patients with diffuse large B-cell lymphoma treated with rituximab. J Clin Oncol 2011; 29: 200–207.

  17. 17.

    , , , , , et al. Gcet1 (centerin), a highly restricted marker for a subset of germinal center-derived lymphomas. Blood 2008; 111: 351–358.

  18. 18.

    , , , , . Strong expression of FOXP1 identifies a distinct subset of diffuse large B-cell lymphoma (DLBCL) patients with poor outcome. Blood 2004; 104: 2933–2935.

  19. 19.

    , , , , , et al. Potentially oncogenic B-cell activation-induced smaller isoforms of FOXP1 are highly expressed in the activated B cell-like subtype of DLBCL. Blood 2008; 111: 2816–2824.

  20. 20.

    , , , , , et al. Gene-expression profiling and not immunophenotypic algorithms predicts prognosis in patients with diffuse large B-cell lymphoma treated with immunochemotherapy. Blood 2011; 117: 4836–4843.

  21. 21.

    , , , . A robust method for the amplification of RNA in the sense orientation. BMC Genomics 2005; 6: 27.

  22. 22.

    , , , , , et al. PQN and DQN: algorithms for expression microarrays. J Theoretical Biol 2006; 243: 273–278.

  23. 23.

    , , , , , . A gene expression-based method to diagnose clinically distinct subgroups of diffuse large B cell lymphoma. Proc Natl Acad Sci USA 2003; 100: 9991–9996.

  24. 24.

    , , , , , . A novel method of amplification of FFPET-derived RNA enables accurate disease classification with microarrays. J Mol Diagn 2010; 12: 680–686.

  25. 25.

    , , , , , et al. Stromal gene signatures in large-B-cell lymphomas. N Engl J Med 2008; 359: 2313–2323.

  26. 26.

    , , , , , . Prognostic immunophenotypic biomarker studies in diffuse large B cell lymphoma with special emphasis on rational determination of cut-off scores. Leuk Lymphoma 2010; 51: 199–212.

  27. 27.

    , , , , , et al. Immunohistochemical prognostic markers in diffuse large B-cell lymphoma: validation of tissue microarray as a prerequisite for broad clinical applications (a study from the Lunenburg Lymphoma Biomarker Consortium). J Clin Pathol 2009; 62: 128–138.

  28. 28.

    , , , , , et al. The FOXP1 winged helix transcription factor is a novel candidate tumor suppressor gene on chromosome3 p. Cancer Res 2001; 61: 8820–8829.

  29. 29.

    . Index for rating diagnostic tests. Cancer 1950; 3: 32–35.

  30. 30.

    , , , , , et al. Report of an international workshop to standardize response criteria for non-Hodgkin's lymphomas. NCI Sponsored International Working Group. J Clin Oncol 1999; 17: 1244–1253.

  31. 31.

    , . Nonparametric estimation from incomplete observations. J Am Stat Assoc 1958; 53: 457–481.

  32. 32.

    . Regression models and life-tables. J R Stat Soc 1972; 34: 187–220.

  33. 33.

    , , , , , et al. Gene expression predicts overall survival in paraffin-embedded tissues of diffuse large B-cell lymphoma treated with R-CHOP. Blood 2008; 112: 3425–3433.

  34. 34.

    , , , , , et al. Differential efficacy of bortezomib plus chemotherapy within molecular subtypes of diffuse large B-cell lymphoma. Blood 2009; 113: 6069–6076.

  35. 35.

    , , , , , et al. Higher response to lenalidomide in relapsed/refractory diffuse large b-cell lymphoma in nongerminal center b-cell-like than in germinal center b-cell-like phenotype. Cancer 2011; 117: 5058–5066.

  36. 36.

    , , , , , et al. Inhibition of c-MET is a potential therapeutic strategy for treatment of diffuse large B-cell lymphoma. Lab Invest 2010; 90: 1346–1356.

  37. 37.

    , , , , , et al. MLN4924, a NEDD8-activating enzyme inhibitor, is active in diffuse large B-cell lymphoma models: rationale for treatment of NF-{kappa}B-dependent lymphoma. Blood 2010; 116: 1515–1523.

  38. 38.

    , , , , , et al. Essential role of MALT1 protease activity in activated B cell-like diffuse large B-cell lymphoma. Proc Natl Acad Sci USA 2009; 106: 19946–19951.

  39. 39.

    , , , , , et al. In vivo imaging of germinal centres reveals a dynamic open structure. Nature 2007; 446: 83–87.

  40. 40.

    , , , , , et al. Identification of centerin: a novel human germinal center B cell-restricted serpin. Eur J Immunol 2000; 30: 3039–3048.

  41. 41.

    , . Proteins encoded by genes involved in chromosomal alterations in lymphoma and leukemia: clinical value of their detection by immunocytochemistry. Blood 2002; 99: 409–426.

  42. 42.

    , , , , , et al. Foxp1 is an essential transcriptional regulator of B cell development. Nat Immunol 2006; 7: 819–826.

  43. 43.

    , , , , , et al. Molecular subtypes of diffuse large B-cell lymphoma arise by distinct genetic pathways. Proc Natl Acad Sci USA 2008; 105: 13520–13525.

  44. 44.

    , , , , , et al. Transcription factor IRF4 controls plasma cell differentiation and class-switch recombination. Nat Immunol 2006; 7: 773–782.

  45. 45.

    , , , , . HGAL is a novel interleukin-4-inducible gene that strongly predicts survival in diffuse large B-cell lymphoma. Blood 2003; 101: 433–440.

  46. 46.

    , , , , , et al. Prediction of survival in diffuse large-B-cell lymphoma based on the expression of six genes. N Engl J Med 2004; 350: 1828–1837.

  47. 47.

    , , , , , et al. Expression of the human germinal center-associated lymphoma (HGAL) protein, a new marker of germinal center B-cell derivation. Blood 2005; 105: 3979–3986.

  48. 48.

    , , , , , . HGAL a lymphoma prognostic biomarker, interacts with the cytoskeleton and mediates the effects of IL-6 on cell migration. Blood 2007; 110: 4268–4277.

  49. 49.

    , , , , , et al. The oncoprotein LMO2 is expressed in normal germinal-center B cells and in human B-cell lymphomas. Blood 2007; 109: 1636–1642.

  50. 50.

    , , , , , et al. Two newly characterized germinal center B-cell-associated genes, GCET1 and GCET2, have differential expression in normal and neoplastic B cells. Am J Pathol 2003; 163: 135–144.

  51. 51.

    , , , , , et al. Absence of cyclin-D2 and Bcl-2 expression within the germinal centre type of diffuse large B-cell lymphoma identifies a very good prognostic subgroup of patients. Histopathology 2007; 51: 70–79.

  52. 52.

    , , , , , et al. XBP1, downstream of Blimp-1, expands the secretory apparatus and other organelles, and increases protein synthesis in plasma cell differentiation. Immunity 2004; 21: 81–93.

  53. 53.

    , , , , , et al. A revised European-American classification of lymphoid neoplasms: a proposal from the International Lymphoma Study Group [see comments]. Blood 1994; 84: 1361–1392.

  54. 54.

    , . Diffuse large B-cell lymphoma. In: Jaffe ES, Harris NL, Stein H, Vardiman JW eds Pathology and Genetics: Tumours of Haematopoietic and Lymphoid Tissues. World Health Organization Classification of Tumours. IARC Press Lyon, France, 2001, 171–174.

  55. 55.

    , , , , , et al. Diffuse large B-cell lymphoma, not otherwise specified. In: Swerdlow SH, Campo E, Harris NL eds WHO Classification of Tumours of Haematopoietic and Lymphoid Tissues 4th edn. IARC Lyon, France, 2008, 233–237.

Download references


We thank our consortium program team of pathologists, hematologists and clinicians, and each of the contributing center principal physicians for their support in selection, evaluation and contribution of the cases. We thank our patients, former and current hematopathology and hematology/oncology fellows, and research scientists (Chih-Jian Lih, Paul M. Williams, Lynn Trinh and Yuchaun Tai) for their support. Technical and publication editing supports from Maitrayee Goswami and Virginia Mohlere from the Department of Scientific Publications are greatly appreciated. The abstract was presented as oral communication at the Lugano ICML Conference on June 16, 2011. CV is a honorable visiting hematologist supported by San Bortolo Hospital, Vicenza, Italy and The University of Texas MD Anderson Cancer Center. KHY is supported by The University of Texas MD Anderson Cancer Center Institutional R and D Fund, Institutional Research Grant Award, MD Anderson Cancer Center SPORE Research Development Program Award, Gundersen Lutheran Medical Foundation Award and Forward Lymphoma Fund. This study is also partially supported by the Zurich Stiftung zur Krebsbekaempfung and NCI/NIH (R01CA138688 and 1RC1CA146299). Publicly available data sets: All primary sequencing data will be made publicly available through the GEO archive through accession GSE#31312.

Author information


  1. Department of Hematopathology, The University of Texas MD Anderson Cancer Center, Houston, TX, USA

    • C Visco
    • , Z Y Xu-Monette
    • , R N Miranda
    • , C E Bueso-Ramos
    • , L J Medeiros
    •  & K H Young
  2. San Bortolo Hospital, Vicenza, Italy

    • C Visco
    •  & E S G d'Amore
  3. Roche Molecular Systems, Inc., Pleasanton, CA, USA

    • Y Li
    • , W Wen
    • , W-m Liu
    •  & L Wu
  4. Odense University Hospital, Odense, Denmark

    • T M Green
    •  & M B Møller
  5. University of Louisville School of Medicine, Louisville, KY, USA

    • Y Li
  6. University Hospital, Basel, Switzerland

    • A Tzankov
  7. University of Wisconsin Hospital and Clinic, Madison, WI, USA

    • B S Kahl
  8. Hospital Universitario Marques de Valdecilla, Santander, Spain

    • S Montes-Moreno
    •  & M A Piris
  9. Aalborg Hospital, Aarhus University Hospital, Aalborg, Denmark

    • K Dybkær
  10. Brigham and Women Hospital, Harvard Medical School, Boston, MA, USA

    • A Chiu
  11. Weill Medical College of Cornell University, New York, NY, USA

    • W Tam
    •  & A Orazi
  12. The Methodist Hospital, Houston, TX, USA

    • Y Zu
  13. Columbia University Medical Center and New York Presbyterian Hospital, New York, NY, USA

    • G Bhagat
  14. Feinberg School of Medicine, Northwestern University, Chicago, IL, USA

    • J N Winter
  15. University of California San Diego School of Medicine, San Diego, CA, USA

    • H-Y Wang
  16. University of North Carolina School of Medicine, Chapel Hill, NC, USA

    • S O'Neill
    •  & C H Dunphy
  17. Cleveland Clinic, Cleveland, OH, USA

    • E D Hsi
  18. University of Maryland School of Medicine, Baltimore, MD, USA

    • X F Zhao
  19. Gundersen Lutheran Health System, La Crosse, WI, USA

    • R S Go
  20. University of Hong Kong Li Ka Shing Faculty of Medicine, Hong Kong, China

    • W W L Choi
  21. Southwest Washington Medical Center, Vancouver, WA, USA

    • F Zhou
  22. University of Indiana School of Medicine, Indianapolis, IN, USA

    • M Czader
  23. Zhejiang University School of Medicine, Second University Hospital, Hangzhou, China

    • J Tong
    •  & X Zhao
  24. Radboud University Nijmegen Medical Centre, Nijmegen, The Netherlands

    • J H van Krieken
  25. City of Hope National Medical Center, Los Angeles, CA, USA

    • Q Huang
  26. University of California San Francisco School of Medicine, San Francisco, CA, USA

    • W Ai
    •  & J Etzell
  27. San Raffaele H. Scientific Institute, Milan, Italy

    • M Ponzoni
    •  & A J M Ferreri


  1. Search for C Visco in:

  2. Search for Y Li in:

  3. Search for Z Y Xu-Monette in:

  4. Search for R N Miranda in:

  5. Search for T M Green in:

  6. Search for Y Li in:

  7. Search for A Tzankov in:

  8. Search for W Wen in:

  9. Search for W-m Liu in:

  10. Search for B S Kahl in:

  11. Search for E S G d'Amore in:

  12. Search for S Montes-Moreno in:

  13. Search for K Dybkær in:

  14. Search for A Chiu in:

  15. Search for W Tam in:

  16. Search for A Orazi in:

  17. Search for Y Zu in:

  18. Search for G Bhagat in:

  19. Search for J N Winter in:

  20. Search for H-Y Wang in:

  21. Search for S O'Neill in:

  22. Search for C H Dunphy in:

  23. Search for E D Hsi in:

  24. Search for X F Zhao in:

  25. Search for R S Go in:

  26. Search for W W L Choi in:

  27. Search for F Zhou in:

  28. Search for M Czader in:

  29. Search for J Tong in:

  30. Search for X Zhao in:

  31. Search for J H van Krieken in:

  32. Search for Q Huang in:

  33. Search for W Ai in:

  34. Search for J Etzell in:

  35. Search for M Ponzoni in:

  36. Search for A J M Ferreri in:

  37. Search for M A Piris in:

  38. Search for M B Møller in:

  39. Search for C E Bueso-Ramos in:

  40. Search for L J Medeiros in:

  41. Search for L Wu in:

  42. Search for K H Young in:

Competing interests

The authors declare no conflict of interest.

Corresponding author

Correspondence to K H Young.

Supplementary information

About this article

Publication history







Author Contributions

Designed research: CV and KHY. Performed research: CV, YL, ZYXM, WL, SMM, MAP, MBM, LW and KHY. Contributed vital new reagents, resource, and analytical tools under approved IRB and MTA: CV, YL, RNM, AT, WW, WL, ESGD, SMM, KD, AC, WT, AO, YZ, GB, JNW, SON, CD, EDH, XFZ, RSG, WWLC, FZ, JT, XYZ, JHVK, QH, MAP, MBM, CEBR, LJM, LW and KHY. Collected data and follow-up under approved IRB and MTA: CV, ZYXM, RNM, TMG, AT, ESGD, SMM, KD, AC, WT, AO, YZ, GB, JNW, HYW, SON, CD, EDH, XFZ, RSG, WWLC, FZ, MC, JT, XYZ, JHVK, QH, WA, JE, MP, AJMF, MAP, MBM, CEBR, LJM and KHY. Contributed vital strategies, participated in discussions and provided scientific input: CV, YL, ZYXM, RNM, TMG, YL, AT, WW, WL, BSK, ESGD, SMM, KD, AC, WT, AO, YZ, GB, JNW, HYW, SON, CD, EDH, XFZ, RSG, WWLC, FZ, MC, JT, XYZ, JHVK, QH, WA, JE, MP, AJMF, MAP, MBM, CEBR, LJM, LW and KHY. Analyzed data: CV and KHY. Performed and supported statistical analysis: CV, AT and KHY. Wrote the paper: CV and KHY.

Supplementary Information accompanies the paper on the Leukemia website website (http://www.nature.com/leu)

Further reading