Acute Leukemias

Identification of a molecular signature for leukemic promyelocytes and their normal counterparts: focus on DNA repair genes


Acute promyelocytic leukemia (APL) is a clonal expansion of hematopoietic precursors blocked at the promyelocytic stage. Gene expression profiles of APL cells obtained from 16 patients were compared to eight samples of CD34+-derived normal promyelocytes. Malignant promyelocytes showed widespread changes in transcription in comparison to their normal counterpart and 1020 differentially expressed genes were identified. Discriminating genes include transcriptional regulators (FOS, JUN and HOX genes) and genes involved in cell cycle and DNA repair. The strong upregulation in APL of some transcripts (FLT3, CD33, CD44 and HGF) was also confirmed at protein level. Interestingly, a trend toward a transcriptional repression of genes involved in different DNA repair pathways was found in APL and confirmed by real-time polymerase chain reactor (PCR) in a new set of nine APLs. Our results suggest that both inefficient base excision repair and recombinational repair might play a role in APLs development. To investigate the expression pathways underlying the development of APL occurring as a second malignancy (sAPL), we included in our study eight cases of sAPL. Although both secondary and de novo APL were characterized by a strong homogeneity in expression profiling, we identified a small set of differentially expressed genes that discriminate sAPL from de novo cases.


Acute promyelocytic leukemia (APL) is a distinct subtype of acute myeloid leukemia (AML) characterized by specific clinical and genetic features. Most notably, these include an invariable balanced chromosomal translocation involving the retinoic acid receptor alpha (RARA) locus. In the vast majority of APL cases, RARA located on chromosome 17 is fused to the promyelocytic leukemia (PML) gene on chromosome 15 (t15;17) that is associated to the exquisite striking response to differentiating agents such as all-trans retinoic acid (ATRA).

Studies of APL-associated translocations have shown that the aberrant transcriptional regulation by the resultant fusion proteins plays a central role in the pathogenesis of the disease. The recruitment by the PML-RARA fusion protein of a transcriptional repressor complex containing histone deacetylases (HDAC) and DNA methyltransferases leads to gene expression deregulation1, 2 interfering with normal myeloid differentiation.3 Although normal RARA responds to physiologic ATRA levels by shedding the deacetylase complex, followed by association with an acetyltransferase complex, PML-RARA responds only to pharmacological concentrations of ATRA. The ability of leukemic cells to differentiate upon ATRA treatment correlates with the ability of their translocation fusion proteins to displace the repressor complex in response to ATRA. In fact, patients with PML-RARA+ APL typically achieve remission upon treatment with high doses of ATRA. Such treatment provides one example of successful transcription-based therapy.

In recent years, the incidence of AML occurring as a second tumor (sAML) has increased as a consequence of improved antineoplastic treatment and prolonged survival of patients with cancer. sAML occurring in patients previously treated for a primary tumor with chemotherapy with or without radiotherapy have been referred to as therapy-related AML (tAML). APL occurring as a second tumor (sAPL) represents an increasing fraction of sAML.4, 5 sAPL and de novo APL share specific clinical and genetic features. Analysis of small series of patients suggested that the frequency of different PML gene breakpoints (bcr1, bcr2 and bcr3) in sAPL is similar to that observed in de novo cases.5 Consistent with this, sAPL shares with de novo APL the striking response to differentiating agents such as ATRA and a favorable outcome after treatment with modern combinatorial regimens including anthracycline-based chemotherapy and retinoids.4, 5 Such good response to treatment in sAPL is in stark contrast with other types of sAML that are frequently refractory to therapies. In fact, the prognosis for sAML is generally dismal and survival rates are much poorer than for de novo AML.6, 7, 8

A distinguishing feature of sAML, including sAPL, is a high frequency of defects in DNA mismatch repair (MMR).9, 10, 11 Loss of this pathway is uncommon in de novo AML cases. We have suggested that this might reflect selection for MMR-defective hemopoietic progenitor cells during exposure to cytotoxic drugs.12 Another intriguing feature of sAPL that distinguishes it from de novo APL is the remarkable predominance of female patients including those cases developed after a cancer treated with surgery alone.4, 13 Such female patients predominance is not observed in other subtypes of sAML. This gender bias may be related to the high incidence of cancers of the female reproductive system (uterus, ovary or breast) among primary tumors in sAPL patients. A trend toward early onset primary breast cancer among sAML patients has been reported previously raising the possibility that common genetic factors might predispose to breast cancer and secondary leukemias.11

As inappropriate modulation of chromatin structure by HDACs and subsequent deregulation of gene expression represent a relevant pathogenetic pathway in the development of APL, analysis of gene expression profiles might provide a powerful approach to identify genes whose functions are involved in the pathogenesis of this disease. To shed light into the molecular mechanisms underlying the disease, we used the DNA microarray technology to compare gene expression profiles of leukemic cells obtained from de novo and sAPL patients and CD34+-derived normal promyelocytes. Leukemic promyelocytes showed widespread changes in transcription in comparison to their normal counterpart and we were able to identify 1020 genes differentially modulated together with a robust co-expression component of the entire data set driven by the opposition normal/leukemic promyelocytes. In addition, although APL patients were characterized by a strong homogeneity in expression profiling, we were able to identify a small set of differentially expressed genes that discriminates sAPL from de novo cases.

Patients and methods


Sixteen Italian patients with APL were analyzed including eight de novo cases compared to eight sAPL (four therapy-related cases and four cases secondary to a tumor treated by surgery alone). Patient details were recorded in the GIMEMA (Gruppo Italiano Malattie Ematologiche Maligne dell’Adulto) Archive. Patients were admitted, treated and followed up at the Department of Cellular Biotechnologies and Hematology of the University La Sapienza of Rome from January 1995 to August 2002. The main biologic and clinical features of patients are summarized in Table 1. In sAPL patients, detailed information was obtained on demographics (race, age, gender), time and date of onset of primary cancer, related treatment (surgery, chemotherapy, radiotherapy), outcome and latency between the primary malignancy and sAPL. Response to therapy was analyzed in sAPL and in de novo APL patients, which were all treated according to the ATRA and idarubicin regimen. The median age at diagnosis of sAPL was 32.5 years (range 23–68 years). The median latency between the primary malignancy and sAPL was 31 months (range 13–72 months). Seven patients had cancer of breast or uterus as the first malignancy accounting for the prevalence of female patients. The eighth case had a primary Hodgkin's lymphoma. Treatment of the initial malignancy was surgery alone in four patients, radiotherapy alone in three patients and chemotherapy combined with radiotherapy in one patient. We also examined eight de novo APL cases with similar age and type of breakpoint distribution (Table 1). APL diagnosis was established according to the French-American-British (FAB) classification using conventional cytochemical staining and surface marker analysis. In all cases, cytogenetic analysis was carried out at the time of diagnosis using standard G-banding techniques. All patients showed karyotypic evidence of the t(15;17) translocation. In one case, additional changes which included an inversion of chromosome 17q together with a der(13;14)(q10;q10) were observed. Genetic diagnosis was confirmed in all cases by demonstration of the PML-RARA fusion gene in bone marrow leukemia cells using reverse transcriptase–polymerase chain reaction (RT-PCR). All patients provided written informed consent for the study, which was performed in accordance with the Helsinki declaration of 1975. In line with National and European Union guidelines, ethical approval was not required for this study.

Table 1 Characteristics of de novo and sAPL patients

Generation of an enriched CD34+-derived promyelocyte population

Peripheral blood mononuclear cells were isolated from buffy coats of eight healthy blood donors by Ficoll–Hypaque density–gradient centrifugation (D=1077 g/ml; Pharmacia, Uppsala, Sweden). All control subjects provided written informed consent for the study. CD34+ cells were isolated using a magnetic cell-sorting program and a CD34 isolation kit in accordance with the manufacturer's recommendations (Mini-MACS; Miltenyi Biotec, Auburn, CA, USA). The purity of CD34+–selected cells was determined for each selection and the percentage of CD34+ cells ranged from 88 to 98%. To induce granulocytic differentiation, CD34+ cells were resuspended in a serum-free medium in the presence of granulocyte colony-stimulating factor (G-CSF) (500 U/ml), interleukin-3 (IL-3) (1 U/ml) and granulocyte macrophage colony-stimulating factor (GM-CSF) (0.1 ng/ml) (PeproTech Inc., Rocky Hill, NJ, USA). Every 2 days, viable cells were scored by trypan blue dye exclusion and cultures were supplemented with fresh serum-free medium with G-CSF, IL-3 and GM-CSF. Morphological and immunophenotypical analysis of each culture was performed at 0, 2, 4, 6 and 7 days. Briefly, cytospins were made with cells harvested at different days and cellular morphology was evaluated after May-Grunwald-Giemsa staining (Sigma, St Louis, MO, USA). CD34, CD133, CD15, CD11b, CD18 and CD24 monoclonal antibody were purchased from Pharmingen (San Diego, CA, USA). Cells were incubated with optimal concentrations of the antibodies and, after two washings in cold phosphate-buffered saline, were analyzed for fluorescence by flow cytometry (FACSCan, Becton Dickinson, San Diego, CA, USA) by Cell Quest Software. To validate microarray results, we measured protein expression of FLT3, CD33 and CD44 in fresh blasts from APL patients and in CD34+-derived normal cells induced to differentiate to granulocytes in vitro. Leukemic and normal cells were labeled with anti-CD33, anti-CD44 (Pharmingen), anti-FLT3 (Serotec, Oxford, UK) and analyzed by cytometry.

RNA isolation

Bone marrow cells collected at diagnosis were separated on Ficoll gradients and mononuclear cells were cryopreserved. Total RNA from leukemic and normal promyelocytes was extracted using the RNeasy Mini Kit following the manifacturer's instructions (Qiagen, Valencia, CA, USA).

Hepatocyte growth factor evaluation

Hepatocyte growth factor (HGF) concentration in cell culture supernatants was evaluated by enzyme-linked immunosorbent assay (ELISA) using a commercial kit from RaD Systems (Minneapolis, MN, USA) specific for the detection and quantification of human HGF. The detection limit was 5 pg/ml.

Gene expression profiling and data analysis

Disposable RNA chips (Agilent RNA 6000 Nano LabChip kit) were used to determine the concentration and purity/integrity of RNA samples using Agilent 2100 bioanalyzer (Agilent Technologies, Palo Alto, CA, USA). Two-cycle target labeling assays, HG-U133 GeneChip (Affymetrix, Santa Clara, CA, USA) arrays hybridization, staining and scanning were performed according to the standard protocol supplied by Affymetrix. The amount of a transcript mRNA (signal) was determined by the Affymetrix GeneChip Operating Software (GCOS) absolute analysis algorithm as already described.14 All expression values for the genes in the absolute GCOS analyses were determined using the global scaling option. Alternatively, probe level data were converted to expression values using robust multi-array average (GCRMA) procedure (BioConductor package, The GCRMA generated data were uploaded onto GeneSpring software version 7.2 (Agilent Technologies, Palo Alto, CA, USA,) using the log2 transformation procedure. A ‘per gene’ normalization was achieved by dividing each signal by the median of its values in all samples. In order to remove genes that are never expressed or expressed always at the same level, the expression data were low-level filtered using GeneSpring to select genes detected as ‘present’ in at least one sample. Then, genes whose normalized expression levels were always between 0.5 and 2 across all of the samples were filtered out. Hence, unsupervised cluster analysis of all the samples was performed using the condition tree option included in the GeneSpring package, applying the Pearson correlation equation. Supervised analysis was performed using: an analysis of variance (ANOVA) test, specifically, a Welch t-test (variances not assumed equal, P-value cutoff 0.05) using the Bonferroni method to control the family-wise error rate and an SAM analysis (significance analysis of microarrays – Bioconductor package) (three fold change, 100 permutations, false discovery rate (FDR) median: 0%, 90th percentile: 0%). Discrimination between de novo APL and sAPL was performed by supervised analysis using an ANOVA test, specifically, a Welch t-test (variances not assumed equal, P-value cutoff 0.01) and dChip Compare Sample procedure.16, 17 The comparison criteria utilized in dChip require the fold change and the absolute difference between two groups’ means to exceed user-defined thresholds (in the present study, 2 and 100, respectively). The ‘use lower 90% confidence bound’ has been selected to use the lower confidence bound of fold changes for filtering.16, 17 The lower confidence bound is intended as a conservative estimate of the real underlying fold change. FDR has been used to adjust P-value for multiple comparisons by 100 random permutations of the group labels. The principal component analysis (PCA) of the matrix having as rows the genes (statistical units) and as columns the samples (variables) was performed on the entire set of genes without any filtering. Another PCA was performed on 123 DNA repair genes, selected according to published literature and expert knowledge (Supplementary Table 1). The extracted component loadings were checked for their ability to discriminate different groups of patients according to the method proposed by Landgrebe et al.18 and subsequently validated by Wang and Gehan.19

Analysis of DNA repair genes by real-time RT-PCR

Analysis of mRNA expression of the transcripts for BRCA1, RAD51, MSH6 and MLH1 was performed by quantitative RT-PCR (QRT-PCR), using the predesigned Assays-on-Demand Gene Expression Assay (Applied Biosystems, Foster City, CA,USA; Assay ID Hs00173233-m1, Hs00153418-m1, Hs00264721-m1, Hs00179866-m1). Briefly, cDNAs were reverse transcribed from total RNA (1 μg per sample) using High Capacity cDNA Archive Kit (Applied Biosystems) as described by the manufacturer. TaqMan PCR reactions were performed on cDNA samples using the TaqMan Universal PCR Master Mix (Applied Biosystems) according the manufacturer's instructions on the ABI Prism 7000 Sequence Detection Systems. All amplification steps were performed in triplicate. The comparative cycle threshold method (ΔΔCt) for relative quantification of gene expression was used. The ribosomal protein 18S gene (RP18S) was used as an internal control (part number 4319413E). The calibrator sample was a pool of RNA extracted from eight different samples of promyelocytes derived from CD34+ cells obtained from peripheral blood of healthy subjects.


Morphological and immunophenotypical characterization of CD34+-derived promyelocytes

Promyelocyte-enriched samples were generated by culturing CD34+ cells from healthy blood donors in serum-free medium supplemented with granulocyte-differentiating factors. Cell progressed from undifferentiated hemopoietic progenitor cells on day 0 to promyelocytes on days 6–7 and typical morphologies were produced under these culture conditions in a reproducible manner (Figure 1a). At day 7, we observed a cell population containing >95% of cells with the typical morphological characteristics of promyelocytes (hypergranular cells with hypertrophic centrosome/Golgi). Flow cytometry analysis, on daily basis from day 0 up to day 7, of cell surface marker expression demonstrated that at day 7 the CD34 marker had disappeared, markers of different stage of maturation (CD11b) or of non-granulocytic lineage (CD14, GPA) were absent (data not shown). Granulocytic differentiation is reflected in the continuous evolution of surface markers CD15 versus CD33 (Figure 1b). Promyelocytes at day 7 were then used to perform experiments of expression profiling.

Figure 1

Granulocytic maturation in liquid culture. (a) Photomicrographs of May-Grunwald-Giemsa stained cells. At day 0, the cellular population was dominated by myeloblasts. At day 7, the culture contained mostly hypergranular cells with hypertrophic Golgi/centrosome that are typical characteristics of promyelocytes. Original magnification × 400. (b) FACS analysis of cell surface marker expression. Granulocytic differentiation is reflected in the disappearance of the CD34 marker and in the continuous evolution of surface markers CD15 versus CD33. The figure shows a representative experiment from a total of eight granulocytic liquid cultures analyzed at day 7.

Clustering analysis

We analyzed gene expression profiles of eight enriched CD34+-derived promyelocyte samples and 16 samples of APL blasts expressing PML-RARA. Biotinylated cRNA targets were synthesized from each sample and hybridized to HG-U133A Affymetrix GeneChips. Signals were generated using GCOS and GCRMA and further analyzed with the Genespring software. Using an unsupervised hierarchical clustering approach, we defined natural subclasses of samples as determined by gene expression profiles. We performed the clustering on a low-level filtered probe list using the Pearson correlation equation (see Patients and methods). The clustering results are shown in Figure 2a. This approach identified two main groups that correspond to the expression pattern of the eight normal promyelocyte cultures distinguished from the 16 leukemic samples derived from APL patients. In order to select the transcripts with a higher discrimination capacity between the two groups identified, we employed two different approaches: the ANOVA and SAM analyses. We applied them to the low-level filtered list of genes derived from the unsupervised analysis. The transcripts passing both ANOVA and SAM analyses generated a list of 1020 probe sets with 800 downregulated and 220 upregulated genes in APL (Supplementary Table 2). Clustering analysis on this supervised probe list is shown in Figure 2b.

Figure 2

Hierarchical clustering trees of unsupervised and supervised analyses of leukemic and normal promyelocytes. (a) ‘Condition’ tree clustering computed using the Pearson correlation equation of unsupervised analysis (15 070 transcripts) defines two subclasses: APL promyelocytes and normal CD34+-derived promyelocytes from healthy donors; (b) Eisen tree map computed using the GeneSpring ‘gene’ and ‘condition’ trees and the Pearson correlation equation on the 1020 modulated transcripts identified by the supervised analyses. sAPLs, de novo APL and normal promyelocytes are shown in black, red and blue, respectively.

Functional classification of discriminating genes

The 1020 probe sets were then investigated in their functional classification using the Ingenuity Pathway Analysis (IPA, Ingenuity Systems Inc., a web-based software application that enables to model and analyze the complexity of biological systems. The list of genes was explored by screening the most statistically relevant overrepresented molecular and cellular functions, that are: lipid metabolism, small molecule biochemistry, cell cycle, RNA post transcriptional modification, cellular assembly and organization, DNA replication/recombination/repair and gene expression (Figure 3). A large series of genes involved in the regulation of transcription was found to be either upregulated (members of the JUN and FOS family, SAP30, XIST, CEBPD) or downregulated (TAL1, HOXA7, HOXA9, HOXA10, MEIS1) in APL in comparison to normal promyelocytes. In contrast, we noticed that genes involved in the cell cycle control and in DNA replication/recombination/repair were consistently downregulated in APL.

Figure 3

Ingenuity Pathways Analysis: molecular and cellular functions represented in the 1020 probe sets passing both ANOVA and SAM analyses. The significances/P-values in Ingenuity Pathways Analysis are calculated on the hypergeometric distribution calculated via the computationally efficient Fisher’s exact test for 2 × 2 contingency tables. More precisely, it is the right-tailed Fisher’s exact test that is employed. Right-tailed here refers to the fact that only overrepresented functional/pathway annotations are showed, that is annotations which have more focus genes than expected by chance.

Correlation between gene expression and protein levels

We observed a strong upregulation of genes coding for the membrane proteins FLT3, CD33, CD44 and for the soluble factor HGF in APL samples. To verify these changes at protein level, we measured expression of FLT3, CD33, CD44 by flow cytometry in fresh blasts from APL patients and in CD34+–derived cells induced to differentiate to granulocytes (Figure 4). The levels of FLT3 protein were found to be greatly increased in fresh APLs promyelocytes, whereas both CD34+ and cells analyzed at different days of neutrophil maturation showed undetectable levels of protein expression (Figure 4a). A similar pattern of protein expression was found for the other surface markers CD33 and CD44 (Figure 4b and c). In addition HGF, measured by ELISA, was virtually absent in the supernatant of promyelocytic cultures, whereas high levels were found in supernatants of fresh APL blasts kept in culture for 1 day (Figure 4d). Hence, for all the markers we have assessed, the results obtained from microarrays data found a perfect correspondence in protein analyses.

Figure 4

Comparison of FLT3, CD33, CD44 and HGF protein levels in fresh blasts from APL patients and normal CD34+-derived cells induced to differentiate to granulocytes at different days of maturation. The analyses were performed by FACS for FLT3 (a), CD33 (b) and CD44 (c). HGF protein levels (d) were measured by ELISA in the supernatant of granulocytic cultures and in supernatants of fresh APL blasts kept in culture for 1 day. Protein levels of fresh blast from APL patients are represented in correspondence of day 7 to a better comparison with the promyelocyte stage of the granulocytic cultures.

Principal component analysis

The organization of our data is mainly degenerated, that is, the number of variables (genes) massively exceeds the number of independent statistical units (biological samples). As this structure might be associated with ‘chance’ correlations,20 we inverted the respective roles of genes and normal/leukemic promyelocytes (statistical units and variables, respectively) and the resulting matrix was submitted to PCA. This method uses an unsupervised learning approach to investigate natural co-expression structures (components) of the entire genome.21, 22 The results details are reported in the Supplementary material (Supplementary Table 3). A minor portion of the entire gene expression pattern, the second component PC2, which explains 5% of the gene expression variability, contains all the information necessary for the discrimination between normal cells and APLs (all positive loadings in normal promyelocytes and negative ones in leukemic samples). A plot of the PC2 and PC3 loadings against each other allows the clear visualization of the parting between the two groups (Figure 5a). Thus, a second multivariate approach such the PCA confirms the conclusion that normal promyelocytes and APLs show widely different patterns of gene expression.

Figure 5

Unsupervised PCA analysis of normal and leukemic promyelocytes. (a) Projection of the major components of highest variance (PC2 and PC3) able to discriminate between normal and leukemic cells. Normal promyelocytes are indicated by closed symbols, de novo APLs by open triangles and sAPLs by open diamonds. (b) Projection of the components PC3 and PC5 that are able to discriminate between de novo APL (closed symbols) and sAPL (open symbols).

Discrimination between de novo and sAPLs by supervised and PCA analyses

To identify the most relevant genes involved in the separation between these two groups, we applied a supervised approach. The ANOVA analysis led to the identification of 142 significant probe sets. Moreover, dChip Compare Sample procedure allowed the identification of 159 significant probe sets, with an overall median FDR of 6.3% (90th percentile FDR of 18.5%). Probe sets passing both analyses (56 probe sets, corresponding to 54 transcripts) were selected as a ‘changing’ gene list with 42 genes upregulated and 12 genes downregulated in sAPL (Table 2). Several of the downregulated genes of known function in sAPL (7/12) encode ribosomal proteins or factors involved in the regulation of protein synthesis. Among the upregulated genes, we noticed the overexpression of two members of the FOS family (FOS and FOSB), which were already identified as more highly expressed in the APL in comparison to normal promyelocytes. Clustering analysis of these genes is shown in Figure 6.

Table 2 Discriminating genes between de novo and sAPLs (ANOVA and dChip Compare Sample procedure)
Figure 6

Supervised analysis of de novo and sAPLs. Eisen tree map computed using the GeneSpring ‘gene’ and ‘condition’ trees and the Pearson correlation equation of the transcripts able to discriminate de novo from sAPLs.

When the entire expression data set was submitted to PCA to separate de novo and sAPLs, the space spanned by third and fifth components allows a significant discrimination between the two groups. The PC3/PC5 space relative to the leukemic patients is reported in Figure 5b (Student's t-test, P<0.05 and P<0.01 for PC3 and PC5, respectively; Supplementary Table 3). The separation is remarkable, given that it is based on an unsupervised technique and these two components account for a very minor portion of the total variance.

Thus, two different methods for data analysis allow identifying differences between secondary and de novo APL on the basis of the global gene expression profiling. In addition, both analyses indicate that these differences are limited to a small subset of genes.

PCA analysis of DNA repair genes

A previous report identified genes involved in DNA repair, and in particular in base excision repair, as modulated by the PML–RARA fusion proteins and relevant in the leukemic phenotype.23 In the list of genes able to discriminate normal promyelocytes from APLs, we also noticed several genes belonging to various DNA repair pathways (Supplementary Table 2). We submitted to PCA a matrix formed by 123 DNA repair genes (variables) and normal/leukemic promyelocytes (statistical units). This list was selected according to current knowledge on their function and personal expertise (Supplementary Table 1). The analysis of the component loadings shows that the first component (PC1) explains 28% of gene expression variability and contains all the information necessary for the discrimination between normal cells and APLs (all APLs scores are lower than normal promyelocytes scores) (P<0.0001, t-test) (the eigenvalues profiles and component loadings are shown in Supplementary Table 4). Thus, the multivariate approach is able to discriminate normal promyelocytes from leukemic ones using the pattern of gene expression of this very limited subset of DNA repair genes. We noticed that the majority of discriminating DNA repair genes (a list generated by the selection of the highest/lowest PC1 scores) was downregulated in leukemic cells and only a few were upregulated (Table 3).

Table 3 Discriminating genes between normal and leukemic promyelocytes (PCA restricted to 123 DNA repair genes)

We then used the same approach to try to discriminate between secondary and de novo APLs. In this case, we were able to significantly separate the two groups using the fourth and sixth components. A plot of the PC4 versus PC6 plane allows the clear visualization of the parting between the two groups (Figure 7a). Taken singularly PC4 shows a marginally significant separation at the t-test (P=0.05), whereas PC6 has a marked significance (P<0.02). It is noticeable that only two patients are misclassified, that is, two de novo APLs are considered as sAPLs. Downregulated and upregulated genes were equally represented in the list of genes able to discriminate the two groups (Figure 7b).

Figure 7

Discrimination between de novo and sAPLs by PCA using 123 DNA repair genes. (a) Spatial discrimination between de novo and sAPLs. The plot of the PC4 versus PC6 plane allows the clear visualization of the parting between the two groups with the exception of two misclassified patients (two de novo APLs, closed symbols, are considered as sAPLs, open symbols). (b) List of differentially expressed DNA repair genes.

Thus, expression profiling of a selection of relevant genes (DNA repair genes) is capable of successfully identifying sAPLs from de novo cases. This minor portion of the entire gene expression pattern contains all the information needed to discriminate between de novo and sAPLs.

Analysis of DNA repair genes by real-time RT-PCR

In order to validate microarray data in a new panel of APL cases, we evaluated the expression levels of selected DNA repair genes by QRT-PCR analysis in leukemic cells from nine de novo APL samples. We selected DNA repair genes identified by PCA (the MSH6 and MLH1 MMR genes and the RAD51 and BRCA1 recombinational repair genes) and which were downregulated in APL versus normal promyelocytes. BRCA1 was also identified as downregulated by the supervised analysis approach. QRT-PCR confirmed the microarray data, showing a strong reduction of MSH6, MLH1, BRCA1 and RAD51 expression in all APL samples with a fold change ranging from 3 to 14.3 (Supplementary Figure 1). Therefore, the analysis by RT-PCR of mRNA levels is consistent with microarrays data.


Gene expression profiling has been successfully used to classify different FAB subtypes of AML and/or to identify AML characterized by different karyotypic alterations. Rarely the comparison with a normal cellular counterpart has been performed and in these cases CD34+ cells have been used.24, 25, 26 In order to take into account both lineage- and stage-dependent changes of leukemic blasts, we compared the expression profiles of APLs with their normal counterpart, that is, CD34+-derived promyelocytes.

To extrapolate the main information from gene expression data, we used two different statistical approaches able to unveil the different levels of complexity of the studied disease. In the first one, the supervised approach, single genes are identified as the determinant factors of differential phenotypes, whereas in the second, based on the PCA, the attention is shifted from single genes to co-regulated networks. The discriminant components represent circuits of genes whose cooperative action is needed to obtain the phenotype under analysis. This implies that genes having extreme scores in the discriminating components are not the genes that taken singularly operate the discrimination but those mostly involved in co-regulation circuits.27 Therefore, the lists of genes identified by ANOVA followed by clustering and by PCA are not overlapping completely. The complementary character of the two approaches allows a more thorough analysis of the experimental data. In addition, experimental measurements at protein (by fluorescence-activated cell sorting (FACS) and ELISA) or at mRNA level (by real-time PCR) fully agreed with changes in several discriminating genes identified by microarrays data.

Surprisingly, the number of statistically significant genes modulated in APLs is extremely large (i.e. >1000). A similar observation was reported in studies in which modulated genes were identified by transfection of the leukemia-associated fusion genes PML-RARA, PLZ-RARA and AML-ETO, into the hemopoietic precursor U937 cell line.23, 28 Analogously, in a mouse study in which APLs were compared to normal promyelocytes changes in expression of a large number of genes were identified.29 Taken together, these results, using different experimental approaches, indicate a wide modification of the cellular transcriptional program and are in good agreement with the reported ability of PML-RARA to act as retinoid-inducible transcriptional regulatory factor.

Together with a direct effect of PML-RARA on transcription, the presence of the fused protein produced changes in expression of several other transcription factors or regulators of transcription. These include JUN, FOS, FOSB, SAP30, XIST, CEBPD, TAL1, HOXA7, HOXA9, HOXA10, MEIS1. For example, we noticed that both members of the transcription factor AP1, JUN and FOS, are upregulated in leukemic promyelocytes. Our data are in agreement with previous observations showing the ability of PML-RARA to enhance the transcriptional activity of FOS. In addition, we observed clear downmodulation of HOX gene expression in line with a previous report.30 The functional analysis of the supervised approach results (1020 transcripts), that is, the genes modulated in leukemic versus normal promyelocytes revealed that PML-RARA influences the expression of genes involved in the control of different cellular functions, including lipid metabolism, DNA repair and cell cycle. Interestingly, a trend toward a transcriptional repression of genes involved in diverse mechanisms of DNA repair and cell cycle control was found in leukemic promyelocytes. This list includes genes controlling the progression through mitosis (BUB3, BUB1B, CCNA2, CCNB1, CDC2, CDC20), the G1 and G2 checkpoints (CHEK1, CHK2) and several DNA repair genes (XRCC5, BRCA1, G22P1, FEN1). A more detailed analysis by PCA of 123 selected DNA replication/repair genes present in the HGU133A GeneChip was also capable of discriminating normal promyelocytes from leukemic ones. Again, we noticed that the majority of discriminating DNA repair genes (a list generated by the selection of the highest/lowest PC1 scores) was downregulated in leukemic cells and only a few were upregulated. Several of these expression changes occur in genes involved in DNA replication. Some of these variations might be due to a diminished proliferation rate of malignant promyelocytes, further supported by the downregulation of key cell cycle control genes. Of more profound significance is probably the downregulation in the expression of several base excision repair and recombinational repair genes, which might result in an altered sensitivity to mutagens, accumulation of DNA damage or increased mutagenesis. Thus, our data suggest that defects in base excision repair and recombinational repair might play a role in APLs development. This is in agreement with the low penetrance of PML/RARA gene in a mouse model for APL31, 32 and the proposed hypothesis that the genetic instability provided by defective DNA repair is needed for the full expression of the transformed phenotype of cells containing the 15:17 translocation.23

A comparison of our results with previous reports of genomic profiling in AML is difficult because the majority of the studies are based on the cumulative analysis of different FAB subtypes and/or experimental systems are not fully comparable.23, 24, 25, 26, 28, 29, 30 For example, the only report of expression profiling using normal promyelocytes is based on a mouse model of APL.29 Interestingly, Valk et al.25 determined the gene expression profiles in samples from 285 patients with AML, including 19 APL cases, and normal bone marrow and CD34+ samples derived from healthy control subjects and identified 16 groups of AML on the basis of molecular signatures. Cluster 12 contained all cases of APL and, in accordance with our results, SAM analyses revealed that genes coding for the two growth factors HGF and FGF were specific for this cluster and the authors suggest that HGF could be among the best predictors of t(15;17) abnormality. The functional categories that are recurrently identified by these studies include regulators of transcription and genes involved in the control of differentiation, cell cycle and apoptosis.

The most important general feature of the transcriptional profiles of de novo and sAPLs is their striking homogeneity. In particular ‘unsupervised PCA’, where no a priori knowledge of the class of the samples is used, indicated that the first principal component explains 88% of the total variability. The remarkable similarity in the pattern of gene expression of the patients implies a common pathological entity, besides its causative agent. These common biological features may also explain the striking clinical similarities between de novo and sAPL including their equally favorable response to differentiation-based protocols. However, a small number of genes were still capable of separating sAPLs and de novo cases using supervised and PCA analyses. The repression of genes involved in protein synthesis is of uncertain significance, although alterations in some of these genes might have a role in human disease, such as the RPS19 gene in the pure red cell aplasia of childhood Diamond-Blackfan Anemia caused by an intrinsic defect in erythropoietic progenitors.33

Expression profiling of a selection of DNA repair genes was also capable of successfully identifying sAPLs from de novo cases. It is well known that loss of MMR is a major step in the development of secondary leukemias, including APLs. Indeed, decreased levels of some MMR genes (MSH2, MSH6, MLH3, MSH3) were found in sAPLs compared to de novo cases. In addition, upregulation of several genes involved in recombinational repair was also observed (RECQL5, RAD51C, XRCC4, BRCA2, ERCC1). The OGG1 glycosylase, which removes the oxidized base 8-oxoguanine from DNA, and the AP endonuclease APEX1 are also highly expressed in sAPL, suggesting a possible exposure to an oxidative stress.34 These data suggest that at least this step in the leukemogenesis process occurring in sAPLs might differ from de novo cases and an imbalance in these repair pathways may underlie the development of secondary tumors. Sporadic reports describe sAMLs with defects in Fanconi pathways or with polymorphisms in the MSH2 MMR gene35 or in XRCC1.36 Molecular analysis of breakpoints sequences in APL that developed following topoisomerase II inhibitors suggests that non-homologous end-joining-mediated events might give origin to different translocations in secondary cases and de novo cases.37

Global expression profiling together with these observations suggest that inefficiency of more than one DNA repair pathway might favor the neoplastic transformation process leading to sAPLs. In light of the changes in gene expression of recombinational repair genes, the absence of extensive alterations in the karyotype of sAPLs is surprising. These observations altogether suggest that changes in gene expression of DNA repair genes might produce major consequences at gene rather than at genomic level.



acute myeloid leukaemia


acute promyelocytic leukaemia


all-trans retinoic acid


false discovery rate


geneChip operating software


robust multi-array average


histone deacetylase


principal component analysis


promyelocytic leukemia-retinoic acid receptor alpha


AML occurring as a second tumor


APL occurring as a second tumor


  1. 1

    Di Croce L, Raker VA, Corsaro M, Fazi F, Fanelli M, Faretta M et al. Methyltransferase recruitment and DNA hypermethylation of target promoters by an oncogenic transcription factor. Science 2002; 295: 1079–1082.

  2. 2

    Segalla S, Rinaldi L, Kilstrup-Nielsen C, Badaracco G, Minucci S, Pelicci PG et al. Retinoic acid receptor alpha fusion to PML affects its transcriptional and chromatin-remodeling properties. Mol Cell Biol 2003; 23: 8795–8808.

  3. 3

    Grignani F, Ferrucci PF, Testa U, Talamo G, Fagioli M, Alcalay M et al. The acute promyelocytic leukemia-specific PML-RAR alpha fusion protein inhibits differentiation and promotes survival of myeloid precursor cells. Cell 1993; 74: 423–431.

  4. 4

    Beaumont M, Sanz M, Carli PM, Maloisel F, Thomas X, Detourmignies L et al. Therapy-related acute promyelocytic leukemia. J Clin Oncol 2003; 21: 2123–2137.

  5. 5

    Pulsoni A, Pagano L, Lo Coco F, Avvisati G, Mele L, Di Bona E et al. Clinicobiological features and outcome of acute promyelocytic leukemia occurring as a second tumor: the GIMEMA experience. Blood 2002; 100: 1972–1976.

  6. 6

    Larson RA, Wernli M, Le Beau MM, Daly KM, Pape LH, Rowley JD et al. Short remission durations in therapy-related leukemia despite cytogenetic complete responses to high-dose cytarabine. Blood 1988; 72: 1333–1339.

  7. 7

    Estey EH . Prognosis and therapy of secondary myelodysplastic syndromes. Haematologica 1998; 83: 543–549.

  8. 8

    Leone G, Voso MT, Sica S, Morosetti R, Pagano L . Therapy related leukemias: susceptibility, prevention and treatment. Leuk Lymphoma 2001; 41: 255–276.

  9. 9

    Ben-Yehuda D, Krichevsky S, Caspi O, Rund D, Polliack A, Abeliovich D et al. Microsatellite instability and p53 mutations in therapy-related leukemia suggest mutator phenotype. Blood 1996; 88: 4296–4303.

  10. 10

    Sheikhha MH, Tobal K, Liu Yin JA . High level of microsatellite instability but not hypermethylation of mismatch repair genes in therapy-related and secondary acute myeloid leukaemia and myelodysplastic syndrome. Br J Haematol 2002; 117: 359–365.

  11. 11

    Casorelli I, Offman J, Mele L, Pagano L, Sica S, D’Errico M et al. Drug treatment in the development of mismatch repair defective acute leukemia and myelodysplastic syndrome. DNA Repair 2003; 142: 1–13.

  12. 12

    Bignami M, Casorelli I, Karran P . Mismatch repair and response to DNA-damaging antitumour therapies. Eur J Cancer 2003; 39: 2142–2149.

  13. 13

    Pagano L, Pulsoni A, Tosti ME, Avvisati G, Mele L, Mele M et al. Clinical and biological features of acute myeloid leukaemia occurring as second malignancy: GIMEMA archive of adult acute leukaemia. Br J Haematol 2001; 112: 109–117.

  14. 14

    Liu WM, Mei R, Di X, Ryder TB, Hubbell E, Dee S et al. Analysis of high density expression microarrays with signed-rank call algorithms. Bioinformatics 2002; 18: 1593–1599.

  15. 15

    Irizarry RA, Hobbs B, Collin F, Beazer-Barclay YD, Antonellis KJ, Scherf U et al. Exploration, normalization, and summaries of high density oligonucleotide array probe level data. Biostatistics 2003; 4: 249–264.

  16. 16

    Li C, Wong WH . Model-based analysis of oligonucleotide arrays: expression index computation and outlier detection. Proc Natl Acad Sci USA 2001; 98: 31–36.

  17. 17

    Li C, Hung Wong W . Model-based analysis of oligonucleotide arrays: model validation, design issues and standard error application. Genome Biol 2001; 2, (research0032.1–11).

  18. 18

    Landgrebe J, Wurst W, Welzl G . Permutation-validated principal components analysis of microarray data. Genome Biol 2002; 3 (research0019.1–11).

  19. 19

    Wang A, Gehan EA . Gene selection for microarray data analysis using principal component analysis. Stat Med 2005; 24: 2069–2087.

  20. 20

    Topliss JG, Edwards RP . Chance factors in studies of quantitative structure–activity relationships. J Med Chem 1979; 22: 1238–1244.

  21. 21

    Alter O, Brown PO, Botstein D . Singular value decomposition for genome-wide expression data processing and modeling. Proc Natl Acad Sci USA 2000; 97: 10101–10106.

  22. 22

    Pittelkow Y, Wilson SR . Use of principal component analysis and the GE-biplot for the graphical exploration of gene expression data. Biometrics 2005; 61: 630–632 (Discussion 632–634).

  23. 23

    Alcalay M, Meani N, Gelmetti V, Fantozzi A, Fagioli M, Orleth A et al. Acute myeloid leukemia fusion proteins deregulate genes involved in stem cell maintenance and DNA repair. J Clin Invest 2003; 112: 1751–1761.

  24. 24

    Bullinger L, Dohner K, Bair E, Frohling S, Schlenk RF, Tibshirani R et al. Use of gene-expression profiling to identify prognostic subclasses in adult acute myeloid leukemia. N Engl J Med 2004; 350: 1605–1616.

  25. 25

    Valk PJ, Verhaak RG, Beijen MA, Erpelinck CA, Barjesteh van Waalwijk van Doorn-Khosrovani S, Boer JM et al. Prognostically useful gene-expression profiles in acute myeloid leukemia. N Engl J Med 2004; 350: 1617–1628.

  26. 26

    Haferlach T, Kohlmann A, Schnittger S, Dugas M, Hiddemann W, Kern W et al. Global approach to the diagnosis of leukemia using gene expression profiling. Blood 2005; 106: 1189–1198.

  27. 27

    Alter O, Golub GH . Reconstructing the pathways of a cellular system from genome-scale signals by using matrix and tensor computations. Proc Natl Acad Sci USA 2005; 102: 17559–17564.

  28. 28

    Muller-Tidow C, Steffen B, Cauvet T, Tickenbrock L, Ji P, Diederichs S et al. Translocation products in acute myeloid leukemia activate the Wnt signaling pathway in hematopoietic cells. Mol Cell Biol 2004; 24: 2890–2904.

  29. 29

    Walter MJ, Park JS, Lau SK, Li X, Lane AA, Nagarajan R et al. Expression profiling of murine acute promyelocytic leukemia cells reveals multiple model-dependent progression signatures. Mol Cell Biol 2004; 24: 10882–10893.

  30. 30

    Thompson A, Quinn MF, Grimwade D, O'Neill CM, Ahmed MR, Grimes S et al. Global down-regulation of HOX gene expression in PML-RARalpha+acute promyelocytic leukemia identified by small-array real-time PCR. Blood 2003; 101: 1558–1565.

  31. 31

    He LZ, Tribioli C, Rivi R, Peruzzi D, Pelicci PG, Soares V et al. Acute leukemia with promyelocytic features in PML/RARalpha transgenic mice. Proc Natl Acad Sci USA 1997; 94: 5302–5307.

  32. 32

    Brown D, Kogan S, Lagasse E, Weissman I, Alcalay M, Pelicci PG et al. A PMLRARalpha transgene initiates murine acute promyelocytic leukemia. Proc Natl Acad Sci USA 1997; 94: 2551–2556.

  33. 33

    Draptchinskaia N, Gustavsson P, Andersson B, Pettersson M, Willig TN, Dianzani I et al. The gene encoding ribosomal protein S19 is mutated in Diamond-Blackfan anaemia. Nat Genet 1999; 21: 169–175.

  34. 34

    Rusyn I, Asakura S, Pachkowski B, Bradford BU, Denissenko MF, Peters JM et al. Expression of base excision DNA repair genes is a sensitive biomarker for in vivo detection of chemical-induced chronic oxidative stress: identification of the molecular source of radicals responsible for DNA damage by peroxisome proliferators. Cancer Res 2004; 64: 1050–1057.

  35. 35

    Worrillow LJ, Travis LB, Smith AG, Rollinson S, Smith AJ, Wild CP et al. An intron splice acceptor polymorphism in hMSH2 and risk of leukemia after treatment with chemotherapeutic alkylating agents. Clin Cancer Res 2003; 9: 3012–3020.

  36. 36

    Seedhouse C, Bainton R, Lewis M, Harding A, Russell N, Das-Gupta E . The genotype distribution of the XRCC1 gene indicates a role for base excision repair in the development of therapy-related acute myeloblastic leukemia. Blood 2002; 100: 3761–3766.

  37. 37

    Mistry AR, Felix CA, Whitmarsh RJ, Mason A, Reiter A, Cassinat B et al. DNA topoisomerase II in therapy-related acute promyelocytic leukemia. N Engl J Med 2005; 352: 1529–1538.

Download references


We thank Ettore Meccia for valuable and constant support in paper preparation. Research grants: this work was supported by individual grants from AIRC and Ministero della Salute to MB, AIRC and MURST/COFIN to SF.

Author information

Correspondence to M Bignami.

Additional information

Supplementary Information accompanies the paper on the Leukemia website (

Supplementary information

Supplementary Material (DOC 400 kb)

Supplementary Table 1 (DOC 197 kb)

Supplementary Table 2 (DOC 774 kb)

Supplementary Table 3 (DOC 98 kb)

Supplementary Table 4 (DOC 41 kb)

Supplementary Figure 1 (JPG 3577 kb)

Rights and permissions

Reprints and Permissions

About this article


  • APL
  • sAPL
  • normal promyelocytes
  • microarray
  • DNA repair

Further reading