Original Paper

Oncogene (2005) 24, 1580–1588. doi:10.1038/sj.onc.1208344 Published online 10 January 2005

Distinct gene expression patterns associated with FLT3- and NRAS-activating mutations in acute myeloid leukemia with normal karyotype

Kai Neben1, Susanne Schnittger2, Benedikt Brors3, Björn Tews1, Felix Kokocinski1, Torsten Haferlach2, Jasmin Müller3, Meinhard Hahn1, Wolfgang Hiddemann2, Roland Eils3, Peter Lichter1 and Claudia Schoch2

  1. 1Division of Molecular Genetics (B060), Deutsches Krebsforschungszentrum, Im Neuenheimer Feld 280, D-69120 Heidelberg, Germany
  2. 2Laboratory for Leukemia Diagnostic, Department of Internal Medicine III, University Hospital Grosshadern, Ludwig-Maximilians-University, D-81366 Munich, Germany
  3. 3Division of Intelligent Bioinformatics Systems (B080), Deutsches Krebsforschungszentrum, Im Neuenheimer Feld 280, D-69120 Heidelberg, Germany

Correspondence: Peter Lichter, Division of Molecular Genetics (B060), Deutsches Krebsforschungszentrum, Im Neuenheimer Feld 280, D-69120 Heidelberg, Germany; E-mail: k.neben@dkfz.de

Received 31 August 2004; Revised 3 November 2004; Accepted 4 November 2004; Published online 10 January 2005.



In acute myeloid leukemia (AML), constitutive activation of the FLT3 receptor tyrosine kinase, either by internal tandem duplications (FLT3-ITD) of the juxtamembrane region or by point mutations in the second tyrosine kinase domain (FLT3-TKD), as well as point mutations of the NRAS gene (NRAS-PM) are among the most frequent somatic gene mutations. To elucidate whether these mutations cause aberrant signal transduction in AML, we used gene expression profiling in a series of 110 newly diagnosed AML patients with normal karyotype. The different algorithms used for data analysis revealed highly concordant sets of genes, indicating that the identified gene signatures are specific for each analysed subgroup. Whereas samples with FLT3-ITD and FLT3-TKD could be separated with up to 100% accuracy, this did not apply for NRAS-PM and wild-type samples, suggesting that only FLT3-ITD and FLT3-TKD are associated with an apparent signature in AML. The set of discriminating genes included several known genes, which are involved in cell cycle control (CDC14A, WEE1), gene transcription (HOXB5, FOXA1), and signal transduction (SMG1). In conclusion, we showed that unique gene expression patterns can be correlated with FLT3-ITD and FLT3-TKD. This might lead to the identification of further pathogenetic relevant candidate genes particularly in AML with normal karyotype.


acute myeloid leukemia, normal karyotype, gene expression profiling, FLT3-activating mutations, NRAS-activating mutations



Acute myeloid leukemia (AML) comprises a group of hematopoietic stem cell malignancies which is heterogeneous with respect to biology and clinical course, affecting approximately 2–3 adults per 100 000 each year in Western countries. Currently, the karyotype is the most important independent prognostic factor in AML (Bloomfield et al., 1998; Grimwade et al., 1998; Buchner et al., 2003). However, in approximately 40–50% of newly diagnosed AML patients, acquired chromosome abnormalities are absent. These AML patients with normal karyotype are pooled together in the prognostically intermediate group, although individual courses may substantially differ. To optimize treatment strategies, a more precise understanding of the molecular mechanism of leukemogenesis, particularly in AML patients with normal karyotype, is clearly necessary.

In AML, signal transduction pathways are affected by frequent somatic mutations, which include the FLT3 and RAS genes. Constitutive activation of the FLT3 receptor tyrosine kinase, either by internal tandem duplications (FLT3-ITD) of the juxtamembrane region or by point mutations in codon 835 of the second tyrosine kinase domain (FLT3-TKD) can be found in 20–30 and 5–15% of patients, respectively (Kottaridis et al., 2001; Yamamoto et al., 2001; Frohling et al., 2002; Schnittger et al., 2002; Thiede et al., 2002). FLT3-ITD were shown to lead to ligand-independent autophosphorylation of the receptor and result in the proliferation of AML cells in vitro, because it appears to stimulate proliferation and inhibit apoptosis (Lisovsky et al., 1996; Kiyoi et al., 1998). In addition, it was shown that FLT3-TKD mutations are located in the activation loop of the second tyrosine kinase domain of FLT3 and constitutively activate the protein (Yamamoto et al., 2001). In both cases of FLT3 mutations, activation of the kinase leads to the stimulation of multiple signaling pathways, including the mitogen-activating protein (MAP) kinase and phophatidyl inositol 3 (PI3) kinase pathways (for a review, see Gilliland and Griffin, 2002). Interestingly, both, FLT3-ITD and FLT3-TKD are most frequent in patients with normal karyotype, suggesting that these mutations may play a role in leukemia initiation, particularly in the cytogenetically normal group (Kottaridis et al., 2001; Yamamoto et al., 2001; Frohling et al., 2002; Schnittger et al., 2002; Thiede et al., 2002). In addition, these studies showed that the presence of particularly FLT3-ITD in AML patients with normal karyotype confer a poor prognosis, with shorter progression-free survival, overall survival, or both. Simulated by the success of imatinib mesylate for the treatment of chronic myeloid leukemia, several potential FLT3 inhibitors are being tested by in vitro assays, in vivo murine models, and phase 1 and 2 trials in humans (Kelly et al., 2002; Weisberg et al., 2002; Griswold et al., 2004; Li et al., 2004).

The most commonly observed RAS mutations arise at sites critical for RAS regulation, namely, codons 12, 13, and 61. Each of these mutations results in the abrogation of the normal GTPase activity of RAS. In AML, most frequently point mutations in NRAS (NRAS-PM) are present at codon 61, which count for approximately 20% of de novo AML patients (for a review, see Reuter et al., 2000). Although NRAS is activated by point mutations, in most of the AML studies a correlation to cytogenetic abnormalities or outcome of the patients was not found (Neubauer et al., 1994; Kiyoi et al., 1999).

Currently, the expression of thousands of genes can be studied simultaneously by using modern microarray technologies. Previous studies showed that prognostic gene expression signatures are present in leukemic cells at diagnosis and suggest that the use of gene expression profiling will improve molecular classification and outcome prediction in AML (Schoch et al., 2002; Bullinger et al., 2004; Valk et al., 2004). To elucidate whether somatic mutations cause aberrant signal transduction in AML, we performed a gene expression profiling analysis of 110 AML samples and focused particularly on patients with normal karyotype that were characterized for FLT3-ITD, FLT3-TKD, and NRAS-PM at diagnosis. We applied different statistical algorithms to identify highly concordant sets of genes and to discover potential candidate genes that are particularly intriguing with respect to the pathomechanism of AML.



Sample set

Our AML sample set consisted of 110 specimens representing the most frequent somatic mutations of AML with normal karyotype. In detail, our gene expression profiling study included 36, 20, 13, and two samples displaying FLT3-ITD, FLT3-TKD, NRAS-PM, and partial tandem duplications of the MLL (MLL-PTD) gene as a sole aberration, whereas 27 wild-type (wt) samples were negative for all of these mutations analysed (Table 1). In addition, our sample set included 12 specimens with multiple mutations, which showed the following characteristics: FLT3-ITD plus FLT3-TKD (n=3), FLT3-ITD plus NRAS-PM (n=1), FLT3-ITD plus MLL-PTD (n=1), FLT3-ITD plus FLT3-TKD plus NRAS-PM (n=1), FLT3-TKD plus NRAS-PM (n=2), FLT3-TKD plus MLL-PTD (n=2), MLL-PTD plus NRAS-PM (n=2). According to the French–American–British (FAB) classification, the present study included three samples with AML M0, 29 with AML M1, 31 with AML M2, 26 with AML M4, four with AML M5a, 12 with AML M5b, and five with AML M6.

SAM-analysis of differentially expressed genes

Genes characteristic for each of the FLT3-TKD, FLT3-ITD, and NRAS-PM sample sets were obtained by means of supervised analysis in comparison to wt specimens, by using the method Significant Analysis of Microarray Experiments (SAM). Each of the three sample sets displayed FLT3-TKD, FLT3-ITD, and NRAS-PM as a sole aberration and were negative for MLL-PTD. Using a q-value of less than 5%, the SAM method identified 154, 47, and 43 genes to be differentially regulated between FLT3-TKD, FLT3-ITD, and NRAS-PM samples, respectively, in comparison to the group of wt specimens. Those top ranking genes, which show differences in expression levels of <0.5 or >2, are listed in Table 2. Functionally, these genes are involved in a variety of biological processes, that is, regulation of transcription (HOXB5, FOXA1, MBD1), signal transduction (SMG1, RAC2), cell cycle control (CCRK, CDC14A, CDK9, KIF23, WEE1), metabolism (CYP2J2, LDHB), or mediate protein kinase activity (CDK9, PKMYT1, SMG1). Notably, both, FOXA1 and SMG1, are highly expressed in FLT3-TKD and NRAS-PM samples, whereas no overlap of coexpressed genes was found for FLT3-ITD specimens in comparison to the FLT3-TKD and NRAS-PM groups.

Shrunken centroid classification

To analyse whether FLT3-ITD, FLT3-TKD, NRAS-PM, and wt AML samples with normal karyotype are characterized by specific gene expression signatures, shrunken centroid classification was applied, which identifies minimal combinations of genes that provide a signature for the best discrimination between known patient groups (Figure 1). The patient samples in the four analysed groups (FLT3-ITD, FLT3-TKD, NRAS-PM, wt) were classified accurately in 74 of 96 cases (77.1%; Figure 2). Notably, patient samples in the FLT3-ITD, FLT3-TKD, and wt group were classified with 100, 100, and 67% accuracy, respectively, whereas all of the patients carrying NRAS-PM were misclassified.

Figure 1.
Figure 1 - Unfortunately we are unable to provide accessible alternative text for this. If you require assistance to access this image, please contact help@nature.com or the author

Shrunken centroid classification of AML expression data according to the mutation status, that is, FLT3-ITD vs FLT3-TKD vs NRAS vs wt. The orientation of the bars in regard to the central line represents down- (left) or upregulation (right) of the gene in the corresponding class. The length of the bars represents the difference in mean expression over samples of one class as compared to the mean over all samples. Thus, it also visualizes the significance of the gene for this classification

Full figure and legend (108K)

Figure 2.
Figure 2 - Unfortunately we are unable to provide accessible alternative text for this. If you require assistance to access this image, please contact help@nature.com or the author

Estimated probabilities for the classification of FLT-LM, FLT3-TKD, NRAS-PM, and wt samples according to the shrunken centroid classification. Notably, patient samples in the FLT3-ITD, FLT3-TKD, and wt group were classified with 100, 100, and 67% accuracy, respectively, whereas all of the patients carrying NRAS-PM were misclassified

Full figure and legend (38K)

Neural network analysis

We applied a neural network analysis as an independent methodological approach to the shrunken centroid classification and developed a multiple-tree classifier to separate the AML samples based on their mutation status (Figure 3). As a result we found that a set of 10 genes is sufficient to classify between the five AML patient groups FLT3-ITD, FLT3-TKD, NRAS, MLL-PTD, and wt with a classification rate of 83.7% after 10-fold crossvalidation (Figure 3a). For each group, the classification rate was as follows: FLT3-TKD (90.0%), FLT3-ITD (86.1%), wt (85.2%), NRAS-PM (76.9%), and MLL-PTD (0.0%). Functionally, the discriminating genes are involved in a variety of biological processes, such as regulation of transcription (ZNF9, FOXA1), cell cycle control (CDC2, CDK9, KIF23), protein phosphatase activity (CDC14A), collagen binding (SERPINH1), as well as structural constitution of cytoskeleton (INA, MAP1LC3B). With the set of 10 discriminating genes an unsupervised cluster analysis was performed, including also 12 additional samples with multiple mutations (Figure 3b). We identified three major clusters: cluster A and B included all samples with FLT3-TKD and NRAS-PM, cluster C all samples with MLL-PTD, cluster A and C most of the samples with wt (26 of 27) and FLT3-ITD (35 of 36) and cluster B most of the samples with multiple mutations (seven of 12). All seven specimens with multiple mutations included in cluster B had either NRAS-PM or FLT3-TKD.

Figure 3.
Figure 3 - Unfortunately we are unable to provide accessible alternative text for this. If you require assistance to access this image, please contact help@nature.com or the author

Identification of a 10-gene signature by neural network analysis that allows the correct classification of FLT3-ITD, FLT3-TKD, NRAS-PM, MLL-PTD, and wt samples with an accuracy of 83.7%. Panel a shows a schematic representation of the decision trees that allow a separation of the AML samples according to their mutation status. Samples are color-coded according to their mutation status of FLT3, NRAS, and MLL, determined on the basis of genomic PCR and sequencing. Panel b displays the results of an unsupervised clustering of all 110 samples, which was performed with the 10-gene signature. The dendrogram shows three major clusters. Mean-centered ratios of gene expression are depicted by a pseudocolor scale. Gray areas indicate poorly measured genes that were removed from the data set after biomathematical analysis. Highly expressed genes are shown in red, whereas genes that are expressed at lower levels are displayed in green

Full figure and legend (309K)

Real-time quantitative reverse transcriptase–polymerase chain reaction (RQ–PCR) analysis

To assess the results of the microarray data in further detail, CDC14A, ZNF9, and FOXA1 were chosen for further quantitative analysis using RQ–PCR (Figure 4). RQ–PCR analysis was performed in six wt, six FTL3-TKD, six FLT3-ITD, and six NRAS-PM randomly chosen AML samples. Consistent with the results obtained by microarray experiments, CDC14A and ZNF9 were found to be expressed 2.3- and 2.5-fold higher in wt than in FLT3-ITD samples (Mann–Whitney U test: P-values 0.18 and 0.18, respectively), whereas FOXA1 was found to be expressed 1.8- and 2.0-fold higher in NRAS-PM and FLT3-TKD than in wt samples (Mann–Whitney U test: P-values 0.25 and 0.20, respectively).

Figure 4.
Figure 4 - Unfortunately we are unable to provide accessible alternative text for this. If you require assistance to access this image, please contact help@nature.com or the author

Quantitative expression analysis of CDC14A, ZNF9, and FOXA1 in 24 randomly selected AML samples. Relative expression levels are shown, measured by RQ-PCR and normalized to three housekeeping genes (PGK1, LMNB1, PPIA). The height of the bars represents the relative gene expression for individual patients on a logarithmic scale

Full figure and legend (36K)



To elucidate the molecular events associated with frequent somatic mutations (FLT3-ITD, FLT3-TKD, NRAS-PM), we analysed molecular alterations on the gene expression level in a series of 110 newly diagnosed AML samples with normal karyotype. The different methods used for data analysis, including SAM, shrunken centroid classification, and neural network analysis, revealed highly concordant sets of genes, indicating that the identified genes are specific for each analysed subgroup. For instance, the signature identified by shrunken centroid classification included 35 genes, showing an overlap to the discriminating genes revealed by the SAM method and neural network analysis in 17 of 20 genes and seven of 10 genes, respectively. In addition, the results obtained by gene expression profiling were confirmed by RQ–PCR for CDC14A, ZNF9, and FOXA1, which were among the top ranking genes for the discrimination between FLT3-ITD, FTL3-TKD, NRAS-PM, and wt samples.

We could separate AML samples with FLT3-ITD and FLT3-TKD into subgroups with 100% accuracy, based on distinct patterns of gene expression revealed by shrunken centroid classification, suggesting that both mutations result in different biological entities within the scope of AML. Using neural network analysis, we were able to identify 10 discriminating genes, which allowed a classification of the samples with 83.7% accuracy. Interestingly, the classification rate revealed by shrunken centroid classification was only 77.1%, although the signature included a higher number of genes (n=35). This finding suggests that much of the clustering result obtained by shrunken centroid classification might be based on irrelevant similarities. Using shrunken centroid classification, all AML samples with FLT3-ITD and FLT3-TKD were classified with 100% accuracy, whereas all NRAS-PM specimens were misclassified. Although we cannot exclude that this problem in classification is related to the small sample size of this particular subgroup (n=13), Valk et al. (2004) reported correspondingly that mutations in codon 12, 13, and 61 of the small RAS GTPases (NRAS and KRAS) resulted in no apparent gene expression signature in AML.

Notably, when the set of 10 discriminating genes revealed by neural network analysis was used for unsupervised clustering, the samples characterized by FLT3-ITD, FLT3-TKD, and NRAS-PM were found in two of three clusters each. In line with two recent AML studies (Bullinger et al., 2004; Valk et al., 2004), the unequal distribution of FLT3-ITD between different subgroups identified by gene expression profiling supports the concept that distinct biologic changes underlie the clinical phenotypes. Interestingly, FLT3-ITD and NRAS-PM samples clustered together, which both did not correlate with leukocytosis at diagnosis or outcome of the patients in recent AML studies (Neubauer et al., 1994; Kiyoi et al., 1999; Yamamoto et al., 2001; Thiede et al., 2002). Although all 110 samples included in the present study were also analysed for MLL-PTD, this subgroup was too small and heterogeneous for an independent statistical analysis. Our sample set included two specimens carrying MLL-PTD as a sole aberration, which clustered together with the FLT3-ITD samples. Since FLT3-ITD as well as MLL-PTD adversely affect the clinical outcome of AML patients (Kottaridis et al., 2001; Dohner et al., 2002; Frohling et al., 2002), our finding suggests that our gene expression signature based on 10 discriminating genes is of prognostic significance. Interestingly, all 110 specimens analysed segregated into three distinct groups after unsupervised clustering. This finding suggests that the large group of cytogenetically normal AML patients can be subdivided into groups, which share common aberrations in signal transduction pathways, that might explain the heterogeneity of this patient group with respect to clinical outcome. In the future, it will be important to refine and validate our gene expression signature in an independent set of AML samples and to test the prognostic significance of our gene expression signature together with clinical defined variables.

The current gene expression study provides insight into the pathogenesis of AML, including, for example, the role of the transcription factors HOXB5 and FOXA1. We found HOXB5 among the genes most abundantly expressed in FLT3-ITD samples, whereas FOXA1 was highly expressed in FLT3-TKD and NRAS-PM specimens. Bullinger et al. (2004) reported that high expression levels of HOXA4, HOXA10, HOXB2, and HOXB5 are associated with a poor outcome in AML, suggesting that homeobox-gene dysregulation plays a significant role in leukemogenesis. The transcription factor FOXA1 is a member of the forkhead gene family and a recent gene expression profiling study of lung adenocarcinomas demonstrated that FOXA1 was among the most abundantly expressed genes in comparison to normal lung (Lin et al., 2002). In mice, the signalling factor sonic hedgehog (SHH) has been shown to be regulated by an FOXA1-dependent mechanism (Epstein et al., 1999). Since overexpression of SHH in mice has been associated with the development of basal cell carcinoma (Oro et al., 1997), this finding also suggests a potential oncogenic involvement of FOXA1.

In addition, we found that the PI-3-kinase-related kinase SMG1 is abundantly expressed in NRAS-PM and FLT3-TKD samples. The phosphatidylinositol 3-kinase (PI3K)/AKT protein kinase pathway is involved in cell growth, proliferation, and apoptosis. Since wortmannin and caffeine were found to inhibit the kinase activity of SMG1 (Yamashita et al., 2001) and inhibition of PI3K promotes apoptosis in myeloid leukemias (Zhao et al., 2004), SMG1 might serve as a therapeutic target in AML.

The set of genes discriminating between FLT3-ITD, FLT3-TKD, and NRAS-PM samples includes several known genes that are involved in the regulation of cell cycle and mitosis. Notably, both, WEE1 and CDC14A, two negative regulators of G2/M phase transition of the cell cycle, were found to be expressed at lower levels in FLT3-ITD specimens. The human WEE1 tyrosine kinase appears to coordinate the transition between DNA replication and mitosis by protecting the nucleus from cytoplasmically activated CDC2 kinase (Heald et al., 1993). Recently, Mailand et al. (2002) showed that downregulation of endogenous CDC14A by short inhibitory RNA duplexes (siRNA) induces mitotic defects in human cells. Therefore, the unequal expression of WEE1 and CDC14A between FLT3-ITD and wt samples suggests that dysregulation of G2/M transition checkpoints of the cell cycle might contribute to leukemogenesis and disease progression in this particular subtype of AML.

In summary, our results provide new insight in the biochemical pathways, particularly intriguing with respect to the pathomechanism of AML with normal karyotype characterized by FLT3-ITD, FLT3-TKD, and NRAS-PM. The finding that FLT3-ITD and FLT3-TKD can be separated from wt specimens with high accuracy supports the concept that distinct biological changes underlie the clinical phenotype. The current study provides insight into the molecular pathways that sustain the growth and survival of leukemic cells. Functional approaches like RNAi knockdown (e.g. CDC14A in FLT3-ITD samples) or vector-induced overexpression (e.g. FOXA1 in FLT3-TKD samples) followed by phenotypic analysis should be used to analyse the role of the identified genes with respect to leukemogenesis in further detail. Such genes might provide a basis for optimization of FLT3 targeting therapies and eventually lead to new agents for challenging diseases like AML.


Materials and methods

Sample set

From 110 AML patients, fresh blood or bone marrow samples were referred to the Laboratory for Leukemia Diagnostics between 1999 and 2002 for cytomorphologic, cytogenetic, and molecular analyses. All diagnoses indicated AML according to standard FAB criteria (Bennett et al., 1985). All patients were adults and presented with de novo AML. Informed consent according to the Declaration of Helsinki was approved by the local ethics committee of the Ludwig-Maximilians-University of Munich. All patients, included in the present study, had normal karyotypes as analysed by cytogenetic G-banding with standard methods. The definition of a cytogenetic clone and descriptions of karyotypes followed the International System for Human Cytogenetic Nomenclature (Mitelman, 1995). The detection of FLT3-ITD, FLT3-TKD, and NRAS-PM by genomic PCR and sequencing of all PCR products larger than the wt allele was performed as described previously (Schnittger et al., 2002). In addition, all samples were analysed for partial tandem duplications of the MLL (MLL-PTD) gene (Schnittger et al., 2000).

Microarray experiments

For gene expression experiments, cDNA-microarrays containing replicate spots of 4211 different gene-specific fragments, representing 2600 different genes with relevance to mitosis, cell cycle control, oncogenesis, or apoptosis, were processed as described previously (Korshunov et al., 2003; Neben et al., 2004). Mononucleated bone marrow cells were obtained by Ficoll Hypaque density gradient centrifugation from 110 AML patients. Total RNA was extracted from 107 cells with the MagnaPureLC mRNA Kit I (Roche Diagnostics, Mannheim, Germany). The tumor RNA was cohybridized with commercially available Universal Human Reference RNA (Stratagene, La Jolla, USA), derived from 10 human cancer cell lines. Approximately, 1 mug of AML and 1 mug of reference mRNA were labeled with Cy3 and Cy5, respectively, using the Omniscript Reverse Transcriptase kit (Qiagen) and hybridized with 10 mug C0t1 DNA, 30 mug bovine liver tRNA, and 10 mug oligo-dT nucleotides in an automated hybridization chamber (GeneTac; Genomic Solutions, Ann Arbor, USA). For all samples, we performed color switch experiments, where the tumor and reference DNA were labeled via Cy3- and Cy5-dUTP, respectively, and vice versa. Data sets for spots, not recognized by the GenePix Pro 4.0 analysis software (Axon Instruments, Union City, USA), were excluded from further considerations. Additionally, all remaining data sets were ranked according to spot homogeneity (as assayed by the ratio of median and mean fluorescence intensities), spot intensity, and the standard deviation of log ratios for replicate spots. Those data points, ranked among the lower 20%, based on the criteria just described, were removed from the data set. For each hybridization, fluorescence ratios (Cy5/Cy3) were normalized by variance stabilization (Huber et al., 2002). To combine experiments with switched dye labeling, the ratios of one experiment were inverted and averaged with the corresponding spots on the second array. The raw data of the microarray experiments will be available under the following url address: http://www.dkfz.de/kompl_genom

Biomathematical analyses of microarray data

We used the SAM method described by Tusher et al. (2001) to identify genes with statistically significant changes in expression between FLT3-ITD, FLT3-TKD, and NRAS-PM samples, comparing each of the sample sets separately with the group of wt specimens. SAM analysis was performed on logged data with the following parameters: (i) two classes, unpaired data; (ii) 100 permutations; (iii) K nearest neighbor, 10 neighbors. Values were not restricted by setting any fold change. The criteria for identifying differentially expressed genes were q-values of less than 5%.

The method of shrunken centroids was employed to classify FLT3-ITD, FLT3-TKD, NRAS-PM, and wt samples on the basis of specific expression signatures (Tibshirani et al., 2002). In this analysis, we included only samples that carry one specific mutation (n=96), since the specimens with multiple mutations display a heterogeneous group (n=12) and the group of MLL-PTD samples (n=2) was to small for a separate analysis.

For each gene, a centroid is calculated for each class and standardized by dividing the average gene expression by the within class standard deviation for this gene to give higher weight to genes whose expression is stable within samples of the same class. Each centroid was then reduced towards 0 by a value proportional to the standard deviation of gene expression over the samples of the centroids class. All genes demonstrating a mean value reduced to zero or below zero in this procedure were removed from the centroid. This shrunken centroid can then be used for classification by determining the distances between each sample and centroid. The optimal amount of shrinkage is determined by 10-fold crossvalidation. This crossvalidation also allows a judgement of the classification quality.

For classification, we used artificial neural networks (Bishop, 1995). To avoid overfitting, we used an implementation in the NETLAB toolbox (Nabney, 2002) for MATLAB (MathWorks Inc., Natick, MA, USA) that was regularized by adding an additional error term to the loss function (Ragg, 2002):

Unfortunately we are unable to provide accessible alternative text for this. If you require assistance to access this image, please contact help@nature.com or the author

where L is the loss function, I an indicator function that is 1 if the predicted class, f(xn), does not match the true class Yn belonging to the expression profile xn, and 0 otherwise; lambda is a regularization parameter, and the wi are the weights of the neural network. The regularization parameter is chosen by Bayesian learning as described by Ragg (2002).

To determine a reasonable number of input variables, we used the following algorithm: First, neural networks were trained using only one input node. Among all variables, we chose the one that resulted in maximum Bayesian model evidence (Ragg, 2002). Then, we searched for a second variable that optimized the model evidence in combination with the first variable. This was iterated until the model evidence could not be increased further. Model evidence was also used to find an optimal number of hidden units in the range of 0–5. All used neural networks had maximum evidence without any hidden unit. Neural networks had one output unit for binary classification. In order to solve the multiclass discrimination task, we combined several binary classifiers in a tree-like fashion. At each node of the tree, a classifier was constructed that was able to discriminate between sets of samples, such that all classes could be finally separated. The presented tree topology performed best among several topologies tested. Accuracy estimates were calculated using 10-fold crossvalidation.


Each cDNA sample was analysed in triplicate (aliquot of 1 mul each) using the ABI PRISM 7700 Sequence Detector (Applied Biosystems, Weiterstadt, Germany) as described previously (Korz et al., 2002). To standardize the amount of sample cDNA, three endogenous control amplicons were used as housekeeping genes (for sequence, see reference above), coding for phosphoglycerate kinase 1 (PGK1), lamin B1 (LMNB1), and cyclophilin A (PPIA). In addition, oligonucleotides used for RQ–PCR were as follows: CDC14A forward, 5'-GCACTTACAATCTCACCATTC-3'; CDC14A reverse, 5'-CATGTTGTAATCCCTTTCTG-3'; ZNF9 forward, 5'-GACGCGGAAGATCTGACTG-3'; ZNF9 reverse, 5'-TCGTCCACACTTGAAGCACTC-3'; FOXA1 forward, 5'-CATTGCCATCGTGTGCTTGT-3'; FOXA1 reverse, 5'-CCCGTCTGGCTATAC-3'. The relative quantification of each target gene in comparison to the reference genes was carried out by using the mathematical model developed by Pfaffl (2001).



  1. Bennett JM, Catovsky D, Daniel MT, Flandrin G, Galton DA, Gralnick HR & Sultan C. (1985) Ann. Intern. Med. 103: 620−625. | PubMed | ISI | ChemPort |
  2. Bishop CM. (ed). (1995) Neural Networks for Pattern Recognition Oxford University Press: Oxford, UK.
  3. Bloomfield CD, Lawrence D, Byrd JC, Carroll A, Pettenati MJ, Tantravahi R, Patil SR, Davey FR, Berg DT, Schiffer CA, Arthur DC & Mayer RJ. (1998) Cancer Res. 58: 4173−4179. | PubMed | ISI | ChemPort |
  4. Buchner T, Hiddemann W, Berdel WE, Wormann B, Schoch C, Fonatsch C, Loffler H, Haferlach T, Ludwig WD, Maschmeyer G, Staib P, Aul C, Gruneisen A, Lengfelder E, Frickhofen N, Kern W, Serve HL, Mesters RM, Sauerland MC & Heinecke A. (2003) J. Clin. Oncol. 21: 4496−4504. | Article | PubMed | ChemPort |
  5. Bullinger L, Dohner K, Bair E, Frohling S, Schlenk RF, Tibshirani R, Dohner H & Pollack JR. (2004) N. Engl. J. Med. 350: 1605−1616. | Article | PubMed | ChemPort |
  6. Dohner K, Tobis K, Ulrich R, Frohling S, Benner A, Schlenk RF & Dohner H. (2002) J. Clin. Oncol. 20: 3254−3261. | Article | PubMed | ChemPort |
  7. Epstein DJ, McMahon AP & Joyner AL. (1999) Development 126: 281−292. | PubMed | ISI | ChemPort |
  8. Frohling S, Schlenk RF, Breitruck J, Benner A, Kreitmeier S, Tobis K, Dohner H, Dohner K & AML Study Group Ulm. (2002) Blood 100: 4372−4380. | Article | PubMed | ChemPort |
  9. Gilliland DG & Griffin JD. (2002) Blood 100: 1532−1542. | Article | PubMed | ISI | ChemPort |
  10. Grimwade D, Walker H, Oliver F, Wheatley K, Harrison C, Harrison G, Rees J, Hann I, Stevens R, Burnett A & Goldstone A. (1998) Blood 92: 2322−2333. | PubMed | ISI | ChemPort |
  11. Griswold IJ, Shen LJ, La Rosee P, Demehri S, Heinrich MC, Braziel RM, McGreevey L, Haley AD, Giese N, Druker BJ & Deininger MW. (2004) Blood 104: 2912−2918. | Article | PubMed | ChemPort |
  12. Heald R, McLoughlin M & McKeon F. (1993) Cell 74: 463−474. | Article | PubMed | ISI | ChemPort |
  13. Huber W, Von Heydebreck A, Sultmann H, Poustka A & Vingron M. (2002) Bioinformatics 18: 96−104.
  14. Kelly LM, Yu JC, Boulton CL, Apatira M, Li J, Sullivan CM, Williams I, Amaral SM, Curley DP, Duclos N, Neuberg D, Scarborough RM, Pandey A, Hollenbach S, Abe K, Lokker NA, Gilliland DG & Giese NA. (2002) Cancer Cell 5: 421−432. | Article |
  15. Kiyoi H, Naoe T, Nakano Y, Yokota S, Minami S, Miyawaki S, Asou N, Kuriyama K, Jinnai I, Shimazaki C, Akiyama H, Saito K, Oh H, Motoji T, Omoto E, Saito H, Ohno R & Ueda R. (1999) Blood 93: 3074−3080. | PubMed | ISI | ChemPort |
  16. Kiyoi H, Towatari M, Yokota S, Hamaguchi M, Ohno R, Saito H & Naoe T. (1998) Leukemia 12: 1333−1337. | Article | PubMed | ChemPort |
  17. Korshunov A, Neben K, Wrobel G, Tews B, Benner A, Hahn M, Golanov A & Lichter P. (2003) Am. J. Pathol. 163: 1721−1727. | PubMed | ChemPort |
  18. Korz C, Pscherer A, Benner A, Mertens D, Schaffner C, Leupolt E, Dohner H, Stilgenbauer S & Lichter P. (2002) Blood 99: 4554−4561. | Article | PubMed | ChemPort |
  19. Kottaridis PD, Gale RE, Frew ME, Harrison G, Langabeer SE, Belton AA, Walker H, Wheatley K, Bowen DT, Burnett AK, Goldstone AH & Linch DC. (2001) Blood 98: 1752−1759. | Article | PubMed | ISI | ChemPort |
  20. Li Y, Li H, Wang MN, Lu D, Bassi R, Wu Y, Zhang H, Balderes P, Ludwig DL, Pytowski B, Kussie P, Piloto O, Small D, Bohlen P, Witte L, Zhu Z & Hicklin DJ. (2004) Blood 104: 1137−1144. | Article | PubMed | ChemPort |
  21. Lin L, Miller CT, Contreras JI, Prescott MS, Dagenais SL, Wu R, Yee J, Orringer MB, Misek DE, Hanash SM, Glover TW & Beer DG. (2002) Cancer Res. 62: 5273−5279. | PubMed | ChemPort |
  22. Lisovsky M, Estrov Z, Zhang X, Consoli U, Sanchez-Williams G, Snell V, Munker R, Goodacre A, Savchenko V & Andreeff M. (1996) Blood 88: 3987−3997. | PubMed | ISI | ChemPort |
  23. Mailand N, Lukas C, Kaiser BK, Jackson PK, Bartek J & Lukas J. (2002) Nat. Cell Biol. 4: 317−322. | Article | PubMed | ISI | ChemPort |
  24. Mitelman F. (ed). (1995) An International System for Human Cytogenetic Nomenclature Karger: Basel, Switzerland.
  25. Nabney IT. (ed). (2002) NETLAB: Algorithms for Pattern Recognition Springer Publishers: Heidelberg, Germany.
  26. Neben K, Korshunov A, Benner A, Wrobel G, Hahn M, Golanov A & Lichter P. (2004) Cancer Res. 64: 3103−3111. | PubMed | ChemPort |
  27. Neubauer A, Dodge RK, George SL, Davey FR, Silver RT, Schiffer CA, Mayer RJ, Ball ED, Wurster-Hill D & Bloomfield CD. (1994) Blood 83: 1603−1611. | PubMed | ISI | ChemPort |
  28. Oro AE, Higgins KM, Hu Z, Bonifas JM, Epstein EH, Jr & Scott MP. (1997) Science 276: 817−821. | Article | PubMed | ISI | ChemPort |
  29. Pfaffl MW. (2001) Nucleic Acids Res. 29: e45. | Article | PubMed | ChemPort |
  30. Ragg T. (2002) AI. Commun. 15: 61−74.
  31. Reuter CW, Morgan MA & Bergmann L. (2000) Blood 96: 1655−1669. | PubMed | ChemPort |
  32. Schnittger S, Kinkelin U, Schoch C, Heinecke A, Haase D, Haferlach T, Buchner T, Wormann B, Hiddemann W & Griesinger F. (2000) Leukemia 14: 796−804. | Article | PubMed | ChemPort |
  33. Schnittger S, Schoch C, Dugas M, Kern W, Staib P, Wuchter C, Loffler H, Sauerland CM, Serve H, Buchner T, Haferlach T & Hiddemann W. (2002) Blood 100: 59−66. | Article | PubMed | ISI | ChemPort |
  34. Schoch C, Kohlmann A, Schnittger S, Brors B, Dugas M, Mergenthaler S, Kern W, Hiddemann W, Eils R & Haferlach T. (2002) Proc. Natl. Acad. Sci. USA 99: 10008−10013. | Article | PubMed | ChemPort |
  35. Thiede C, Steudel C, Mohr B, Schaich M, Schakel U, Platzbecker U, Wermke M, Bornhauser M, Ritter M, Neubauer A, Ehninger G & Illmer T. (2002) Blood 99: 4326−4335. | Article | PubMed | ISI | ChemPort |
  36. Tibshirani R, Hastie T, Narasimhan B & Chu G. (2002) Proc. Natl. Acad. Sci. USA 99: 6567−6572. | Article | PubMed | ChemPort |
  37. Tusher VG, Tibshirani R & Chu G. (2001) Proc. Natl. Acad. Sci. USA 98: 5116−5121. | Article | PubMed | ChemPort |
  38. Valk PJ, Verhaak RG, Beijen MA, Erpelinck CA, Barjesteh van Waalwijk van Doorn-Khosrovani S, Boer JM, Beverloo HB, Moorhouse MJ, van der Spek PJ, Lowenberg B & Delwel R. (2004) N. Engl. J. Med 350: 1617−1628. | Article | PubMed | ChemPort |
  39. Weisberg E, Boulton C, Kelly LM, Manley P, Fabbro D, Meyer T, Gilliland DG & Griffin JD. (2002) Cancer Cell 5: 433−443. | Article |
  40. Yamamoto Y, Kiyoi H, Nakano Y, Suzuki R, Kodera Y, Miyawaki S, Asou N, Kuriyama K, Yagasaki F, Shimazaki C, Akiyama H, Saito K, Nishimura M, Motoji T, Shinagawa K, Takeshita A, Saito H, Ueda R, Ohno R & Naoe T. (2001) Blood 97: 2434−2439. | Article | PubMed | ISI | ChemPort |
  41. Yamashita A, Ohnishi T, Kashima I, Taya Y & Ohno S. (2001) Genes Dev. 15: 2215−2228. | Article | PubMed | ISI | ChemPort |
  42. Zhao S, Konopleva M, Cabreira-Hansen M, Xie Z, Hu W, Milella M, Estrov Z, Mills GB & Andreeff M. (2004) Leukemia 18: 267−275. | Article | PubMed | ChemPort |


We thank Heidi Kramer for her excellent technical assistance. This study was supported by two grants of the Bundesministerium für Bildung und Forschung (FKZ 01 KW 9937 and NGFN, 01 GR 0101). KN is scholar of the Deutsche José Carreras Leukämie-Stiftung e.V. (DJCLS 2001/NAT-3).



These links to content published by NPG are automatically generated