Skip to main content

Thank you for visiting You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

A 17-gene stemness score for rapid determination of risk in acute leukaemia


Refractoriness to induction chemotherapy and relapse after achievement of remission are the main obstacles to cure in acute myeloid leukaemia (AML)1. After standard induction chemotherapy, patients are assigned to different post-remission strategies on the basis of cytogenetic and molecular abnormalities that broadly define adverse, intermediate and favourable risk categories2,3. However, some patients do not respond to induction therapy and another subset will eventually relapse despite the lack of adverse risk factors4. There is an urgent need for better biomarkers to identify these high-risk patients before starting induction chemotherapy, to enable testing of alternative induction strategies in clinical trials5. The high rate of relapse in AML has been attributed to the persistence of leukaemia stem cells (LSCs), which possess a number of stem cell properties, including quiescence, that are linked to therapy resistance6,7,8,9,10. Here, to develop predictive and/or prognostic biomarkers related to stemness, we generated a list of genes that are differentially expressed between 138 LSC+ and 89 LSC cell fractions from 78 AML patients validated by xenotransplantation. To extract the core transcriptional components of stemness relevant to clinical outcomes, we performed sparse regression analysis of LSC gene expression against survival in a large training cohort, generating a 17-gene LSC score (LSC17). The LSC17 score was highly prognostic in five independent cohorts comprising patients of diverse AML subtypes (n = 908) and contributed greatly to accurate prediction of initial therapy resistance. Patients with high LSC17 scores had poor outcomes with current treatments including allogeneic stem cell transplantation. The LSC17 score provides clinicians with a rapid and powerful tool to identify AML patients who do not benefit from standard therapy and who should be enrolled in trials evaluating novel upfront or post-remission strategies.

This is a preview of subscription content, access via your institution

Relevant articles

Open Access articles citing this article.

Access options

Rent or buy this article

Prices vary by article type



Prices may be subject to local taxes which are calculated during checkout

Figure 1: Analysis of LSC-specific GE identifies an optimal 17-gene prognostic signature.
Figure 2: LSC signature scores are associated with OS in multiple independent AML cohorts across different GE measurement platforms.
Figure 3: Impact of aSCT on patient outcome.
Figure 4: LSC17 score predicts therapy response.

Accession codes

Primary accessions

Gene Expression Omnibus


  1. Ferrara, F. & Schiffer, C. A. Acute myeloid leukaemia in adults. Lancet 381, 484–495 (2013)

    PubMed  Google Scholar 

  2. Grimwade, D. et al. Refinement of cytogenetic classification in acute myeloid leukemia: determination of prognostic significance of rare recurring chromosomal abnormalities among 5876 younger adult patients treated in the United Kingdom Medical Research Council trials. Blood 116, 354–365 (2010)

    CAS  PubMed  Google Scholar 

  3. Döhner, H. et al. Diagnosis and management of acute myeloid leukemia in adults: recommendations from an international expert panel, on behalf of the European LeukemiaNet. Blood 115, 453–474 (2010)

    PubMed  Google Scholar 

  4. Röllig, C. et al. Long-term prognosis of acute myeloid leukemia according to the new genetic risk classification of the European LeukemiaNet recommendations: evaluation of the proposed reporting system. J. Clin. Oncol. 29, 2758–2765 (2011)

    PubMed  Google Scholar 

  5. Walter, R. B. et al. Resistance prediction in AML: analysis of 4601 patients from MRC/NCRI, HOVON/SAKK, SWOG and MD Anderson Cancer Center. Leukemia 29, 312–320 (2015)

    CAS  PubMed  Google Scholar 

  6. Kreso, A. & Dick, J. E. Evolution of the cancer stem cell model. Cell Stem Cell 14, 275–291 (2014)

    CAS  PubMed  Google Scholar 

  7. Saito, Y. et al. Identification of therapeutic targets for quiescent, chemotherapy-resistant human leukemia stem cells. Sci. Transl. Med. 2, 17ra9 (2010)

    PubMed  PubMed Central  Google Scholar 

  8. Li, L. et al. SIRT1 activation by a c-MYC oncogenic network promotes the maintenance and drug resistance of human FLT3-ITD acute myeloid leukemia stem cells. Cell Stem Cell 15, 431–446 (2014)

    CAS  PubMed  PubMed Central  Google Scholar 

  9. Fong, C. Y. et al. BET inhibitor resistance emerges from leukaemia stem cells. Nature 525, 538–542 (2015)

    ADS  CAS  PubMed  PubMed Central  Google Scholar 

  10. Lechman, E. R. et al. miR-126 regulates distinct self-renewal outcomes in normal and malignant hematopoietic stem cells. Cancer Cell 29, 214–228 (2016)

    CAS  PubMed  PubMed Central  Google Scholar 

  11. Eppert, K. et al. Stem cell gene expression programs influence clinical outcome in human leukemia. Nature Med. 17, 1086–1093 (2011)

    CAS  PubMed  Google Scholar 

  12. Sarry, J. E. et al. Human acute myelogenous leukemia stem cells are rare and heterogeneous when assayed in NOD/SCID/IL2Rγc-deficient mice. J. Clin. Invest. 121, 384–395 (2011)

    CAS  PubMed  Google Scholar 

  13. Laurenti, E. et al. The transcriptional architecture of early human hematopoiesis identifies multilevel control of lymphoid commitment. Nature Immunol. 14, 756–763 (2013)

    CAS  Google Scholar 

  14. Novershtern, N. et al. Densely interconnected transcriptional circuits control cell states in human hematopoiesis. Cell 144, 296–309 (2011)

    CAS  PubMed  PubMed Central  Google Scholar 

  15. Verhaak, R. G. et al. Prediction of molecular subtypes in acute myeloid leukemia based on gene expression profiling. Haematologica 94, 131–134 (2009)

    PubMed  Google Scholar 

  16. Friedman, J., Hastie, T. & Tibshirani, R. Regularization paths for generalized linear models via coordinate descent. J. Stat. Softw. 33, 1–22 (2010)

    PubMed  PubMed Central  Google Scholar 

  17. Simon, N., Friedman, J., Hastie, T. & Tibshirani, R. Regularization paths for Cox’s proportional hazards model via coordinate descent. J. Stat. Softw. 39, 1–13 (2011)

    PubMed  PubMed Central  Google Scholar 

  18. Cancer Genome Atlas Research Network. Genomic and epigenomic landscapes of adult de novo acute myeloid leukemia. N. Engl. J. Med. 368, 2059–2074 (2013)

  19. Metzeler, K. H. et al. An 86-probe-set gene-expression signature predicts survival in cytogenetically normal acute myeloid leukemia. Blood 112, 4193–4201 (2008)

    CAS  PubMed  PubMed Central  Google Scholar 

  20. Grimwade, D., Ivey, A. & Huntly, B. J. Molecular landscape of acute myeloid leukemia in younger adults and its clinical relevance. Blood 127, 29–41 (2016)

    CAS  PubMed  PubMed Central  Google Scholar 

  21. Papaemmanuil, E. et al. Genomic classification and prognosis in acute myeloid leukemia. N. Engl. J. Med. 374, 2209–2221 (2016)

    CAS  PubMed  PubMed Central  Google Scholar 

  22. Levine, J. H. et al. Data-driven phenotypic dissection of AML reveals progenitor-like cells that correlate with prognosis. Cell 162, 184–197 (2015)

    CAS  PubMed  PubMed Central  Google Scholar 

  23. Gentles, A. J., Plevritis, S. K., Majeti, R. & Alizadeh, A. A. Association of a leukemic stem cell gene expression signature with clinical outcomes in acute myeloid leukemia. J. Am. Med. Assoc. 304, 2706–2715 (2010)

    CAS  Google Scholar 

  24. Jung, N., Dai, B., Gentles, A. J., Majeti, R. & Feinberg, A. P. An LSC epigenetic signature is largely mutation independent and implicates the HOXA cluster in AML pathogenesis. Nature Commun. 6, 8489 (2015)

    ADS  CAS  Google Scholar 

  25. Geiss, G. K. et al. Direct multiplexed measurement of gene expression with color-coded probe pairs. Nature Biotechnol. 26, 317–325 (2008)

    CAS  Google Scholar 

  26. Cornelissen, J. J. et al. The European LeukemiaNet AML Working Party consensus statement on allogeneic HSCT for patients with AML in remission: an integrated-risk adapted approach. Nature Rev. Clin. Oncol. 9, 579–590 (2012)

    CAS  Google Scholar 

  27. Kohlmann, A. et al. Gene expression profiling in AML with normal karyotype can predict mutations for molecular markers and allows novel insights into perturbed biological pathways. Leukemia 24, 1216–1220 (2010)

    CAS  PubMed  Google Scholar 

  28. Castaigne, S. et al. Effect of gemtuzumab ozogamicin on survival of adult patients with de-novo acute myeloid leukaemia (ALFA-0701): a randomised, open-label, phase 3 study. Lancet 379, 1508–1516 (2012)

    CAS  PubMed  Google Scholar 

  29. Hills, R. K. et al. Addition of gemtuzumab ozogamicin to induction chemotherapy in adult patients with acute myeloid leukaemia: a meta-analysis of individual patient data from randomised controlled trials. Lancet Oncol. 15, 986–996 (2014)

    CAS  PubMed  PubMed Central  Google Scholar 

  30. Klco, J. M. et al. Association between mutation clearance after induction therapy and outcomes in acute myeloid leukemia. J. Am. Med. Assoc. 314, 811–822 (2015)

    CAS  Google Scholar 

  31. Du, P., Kibbe, W. A. & Lin, S. M. lumi: a pipeline for processing Illumina microarray. Bioinformatics 24, 1547–1548 (2008)

    CAS  PubMed  Google Scholar 

  32. Ritchie, M. E. et al. limma powers differential expression analyses for RNA-sequencing and microarray studies. Nucleic Acids Res. 43, e47 (2015)

    Article  PubMed  PubMed Central  Google Scholar 

  33. Qiao, W. et al. PERT: a method for expression deconvolution of human blood samples from varied microenvironmental and developmental conditions. PLOS Comput. Biol. 8, e1002838 (2012)

    CAS  PubMed  PubMed Central  Google Scholar 

  34. Gautier, L., Cope, L., Bolstad, B. M. & Irizarry, R. A. affy—analysis of Affymetrix GeneChip data at the probe level. Bioinformatics 20, 307–315 (2004)

    CAS  PubMed  Google Scholar 

  35. Wu, J., Irizarry, R., MacDonald, J. & Gentry, J. Gcrma: background adjustment using sequence information. R package version 2.36.0 (2016)

  36. Dai, M. et al. Evolving gene/transcript definitions significantly alter the interpretation of GeneChip data. Nucleic Acids Res. 33, e175 (2005)

    PubMed  PubMed Central  Google Scholar 

  37. Macrae, T. et al. RNA-seq reveals spliceosome and proteasome genes as most consistent transcripts in human cancer cells. PLoS ONE 8, e72884 (2013)

    ADS  CAS  PubMed  PubMed Central  Google Scholar 

  38. Scott, D. W. et al. Determining cell-of-origin subtypes of diffuse large B-cell lymphoma using gene expression in formalin-fixed paraffin-embedded tissue. Blood 123, 1214–1217 (2014)

    CAS  PubMed  PubMed Central  Google Scholar 

  39. Nielsen, T. et al. Analytical validation of the PAM50-based Prosigna Breast Cancer Prognostic Gene Signature Assay and nCounter Analysis System using formalin-fixed paraffin-embedded breast tumor specimens. BMC Cancer 14, 177 (2014)

    PubMed  PubMed Central  Google Scholar 

  40. R Development Core Team. A Language and Environment for Statistical Computing (R Foundation for Statistical Computing, 2014)

  41. Cheson, B. D. et al. Revised recommendations of the International Working Group for Diagnosis, Standardization of Response Criteria, Treatment Outcomes, and Reporting Standards for Therapeutic Trials in Acute Myeloid Leukemia. J. Clin. Oncol. 21, 4642–4649 (2003)

    PubMed  Google Scholar 

  42. Gray, R. J. A class of K-sample tests for comparing the cumulative incidence of a competing risk. Ann. Stat. 16, 1141–1154 (1988)

    ADS  MathSciNet  MATH  Google Scholar 

  43. Fine, J. P. & Gray, R. J. A proportional hazards model for the subdistribution of a competing risk. J. Am. Stat. Assoc. 94, 496–509 (1999)

    MathSciNet  MATH  Google Scholar 

  44. Gray, B. cmprsk: subdistribution analysis of competing risks. R package version 2.2-7 (2014)

  45. Gerds, T. A. & Scheike, T. H . riskRegression: risk regression for survival analysis. R package version 0.0.8 (2016)

  46. Kanda, Y. Investigation of the freely available easy-to-use software ‘EZR’ for medical statistics. Bone Marrow Transplant. 48, 452–458 (2013)

    CAS  PubMed  Google Scholar 

  47. Mantel, N. & Byar, D. Evaluation of response-time data involving transient states: an illustration using heart transplant data. J. Am. Stat. Assoc. 69, 81–86 (1974)

    MATH  Google Scholar 

  48. Andersen, P. & Gill, R. D. Cox’s regression model for counting processes: a large sample study. Ann. Stat. 10, 1100–1120 (1982)

    MathSciNet  MATH  Google Scholar 

  49. Simon, R. & Makuch, R. W. A non-parametric graphical representation of the relationship between survival and the occurrence of an event: application to responder versus non-responder bias. Stat. Med. 3, 35–44 (1984)

    CAS  PubMed  Google Scholar 

  50. Therneau, T. M. & Grambsch, P. M. Modeling Survival Data: Extending the Cox Model (Springer, 2000)

  51. Harrell, F. E. Jr. rms: regression modeling strategies. R package version 4.4-1 (2016)

  52. Robin, X. et al. pROC: an open-source package for R and S+ to analyze and compare ROC curves. BMC Bioinformatics 12, 77 (2011)

    PubMed  PubMed Central  Google Scholar 

  53. Kundu, S., Aulchenko, Y. S., van Duijn, C. M. & Janssens, A. C. PredictABEL: an R psackage for the assessment of risk prediction models. Eur. J. Epidemiol. 26, 261–264 (2011)

    PubMed  PubMed Central  Google Scholar 

Download references


This work was supported by grants from the Ontario Institute for Cancer Research with funds from the province of Ontario, the Cancer Stem Cell Consortium with funding from the Government of Canada through Genome Canada and the Ontario Genomics Institute (OGI-047), and the Canadian Institutes of Health Research (CSC-105367), Canadian Cancer Society, Terry Fox Foundation, a Canada Research Chair to J.E.D., the Philip S. Orsino Chair in Leukemia Research to M.D.M., and a Collaborative Translational Cancer Research Grant from the Princess Margaret Cancer Centre (formerly Ontario Cancer Institute). This research was funded in part by the Leukemia & Lymphoma Society of Canada (493946) and the Stem Cell Network (492019), Ontario Graduate Scholarships, and the Ontario Ministry of Health and Long Term Care (OMOHLTC). The views expressed do not necessarily reflect those of the OMOHLTC. L.B. was supported in part by the Deutsche Forschungsgemeinschaft (Heisenberg-Professur BU 1339/8-1). T.H. was supported by the Wilhelm-Sander-Stiftung (grant 2013.086.1). K.M. and W.H. received grant support from Deutsche Forschungsgemeinschaft (DFG SFB 1243). We thank The Centre for Applied Genomics (Hospital for Sick Children) and the Princess Margaret Genomics Centre for the generation of GE data for the PM sorted cell fractions and validation cohort. We thank M. Pintilie for discussions regarding time-dependent covariates in survival analysis. We thank S. Geffroy for technical support and running the microarrays for the ALFA-0701 trial cohort.

Author information

Authors and Affiliations



S.W.K.N. developed the signature derivation workflow, identified, refined and validated prognostic and predictive signatures, designed the custom NanoString assay, processed and analysed GE data, and performed statistical analyses and bioinformatics. A.M., W.C.C., J.M. and A.P. carried out functional xenograft transplantation, RNA extraction for GE analysis, and provided technical support for experiments. J.A.K., N.I., A.A., V.G., A.D.S., A.C.S., K.W.Y. and M.D.M. provided clinical annotations for the PM AML cohort. M.D.M. provided PM AML samples. S.W.K.N., J.C.Y.W., J.E.D. and M.D.M. interpreted the data. W.H., W.E.B., B.W., T.B., D.G., L.B., K.M., T.H. and C.B. provided clinical annotations for the GSE15434 and GSE12417 data sets. M.C., C.P. and H.D. provided GE and clinical data for the ALFA-0701 trial cohort. P.J.M.V. and B.L. provided clinical annotations for the GSE6891 data set. J.C.Y.W. and J.E.D. supervised the study. S.W.K.N. and J.C.Y.W. wrote the paper. A.M., J.A.K., P.W.Z., J.E.D. and M.D.M. revised the paper.

Corresponding author

Correspondence to Jean C. Y. Wang.

Ethics declarations

Competing interests

The authors declare no competing financial interests.

Additional information

Reviewer Information Nature thanks F. Holstege, G. Schuurhuis and the other anonymous reviewer(s) for their contribution to the peer review of this work.

Extended data figures and tables

Extended Data Figure 1 Overview of LSC signature training and testing.

a, Clinical characteristics of the 78 patients analysed by xenotransplantation and microarray GE analysis. CMML, chronic myelomonocytic leukaemia; t-AML, therapy-associated AML; CN, cytogenetically normal. b, Schematic of the experimental protocol. c, d, Summary of functionally defined LSC+ and LSC fractions in each phenotypic cell population as a whole (c) and for each patient (d). Red and blue denote LSC+ and LSC, respectively. In d, each row represents fractions sorted from one patient sample. White boxes denote fractions that were not included in the analysis due to insufficient cell numbers for xenotransplantation and/or insufficient RNA. e, Strategy used to identify and test the 17 LSC signature genes. f, Key clinical characteristics of the GSE6891 signature training cohort. *P value calculated using the Wilcoxon rank-sum test; †P value calculated using the Student’s t-test; ‡P value calculated using Pearson’s chi-squared test; §P value calculated using log-rank test; ||P value calculated using Fisher’s exact test; ¶cytogenetic risk groups were defined as per GSE6891 investigators15.

Extended Data Figure 2 LSC17 and LSC3 scores are associated with survival in multiple AML cohorts.

a–n, q, Kaplan–Meier estimates of OS, EFS or RFS according to LSC17 scores in various patient cohorts, as indicated. In c, patients were also analysed according to whether or not CR was achieved after initial treatment (no CR, dotted lines; CR, solid lines). i, The subset of patients in the TCGA AML cohort with no clear genomic classification as defined previously21. o, Simon and Makuch estimates of OS, according to LSC17 scores and whether or not patients received aSCT (no aSCT, dotted lines; aSCT, solid lines). p, Kaplan–Meier estimates of OS of CN-LMR patients, according to LSC3 scores. In a–q, patients with scores above and below the median in each cohort are shown by red and blue lines, respectively. r, s, Kaplan–Meier estimates of RFS for patients with high (r) or low (s) LSC17 scores treated with standard chemotherapy with (red lines) or without (blue lines) addition of GO.

Extended Data Table 1 List of 104 DE LSC genes
Extended Data Table 2 Clinical characteristics of the TCGA AML cohort
Extended Data Table 3 Clinical characteristics of the GSE12417 CN-AML cohorts
Extended Data Table 4 Multivariate survival analysis of LSC17 and LSC3 scores
Extended Data Table 5 The LSC17 score refines genomic classifications
Extended Data Table 6 The LSC17 score improves survival association compared to other LSC signatures
Extended Data Table 7 Clinical characteristics of the PM AML and GSE15434 CN-LMR AML cohorts
Extended Data Table 8 Clinical characteristics and multivariate survival analysis of the ALFA-0701 AML cohort

Supplementary information

Supplementary Information

This file contains a Supplementary Discussion of the 17 LSC signature genes. (PDF 84 kb)

PowerPoint slides

Rights and permissions

Reprints and Permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Ng, S., Mitchell, A., Kennedy, J. et al. A 17-gene stemness score for rapid determination of risk in acute leukaemia. Nature 540, 433–437 (2016).

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI:

This article is cited by


By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.


Quick links

Nature Briefing: Cancer

Sign up for the Nature Briefing: Cancer newsletter — what matters in cancer research, free to your inbox weekly.

Get what matters in cancer research, free to your inbox weekly. Sign up for Nature Briefing: Cancer