Introduction

Acute myeloid leukemia (AML) is a hematological malignancy characterized by the clonal expansion of myeloid blasts, resulting in impaired hematopoiesis and bone marrow failure1,2. The outcomes of AML patients are highly heterogeneous, and analysis of cytogenetic abnormalities is the backbone of risk stratification in AML3. In recent years, several somatic mutations, such as FLT3 and NPM1, were shown to be strongly prognostic and have been incorporated into risk categories of AML in both NCCN guidelines and ELN recommendations2,3.

Tumor microenvironment (TME) infrastructure, which comprises a variety of immune and stromal cell types (e.g., endothelial cells and fibroblasts) and extracellular components they secrete (e.g., cytokines, growth factors, hormones, and extracellular matrix), represents a chronic inflammatory, immunosuppressive, and proangiogenic intratumoral environment4,5,6,7. The TME not only plays an important role during tumor initiation, progression, and metastasis but also has profound implications for therapeutic efficacy and specificity8,9,10,11,12,13. AML myeloid blasts are able to adapt and grow in bone marrow environments with a significantly lower likelihood of detection and eradication by host immunosurveillance compared with other environments. Recent evidence has highlighted the importance of the bone marrow microenvironment in protecting leukemic stem cells (LSCs) from chemotherapy-induced cell death14. Therefore, efforts to characterize the TME signatures have drawn considerable attention in the field of solid tumors, as well as leukemia.

Estimation of STromal and Immune cells in Malignant Tumors using Expression data’ (ESTIMATE) is a method that uses gene expression signatures to infer the fraction of stromal and immune cells in tumor samples15. This algorithm has been employed to investigate the microenvironment of several solid tumors, such as gastric cancer16, breast cancer17 and glioblastoma18, and it has also been applied to estimate immune and stromal scores in AML patients19,20,21. Since gene mutations are important prognostic factors for AML, whether they individually have unique microenvironment features has not been determined to date.

In the current study, by downloading gene expression profiles for AML cohorts from The Cancer Genome Atlas (TCGA) database and analyzing the immune/stromal scores of patients based on the ESTIMATE algorithm, we characterized the gene mutation-associated microenvironment. Moreover, the identified immune- and stromal-relevant DEGs associated with some mutations were verified using the Gene Expression Omnibus (GEO) database.

Results

OS and impact of immune and stromal scores in AML patients

The complete gene expression profiles and clinical information of 173 AML patients were retrieved from the TCGA database for this study. The median follow-up period was 304 days (range, 28–2861 days) for the entire cohort, and the 2-year OS rate was 44.4% (95% confidence interval (CI), 35.6–52.8%). According to the ESTIMATE algorithm, immune scores ranged from 1329.53 to 3971.97, whereas stromal scores varied from -1888.81 to 435.75. Then, patients were divided into high- and low-score groups according to the median immune and stromal scores, respectively. Patients with high immune scores had significantly lower 2-year OS rates than did those with low immune scores (32.7% [95% CI 21.9–43.8%] vs 58.1% [95% CI 44.6–69.5%], P = 0.026, log-rank test, Fig. 1a). However, patients in the high stromal score group had similar 2-year OS rates to those in the low stromal score group (41.4% [95% CI 29.9–52.4%] vs 48.6% [95% CI 35.1–60.9%], P = 0.58, log-rank test, Fig. 1b).

Figure 1
figure 1

Immune scores and stromal scores are associated with AML OS and cytogenetic risk. (a) Kaplan–Meier survival analysis of high versus low immune score groups (log-rank test, P = 0.026). (b) Kaplan–Meier survival analysis of high versus low stromal score groups (log-rank test, P = 0.58). (c) Kaplan–Meier survival analysis of cytogenetic risk groups (log-rank test, P = 0.0006). (d) Distribution of immune scores within cytogenetic risk groups (P = 0.035). (e) Distribution of stromal scores within cytogenetic risk groups (P = 0.53). (f) Kaplan–Meier survival analysis of high versus low immune score groups in the intermediate and poor cytogenetic risk patients (log-rank test, P = 0.011). (g) Kaplan–Meier survival analysis of high versus low stromal score groups in the intermediate and poor cytogenetic risk patients (log-rank test, P = 0.14).

Relationship between immune/stromal scores and cytogenetic risk

Of the 173 patients, 32 (18.5%) were in the favorable cytogenetic risk group, 101 (58.4%) were in the intermediate cytogenetic risk group, 37 (21.4%) were in the poor cytogenetic risk group, and the remaining 3 (1.7%) belonged to the unknown group. As shown in Fig. 1c, patients in the intermediate cytogenetic risk group had a similar 2-year OS rate as those in the poor cytogenetic risk group (39.7% [95% CI 28.8–50.4%] vs 24.6% [95% CI 9.7–43.0%], P = 0.16, log-rank test), and they both had significantly lower 2-year OS rates compared with the favorable cytogenetic risk group (82.0% [95% CI 61.7–92.2%], P = 0.0009 and 0.0002, log-rank test).

The median immune scores in the favorable, intermediate and poor cytogenetic risk groups were 2209.75, 2738.40 and 2616.32, respectively (P = 0.035, ANOVA test). As shown in Fig. 1d, the immune score of the favorable cytogenetic risk group was significantly lower than those of the intermediate and poor cytogenetic risk groups (P = 0.0084 and 0.067, Mann–Whitney test, two-sided), and the immune score of the intermediate risk group was similar to that of the poor risk group (P = 0.44, Mann–Whitney test, two-sided). The median stromal scores in the favorable, intermediate and poor risk groups (− 1011.98, − 1050.87 and − 1122.03) were similar (P = 0.53, ANOVA test, Fig. 1e).

Overall, patients in the favorable cytogenetic risk group had significantly lower immune scores and a higher 2-year OS rate. However, there were no significant differences in the 2-year OS rate, immune scores or stromal scores between the intermediate and poor cytogenetic risk groups. Therefore, the intermediate and poor cytogenetic risk groups (n = 137) were grouped together for the subsequent analysis. After grouping AML patients with intermediate and poor cytogenetic risk by the median score, those with high immune scores still had significantly lower 2-year OS rates than those with the low immune scores (23.5% [95% CI 13.2–35.5%] vs 48.8% [95% CI 34.0–62.1%], P = 0.011, log-rank test, Fig. 1f). Furthermore, patients in the high stromal score group tended to have lower 2-year OS rates compared with patients in the low stromal score group (29.1% [95% CI 18.0–41.1%] vs 43.8% [95% CI 28.9–57.8%], P = 0.14, log-rank test, Fig. 1g).

Somatic mutation-associated immune/stromal scores in the intermediate and poor cytogenetic risk groups

To further explore the association between immune/stromal scores and mutations, patients in the intermediate and poor cytogenetic risk groups were divided into two subgroups based on whether the individual somatic mutation existed, and their immune and stromal scores are presented in Fig. 2g,h.

Figure 2
figure 2

Mutations associated immune and stromal scores in the intermediate and poor cytogenetic risk groups. (af) Kaplan–Meier survival analysis of genetic mutation versus WT groups. (g,h) Immune and stromal scores of the genetic mutation versus WT groups.

Poor prognostic mutations (RUNX1, TP53, ASXL1 and FLT3-ITD)

As shown in Fig. 2a–d, patients with the individual gene mutation had or tended to have lower 2-year OS rates than WT patients (RUNX1: 14.3% [95% CI 2.3–36.6%] vs 39.2% [95% CI 28.9–49.2%], P = 0.11, Log-rank test; TP53: 0% [95% CI 0–0%] vs 39.7% [95% CI 29.6–49.6%], P = 0.0001, Log-rank test; ASXL1: 33.3% [95% CI 0.9–77.4%] vs 35.5% [95% CI 26.1–45.0%], P = 0.25, Log-rank test; FLT3-ITD: 30.2% [95% CI 10.5–52.9%] vs 36.6% [95% CI 26.4–46.8%], P = 0.41, Log-rank test), respectively.

Patients in the RUNX1 and TP53 mutation groups had or tended to have higher immune scores and stromal scores than did those in the corresponding WT groups (RUNX1: immune score 2947.2 vs 2669.7, P = 0.15, Mann–Whitney test, two-sided; stromal score − 794.34 vs − 1098.615, P = 0.0008, Mann–Whitney test, two-sided. TP53: immune score 2993.49 vs 2670.13, P = 0.35, Mann–Whitney test, two-sided; stromal score − 923.14 vs − 1073.74, P = 0.089, Mann–Whitney test, two-sided). ASXL1 mutation patients had similar immune scores compared with WT patients (2642.07 vs 2685.88, P = 0.91, Mann–Whitney test, two-sided), but they tended to have higher stromal scores than WT patients (− 806.38 vs − 1072.71, P = 0.14, Mann–Whitney test, two-sided). However, both the immune scores and stromal scores of patients with FLT3-ITD mutation were lower or tended to be lower than those of WT patients (immune score 2565.07 vs 2762.73, P = 0.061, Mann–Whitney test, two-sided; stromal score -1255.38 vs -1010.38, P = 0.022, Mann–Whitney test, two-sided) (Fig. 2g,h).

Favorable prognostic mutations (NPM1 and biCEBPA)

Patients in the NPM1/biCEBPA mutation groups individually tended to have higher 2-year OS rates than those in the WT groups (NPM1: 43.3% [95% CI 26.6–59.1%] vs 31.7% [95% CI 20.9–42.9%], P = 0.48, Log-rank test, Fig. 2e; biCEBPA: 50.0% [95% CI 5.8–84.5%] vs 35.0% [95% CI 25.6–44.5%], P = 0.35, Log-rank test, Fig. 2f).

NPM1 mutation patients tended to have lower stromal scores than WT patients (− 1172.08 vs − 1050.87, P = 0.11, Mann–Whitney test, two-sided), but their immune scores were similar (2716.05 vs 2683.65, P = 0.48, Mann–Whitney test, two-sided). Patients with biCEBPA mutation had similar immune scores and stromal scores compared with WT patients (immune score 2688.10 vs 2683.34, P = 0.88, Mann–Whitney test, two-sided; stromal score − 1010.38 vs − 1070.96, P = 0.94, Mann–Whitney test, two-sided) (Fig. 2g,h).

The above analysis reflected that there were distinct relationships between somatic mutations and immune/stromal scores. RUNX1 and TP53 mutations were related to both higher immune scores and higher stromal scores, FLT3-ITD was related to both lower immune scores and lower stromal scores, ASXL1 mutation was only related to higher stromal scores, NPM1 was only related to lower stromal scores, and biCEBPA was not related to immune scores and stromal scores.

Identification of differentially expressed genes (DEGs) and functional enrichment analysis

As shown in Supplementary Figure S1, after the gene expression data of patients with intermediate and poor cytogenetic risk was analyzed, a certain number of upregulated and downregulated genes were identified for the individual somatic mutations RUNX1, TP53, ASXL1, FLT3-ITD, NPM1 and biCEBPA.

The results of GO term and KEGG pathway enrichment analysis are shown in Fig. 3. In general, the enrichment analysis results were consistent with the immune/stromal scores. In other words, both immune- and stromal-related GO terms and KEGG pathways were enriched for RUNX1, TP53 and FLT3-ITD mutations, which corresponded to their association with immune and stromal scores. Only stromal-related GO terms were enriched for ASXL1 mutation, which corresponded to its only association with stromal scores. However, discrepancies were observed for NPM1 and biCEBPA mutations: both immune- and stromal-related GO terms and KEGG pathways were enriched for NPM1 mutation, although it was only related to lower stromal scores; stromal-related GO terms were enriched for biCEBPA mutation, despite its lack of a relationship to immune/stromal scores.

Figure 3
figure 3figure 3

GO term and KEGG pathway enrichment analysis of DEGs. (a,b) GO terms and KEGG pathways of RUNX1 upregulated genes. (c,d) GO terms and KEGG pathways of ASXL1 upregulated genes. (e,f) GO terms and KEGG pathways of TP53 upregulated genes. (g,h) GO terms and KEGG pathways of FLT3-ITD downregulated genes. (i,j) GO terms and KEGG pathways of NPM1 downregulated genes. (k) GO terms of biCEBPA downregulated genes. Note: immune-related pathways are marked red, and stromal-related pathways are marked blue.

Characteristics and prognostic significance of mutation-associated immune/stromal cell-relevant DEGs

Mutation-associated immune/stromal score-relevant DEGs were selected according to ESTIMATE algorithm gene lists and are shown in Supplementary Table S1. Overall, the number of immune/stromal cell-relevant DEGs was consistent with immune/stromal scores (Table 1): for RUNX1 and TP53 mutations, the majority of both immune and stromal cell-relevant DEGs were upregulated, which corresponded to their associated higher immune and stromal scores; for the ASXL1 mutation, only several stromal cell-relevant DEGs were upregulated, which corresponded to its associated higher stromal scores; the majority of both immune and stromal cell-relevant DEGs for FLT3-ITD and the majority of stromal cell-relevant DEGs for NPM1 mutation were downregulated, which corresponded to their associated lower scores; for biCEBPA mutation, almost no immune and stromal cell-relevant DEGs were observed, which corresponded to its lack of a relationship with immune and stromal scores. The only exception was that the majority of immune cell-relevant DEGs for NPM1 mutations were downregulated, despite their lack of association with immune scores.

Table 1 Characteristics of somatic mutation-associated immune/stromal cell-relevant DEGs.

There were overlaps among the individual mutation-associated immune/stromal cell-relevant DEGs (Supplementary Fig. S2). The common upregulated stromal cell-relevant DEGs among RUNX1, TP53 and ASXL1 mutations included DDR2 and FRZB. The common downregulated immune cell-relevant DEGs between FLT3-ITD and NPM1 mutations included CD3D, CD48, GBP1, and IL18RAP. The common downregulated stromal cell-relevant DEGs between FLT3-ITD and NPM1 mutations included BGN, CDH5, COL1A2, COL6A3, CXCL12, DCN, FRZB, ISLR, ITIH5, MXRA5, TRAT1, and VCAM1 (n = 12). Furthermore, FRZB was the only DEG associated with all 5 mutations (RUNX1, TP53, ASXL1, FLT3-ITD and NPM1), ITIH5 was associated with RUNX1, ASXL1, FLT3-ITD and NPM1 mutations, and ISLR was associated with TP53, ASXL1, FLT3-ITD and NPM1 mutations.

The intermediate and poor cytogenetic risk patients were grouped by the median transcription levels of the individual mutation-associated immune/stromal cell-relevant DEGs to evaluate their effects on OS. The following genes were found to have prognostic significance, which was consistent with that of the corresponding mutation: high expression of SCUBE2 (RUNX1 mutation-associated upregulation) was shown to be related to lower OS, high expression of SPON1 (TP53 mutation-associated upregulation) was shown to be related to lower OS, high expression of GREM1 (ASXL1 mutation-associated upregulation) was shown to be related to lower OS, low expression of COL3A1, CXCL12, EMCN, FRZB, ITIH5, KDR, MXRA5, TART1, and VCAM1 (FLT3-ITD mutation-associated downregulation) were shown to be related to lower OS, high expression of ADAMTS5 (NPM1 mutation-associated upregulation) was shown to be related to higher OS, and low expression of SPON1 (NPM1 mutation-associated downregulation) was shown to be related to higher OS (Supplementary Fig. S3).

Validation in the GEO database and identification of hub genes

To verify whether these immune and stromal cell-relevant DEGs identified from TCGA AML patients are also associated with mutations in an independent AML cohort, we analyzed the gene expression levels of 524 AML cases from GSE14468, for which FLT3-ITD, NPM1 and biCEBPA mutation data were available. For immune score-involved genes, 7/7 of FLT3-ITD-associated, 13/13 of NPM1 mutation-associated and 3/3 biCEBPA mutation-associated DEGs were confirmed to be significantly related to the individual gene mutation by GSE14468. Similarly, for stromal score-involved genes, 7/22 FLT3-ITD-associated, 20/32 NPM1 mutation-associated and 2/2 biCEBPA mutation-associated DEGs were also confirmed. The confirmed mutation-associated immune and stromal cell-relevant DEGs are shown in Table 2.

Table 2 FLT3-ITD, NPM1 and biCEBPA mutations-associated immune and stromal cell-relevant DEGs.

To further investigate the interaction among FLT3-ITD/NPM1-associated immune and stromal cell-relevant DEGs validated by GSE14468, we constructed PPI networks based on the STRING database and Cytoscape software. Then, we identified the top 10 FLT3-ITD/NPM1-associated hub genes by applying cytoHubba (Fig. 4). Node degree of importance is represented by circle color. As a result, the FLT3-ITD-associated hub genes were as follows: LCK, CD48, CD3D, IL2RB, CXCL12, VCAM1, DCN, IL10RA, TRAT1 and IL18RAP. NPM1-associated hub genes included VCAM1, CD3D, ZAP70, HLA-DRA, HLA-DPB1, HLA-DPA1, IRF8, CD48, GZMB and GBP1. LCK and VCAM1 were determined to be the most important hub genes in the networks associated with the FLT3-ITD mutation and the NPM1 mutation, respectively.

Figure 4
figure 4

Results of degree algorithms from cytoHubba. (a) The PPI network of the top 10 hub genes associated with the FLT3-ITD mutation. (b) The PPI network of the top 10 hub genes associated with NPM1 mutations.

Discussion

A number of genetic mutations have presented immune microenvironment modulatory properties in solid tumors: EGFR mutations correlate with an immunosuppressive TME and may impact the antitumor immune response in NSCLC22,23; TP53 and KRAS mutations in lung adenocarcinoma can regulate the immune microenvironment to affect PD-1 blockade immunotherapy24,25; JAK1 or JAK2 mutations may lead to acquired resistance to PD-1 blockade immunotherapy in patients with melanoma26. Recurrent genetic mutations found in AML have been heavily studied to classify and predict the risk of relapse after treatment. According to the ELN and NCCN guidelines, RUNX1, ASXL1, TP53, FLT3-ITD, NPM1 and biCEBPA mutations have been involved in AML prognostic stratification2,3. Interactions between leukemic stem cells and other cells in the BM microenvironment are known to be vital for the maintenance and progression of chemotherapy-resistant AML14. LSCs can remodel the BM niche into a favorable environment for expansion or even induce leukemic transformation. Nonetheless, the relationships between recurrent genetic mutations and the immune microenvironment in AML have not been comprehensively described27.

In our study, we calculated the immune and stromal scores for AML patients from the TCGA database based on the ESTIMATE algorithm. Our results showed that immune scores were significantly associated with OS and cytogenetic risk; a high immune score was a significantly poor prognostic factor for both the entire cohort and patients in the intermediate and poor cytogenetic risk groups, and a high stromal score tended to be correlated with poor OS in the intermediate and poor cytogenetic risk groups. The prognostic significance of immune and stromal scores was different in solid tumors: high immune and stromal scores correlated with poor survival in glioblastoma18, clear cell renal cell carcinoma (ccRCC)28 and gastric cancer16, whereas high immune and stromal scores correlated with better survival in cervical squamous cell carcinoma (CSCC)29 and pancreatic ductal adenocarcinoma (PDAC)30. These studies demonstrated the varied effects of immune and stromal scores on prognosis, and these effects are related to tumor type.

Due to the similar OS rate, immune scores and stromal scores between the intermediate and poor cytogenetic risk groups, we considered these patients as a single group to explore the characteristics of the somatic mutation-associated immune microenvironment in AML. We compared the immune/stromal scores, identified DEGs between the mutation and WT groups, conducted functional enrichment analysis of DEGs and selected somatic mutation-associated immune/stromal cell-relevant DEGs. We found that similar to the impact of immune and stromal scores on prognosis, distinct relationships existed between somatic mutations and immune/stromal scores. In other words, RUNX1, ASXL1 and TP53 mutations were related to higher immune or stromal scores, whereas FLT-ITD mutation was related to lower immune and stromal scores, although they were all poor prognostic mutations. Furthermore, patients with NPM1 mutation had lower stromal scores, while patients with biCEBPA mutation showed similar immune and stromal scores. despite their favorable prognostic risks.

The functional enrichment analysis and immune/stromal cell-relevant DEGs were generally consistent with the immune/stromal scores for individual genetic mutations. There are unique and common genes among mutations associated with immune/stromal cell-relevant DEGs. The results obtained in this study demonstrated that RUNX1, TP53 and ASXL1 mutation-associated characteristics of the microenvironment are similar. Reports have revealed the pro-inflammatory impact of RUNX1, TP53 and ASXL1 mutations on the immune microenvironment. RUNX1 mutation has been shown to activate NF-κB signaling and has been proposed to promote inflammatory signaling pathways in the bone marrow microenvironment31. Previous reports have shown that TP53 mutations induce pro-inflammatory effects on epithelial cells through NF-κB-mediated production of inflammatory cytokines. Moreover, TP53 mutations in CAFs are associated with pro-tumor and pro-inflammatory effects through enhanced production of cytokines and chemokines, including CXCL12, SDF-1 and IL-6, which notably affect the immune microenvironment32,33. ASXL1 mutation is one of the most frequently observed mutations leading to clonal hematopoiesis (CH), which have been known to show elevated inflammation, impaired tumor suppressor function, and risk of eventual hematological malignancy (HM)34,35,36. Patients with NPM1 mutation and FLT3-ITD mutation not only had similar lower scores but also had multiple common immune/stromal cell-relevant DEGs, which were not consistent with their opposite prognostic significance. Our results indicated that there might be a common mechanism on the impact of NPM1 and FLT3-ITD mutations on the bone marrow microenvironment, which remains to be explored.

Notably, several mutations associated with immune/stromal cell-relevant DEGs were observed to have prognostic significance in intermediate and poor cytogenetic risk patients. The results implied that these specific genes may play an important role in the formation of a mutation-associated microenvironment and may affect the survival of AML patients. Moreover, the majority of immune/stromal cell-relevant DEGs were confirmed to be significantly correlated with FLT3-ITD, NPM1, and biCEBPA mutations in the GSE14468 database. PPI networks were subsequently built based on the verified FLT3-ITD/NPM1-associated immune and stromal cell-relevant DEGs, and the top 10 hub genes were subsequently identified by the degree of interaction.

LCK (lymphocyte-specific protein tyrosine kinase) was the most significant hub gene associated with the FLT3-ITD mutation. LCK plays an essential role in the selection and maturation of developing T cells in the thymus, the activation of mature T-cells and the initiation of T cell antigen receptor (TCR) signal transduction pathways37. Studies have indicated higher expression of LCK in leukemic cells from less differentiated cases of AML (AML-0 and AML-1)38. A recent report found that LCK is overexpressed and mutated in CTV-1 cells (AML-M5 cell lines)39. Nonetheless, the expression of LCK in FLT3-ITD-mutated cells has not been studied to date. In the present study, the downregulation of LCK was correlated with the FLT3-ITD mutation. In a study of a zebrafish model40, FLT3 was found to initiate definitive hematopoietic stem cells, and the knockdown of FLT3 reduced hematopoiesis. The expression of the FLT3-ITD mutation resulted in the expansion of myeloid cells and the reduction of T cells. These results suggest that the FLT3-ITD mutation decreases the expression of LCK and reduces the production of functional T cells.

VCAM1 (vascular cell adhesion molecule-1) was shown to be the most significant hub gene associated with NPM1 mutation. VCAM1 is a cell adhesion molecule primarily expressed on endothelial cells, and its expression is induced by pro-inflammatory cytokines, such as TNFα41,42. VCAM1 has been identified to regulate vascular adhesion and transendothelial migration by binding to VLA-4 (very late antigen-4, an α4β1 integrin) on leukocytes. VCAM1 binding to VLA-4 confers AML blast cell protection from chemotherapy-induced apoptosis43,44. In our study, the downregulation of VCAM1 was confirmed to be correlated with NPM1 mutation. Although no study to date has explored the function of VCAM1 in NPM1-mutated AML patients, we speculated that the downregulation of VCAM1 may reduce the stroma-mediated protection of leukemic cells, which might confer favorable outcomes to AML patients with NPM1 mutations.

In conclusion, we focused on the relationship between recurrent genetic mutations and the immune microenvironment in AML patients based on TCGA database by integrated bioinformatic approaches. Important immune and stromal cell-relevant DEGs that affected the immune landscape of patients with individual gene mutations were identified and validated. Considering the specific properties of the hematopoietic microenvironment of leukemia15, ESTIMATE may not accurately predict infiltrating stromal and immune cells for the AML microenvironment, and we need to develop a more suitable and accurate algorithm. Due to the limited patient numbers in mutation subgroups in the TCGA database, further investigation of these mutation-associated stromal and immune signatures in large clinical AML patient cohorts is warranted, which may provide new prognostic biomarkers to achieve precision tumor therapy. Our results may help to elucidate how AML genetic mutations modulate the immune microenvironment to better guide personalized immunotherapy in the era of precision medicine45.

Materials and methods

Database

The transcriptional profiles and clinical and overall survival (OS) data of 173 AML patients were downloaded from the TCGA database (https://portal.gdc.cancer.gov/). The gene expression profile was measured experimentally using the Illumina HiSeq 2000 RNA Sequencing platform. Log2 transformations were performed for all gene expression data. Immune and stromal scores were calculated by applying the ESTIMATE algorithm15 to the mRNA expression data (https://bioinformatics.mdanderson.org/estimate/). The definitions of cytogenetic risk and risk-related somatic mutations were based on NCCN guidelines2.

For validation of the mutation-associated microenvironment signatures obtained from TCGA data, GSE14468 based on GPL570 (Affymetrix Human Genome U133 Plus 2.0 Array) were downloaded from the GEO database (https://www.ncbi.nlm.nih.gov/geo/), including cytogenetic risk, FLT3-ITD, NPM1 and CEBPA mutations.

Identification of differentially expressed genes

AML patients in the intermediate and poor cytogenetics risk categories were divided into mutation and wild-type (WT) groups according to the individual somatic mutation status (RUNX1, TP53, ASXL1, FLT3-ITD, and NPM1). For CEBPA, patients with biallelic CEBPA mutation were classified as the biCEBPA group, and patients with monoallelic mutation or wild-type CEBPA were classified as the WT group. Differentially expressed genes (DEGs) were identified using the limma package in R software (version 3.6.2; https://www.r-project.org/). Genes with |log2FC|> 1.0 and adjusted P values (q values) < 0.05 were selected as DEGs. Volcano plots were generated using the ggplot2 package in R software.

Functional enrichment analysis of DEGs

Functional enrichment analysis of DEGs was performed based on clusterProfiler, enrichplot, org.Hs.eg.db, and ggplot2 packages to identify the Gene Ontology (GO) categories, including biological processes (BP), cellular components (CC), and molecular functions (MF). Pathway enrichment analysis based on the Kyoto Encyclopedia of Genes and Genomes (KEGG) database46,47 was also conducted using these packages. Upregulated and downregulated DEGs were annotated by functional enrichment analyses, and FDR (false discovery rate) < 0.05 was considered to be significant. The top 10 GO terms in each of the BP, CC, MF and top 30 KEGG pathways are presented using bar plots.

Immune/stromal cell-relevant DEGs and overall survival analysis

According to genes selected by the ESTIMATE algorithm, immune/stromal cell-relevant DEGs of each mutation group were identified. To explore the prognostic value of these immune/stromal cell-relevant DEGs in predicting the overall survival of AML patients in the intermediate and poor cytogenetic risk groups, Kaplan–Meier survival curves were generated by the "survival” package in R software using the log-rank test. P values < 0.05 were considered to be significant.

Protein–protein interaction (PPI) network and hub genes

Protein–protein interaction (PPI) network construction of FLT3-ITD/NPM1-associated immune and stromal cell-relevant DEGs validated by GSE14468 was based on the STRING online database (version 11.0; https://string-db.org/) and Cytoscape software (version 3.6.0; https://cytoscape.org/). We used cytoHubba to identify the top 10 hub genes according to the degree algorithm.

Statistical analysis

Comparisons of immune and stromal scores among cytogenetic risk groups were performed using one-way analysis of variance. Comparisons of immune and stromal scores between the mutation and WT groups were performed using the Mann–Whitney U test. Survival functions were estimated using the Kaplan–Meier method and were compared using the log-rank test. The SPSS Statistics 22.0 (IBM Corp. in Armonk, NY; https://www.ibm.com/) and GraphPad Prism 5.0 (GraphPad Software, La Jolla California USA, www.graphpad.com) were used for the data analysis.