Introduction

GATA binding protein 2 (GATA2) belongs to the GATA family of transcription factors which regulate hematopoietic stem cell proliferation and differentiation1,2. GATA2 mutations have been reported in acute myeloid transformation of chronic myeloid leukemia (CML)3, familial myelodysplastic syndrome-related acute myeloid leukemia (MDS/AML), pediatric MDS4,5, Emberger syndrome6, and monocytopenia and mycobacterial infection (MonoMAC) syndrome7,8. Mutations of GATA2 are also identified in AML patients, with an incidence varied from 3.6% in patients with French-American-British (FAB) M5 subtype4 to 8.1–14.4% in non-selected AML patients9,10,11.

Somatic GATA2 mutations mainly cluster in the two zinc finger (ZF) domains, which can occupy GATA DNA motif in thousands of genes9. The patterns of somatic GATA2 mutations differ among myeloid diseases. ZF1 mutations predominate in AML, and ZF2 mutations are frequently identified in CML blastic phase3. GATA2 mutations are strongly associated with CEBPA double mutations (CEBPAdouble-mut)9,10,12. However, discrepancies exist among different reports regarding prognostic impact of GATA2 mutations in AML patients10,13. We hypothesize that mutations in different domains of GATA2 may have distinct impact on clinico-biological features and outcomes in AML patients, like IDH2 mutations in which IDH2 R172 is associated with gene mutations and clinical outcomes different from other IDH mutations14. However, little is known about this issue till now.

In this study, we investigated the clinical and prognostic relevance of mutations in different GATA2 domains in a large cohort of 693 unselected de novo non-M3 AML patients. To our knowledge, this is the first study to show GATA2 ZF1 mutations are associated with distinct clinical features, gene mutations, and outcomes different from ZF2 mutations. Longitudinal follow-ups were also performed in 419 samples from 124 patients to evaluate the dynamic changes of the mutations. Furthermore, we analyzed the global gene expression profiles in 328 patients to interrogate the possible molecular pathways associated with mutations in different GATA2 domains.

Methods and materials

Subjects

We consecutively enrolled 693 newly diagnosed de novo non-M3 AML patients at the National Taiwan University Hospital (NTUH) from 1994 to 2011. Diagnosis and classification of AML were made according to the FAB Cooperative Group Criteria and the 2016 WHO classification15. To focus on a more homogeneous group of patients with de novo AML, those with antecedent hematological diseases, history of cytopenia, and family history of myeloid neoplasms or therapy-related AML were excluded16. Survival analyses were performed in 469 (67.7%) patients who received standard chemotherapy. This study was approved by the Institutional Review Board of the NTUH, and written informed consents were obtained from all participants in accordance with the Declaration of Helsinki.

Cytogenetics

Chromosomal analyses were performed as described previously17. Karyotypes were classified using Medical Research Council (MRC) risk groups18.

Mutation analysis

Mutation analysis of GATA2 exons 2–612 and 20 other genes, including FLT3-ITD19, FLT3-TKD19, NRAS19, KRAS19, KIT19, PTPN1120, CEBPA21, RUNX122, MLL-PTD23, ASXL124, IDH125, IDH225, TET226, DNMT3A16, SF3B127, SRSF227, U2AF127, NPM128, WT129, TP5330, and ETV631 were performed by Sanger sequencing as previously described for patients (n = 455) diagnosed from 1994 to 2007. For patients (n = 238) diagnosed after 2008, Ion torrent next-generation sequencing (NGS) (Thermo Fisher Scientific, MA, USA) was performed32. Serial analyses of mutations at diagnosis, complete remission (CR), and relapse were performed in 419 samples from 124 patients by targeted NGS using TruSight Myeloid Panel (Illumina, San Diego, CA, USA). HiSeq platform (Illumina) was used for sequencing with a median reading depth of 12,000× 32.

Functional annotation analysis of GATA2 mutation-regulated genes

We analyzed the differentially expression genes associated with GATA2 mutations by the knowledge-based Ingenuity Pathway Analysis (IPA) (Qiagen, Redwood City, CA) software for associated functions. We also used Gene Set Enrichment Analysis (GSEA) software to investigate systematic enrichments of GATA2 mutation-governed expressional profile in biological functions33. Statistical significance of the degree of enrichment was assessed by a 1000-time random permutation test.

Statistical analysis

The discrete variables were compared using the χ2 tests, but if the expected values of contingency tables were <5, Fisher’s exact test was used. Mann–Whitney U tests were used to compare continuous variables and medians of distributions. Overall survival (OS) was measured from the date of first diagnosis to the date of last follow-up or death from any cause. Disease-free survival (DFS) was measured from the date of diagnosis until treatment failure, relapse from CR, or death from any cause, whichever occurred first. To ameliorate the influence of hematopoietic stem cell transplantation (HSCT) on survival, DFS and OS were censored at the time of HSCT in patients receiving the treatment34. Multivariate Cox proportional hazard regression analysis was used to investigate independent prognostic factors for OS and DFS. A P value <0.05 was considered statistically significant. All statistical analyses were performed with the SPSS 18 (SPSS Inc., Chicago, IL, USA) and StatsDirect (Cheshire, England, UK).

Results

GATA2 mutations in patients with AML

Excluding two single-nucleotide polymorphisms (A164T, M400T)35 and eight missense mutations (N114T, M223I, P250A, A256V, L315P, C319F, V369A, S429T) with unknown biologic significance (because they were not reported previously and could not be verified because of lack of matched bone marrow samples in CR), we identified 44 distinct GATA2 mutations in 43 (6.2%) of 693 patients (Fig. 1). Forty GATA2 mutations were missense mutations. The other four were in-frame deletion or duplication: p.Ser201*(c.598_599insG) in two, p.Thr387_Gly392del (c.1160_1177delCCATGAAGAAGGAAGGGA) and G210dup (c.631_632insGCG) in one each. With regard to the functional sites, 31 mutations were clustered in the highly conserved N-terminal ZF domain (ZF1 domain), and other 10 mutations were within C-terminal ZF domain (ZF2 domain). The remaining three mutations scattered outside of the ZF domains. The most common mutations were A318V (n = 4), followed by L321F and A318T (n = 3 each). p.Ser201*(c.598_599insG), N297S, A318G, G320V, L321H, and K324E occurred in two patients each. All other mutations were detected in only one patient each (Table 1). Only one patient had two GATA2 mutations (patient no. 20). All mutations were heterozygous. The mutant burden ranged from 4.89 to 52% with a median of 39.07% in ZF1 mutations, and from 10.74 to 50.26% with a median of 36.16% in ZF2 mutations.

Fig. 1
figure 1

Patterns and locations of the 44 GATA2 mutations

Table 1 The mutation patterns in 43 patients with GATA2 mutations at diagnosis

Correlation of GATA2 mutations with clinical and laboratory features

Table 2 depicted the clinical characteristics of patients with and without GATA2 mutations. ZF1-mutated patients were younger (median, 39 years vs. 55 years, P = 0.004), and had higher incidence of FAB M1 subtype (56.7% vs. 22.1%, P < 0.0001), but lower incidence of FAB M4 subtype (3.3% vs. 28.1%, P = 0.003) than GATA2-wild patients. ZF1-mutated patients also had a higher incidence of FAB M1 subtype than ZF2-mutated patients (P = 0.044). The patients with ZF2 mutations showed similar clinical features to the GATA2-wild group, including peripheral white blood cell counts (median, 47.3 vs. 18.7 k/µL), incidences of FAB M1 subtype (20% vs. 22.1%), and M4 subtype (20% vs. 28.1%).

Table 2 Comparison of clinical and laboratory features between AML patients with GATA2 ZF1 domain and ZF2 domain mutations

Association of GATA2 mutations with cytogenetics abnormalities

Chromosome data were available in 669 patients at diagnosis, including 43 GATA2-mutated and 626 GATA2-wild patients (Supplementary Table 1). Totally, GATA2 mutations were closely associated with intermediate-risk cytogenetics. Compared to GATA2-wild patients, ZF1-mutated patients had more intermediate-risk cytogenetics (100% vs. 70.9%, P < 0.0001), normal karyotype (73.3% vs. 46.5%, P = 0.004), and t(3;3) (6.7% vs. 1.0%, P = 0.048), but less favorable-risk (0% vs. 13.6%, P = 0.024) or unfavorable-risk cytogenetics (0% vs. 15.5%, P = 0.014). There was no association of ZF1 mutations with other chromosomal abnormalities, including +8, +11, +13, and +21.

Association of GATA2 mutations with other molecular alterations

To investigate the interaction of GATA2 ZF1 and ZF2 mutations with other genetic alterations in the pathogenesis of adult AML, a complete mutational screening of 20 other genes was performed. Only ZF1-mutated patients had a significantly higher frequency of CEBPAdouble-mut (66.7% vs. 6.7%, P < 0.0001) than wild-type patients, but not ZF2-mutated patients (Table 3). ZF1-mutated patients had lower frequencies of NPM1 mutations (0% vs. 22%, P = 0.004) and FLT3-ITD (4% vs. 19.9%, P = 0.024) than wild-type patients. In contrast, ZF2-mutated patients had similar frequencies of NPM1 mutations (30%) and FLT3-ITD (20%) to those with wild type of GATA2. Both ZF1 and ZF2 mutations were mutually exclusive with KRAS, WT1, IDH1, TP53, and ETV6 mutations (Table 3).

Table 3 Comparison of other genetic alterations between AML patients according to GATA2 mutation domain

Impact of different GATA2 domains mutations on treatment response and clinical outcomes

Of the 469 AML patients, including 27 GATA2 ZF1-mutated and nine GATA2 ZF2-mutated patients, undergoing conventional intensive induction chemotherapy, 352 (75.1%) patients achieved a CR. The CR rate was 85.2% in ZF1-mutated patients and 60% in ZF2-mutated patients (Table 2). The relapse rate was similar between the two groups.

With a median follow-up time of 78.6 months (ranges, 0.1–236 months), patients with GATA2 mutations as a whole had a trend of longer OS (5-year survival rate, 56% vs. 43%, P = 0.078) and DFS (median, 32.9 vs. 8.8 months, P = 0.091) than those without GATA2 mutations (Supplementary Figure 1). Focusing on the prognostic implication of mutation sites, patients with GATA2 ZF1 mutations had a significantly better OS (5-year survival rate, 72% vs. 43%, P = 0.003) and DFS than GATA2-wild patients (median, 91.2 vs. 8.8 months, P = 0.022) (Fig. 2). In contrast, patients with GATA2 ZF2 mutations had similar OS (5-year survival rate, 31%, P = 0.297) and DFS (median, 4.4 months, P = 0.882) as the GATA2-wild group. Intriguingly, ZF1 mutations were also associated with better OS compared with ZF2 mutations (P = 0.001) (Fig. 2). In intermediate-risk cytogenetics group, ZF1-mutated patients had significantly superior OS (5-year survival rate, 72% vs. 39%, P = 0.009) and DFS (median, 91.2 vs. 7.8 months, P = 0.006) than GATA2-wild patients, and a longer OS (5-year survival rate, 72% vs. 31%, P = 0.007) and a trend toward longer DFS (median, 91.2 vs. 4.4 months, P = 0.133) than ZF2-mutated patients (Fig. 3). The finding also held true in normal karyotype subgroup (Supplementary Figure 2). Multivariate analysis demonstrated that ZF1 mutation was an independent favorable prognostic factor for OS (HR 0.207, 95% CI 0.066–0.652, P = 0.007) and DFS (HR 0.529, 95% CI 0.295–0.948, P = 0.032) irrespective of age, white blood cell counts, cytogenetics, NPM1, and FLT3-ITD status. However, the prognostic independence of ZF1 mutation was lost if we included CEBPAdouble-mut as a covariable (Supplementary Table 2). We could not find the survival difference stratified by the degree of mutational burden in either ZF1 or ZF2-mutated patients (data not shown). Allo-HSCT in CR1 for ZF1-mutated patients did not offer survival benefit compared to postremission chemotherapy alone (data not shown).

Fig. 2
figure 2

Kaplan–Meier survival curves for OS (a) and DFS (b) stratified by the GATA2 mutation status and the sites of mutations in 467 AML patients who received standard intensive chemotherapy. Patients with GATA2 ZF1 mutations had a significantly better OS (5-year survival rate, 72% vs. 43%, P = 0.003) and DFS than GATA2-wild patients (median, 91.2 vs. 8.8 months, P = 0.022). Patients with GATA2 ZF2 mutations had similar OS (5-year survival rate, 31%, P = 0.297) and DFS (median, 4.4 months, P = 0.882) as the wild-type group. ZF1 mutations were also associated with better OS compared with ZF2 mutations (P = 0.001)

Fig. 3
figure 3

Kaplan–Meier survival curves for OS (a) and DFS (b) stratified by the GATA2 mutation status and the sites of mutations in 328 intermediate-risk cytogenetics patients who received standard intensive chemotherapy. Patients with GATA2 ZF1 mutations had a significantly better OS (5-year survival rate, 72% vs. 39%, P = 0.009) and DFS (median, 91.2 vs. 7.8 months, P = 0.006) than GATA2-wild patients. Patients with GATA2 ZF2 mutations had similar OS and DFS as the wild-type group (P = 0.504, P = 0.989, respectively). ZF1 mutations were also associated with a longer OS (5-year survival rate, 72% vs. 31%, P = 0.007) and a trend toward longer DFS (median, 91.2 vs. 4.4 months, P = 0.133) compared with ZF2 mutations

In CEBPAdouble-mut subgroup, GATA2 ZF1-mutated patients had a trend of longer OS (5-year survival rate, 76% vs. 68%, P = 0.075) and a significantly longer DFS (median, 91.2 vs. 14.0 months, P = 0.034) than GATA2-wild patients (Fig. 4). ZF1 mutations allowed further refinement of the clinical outcome of CEBPAdouble-mut patients. The small number of ZF2-mutated patients (n = 3) in this group did not allow statistically meaningful correlations.

Fig. 4
figure 4

Comparison of OS (a) and DFS (b) among CEBPAdouble-mut/GATA2 ZF1-mutated, CEBPAdouble-mut/GATA2-wild and CEBPA-wild AML patients who received standard intensive chemotherapy. CEBPAdouble-mut patients with GATA2 ZF1 mutations had a trend of longer OS (5-year survival rate, 76% vs. 68%, P = 0.075) and a significantly longer DFS (median, 91.2 vs. 14.0 months, P = 0.034) that those with wild-type GATA2. The small number of ZF2-mutated patients in CEBPAdouble-mut patients did not allow statistically meaningful correlations

Sequential studies of GATA2 mutations in AML patients

GATA2 mutations were serially studied in 419 samples from 124 patients who had ever obtained a CR and had available samples for study, including 19 patients with and 105 patients without GATA2 mutations at diagnosis (Table 4). Among the 19 GATA2-mutated patients who had paired samples, all lost the original GATA2 mutations at remission. Five of the six patients regained the original GATA2 mutations at first relapse, but one (no. 27) lost the mutation. In the former five patients, the mutation burden, compared to that at diagnosis, was increased in one patient (no. 25), decreased in two (nos. 13 and 16), and stable in the remaining two (nos. 5 and 9). One patient (no. 9) retained the co-occurring ASXL1 mutations at CR status. Among the 105 patients who had no GATA2 mutations at diagnosis, four patients (nos. 44, 45, 46, and 47) acquired novel GATA2 mutations at relapse (Table 4).

Table 4 Sequential studies in the AML patients with GATA2 mutationsa

GATA2 expression and biological functions associated with GATA2 mutations

We analyzed the microarray dataset of 328 patients studied to assess the impact of GATA2 mutations on gene expression and biological functions. By comparing the mRNA expression profiles between patients with and without GATA2 mutations, we found GATA2 expression levels were higher in those with GATA2 mutations (P = 0.003). More specifically, both ZF1 and ZF2 mutations correlated with higher GATA2 expression level compared to GATA2 wild-type. GATA2 mutations were associated with significant differential expression of 159 probes (t-test, P < 0.05 and >2-fold change). IPA analysis revealed different molecular networks between the GATA2 ZF1 and ZF2-mutated group (Supplementary Figure 3). We also performed the GSEA analysis to identify biological functions associated with genes significantly enriched in GATA2-mutated AML, compared with GATA2-wild AML. Three-hundred and thirteen patients with wild-type GATA2, 12 patients with GATA2 ZF1 mutations, and three patients with GATA2 ZF2 mutations were analyzed. We identified significant underrepresentation of genes hyper-methylated in AML (P = 0.006; normalized enrichment score (NES) = −1.49; Supplementary Figure 4A) and genes related to apoptosis (P = 0.042; NES = −1.33) in the ZF1-mutated patients compared to GATA2 wild-type patients. ZF2-mutations were associated with the Gene Oncology term of myeloid leukocyte differentiation (P = 0.03; NES = −1.46) (Supplementary Figure 4B). Comparing with ZF2-mutated AML, we identified significant overrepresentation of genes related to myeloid leukocyte differentiation (P = 0.042; NES = 1.36) and underrepresentation of genes hyper-methylated in AML (P = 0.029; NES = −1.37) in the ZF1-mutated AML.

Discussion

To the best of our knowledge, this is the first study to explore differences in clinical and biological implications between the GATA2 ZF1 and ZF2 mutations in AML patients. We found that mutations in different domains were associated with distinct clinical features, co-occurring mutations and outcomes (Supplementary Table 3).

The GATA2 mutation landscape in adult de novo AML differs from that in blastic crisis of CML3, familial MDS/AML4, and pediatric AML5. In adult AML, ZF1 mutations predominate, while ZF2 mutations are reported sporadically10,36,37. In concordance with the findings, two-thirds of the 44 distinct GATA2 mutations in our study were located in the ZF1 domain. We also reported two novel missense mutations in ZF2 domain (L359V and G366R) that had not been reported before in adult de novo AML patients, but ever identified in blastic crisis of CML.

AML with CEBPAdouble-mut has been included as a definite entity in the 2016 WHO Classification of Myeloid Neoplasms15. It is well established that GATA2 mutations frequently co-occur with CEBPAdouble-mut with an incidence of 18–41%9,10,12 and the two proteins show direct protein–protein interaction38. Further study revealed GATA2 ZF1 mutants, but not the ZF2 L359V that is commonly seen at the progression of CML to blast crisis, had reduced capacity to enhance CEBPA-dependent activation of transcription9. Based on this functional study and the frequent co-occurrence of CEBPAdouble-mut and ZF1 mutations, but not ZF2 mutations, in AML patients, it is possible that GATA2 ZF1 mutations and CEBPAdouble-mut interact together to induce leukemogenesis. In addition, we found ZF1 mutations were associated with lower incidences of NPM1 mutations and FLT3-ITD than wild-type GATA2, different from ZF2 mutations as ZF2-mutated patients had similar incidences of these two mutations to those in GATA2-wild patients. GATA2 ZF1 and ZF2 mutations may induce AML through different oncogenic mechanisms and have distinct impact on clinical outcomes. Truly, in this study, we demonstrated that patients with GATA2 ZF1 mutations had a significantly longer OS than ZF2-mutated patients in total cohort, as well as in patients with intermediate-risk cytogenetics and normal karyotype.

The prognostic impact of GATA2 mutations in CEBPAdouble-mut patients was conflicting12,13,37,39. Greif et al. and Theis et al. found that GATA2 mutations did not impact clinical outcome in CEBPAdouble-mut patients. On the contrary, GATA2 mutations correlated with improved survival among CEBPAdouble-mut patients in other reports12,13. In a study of Theis et al., 31 (74%) of GATA2 mutations were detected in ZF1 domain, and 11 (26%) in ZF2 domain. They did not show different clinical outcomes with respect to GATA2 ZF1 and ZF2 mutations in a cohort with both CEBPAdouble-mut and CEBPAsingle-mut patients39. We were the first to investigate the prognostic implication of GATA2 ZF1 mutations in CEBPAdouble-mut patients and showed its association with a better DFS and a trend of longer OS than wild-type GATA2 among the CEBPAdouble-mut subgroup.

The poor prognostic impact of GATA2 ZF2 mutations was also witnessed in blast crisis CML patients as in de novo AML patients shown in this study4. The reason that ZF1 and ZF2 mutations had different survival impacts on de novo AML patients might be partially explained by their difference in association with CEBPAdouble-mut, and by different oncogenic mechanisms. Further studies are warranted to explore the underlying mechanisms of the differences.

The study also recruited the largest number of de novo AML patients for sequential analyses of GATA2 mutations by NGS during clinical follow-ups. The original mutations in all 19 GATA2-mutated patients were lost at remission status, confirming them to be truly somatic mutations. We showed GATA2 mutation was not stable during disease evolution. One (no. 27) of the six patients with GATA2 mutations at diagnosis lost the mutation at relapse. Among the 105 patients who had no GATA2 mutations at diagnosis, four (nos. 44, 45, 46, 47) acquired novel GATA2 mutations at relapse. The four mutations were all ZF1 mutations.

In conclusion, GATA2 ZF1 mutations, but not ZF2 mutations, are closely associated with CEBPAdouble-mut, and inversely correlated with NPM1 mutations and FLT3-ITD. The two GATA2 ZF domain mutations have different impacts on OS in AML patients. GATA2 ZF1 mutations also affect clinical outcome in CEBPAdouble-mut patients. Incorporation of GATA2 ZF1, not ZF2 mutations, allows further refinement of the WHO Classification in the specific entity of AML with CEBPAdouble-mut.