Introduction

Chronic eosinophilic leukemia (CEL; previously also known as chronic eosinophilic leukemia, not otherwise specified), is an infrequent and challenging subtype of BCR::ABL negative myeloproliferative neoplasm (MPN) that primarily targets eosinophils—a crucial component of the body's immune response and allergy regulation1,2,3. Characterized by specific genetic attributes, CEL is set apart by the absence of PDGFRA, PDGFRB, or FGFR1 rearrangements and PCM1::JAK2, ETV6::JAK2, or BCR::JAK2 fusion genes2,4. This distinctive neoplasm is marked by the persistent proliferation of eosinophilic precursors, resulting in pronounced eosinophilia (> 1.5 × 109/L) in both peripheral blood and bone marrow2. Differentiating CEL from other eosinophilic disorders, often driven by specific genetic alterations or secondary factors, is crucial5. While the precise etiology of CEL remains elusive, its diagnosis hinges on the exclusion of alternative causes of eosinophilia. According to the 5th edition of the WHO classification of hematolymphoid tumors, three pivotal factors distinguish CEL from hypereosinophilic syndromes (HES): (a) sustained hypereosinophilia for 4 weeks (a notable change from the 6-month duration requirement in the 4th edition); (b) both the presence of clonality among eosinophils, as indicated by cytogenetic or molecular abnormalities; and (c) bone marrow morphology, such as megakaryocytic or erythroid dysplasia (not required in the 4th edition). Notably, increased blasts (≥ 2% in peripheral blood or 5–19% in bone marrow) are no longer considered an alternative to demonstrate clonality by the 5th edition of the WHO classification of hematolymphoid tumors3,5,6,7. Additionally, rigorous exclusion of non-myeloid malignancies and myeloid neoplasms with eosinophilia, such as acute myeloid leukemia (AML) with inv(16), chronic myeloid leukemia, myeloproliferative neoplasms, and myelodysplastic syndrome, is imperative8.

The clinical presentation of CEL is highly variable and contingent on the degree and duration of eosinophilia, the extent of organ involvement, and the risk of progression to AML9. Patients may experience a spectrum of symptoms, including fatigue, weakness, weight loss, fever, night sweats, splenomegaly, hepatomegaly, lymphadenopathy, and so on9,10. These complications can be life-threatening and require swift recognition and management10. Moreover, some patients with CEL may progress to AML, a more aggressive hematological malignancy with a less favorable prognosis9. As a result, patients with CEL necessitate regular follow-up and monitoring to assess their disease status and response to therapy.

Managing CEL is a complex endeavor, typically necessitating a multidisciplinary approach10. Therapeutic strategies aim to reduce eosinophilia and prevent or reverse organ damage2. However, no standardized therapy exists for CEL, and treatment options are often limited in their efficacy2,10. The choice of therapy is influenced by the presence of specific genetic abnormalities that may predict a patient's response to certain drugs9. For instance, patients with FIP1L1::PDGFRA rearrangements or other PDGFRA mutations may benefit from imatinib, a tyrosine kinase inhibitor that targets aberrant signaling pathways11,12. Nevertheless, most CEL patients lack these mutations and may exhibit resistance or intolerance to imatinib12. For such cases, alternative treatment options encompass corticosteroids, hydroxyurea, interferon-alpha, mepolizumab and chemotherapy2,13,14,15. However, these therapies have variable efficacy, significant toxicity profiles, and may not alter the natural progression of the disease, underscoring the critical need for improved therapeutic strategies9. Furthermore, most CEL patients are ineligible for allogeneic stem cell transplantation, a possible curative option, due to their advanced age and significant comorbidities15. Participation in clinical trials may offer hope for novel and more effective therapeutic options for this challenging condition.

Data associated with the prognosis of CEL is limited. Prognosis in CEL may dependent on factors such as the degree and duration of eosinophilia, the extent and reversibility of organ damage, the presence of cytogenetic or molecular abnormalities, and the response to therapy16. Nevertheless, some patients experience a more aggressive clinical course marked by life-threatening complications or progression to AML9. According to a study of a small cohort with 10 patients, the survival durations in CEL vary, ranging from 2.2 months to 10 years (median survival was 22.2 months)9. Adverse prognostic markers encompass abnormal karyotypes, increased blast counts, thrombocytopenia, bone marrow fibrosis, atypical megakaryocytes, resistance, or intolerance to imatinib, and transformation to AML9,16.

Owing to the intricacies of diagnosis and the rarity of CEL, comprehensive investigations among patients meeting the 2016 World Health Organization (WHO) criteria for CEL have been notably limited. The purpose of this study is to investigate the incidence and identify the factors affecting the survival of CEL patients based on a population-based study using the national Cancer Institute’s Surveillance, Epidemiology and End Results (SEER) database. Moreover, we endeavor to construct a nomogram to predict the prognosis of patients with CEL.

Materials and methods

Study population and data acquisition

Data for this study were sourced from the Surveillance, Epidemiology, and End Results (SEER) Program (https://seer.cancer.gov/), maintained by the National Cancer Institute (NCI). The data were collected using SEER*Stat software version 8.4.1 (https://seer.cancer.gov/seerstat/, accessed on August 1, 2023). The SEER 17 database [Incidence-SEER Research Data, 17 registries, Nov 2022 Sub (2000–2020)] was utilized to extract data of disease incidence by using the “rate session”. Patients diagnosed with CEL between 2001 and 2020 were selected from the "Incidence-SEER Research Plus Data, 17 Registries, Nov 2022 Sub (2000–2020)" database using the “case listing session”. Only cases with known age (censored at age 89 years) and malignant behavior were included. The inclusion criteria were as follows: the International Classification of Diseases for Oncology (ICD-O-3) histologic code (9964/3). The exclusion criteria were as follows: (1) the diagnosis confirmation was “unknown”; (2) the patient's survival time was 0 or unknown; (3) age below 20 years old. Ultimately, 487 patients with CEL were included in the final cohort. Ethical approval was not required as SEER data are publicly available and anonymized, precluding patient reidentification. A flowchart depicting the selection process is presented in Fig. 2.

Definition of variables

The analysis encompassed an array of variables, including age, sex, race, marital status, year of diagnosis, primary site, vital status, survival months, cause of death (COD) to site recode, cause-specific death classification, cause of death to site, sequence number, first malignant primary indicator, total number of in situ/malignant tumors for the patient, type of reporting source, diagnostic confirmation, surgery of the primary site, median household income inflation adj. to 2021, rural–urban continuum code, chemotherapy recode, and radiation recode. Age was dichotomized into < 60 years and 60+ years, referencing the age at diagnosis of CEL. Race categories included African American, White, and Other (comprising “Asian/Pacific Islander,” “American Indian/Alaska Native,” and “Unknown”). Marital status was classified as married, single, or other (encompassing “divorced,” “separated,” “widowed,” “unmarried or domestic partner,” and “unknown”). COD information was derived from the “COD to site recode” field. In the SEER database, CEL-related death was defined as “dead (attributed to this cancer diagnosis),” while CEL-unrelated death was defined as death “dead (attributable to causes other than this cancer diagnosis).” Diagnosis years were categorized into “2001–2005,” “2006–2010,” “2011–2015,” and “2016–2020.” To assess the results based on the annual median household income at the county levels, patients were categorized into groups based on their annual household income: < $50,000, $50,000-$75,000, and $75,000+. The classification of residence type was based on the Rural–Urban Continuum Codes. These codes separated metropolitan counties according to the population size of their metropolitan areas, and nonmetropolitan counties by their level of urbanization and proximity to a metropolitan area. Overall survival (OS) time was calculated from the date of diagnosis to death or last follow-up.

Statistical analysis

All statistical analyses were conducted using the R program language (version 4.2.1; R Foundation for Statistical Computing, Vienna, Austria). Patients were randomly allocated to training and validation groups (7:3 ratio), with baseline characteristics compared using Student’s t-test and chi-square test for continuous and categorical data, respectively. Prognostic risk factors for CEL were identified through least absolute shrinkage and selection operator (LASSO) regression, followed by univariate and multivariate Cox proportional regression analyses to determine independent prognostic factors. Hazard ratios (HR) and 95% confidence intervals (95% CIs) were reported. The Cox proportional hazards regression model was assessed for proportionality assumptions, with no violations detected. Variables with P < 0.05 in the multivariate model were considered significant. Nomograms predicting 3- and 5-year overall survival (OS) probabilities were constructed based on independent prognostic factors. Time-dependent receiver operating characteristic (ROC) curves and corresponding area under the curve (AUC) values assessed discrimination. Calibration curves and decision curve analysis (DCA) nomograms were generated to evaluate performance. Patients from both cohorts were classified into high- and low-risk groups using median nomogram points, with Kaplan–Meier analysis employed for OS estimation and log-rank tests for group comparisons. A two-sided P value < 0.05 indicated statistical significance.

Results

Incidence of CEL

The analysis of data from the SEER database showed that the age-adjusted incidence rate (AIR) of CEL from 2001 to 2020 [age adjusted to the 2000 US Standard Population (19 age groups—Census P25-1130)] was 0.033 per 100,000 person-years (95% CI 0.031–0.036). The annual AIR of CEL was presented in Fig. 1A. Notably, the peak AIR was documented in 2008 with 0.054 per 100,000 person-years (95% CI 0.039–0.073), in comparison with the AIR of 2001–2010 (0.040/100,000 person-years), the incidence rate ratio (IRR) for the 2011–2020 group (AIR 0.028/100,000 person years) was 0.68 (95% CI 0.57–0.81, P < 0.0001). Moreover, it is notable that the AIR of CEL increased with age (Fig. 1B). Compared with the AIR of patients < 60 years old (0.024/100,000 person-years), the IRR for the 60+ age group (AIR 0.087/100,000 person years) was 3.65 (95% CI 3.07–4.34, P < 0.0001), indicating statistical significance. Investigation into gender differences revealed that the AIR of male (0.042/100,000 person-years) was significantly higher than that of female (0.025/100,000 person-years, IRR 1.66, 95% CI 1.39–1.98, P < 0.0001, Fig. 1C, D).

Figure 1
figure 1

Age-adjusted incidence of CEL from 2001 to 2020 in SEER database. (A) Annual age-adjusted incidence of CEL; (B) Age-adjusted incidence of CEL based on age of diagnosis; (C) Annual age-adjusted incidence of CEL in male and female populations, respectively; (D) Age-adjusted incidence of CEL based on age of diagnosis in male and female patients, respectively. CEL, chronic eosinophilic leukemia.

Baseline characteristics of CEL patients

As depicted in Fig. 2, 487 patients were finally identified as CEL in the SEER 17 registry, Nov 2022 Sub (2000–2020) from January 2001 to December 2020. The primary site was bone marrow (n = 487, 100%). For all patients in the study cohort, 61.2% were male, 1.6 folds that of female (n = 189, 38.8%; Table 1). The average age at diagnosis was 57.0 ± 17.0 years (Range: 20–89), with 53.4% under the age of 60 (< 60) and 46.6% aged 60 or older (60+). Most CEL patients were white (73.9%), African American and Other (including Asian/Pacific Islander and American Indian/Alaska Native) occupied 15.6% and 10.5%, respectively. At diagnosis, 57.3% of patients were married, 24.6% had other marital statuses (such as divorced, separated, widowed, unmarried or domestic partner), and 18.1% were single who had never been married. For all CEL cases in this study, 89.7% were primary CEL and 10.3% were secondary CEL that were secondary to other primary malignancies. For the average household income per year, 11.9% were < $50, 000, 45.4% were $50,000-$75, 000, and 42.7% were above $75, 000. About 41.3% were treated with chemotherapy. At the time of last follow-up, 284 (58.3%) patients were alive; 42 (8.6%) deaths were attributable to CEL, 45 (9.2%) patients died of heart diseases, and an additional 116 (23.8%) patients died due to other causes such as diabetes mellitus, cerebrovascular diseases, septicemia, and so on. For all those variables, there were no statistical difference between the training cohort and validation cohort (P > 0.05). The comparison of epidemiologic characteristics was summarized in Table 1.

Figure 2
figure 2

Flow chart of study cohort selection using the SEER database. A flow diagram of selection of patients with CEL in this study. CEL, chronic eosinophilic leukemia; SEER, Surveillance, Epidemiology, and End Results.

Table 1 Comparison of baseline characteristics of patients with CEL between the training set and validation set from SEER database.

LASSO regression and independent prognostic factors selection

A total of 9 clinical parameters were included in the training set. According to the results of LASSO Cox regression analysis, age, sex, marital status at diagnosis, household income and sequence were identified for OS risk factors by using the minimum standard value as the criterion (Fig. 3). The Cox regression model was further used to screen the prognostic factors. All the five variables passed the preliminary proportional hazards assumption test: age (P = 0.366), sex (P = 0.355), marital status (P = 0.535), household income (P = 0.208) and sequence (P = 0.454). Univariate Cox regression analysis revealed that age, marital status at diagnosis, and sequence were significantly associated with OS (Table 2). In the multivariate Cox analysis of OS, age, marital status at diagnosis, and sequence were independently and significantly associated with OS (Table 2). Older age (HR 3.74, 95% CI 2.51–5.60, P < 0.001, Table 2), marital status of single (HR 2.44; 95% CI 1.49–4.00, P < 0.001, Table 2), marital status of other (HR 2.08; 95% CI 1.41–3.10, P < 0.001, Table 2), secondary CEL (HR 1.98; 95% CI 1.23–3.20, P = 0.005, Table 2) were significantly associated with worse overall survival. The detailed data was demonstrated in Table 2.

Figure 3
figure 3

LASSO regression model was used to select characteristic impact factors. (A) LASSO coefficients of 7 features; (B) Selection of tuning parameter (λ) for LASSO model.

Table 2 Univariable and multivariable cox regression analysis for overall survival of patients with CEL.

Construction of prognostic nomogram

By incorporating the three independent prognostic factors including age, marital status at diagnosis, and sequence, a nomogram was constructed to predict the 3- and 5-year OS probability of CEL patients (Fig. 4). The total points were calculated by integrating scores related to age, marital status, sequence and projected to the bottom scale to predict the OS probability at 3 and 5 years.

Figure 4
figure 4

Construction of the prognostic nomogram of CEL patients based on 3 risk factors. The total points were calculated by integrating scores related to age, marital status, sequence and projected to the bottom scale to predict the overall survival probability at 3 and 5 years. CEL, chronic eosinophilic leukemia.

Evaluation and validation of the nomogram

The calibration curve of the nomogram for the training cohort revealed a close match between the predicted and observed OS probability at the 3- and 5-year intervals (Fig. 5A). Additionally, validation cohort calibration plots at 3- and 5 years also showed good agreement between prediction and actual observation (Fig. 5B). Time-dependent ROC analyses showed the accuracy of the nomogram models in predicting 3- and 5-year OS probability in the training set, with AUC values of 0.702 and 0.736, respectively (Fig. 6A), and the 3-year and 5-year AUC of the validation set was of 0.731 and 0.754, respectively (Fig. 6B). The DCA was employed to evaluate the clinical net benefit of the predictive model. The results showed that the nomogram model has a good net benefit in predicting the 3- and 5-year OS probability both in the training set (Fig. 7A, B) and validation set (Fig. 7C, D).

Figure 5
figure 5

Evaluation of the nomogram of by calibration plot. (A) The calibration curve of the training set for the observed overall survival (OS) probability and predicted OS at 3-year and 5-year. (B) The calibration curve of the validation set at 3-year and 5-year.

Figure 6
figure 6

Evaluation of the nomogram of by receiver operating characteristic (ROC) plot. (A) Time-dependent ROC curve analyses of the nomograms for the 3 years and 5 years in the training set. (B) Time-dependent ROC curve analyses of the validation set.

Figure 7
figure 7

Evaluation of the nomogram of by decision curve analyses. (A, B) The decision curve analyses of the nomogram for the 3 years (A) and 5 years (B) in the training set. (C, D) The decision curve analysis of the nomogram for the 3 years (C) and 5 years (D) in the validation set.

Survival analysis between the stratified risk groups

The score of each variable was generated from the nomogram and the cumulative scores were calculated for all the patients. The entire cohort was stratified into low- and high- risk subgroups according to the median risk score. Kaplan–Meier analysis of OS revealed significant differences between the low- and high -risk groups for both training set (P < 0.0001, Fig. 8A) and validation set (P < 0.0001, Fig. 8B), which underscores the exceptional capacity of the nomogram for effective risk stratification.

Figure 8
figure 8

Kaplan–Meier curve of overall survival of CEL patients stratified by the risk stratification system in the training set (A) and validation set (B). CEL, chronic eosinophilic leukemia.

Discussion

CEL represents a rare and intricate hematological disorder characterized by uncontrolled eosinophilic proliferation10. Given its rarity, there is limited published literature on CEL, mostly comprising case reports or small case series, the incidence and clinical characteristics have not been comprehensively studied yet9,16. The present study identified 487 CEL patients from 2001 to 2020 using the SEER database, representing the largest cohort describing the incidence and clinical characteristics of CEL patients to date. We also developed and validated a nomogram to predict the 3- and 5-year overall survival probability of CEL patients based on the screened prognostic factors.

The epidemiology of CEL remains incompletely characterized due to its rarity and clinical heterogeneity1,2. Available studies suggest an estimated incidence rate of approximately 0.036 per 100,000 individuals for all hypereosinophilic syndromes (HES), including CEL17. In this study, analysis of CEL from the SEER database between 2001 and 2020 revealed a low age-adjusted incidence rate (AIR) of 0.033 per 100,000 person-years, and the incidence rate significantly decreased in 2011–2020 compared to 2001–2010. The decreased incidence rate of CEL may be partially attributed to modifications in its diagnostic criteria. In the 2008 revision of the World Health Organization (WHO) classification of myeloid neoplasms and acute leukemia, a specific group of patients with eosinophilia and gene rearrangements involving PDGFRA, PDGFRB, or FGFR1 were excluded from the diagnosis of CEL and classified as a separate entity, namely “myeloid and lymphoid neoplasms with eosinophilia and abnormalities of PDGFRA, PDGFRB, or FGFR1 (MLNE)”, which have distinct clinical and molecular features and respond well to tyrosine kinase inhibitors18. Therefore, this reclassification may have reduced the number of cases that were previously diagnosed as CEL based on the older criteria. And the peak incidence in 2008 observed in this study, followed by a decline, coincides with the implementation of the revised WHO classification, suggesting that the initial higher rates may have included cases that would later be reclassified under the new criteria, which supports the hypothesis that the redefinition of CEL has had a significant impact on its reported incidence. Furthermore, the advancement in diagnostic technologies and understanding of molecular genetics over the past two decades has allowed for more accurate diagnosis of myeloproliferative neoplasms (MPNs). This progress may have resulted in more cases being classified into other specific subtypes of MPNs rather than being broadly categorized as CEL.

Moreover, it was once reported that CEL exhibits a male predominance, and the median age at diagnosis was 62 years9. In this study, the average age of the patients was 57.0 ± 17.0 years, and the IRR of male-to-female was 1.66 (95% CI 1.39–1.98), which is in agreement with the previous research9,19. Furthermore, the current study demonstrated that the incidence rate increased with age, with the IRR of the 60+ age group to the < 60 age group being 3.65 (95% CI, 3.07–4.34), which has not been documented in the literature so far. The age distribution of CEL may reflect the accumulation of genetic and epigenetic alterations that lead to clonal expansion of eosinophils over time20. The gender difference of CEL may be influenced by hormonal factor or genetic factors that affect the susceptibility or exposure to eosinophilic stimuli21,22.

A previous study showed that the prognosis of CEL is poor in a cohort of 10 patients and the median survival was 22.2 months, with 5 patients developing acute transformation after median of 20 months from diagnosis9. However, it is extremely difficult to draw any conclusion from this study due to the small sample size. In the current study, to identify the prognostic factors for CEL, LASSO Cox regression analysis and multivariate Cox regression analysis were performed on a set of clinical variables. The study unveiled that age, marital status at diagnosis, and sequence were independently associated with overall survival. Older age emerged as a significant adverse prognostic factor, with older individuals facing substantially elevated mortality risks (HR 3.74, 95% CI 2.51–5.60), which is in line with previous studies on MPNs23. Marital status was also a significant predictor of survival, with the marital statuses of single and other (divorced, separated, widowed, unmarried or domestic partner) having a worse outcome than married patients. This may reflect the impact of social support and psychological factors on cancer survival24,25,26. Sequence was another important prognostic factor, with secondary CEL associated with poorer prognosis. This may be due to the presence of other malignancies or comorbidities that affect the treatment response and tolerance27. In addition, chemotherapy showed no effect on the OS of CEL patients in this study, which is in accordance with some study showed that CEL patients are usually unresponsive to conventional chemotherapy9.

Nomograms have been developed and proven to surpass the conventional staging systems in terms of prognostic accuracy for certain types of cancers24. Consequently, the integration of nomograms into clinical practice as reliable tools for predicting cancer prognosis has become increasingly prevalent28,29. In this study, a prognostic nomogram was constructed to predict the 3- and 5-year overall survival probability of CEL patients based on those three independent prognostic factors: age, marital status at diagnosis, and sequence. The nomogram demonstrated commendable calibration and discriminative performance in both the training and validation cohorts, indicating satisfactory accuracy and consistency of the nomogram. This study is the first to develop and validate a clinical prognostic model for CEL patients. To the best of our knowledge, this is also the largest study ever conducted.

Effective risk stratification is integral to tailoring treatment strategies and optimizing patient outcomes30. Utilizing the nomogram, CEL patients were stratified into low- and high-risk groups based on individual risk scores. Kaplan–Meier analysis of OS revealed substantial distinctions between these risk cohorts, underscoring the nomogram's efficacy in risk stratification. This empowers clinicians to identify patients who may benefit from more aggressive therapeutic interventions or intensified surveillance, ultimately contributing to improved patient care and outcomes.

However, there are some limitations with this study. Firstly, the SEER registry did not document other potential prognostic factors that may have a significant impact on CEL patient outcomes, such as genetic mutation, performance status, LDH level, immunophenotypic features, family history and alcohol/smoking consumption history. Secondly, detailed information about therapy was not recorded in the SEER database, making it impossible to analyze the effect of different treatment regimens. Thirdly, this is a retrospective study, which means that there may be unavoidable potential biases such as selection bias. Finally, while the nomograms of CEL were constructed and verified using the same database, they were not further validated using another independent dataset. Thus, although this study provided important insights on CEL due to the rarity and lack of large-scale multicenter prospective study of this disease, the results should still be interpreted with caution.

Conclusions

In conclusion, this study provides novel insights into the epidemiology and prognosis of CEL in the US population using the SEER database. CEL is a very rare disease with a variable clinical presentation and outcome. Age, marital status at diagnosis, and sequence were identified as independent prognostic factors for overall survival, culminating in the development of a prognostic nomogram to predict the 3- and 5-year overall survival probability of CEL patients. This nomogram may help clinicians provide personalized treatment and clinical decisions for CEL patients. To our knowledge, this study represents the largest population-based cohort investigating the epidemiology and survival outcome of CEL patients. However, more clinical research is needed to validate our findings.

Data availability

The data analyzed in this study are from the SEER database (https://seer.cancer.gov/) that are available to the public.