Introduction

Esophageal cancer (EC) is the seventh most common type of cancer in the world and the sixth leading cause of cancer-related death, with a 5-year survival rate of 15–20%1,2. Its incidence is expected to rise 140% in the world in a period of 10 years until 20253. In the United States, it is estimated that, in 2021, there will be 19,260 new cases (15,310 in males and 3959 in females; with an ~ fourfold higher incidence in males) and 15,530 deaths (12,410 in males and 3120 in females; with an ~ fourfold higher death rates in males) from esophageal cancer4. Strikingly, the epidemiology in the western world has changed during the last 4 decades with a sharp decline in the proportion of squamous cell carcinomas (SCC) and an increase in the proportion of adenocarcinomas5.

Esophageal adenocarcinomas (EACs) and SCCs differ mainly in terms of tumor location and by their predisposing factors. Barrett’s esophagus (BE) is the only known pre-malignant precursor to EAC, and virtually all EAC is said to arise in a background of BE6. Smoking and alcohol are the main risk factors for SCC, and these two risk factors seem to confer a synergistic risk effect7,8,9. EACs are associated with GERD (Gastroesophageal reflux disease), central obesity and smoking but not alcohol9. Smoking is a stronger risk factor of SCC with approximately sixfold odds compared to twofold odds for EAC10,11. Esophageal SCC can be present throughout the middle esophagus, while EAC can be present throughout the distal esophagus12. Treatment depends on the location and the histological subtype, and may be endoscopic for very early asymptomatic disease, only surgical for localized disease, multimodal for advanced disease and palliative non-surgical for metastatic disease1,13,14,15.

As in other types of cancer, sex differences in incidence are also seen in esophageal cancer. In the United States, 76% of cases of adenocarcinoma from 1973 to 2012 occurred in white males16,17. It is estimated that the odds for EAC is 7–10 times greater and the odds for SCC is 3–4 times greater in males than females18. Also, sex has been shown to be an independent prognostic marker in SCC but not in EAC, with females having better survival19,20,21,22. In addition, there is a report of greater regional recurrence and distant metastasis in males when compared to females, indicating that there is greater control of the disease after radiotherapy in females19.

Despite published data showing sex differences for esophageal cancer, root causes are still poorly understood. To our knowledge, no studies have analyzed sex differences across a large spectrum of variables in both SCC and EAC. We hypothesized that there may be differences in epidemiological criteria, risk factors or treatment patterns that explain the sex differences in incidence and outcomes. Therefore, we carried out a comprehensive analysis on solid databases of differences between sexes and other covariates in patients diagnosed with primary esophageal cancer.

Methods

Data were obtained from the University Hospitals (UH) Seidman Cancer Center research data repository consisting of patient records from 2005 to 2020. The Data repository is based on CAISIS, an open source web-based cancer data management system that integrates research with patient care and has integration from disparate sources (Soarian, NGS Labs, Sunrise Clinical Manager, Tumor Registry, Via Oncology, OnCore, MosiaQ, PRO tools and others) to provide comprehensive data on the UH Seidman cancer patient population23. Patient records were deidentified and all the analysis were performed in accordance with relevant guidelines and regulations, respecting the Declaration of Helsinki. The study with the waiver of the informed consent was approved by the University Hospitals of Cleveland Institutional Review Board (IRB).

The initial cohort included patients ≥ 18 years old who were diagnosed with primary malignant esophageal cancer between 2005 and 2020 (ICD codes C15.XX, C49.A1 and 150.XX)24,25. Patients were excluded from analysis if they had missing sex information, unknown date of diagnosis or a prior history of cancer. The cohort selection for analysis is described in Fig. 1.

Figure 1
figure 1

Cohort description with inclusion and exclusion criteria for UH institutional database. The final cohort included 1205 patients diagnosed with esophageal cancer from 2005 to 2020 with ≥ 18 years, excluding those without unknown diagnosis date, missing gender and without primary esophageal cancer).

Data extracted from the UH platform for each patient included basic demographics (such as age at diagnosis, sex, race, etc.), comorbidities, histology/subtype, staging, laboratory results, vital signs, medications, and cancer treatment information (chemotherapy, hormone therapy, immunotherapy, surgery, radiation). Treatment information was only included if it was related to esophageal cancer or the anatomical location of the esophagus. From the list of medications, we selected the drugs and classes commonly used on treatment for esophageal cancer or those drugs that can be risk factors26. Patients with a recorded date of death obtained from the EMR and state records were considered deceased.

Our final analysis included 29 categorical variables, grouped into General Characteristics, Cancer Characteristics, Risk Factors and General Treatment. General Characteristics variables included age at diagnosis, sex, median income, race and ethnicity. Cancer Characteristics variables included histological subtype, clinical stage and pathological stage. Risk Factors included Charlson comorbidity score27, smoking status and presence/absence of the following comorbidities: obesity, BE, alcoholism, achalasia, previous gastrectomy, gastritis, gastroesophageal reflux, H.pylori infection and long term use of NSAIDs. General treatment variables included whether the patient received the following therapies: chemotherapy, immunotherapy, radiation (of the esophagus or nearby anatomical area), surgery of the esophagus, Cisplatin, Fluorouracil, Paclitaxel, H2 antagonists, proton pump inhibitors (PPIs), NSAIDs and statins. Time to treatment variables (time to chemotherapy, time to radiation, time of radiation and time to surgery) were calculated for those which received the respective types of treatment according to last day and first day registered on the EMR, surgery date, diagnosis date and categorized as < 40 days and ≥ 40 days.

Age was categorized based on epidemiology reports and clinical experience as ≥ 18–55 years, 56–70 years and > 70 years38,28,29. Race was categorized as white, black, or other. Ethnicity was categorized as Hispanic, non-Hispanic or other. Estimated median income was determined by the patient’s zip code and categorized as < $43,235, $43,235-$64,446, or > $64,446 (25%, 50% and 75% percentiles). The risk factors were selected from the list of comorbidities of each patient based on those most related to esophageal cancer according to the literature and clinical experience30,31. Histological subtype was categorized into Squamous Cell Carcinoma—SCC (ICD-O-3 8050-8084) or Esophageal Adenocarcinoma—EAC (ICD-O-3 8140-8384)32. The categorization process is summarized in supplementary Table S1.

SEER data was used to compare and validate our findings with the general population. Data were obtained from SEER*stat software based on SEER Research Plus Database for esophageal cancer diagnosis between 2005 and 201833. The variables analyzed were categorized following the methodology applied to the UH Database and included sex, age at diagnosis, race, ethnicity, histology, staging, chemotherapy, radiotherapy, surgery, vital status and median survival.

The sample was divided according to sex as male or female. Pearson Chi-Square test was used to compare variables of interest by sex, disregarding patients with missing values, with p < 0.05 being considered significant. The influence of sex on survival was first assessed using Kaplan Meier analysis generating median survival by sex with 95% confidence intervals (95% CI) and log rank tests by sex. Cox proportional hazards regression models were used after getting the assumptions checked to assess univariable and multivariable models of overall survival by sex and by sex and histological subtype (EAC and SCC). The variables selected for the multivariable model overall and by histological subtype were those with p < 0.20 in the univariable model and those with clinical importance. Correlated variables checked by chi-square test were not included in the final model. All analyses were performed using RStudio 1.2.1335 software34.

Results

All types of esophageal cancer

Using data from years 2005 to 2020 we analyzed a total of 1205 patients for all types of esophageal cancer, with 75.8% (913) males and 24.2% (292) females, establishing a male: female ratio of about of 3:1. The evolution of cases by year is shown on Fig. 2. For general characteristics (Table 1), sex differences existed only for age at diagnosis (p < 0.001), with a predominance of females > 70 years old (46.9% of females) and males between 56 and 70 years old (48.2% of males). There were no significant differences for median income (p = 0.12), race (p = 0.06) and ethnicity (p = 0.21). When cancer characteristics (Table 2), we found a difference for histology (p < 0.001) with a predominance of EAC in both groups (79.2% in males and 56.5% in females) and no significant differences were found for clinical staging (p = 0.21) and pathological staging (p = 0.08).

Figure 2
figure 2

Plots of cases by year from UH Seidman Cancer Center (2005–2020) and SEER (2005–2018) for all types of esophageal cancer, esophageal adenocarcinoma (EAC) and squamous cell carcinoma (SCC). All types of esophageal cancer cases are increasing in the US.

Table 1 General characteristics by sex for all types of esophageal cancer combined, esophageal adenocarcinoma (EAC) and squamous cell carcinoma (SCC), UH Seidman Cancer Center Database (2005–2020).
Table 2 Cancer characteristics by sex for all types of esophageal cancer combined, esophageal adenocarcinoma (EAC) and squamous cell carcinoma (SCC), UH Seidman Cancer Center Database (2005–2020).

There was a difference for risk factors (Table 3) in smoking status (p = 0.01) with a predominance of former smokers in males overall and by histological subtype (58.1% in males and 45% in females overall), with no difference for Charlson Score (p = 0.28), obesity (p = 0.11), BE (p = 0.22), alcoholism (p = 0.35), achalasia (p = 0.64), previous gastrectomy (p = 0.17), gastritis (p = 0.28), gastroesophageal reflux (p = 0.42), H.pylori infection (p = 0.80) and long term use of NSAIDs (p = 0.96).

Table 3 Risk factors by sex for all types of esophageal cancer combined, esophageal adenocarcinoma (EAC) and squamous cell carcinoma (SCC), UH Seidman Cancer Center Database (2005–2020).

For treatment characteristic (Table 4), differences were observed in the prescription of NSAIDs (p = 0.04). No difference were seen for chemotherapy (p = 0.82), immunotherapy (p = 0.13), radiotherapy (p = 0.50), surgery (p = 0.17), time to chemotherapy (p = 0.77), time to radiation (p = 1.00), time of radiotherapy (p = 0.72), time to surgery (p = 0.93), Cisplatin use (p = 0.97), Fluorouracil use (p = 0.23), Paclitaxel use (p = 1.00), H2 antagonists use (p = 0.52), PPIs use (p = 0.96) and Statins use (p = 0.93).The median survival was 27 months for males vs 32 for females (p = 0.50) (Table 5). The univariable model did not show inferiority for males (HR = 1.06, CI 0.89–1.26, p = 0.50). The multivariable model included the statistically significant variables (p < 0.20) age at diagnosis, race, ethnicity, histology, obesity, BE, gastrectomy, gastritis, gastroesophageal reflux, chemotherapy and surgery. Pathological stage, although statistically significant, was not included in the model due to the high number of unknown values. Radiotherapy was included despite p > 0.20 due to reports in the literature of better survival in females after this type of treatment19. The multivariable model also did not show inferiority for males (HR = 1.10, CI 0.88–1.27, p = 1.37). Both models are summarized on Fig. 3. Sex Differences for all types of esophageal cancer combined are summarized on Fig. 4.

Table 4 General treatment by sex for all types of esophageal cancer combined, esophageal adenocarcinoma (EAC) and squamous cell carcinoma (SCC), UH Seidman Cancer Center Database (2005–2020).
Table 5 Median survival by sex for all types of esophageal cancer combined, esophageal adenocarcinoma (EAC) and squamous cell carcinoma (SCC), UH Seidman Cancer Center Database (2005–2020).
Figure 3
figure 3

Forest plot of sex differences in survival for esophageal cancer (all types), esophageal adenocarcinoma (EAC) and squamous cell carcinoma (SCC). Univariable and multivariable cox models represented, UH Seidman Cancer Center Database (2005–2020). *Adjusted for: age at diagnosis, race, ethnicity, histology, obesity, Barret’s, gastrectomy, gastritis, gastroesophageal reflux, chemotherapy, surgery and radiotherapy. **Adjusted for: age at diagnosis, smoking status, obesity, Barret’s, alcoholism, achalasia, gastrectomy, gastritis, gastroesophageal reflux, h.pilory, chemotherapy, surgery and radiotherapy. ***Adjusted for: race, Barret’s, gastritis, gastroesophageal reflux, h.pilory, radiotherapy, surgery, smoking status and alcoholism.

Figure 4
figure 4

Sex differences for esophageal cancer overall. Higher in males (3:1 ratio) and statistical differences for age at diagnosis (p < 0.001), smoking status (p = 0.01), histological subtype (p < 0.001) and NSAIDs use (p = 0.04), UH Seidman Cancer Center Database (2005–2020).

Esophageal adenocarcinoma (EAC)

Of the total cohort of 1205 patients, 596 (49.46%) had a histological classification of EAC, with 487 males (81.7%) and 109 females (18.3%), establishing a male: female ratio of about 4:1. The evolution of cases by year with EAC is shown on Fig. 2. In General Characteristics for EAC (Table 1) there was no difference in age at diagnosis (p = 0.16), median income (p = 0.58), race (p = 0.06) and ethnicity (p = 1.00). In males there was a predominance of the categories 56–70 years (48.3%), $43,235–$64,446 (53.9%), white race (83.9%) and non-Hispanic (99.1%). In females predominated the categories > 70 years (44%), $43,235–$64,446 (50.5%), white race (80.2%) and non-Hispanic (99%). The white: black ratio was of 30:1. In cancer characteristics for EAC (Table 2), there was no difference in clinical staging (p = 0.81) and in pathological staging (p = 0.63). Clinical staging IV predominated in males (39.3%) and clinical staging III in females (42.6%). Pathological staging I predominated in both groups (29% in males and 42.3% in females).

For risk factors for EAC (Table 3) there were no differences in Charlson Score (p = 0,30), smoking status (p = 0.05), obesity (p = 0.21), BE (p = 0.87), alcoholism (p = 0.16), achalasia (p = 0.80), previous gastrectomy (p = 0.62), gastritis (p = 0.60), gastroesophageal reflux (p = 0.52), H.pylori infection (p = 1.00) and long-term use of NSAIDs (p = 1.00). In treatment characteristics for EAC (Table 4) there was no difference in chemotherapy (p = 0.90), immunotherapy (p = 0.40), radiotherapy (p = 0.96), surgery (p = 0.93), time to chemotherapy (p = 0,82), time to radiation (p = 0.62), time of radiation (p = 1.00), time to surgery (p = 0.40), Cisplatin use (p = 0.78), Fluorouracil use (p = 0.54), Paclitaxel use (p = 0.38), H2 antagonists use (p = 0.22), PPIs use (p = 0.71), NSAIDs use (p = 0.12) and Statins use (p = 0.48). For EAC only, the median survival was 27 months for males and 29 for females (p = 0.40) (Table 5). The univariable model did not show inferiority for males (HR = 1.11, CI 0.84–1.46, p = 0.44). For the multivariable model were selected the statistically significant variables (p < 0.20) age at diagnosis, smoking status, obesity, BE, alcoholism, achalasia, gastrectomy, gastritis, gastroesophageal reflux, H.pylori, chemotherapy and surgery. Pathological Stage was not included in the model despite p < 0.20 due to the high number of unknown values. Radiotherapy was included in the model despite p > 0.20 due to reports of greater survival in females after this type of treatment19. This model also did not show inferiority for males (HR = 1.16 CI 0.87–2.10, p = 0.16). The models are summarized on Fig. 3. Sex differences for EAC are summarized on Fig. 5.

Figure 5
figure 5

Sex differences for esophageal adenocarcinoma (EAC). Higher in males (ratio 4:1), UH Seidman Cancer Center Database (2005–2020).

Squamous cell carcinoma (SCC)

Of the total cohort of 1205 patients, 212 (17.59%) had a histological classification of SCC, with 128 males (60.4%) and 84 females (39.6%), establishing a male:female ratio of about 3:2. The evolution of cases with SCC by year is shown on Fig. 2. In General Characteristics for SCC (Table 1), there was a difference for age at diagnosis (p = 0.02), with a predominance of > 70 years in females (42.9%) and 57–70 years in males (57%). There were no differences in median income (p = 0.94), race (p = 0.98) and ethnicity (p = 0.67). For median income, there was a predominance of $43,235–$64,446 for both groups (36.9% in males and 37.5% in females). For race, there was white predominance in both males (52.8%) and females (52.9%), establishing a white: black ratio of 2:1. For ethnicity there was a predominance of non-Hispanics (98.4% in males and 100% in females). For cancer characteristics for SCC (Table 2), there was no difference in clinical staging (p = 0.57) or in pathological staging (p = 0.17). In males, clinical staging IV (39.4%) and pathological staging II (47.1%) predominated. In females, clinical staging III (38.8%) and pathological staging I and II (31.2% for each) predominated.

Regarding risk factors for SCC (Table 3), there was a difference in alcoholism (p = 0.04), with a higher percentage of alcoholics in males (36.7%). There was no difference for Charlson Score (p = 0.58), smoking status (p = 0.49), obesity (p = 1.00), BE (p = 0.92), achalasia (p = 1.00), previous gastrectomy (p = 0.15), gastritis (p = 0.98), gastroesophageal reflux (p = 0.55), H.pylori infection (p = 1.00) and long-term use of NSAIDs (p = 0.83). For treatment characteristics for SCC (Table 4) there was no difference in chemotherapy (p = 1,00), immunotherapy (p = 0.93), radiotherapy (p = 0.39), surgery (p = 0.43), time to chemotherapy (p = 1.00, time to radiation (p = 0.81), time of radiation (p = 1.00), time to surgery (p = 1.00) cisplatin use (p = 0.28), fluorouracil use (p = 0.80), paclitaxel use (p = 0.35), H2 antagonists use (p = 0.85), PPIs use (p = 1.00), NSAIDs use (p = 1.00) and statins use (p = 1.00). For SCC only, the median survival was 17 months for males and 25 months for females (p = 0.80) (Table 5). The univariable analysis did not show inferiority for males (HR = 1.04, CI 0.74–1.46, p = 0.82). The multivariable model included the statistically significant variables (p < 0.20) race, BE, gastritis, gastroesophageal reflux, H.pylori, radiotherapy and surgery. Smoking status and alcoholism were included despite p > 0.20 for being recognized risk factors for this histological subtype. This analysis did not demonstrate inferiority for males (HR = 1.70, CI 0.94–3.07, p = 0.07). The models are summarized on Fig. 3. Sex Differences for SCC are summarized on Fig. 6.

Figure 6
figure 6

Sex differences for squamous cell carcinoma (SCC). Higher in males (ratio 3:2) and statistical differences in age a diagnosis (p = 0.02) and alcoholism (p = 0.04), UH Seidman Cancer Center Database (2005–2020).

SEER

Using data from years 2005 to 2018 from SEER (Table 6 and Fig. 2), we analyzed a total of 55,771 patients for all types of esophageal cancer with 77.9% (43,441) males and 22.1% (12,330) females, establishing a male: female ratio of about of 3.5:1. In this cohort, 31,255 had the diagnosis of EAC and 17,540 of SCC, with a predominance of EAC in males (70.5%) and SCC on females (58.7%).

Table 6 Charachteristics of esophageal cancer patients from SEER Database diagnosed between 2005–2018.

For all types of EC sex differences were found for age at diagnosis (p < 0.001), race (p < 0.001), ethnicity (p < 0.001), histology (p < 0.001), stage (p < 0.001), chemotherapy (p < 0.001), radiotherapy (p < 0.001), surgery (p < 0.001) and median survival (p = 0.03), without differences for vital status (p = 0.80). For EAC there were no differences for ethnicity (p = 0.09), stage (0.20) and vital status (p = 0.06). For SCC there were no differences only for radiotherapy (p = 0.25).

On the Survival Analysis, summarized on Fig. 7, differences were found on all types, EAC and SCC univariable models, while on multivariable EC and SCC showed higher risk of death for males, except on the EAC multivariable (HR for males = 1.01, CI = 0.97–1.06, p = 0.35).

Figure 7
figure 7

Forest plot of sex differences in survival for esophageal cancer (all types), esophageal adenocarcinoma (EAC) and squamous cell carcinoma (SCC). Univariable and multivariable cox models represented, SEER Database (2005–2018). *Adjusted for: age at diagnosis, race, ethnicity, histology, stage, chemotherapy, radiotherapy, and surgery. **Adjusted for: age at diagnosis, race, ethnicity, stage, chemotherapy, radiotherapy, and surgery. ***Adjusted for: age at diagnosis, race, histology, stage, chemotherapy, radiotherapy, and surgery.

Discussion

The primary objective of this work was to assess sex differences in a large spectrum of variables and assess the potential effects of these differences on survival for Esophageal Cancer and its two main histological subtypes (Adenocarcinoma—EAC and Squamous Cell Carcinoma—SCC). We believe that our main contribution to the field is the solid, qualified, and detailed information available on our institutional database, that with the integration of disparate sources, enabled us to carry a comprehensive analysis, adding variables and information that helps to understand the epidemiology of sex differences for esophageal cancer.

This study showed that, like other cancers, esophageal cancer and its two main histological subtypes (EAC and SCC) occur more often in males than in females, on both our institutional database and SEER, corroborating with literature reports of higher incidence in males38,35. The mechanisms to explain these sex differences are not fully understood and seems to be multifactorial, mainly involving hormonal and genomic factors38,36. Our study also showed that, regarding risk factors, there are differences only on smoking status and only when analyzing all types of esophageal cancer (probably due to the inclusion on other/unknown histology diagnosis on this group), in line with findings that risk factors doesn’t seem to be associated with the higher incidence in males37,38. Looking to demographic factors we noted that there are differences for age at diagnosis for all types of Esophageal Cancer and for SCC, with females tending to be diagnosed at older ages, findings that can contribute to the hypothesis that estrogen can be an inhibitor for the esophageal carcinogenesis and thus protective for females on the pre-menopausal stage38,39,40.

Regarding cancer characteristics (Table 2) the only difference seen is on the histological subtype. Besides our institutional database confirming the trends of higher rates of EAC for both sexes, it interestingly also showed that females have 2 × higher rates of SCC than males and lower rates of EAC than their counterparts (while 43.5% of the females have SCC diagnosis, only 20.8% of males have this diagnosis), while SEER showed a predominancy on SCC in females (58.7%). Since the female diagnosis of this subtype of cancer is predominantly at post-menopausal ages this could indicate that the estrogen protective effect is higher on the SCC subtype, corroborating with some previous studies41.

Excluding the differences already mentioned on age at diagnosis, smoking status and histological subtype, the only other differences between males and females are on alcoholism (only for SCC) and NSAIDs prescription (only for all types of Esophageal Cancer). All the other various variables analyzed didn’t showed any sex differences. Smoking status and alcoholism differences seems to be explained by populational behavior differences and, together with the other differences found, don’t seem to be to have an impact on the outcomes. Regarding to the outcomes, there’s not statistical significance but it’s possible to see a tendency of higher hazard ratios for males. The literature is conflicting about sex differences on survival, while some studies report worse outcomes in males, other report no differences, just like our study19,20,22,42,43,44. In addition, we observed changes in diagnosis over time (Fig. 2), with an increasing trend in Esophageal Cancer overall, with increasing number of EAC cases and a downward trend in the number of cases of SCC, especially in male patients. These trends corroborate findings in the literature regarding expectations of an increase in incidence of Esophageal Cancer, especially EAC, on US and an decrease in SCC, probably associated with a reduction in alcohol and tobacco consumption17,45.

Interestingly SEER data showed different patterns from our population. These differences reported between our database and SEER reflects differences in quality of care and population treated inside the US, for example with the underrepresentation of Hispanic patients and higher rates of treatment on our population. Our institution is localized on the state of Ohio, that, accordingly to the Ohio Department of Health, has an average of 781 new esophageal cancer cases per year, with an incidence rate of 5.2 per 100,000 (number 23% higher than the US rate) and annual cases of 625 per 100,000 for males vs 156 cases per 100,000 for females and all providers are required, by law, to report to Ohio Cancer Incidence Surveillance System (OCISS) all cancers diagnosed and/or treated on the state.

This study has several limitations. Our institutional database is based on the specific population being followed up on the University Hospitals Seidman Cancer Center, thus our selection does not indicate a population sample outside this context. Since this service also receives patients already diagnosed and being treated on other services, some of the information on the EMR may be incomplete. In addition, given the retrospective nature of this study, some valuable variables (such as the location of the cancer on the esophagus) are not available and for some variables there is a high number of NAs/unknown information (such as histological subtype and clinical staging) and this missing information can lead to a loss of statistical power. Also, the median income variable was generated by patient`s zip code and thus could have some misclassification. On the other hand, we analyzed a high number of variables with detailed information, giving new insights for what`s already published on the field. Additional studies with other databases, larger cohorts and with prospective design are needed to corroborate and investigate the findings reported here.

In summary, we found that males have a higher incidence of Esophageal Cancer and its two main subtypes (EAC and SCC) but none of the comprehensive set of variables analyzed showed to be strongly or unique correlated with this sex difference in incidence nor are they associated with a sex difference in survival.