Introduction

Countries with universal health coverage should, in theory, have consistently high standards of treatment with limited variations in quality across healthcare institutions1. To achieve this goal, quality assessments have been conducted to identify and minimize any extraneous heterogeneity in clinical practice2,3. Numerous studies have reported wide disparities in clinical care processes for cancer patients at the hospital level4,5,6,7,8, which may lead to variations in survival rates. We therefore hypothesized that survival outcomes after cancer diagnosis vary among hospitals. If this hypothesis is true, there may be considerable scope for improving the survivability of cancer patients9.

Comparisons of hospital performance based on unadjusted outcomes are problematic due to inherent differences in patient case mix. Prognostic factors such as patient age, comorbidities, and disease severity can influence overall survival in cancer populations10, and these factors should be controlled to enable robust comparisons that account for case mix differences11. Although cancer registries provide a wealth of information on tumor incidence, characteristics, and outcomes, their datasets are generally inadequate for outcome comparisons12. This is because cancer registry data frequently lack important clinical information required for risk adjustments. In contrast, administrative claims data contain detailed clinical information, but lack records on tumor characteristics and outcomes. The linkage of these two data types could therefore provide a more comprehensive dataset for analyses. In Japan, most acute care hospitals are reimbursed under the Diagnosis Procedure Combination (DPC) system, and the generated data are widely used for epidemiological research13. Cancer registry data and administrative data have been linked to allow for adjustments in case mix among Japanese cancer patients14. Previous studies have also shown that such linked data can be used to identify prognostic factors of cancer15,16,17,18,19,20.

The current literature on hospital variations in survival in the oncology setting is usually surgery-specific or focused on short-term mortality (e.g., 30-day postoperative mortality)21,22,23,24,25,26,27,28,29,30. Within the increasing culture of accountability in healthcare, short-term mortality will understandably remain one of the more important metrics for the surgical field. However, we consider it reasonable to include both surgical and non-surgical patients in analyses to appraise a hospital as a whole rather than separating patients according to their treatment modalities. This is because the decisions to perform resections are complex, and are based on the needs of each patient and the case selection of each surgeon. In addition, the mortality rates soon after surgery are low, making it unlikely that postoperative mortality would be sufficiently sensitive to detect significant differences among hospitals29,31,32. In contrast, survival metrics spanning a year or more at the hospital level are important for cancer patients and policymakers because survival duration is a major determinant of cancer care quality33.

Using a database that linked cancer registry data and administrative data, this study aimed to investigate between-hospital variations in 3-year survival rates for cancer patients in Japan irrespective of treatment modality, as well as to elucidate patient-level prognostic factors that affect survival.

Methods

Data sources

We performed a multicenter retrospective cohort study of 31 cancer care hospitals in Osaka Prefecture, Japan. These institutions were designated as cancer care hospitals by the national or prefectural government based on certain functional criteria for providing cancer treatment. The study database was produced through the linkage of hospital-based cancer registry data, prefectural cancer registry data, and administrative data. The data were obtained with support from the Council for Coordination of Designated Cancer Care Hospitals in Osaka15,16,17,18,19,20.

Hospital-based cancer registry data were collected from the cancer care hospitals. In addition to patient demographic information (age and sex), these data also included tumor information such as cancer diagnoses with topographical and morphological codes from the International Classification of Diseases for Oncology, Third Edition (ICD-O-3); cancer stage and tumor-node-metastasis classifications at diagnosis according to the Seventh Edition of the Union for International Cancer Control (UICC) staging system; and dates of cancer diagnoses. These hospital-based cancer registry data were linked to DPC administrative data for episodes of hospital care. The DPC data included inpatient and outpatient administrative claims for the provision of care, as well as the following discharge abstracts for inpatient episodes: dates of admission and discharge, patient characteristics (e.g., body height and weight), and the primary diagnosis and pre-existing comorbidities on admission according to International Classification of Diseases, Tenth Edition (ICD-10) codes13. Additionally, data from Osaka Cancer Registry, which collects and curates information on cancer incidence and outcomes in Osaka Prefecture residents, were also linked. These registry data contained the dates of cancer diagnoses and vital status information (verified through death certificates and official resident registries).

The three data sources were linked at the patient level and the completeness of record linkage was estimated to be 98%14. The 31 participating hospitals treated approximately 50% of all patients with newly diagnosed cancer in the study region. An anonymized dataset was extracted from the linked database for this study.

Selection criteria

The study population was selected using the hospital-based cancer registry dataset. The selection process is presented in Fig. 1. We first identified 30,440 patients (a) aged 18–99 years at cancer diagnosis; (b) who received treatment for gastric, colorectal, or lung cancer (designated the index cancer) at any of the 31 hospitals between January 1, 2013 and December 31, 2015; (c) whose registry data could be linked to administrative data from the same hospital; and (d) who were hospitalized for treatment of the index cancer within 3 months before or after the month of diagnosis. Treatment of the index cancer was defined as the first course of cancer-specific treatment, including best supportive care. We assigned each patient and his/her outcome to the hospital that provided this initial treatment even if he/she subsequently visited other hospitals for continued cancer care. We selected the three target cancer types due to their relatively high incidence within the study region, and identified them using ICD-O-3 topographical codes (C16.x for gastric cancer, C18.x–C20.x for colorectal cancer, and C33.x–C34.x for lung cancer). If an individual patient received multiple diagnoses of the same cancer type, we used the information from the earliest diagnosis. Patients were excluded if they had a diagnosis of sarcoma (ICD-O-3 morphological codes: 8800–8936, 8990–8991, 9020, 9040–9044, 9120–9133, 9150, 9170, and 9180–9251; n = 284), hematological tumor (9590–9989; n = 164), melanoma (8720–8790; n = 7), blastoma (8970–8974; n = 1), or carcinoma in situ (n = 2356); or if the follow-up of their vital status was censored within 3 years of cancer diagnosis (n = 64). We also excluded patients from hospitals with fewer than ten patients of each cancer type over the 3-year study period to ensure sufficient caseloads and to avoid extreme values. This led to the exclusion of 14 lung cancer patients from two hospitals. The final study population comprised 27,550 patients from 30 hospitals. One of the target hospitals had no eligible patients.

Figure 1
figure 1

Flow diagram of study population selection.

Potential prognostic factors

From the hospital-based cancer registry data, we obtained the following pretreatment demographic and tumor factors that can potentially influence outcomes: age (18–64, 65–69, 70–74, 75–79, and 80–99 years), sex, and UICC stage (I, II, III, IV, and unknown) at cancer diagnosis. The UICC pathological classification system was primarily used to determine cancer stage, but the clinical classification system was used for cases where surgical resections were not performed. For colorectal cancer, tumor localization was also included as a prognostic factor, and was grouped into right-sided colon cancer (ICD-O-3 topographical codes: C18.0–C18.5) and left-sided colorectal cancer (C18.6–C18.7, C19.9, C20.9) due to differences in tumor biology and prognoses between the anatomical subsites34. Patients with an ICD-O-3 topographical code of C18.8 or C18.9 (unspecified localization of colon cancer) in the hospital-based cancer registry data were manually classified to the appropriate subsite according to their primary diagnoses (ICD-10 codes) in the administrative data. For lung cancer, tumor histology was also included as a prognostic factor, and was classified into small cell lung cancer (ICD-O-3 morphological codes: 8041–8045) and non-small cell lung cancer (all other ICD-O-3 morphological codes)35.

For each patient identified in the hospital-based cancer registry data, we searched all inpatient episodes in the administrative data to identify the first cancer-specific hospitalization within 3 months before or after the month of cancer diagnosis. The following pretreatment demographic factors were derived from the administrative data: pre-existing comorbidities on admission, activities of daily living (ADL) on admission, type of admission (emergency or elective), smoking status (never smoker, current or ex-smoker, or unknown), and body mass index (BMI). The Quan adaptation of the Charlson Comorbidity Index (CCI) based on ICD-10 codes was used to measure patient comorbidities (excluding metastasis), and patients were grouped into three categories: no comorbidity (CCI score: 0), moderate comorbidities (1–2), and severe comorbidities (≥ 3)36,37. This procedure has been previously described in detail14. Higher CCI scores indicate a higher risk of death. The Barthel index was used to measure performance in ADL because of its association with cancer survival15,16. This index uses a scale of 0–100 with higher scores indicating better functional status, and patients were grouped into four categories: no disability (score: 100), moderate disability (60–95), severe disability (0–55), and unknown15. Emergency admission was defined as hospitalization due to critical conditions such as bleeding, perforation, shock, and organ failure; all other admissions were considered elective. Emergency admission was included as it has been reported to be a stage-independent poor prognostic factor38,39. Finally, BMI was calculated as weight (kg)/height (m2), and patients were grouped into five categories: underweight (BMI: < 18.5), normal weight (18.5–24.9), overweight (25.0–29.9), obese (≥ 30), and unknown16.

Statistical analysis

The study outcome was patient survival for 3 years or more. We defined overall survival duration as the period from the date of cancer diagnosis until the date of all-cause mortality. The 3-year survival rate was expressed as a proportion (percentage) of the total caseload. We constructed multilevel logistic regression models with hospital as a random effect to determine the variables associated with the probability of 3-year survival. The outcome variable had a binary value (survival or non-survival). The potential prognostic factors described above were all entered into the statistical models as explanatory variables. Missing data values (i.e., “unknown” categories) for cancer stage, ADL, smoking status, and BMI were imputed using five datasets from multiple imputation models that incorporated all explanatory and outcome variables40. We assumed that the missing data were missing at random. This assumption was plausible because of the wide range of variables included in the multiple imputation models. As a sensitivity analysis, the models used to investigate the associations of the potential prognostic factors with overall survival were built using a dataset restricted to cases with complete data (i.e., exclusion of patients with missing data values for cancer stage, ADL, smoking status, or BMI). The statistical analyses were separately performed for gastric, colorectal, and lung cancer patients using SAS 9.4 (SAS Institute, Cary, NC). Two-tailed P values below 0.05 were considered statistically significant.

Funnel plots

Institutional variations in 3-year survival rates were examined using funnel plots, which are designed to detect variations through simple visual inspection41. Greater variations in healthcare quality metrics can arise by chance in hospitals with smaller caseloads due to the between-hospital variations in cumulative caseload during the 3-year study period. Individual hospitals were plotted according to the outcome measure (Y-axis) and caseload during the study period (X-axis).

Three statistical models were fitted to create a set of graphs for each cancer type in order to assess the performance of the risk adjustment process: one for unadjusted outcomes and two for risk-adjusted outcomes. The first risk adjustment was performed using multilevel logistic regression models that included explanatory variables extracted only from the hospital-based cancer registry data (i.e., age, sex, stage, tumor localization for colorectal cancer, and histology for lung cancer). The outcomes produced by these models were termed “partially adjusted survival rates”. The second risk adjustment was performed using models that also included explanatory variables extracted from the administrative data (i.e., CCI, ADL, type of admission, smoking status, and BMI) in addition to the explanatory variables from the cancer registry data. The outcomes produced by these models were termed “fully adjusted survival rates”. To obtain the adjusted survival rates for each hospital, the observed number (O) of patients who survived for 3 years or more was divided by the expected number (E) to calculate an O/E ratio that was multiplied by the average survival rate (i.e., the crude number of 3-year survivors divided by the crude number of selected patients) of all hospitals (shown as the horizontal line in each graph)11. Each hospital’s expected number of survivors was calculated by summing its patients’ predicted probabilities of surviving given their individual values in the explanatory variables, and averaging over the random effect. Poisson control limits for a given level of significance were also drawn around the horizontal line. We used two-sided significance levels of 0.05 and 0.002, and the resulting 95% and 99.8% control limits represented approximately 2 (inner) and 3 (outer) standard deviations (SDs), respectively, on either side of the horizontal line. Hospitals with survival rates below the lower control limit or above the upper control limit of 2 SDs were regarded as potential outliers23,24,25,26,27,28,29. Adjusted survival rates below the lower control limit of 3 SDs and 2 SDs were regarded as “alarm” and “alert” signals, respectively41.

The distribution of cancer stage varied substantially among the hospitals. Preliminary analyses showed that the proportion of stage I cancer ranged from 44.4 to 74.4% in gastric cancer, 16.5% to 38.9% in colorectal cancer, and 0.0% to 48.3% in lung cancer. These variations can cause misleading comparisons of survival between hospitals because stage is a strong prognostic factor. Therefore, we conducted subgroup analyses for patients with stage I cancer. In these analyses, we excluded patients from hospitals with fewer than ten patients of each cancer type over the study period to ensure sufficient caseloads. This led to the exclusion of 17 stage I lung cancer patients from six hospitals.

Ethical approval

The study was conducted in accordance with the relevant guidelines and regulations of the ethical principles for medical research involving human subjects, as stated by the Declaration of Helsinki. Ethical approval was granted by the Institutional Review Board of Osaka International Cancer Institute (Approval number: 1707105108). The Institutional Review Board of Osaka International Cancer Institute waived the need for informed consent in accordance with the Japanese government’s Ethical Guidelines for Medical and Health Research Involving Human Subjects, which allow for the opt-out approach for the secondary use of existing data.

Results

The study population comprised 10,296 gastric cancer patients from 30 hospitals, 9276 colorectal cancer patients from 30 hospitals, and 7978 lung cancer patients from 28 hospitals (Fig. 1). Table 1 shows the distribution of pretreatment demographic and tumor characteristics. The unadjusted 3-year survival rate across the study hospitals was 70.2%, 75.2%, 45.0% for gastric, colorectal, and lung cancer, respectively. Among the age groups, patients aged 18–64 years formed the highest proportion of gastric and colorectal cancer cases, whereas patients aged 70–74 years formed the highest proportion of lung cancer cases. A high proportion of gastric and colorectal cancers were diagnosed at UICC stage I, whereas lung cancer presented more frequently at stage IV. Regardless of cancer type, the majority of patients had no comorbidity or functional disability, and had first received treatment in an elective admission. Current or ex-smokers were more prevalent among the gastric and lung cancer patients than never smokers, but never smokers were more prevalent among the colorectal cancer patients. Approximately two thirds of all patients had a BMI of 18.5–24.9 (normal weight). Proportions of the “unknown” categories for cancer stage, ADL, smoking status, and BMI ranged from 0.1 to 2.9%.

Table 1 Pretreatment demographic and tumor characteristics of patients with gastric, colorectal, and lung cancer.

Factors associated with 3-year survival

Table 2 summarizes the results of the multilevel logistic regression analyses examining the associations of the potential prognostic factors with 3-year overall survival after multiple imputations. The odds of survival were significantly lower for older age, men, and more advanced stages for all three cancer types. The adjusted odds ratios for these three variables in colorectal cancer were closer to 1.00 than the other two cancer types. Localization of colorectal cancer was not significantly associated with survival. Small cell lung cancer generated significantly lower odds than non-small cell lung cancer. Moderate and severe comorbidities were significantly associated with lower odds than no comorbidity for all cancer types, with the exception of moderate comorbidities in lung cancer. Functional disability in ADL was associated with significantly lower odds when compared with no disability. Emergency admission was also an independent risk factor of poorer survival. Current or ex-smokers with colorectal or lung cancer were significantly less likely to survive than never smokers. The odds of survival were lower for patients with a BMI of < 18.5 than for those with a BMI of 18.5–24.9 in all three cancer types, whereas the odds were higher for patients with a BMI of 25.0–29.9 in gastric and colorectal cancer. A BMI of ≥ 30 was not significantly associated with survival. In the sensitivity analysis, we repeated the multilevel logistic regression analyses using complete-case data (Supplementary Table S1). These results were similar to those of the main analyses using data from the multiple imputation models.

Table 2 Association of prognostic factors with 3-year overall survival in patients with gastric, colorectal, and lung cancer using data from multiple imputation models.

Between-hospital variations in 3-year survival

Funnel plots describing the unadjusted, partially adjusted, and fully adjusted 3-year survival rates of gastric cancer for 30 hospitals are presented in Fig. 2. The unadjusted survival rates ranged from 53.0 to 83.2% among the hospitals. The funnel plot highlighted five hospitals with unadjusted survival rates below the lower control limit or above the upper control limit of 2 SDs (Fig. 2A). The unadjusted survival rates for the remaining 25 hospitals fell between these lower and upper control limits. After risk adjustment, however, the between-hospital variations in survival rates had shrunk. The partially adjusted and fully adjusted survival rates for all 30 hospitals fell between the lower and upper control limits of 2 SDs, with variations shrinking further in the latter according to visual inspection (Fig. 2B,C).

Figure 2
figure 2

Funnel plots of 3-year survival and the number of gastric cancer patients in each hospital. (A) Unadjusted survival; (B) Partially adjusted survival that controlled for age, sex, and cancer stage; and (C) Fully adjusted survival that controlled for comorbidities, activities of daily living, type of admission, smoking status, and body mass index in addition to the variables in the partially adjusted model. LCL lower control limit, UCL upper control limit.

Funnel plots describing the unadjusted, partially adjusted, and fully adjusted 3-year survival rates of colorectal cancer for 30 hospitals are presented in Fig. 3. The unadjusted survival rates ranged from 66.5 to 82.7% among the hospitals. The survival rates for all 30 hospitals lay between the lower and upper control limits of 2 SDs even before risk adjustment. However, the adjustments reduced the variations according to visual inspection.

Figure 3
figure 3

Funnel plots of 3-year survival and the number of colorectal cancer patients in each hospital. (A) Unadjusted survival; (B) Partially adjusted survival that controlled for age, sex, cancer stage, and tumor localization; and (C) Fully adjusted survival that controlled for comorbidities, activities of daily living, type of admission, smoking status, and body mass index in addition to the variables in the partially adjusted model. LCL lower control limit, UCL upper control limit.

Funnel plots describing the unadjusted, partially adjusted, and fully adjusted 3-year survival rates of lung cancer for 28 hospitals are presented in Fig. 4. The unadjusted survival rates ranged from 5.3 to 61.4% among the hospitals. The funnel plot detected 13 hospitals with unadjusted survival rates below the lower control limit or above the upper control limit of 2 SDs (Fig. 4A). The unadjusted survival rates for the remaining 15 hospitals fell between these lower and upper control limits. In Fig. 4B, the partially adjusted survival rates for six hospitals remained below the lower control limit or above the upper control limit of 2 SDs. In Fig. 4C, the funnel plot identified two hospitals with fully adjusted survival rates lying below the lower control limit of 3 SDs (“alarm” line), two hospitals with fully adjusted survival rates lying below the lower control limit of 2 SDs (“alert” line), and two hospitals with fully adjusted survival rates lying above the upper control limit of 2 SDs. The fully adjusted survival rates for the remaining 22 hospitals fell between these lower and upper control limits, with variations shrinking further according to visual inspection when compared with the partially adjusted survival rates.

Figure 4
figure 4

Funnel plots of 3-year survival and the number of lung cancer patients in each hospital. (A) Unadjusted survival; (B) Partially adjusted survival that controlled for age, sex, cancer stage, and histology; and (C) Fully adjusted survival that controlled for comorbidities, activities of daily living, type of admission, smoking status, and body mass index in addition to the variables in the partially adjusted model. LCL lower control limit, UCL upper control limit.

Subgroup analyses

The study population for the subgroup analyses of stage I cancer patients comprised 6400 gastric cancer patients, 2536 colorectal cancer patients, and 2548 lung cancer patients. As shown in Supplementary Table S2, restricting the study population to patients with stage I cancer did not markedly affect the results of the multivariable analyses on the associations of the potential prognostic factors with 3-year survival after multiple imputations. Funnel plots describing the 3-year survival rates of gastric cancer for 30 hospitals, colorectal cancer for 30 hospitals, and lung cancer for 22 hospitals are presented in Supplementary Figs. S1, S2, and S3, respectively. The survival rates of gastric cancer and colorectal cancer for all hospitals lay between the lower and upper control limits of 2 SDs before and after risk adjustment. The funnel plots for lung cancer indicated that although one hospital had an unadjusted rate below the lower control limit of 2 SDs, its partially and fully adjusted survival rates fell between the lower and upper control limits.

Discussion

Using a dataset that linked cancer registry data and administrative data, this study examined the variations in cancer survival rates among Japanese cancer care hospitals over the course of 3 years irrespective of treatment modality. We observed wide between-hospital variations in the unadjusted survival rates of gastric and lung cancer patients, which were reduced after adjusting for case mix. Interestingly, there were no substantial between-hospital variations in survival among colorectal cancer patients even before risk adjustment. A possible explanation for this finding may be that prognostic factors such as age, sex, and cancer stage were weaker in predicting the survival duration for colorectal cancer than the other two cancer types, which in turn had little impact on the between-hospital differences. Another explanation may be that colorectal cancer patients have modest healthcare seeking behavior, which suggests that the equalization of healthcare services for this cancer type has been virtually attained42. It was notable that only a small proportion of patients had missing data (“unknown” categories) for the potential prognostic factors, and these were included in the risk adjustment models to provide a more valid investigation of institutional variations. A strength of our study was the examination of survival over several years. As postoperative death has become a relatively rare event, the appraisal of cancer care services requires the inclusion of long-term oncological outcomes29,31,32.

Through a visual comparison of the partially and fully adjusted survival rates in the funnel plots, we found that several demographic factors from the administrative data could partly explain the between-hospital variations. These results underscored the importance of record linkage for better risk adjustment because both administrative and registry data contribute unique prognostic information. In fact, numerous cancer-related studies on health services research have similarly used record-linked databases such as the SEER-Medicare Linked Database5,6,7,12,14,15,16,17,18,19,20,22,23,24,32.

The observed variations in survival rates that persisted even after adjusting for case mix may be explained by differences in provided care. While this study identified hospitals with outlying 3-year survival rates, the mechanisms underlying these variations remain unclear. One explanation is that hospitals with a higher survival rate are more adept at providing higher-quality care7,8. If this is true, an effective strategy for reducing variations in outcomes would be to learn best practice from the hospitals with the best outcomes and to spread its adoption2,3. Other potential factors may include staffing, surgeon proficiency, perioperative management, multidisciplinary teams, palliative care, and patient safety culture4. In addition, hospital caseload has been correlated with outcomes in cancer patients43. Investigating and addressing the causes of the observed variations could improve the care and outcomes of the cancer population.

Caution is needed when considering how institutional variations in outcomes should be applied to improve the quality of care. First, the definition of outlying status is complicated. To simplify this definition, our approach to identifying statistical outliers on the basis of control limits using 2 SDs and 3 SDs followed previously published studies23,24,25,26,27,28,29. Second, hospitals may not necessarily be excellent or substandard performers for all process or outcome metrics despite being high or low outliers for the survival outcome. Performance appraisals of service quality in cancer treatment is multifaceted and dependent on the viewpoints of the stakeholders concerned. Third, a “snapshot” of performance cannot be safely interpreted in isolation. It would be erroneous to form any firm conclusions based on a single data point where a hospital happens to be an outlier, and more prudent to identify persistent outliers in time series analyses of quality metrics. Fourth, the institutional variations can also be explained by data quality. Not all prognostic factors could be quantified in this study. Specifically, hospitals that treat more patients with severe diseases are more likely to have a higher premature mortality rate, and this disadvantage was not directly discernible from our data sources.

Nevertheless, the between-hospital variations in survival should not be ignored. These variations suggest that there may be considerable scope for quality improvement9. Efforts should be made to elucidate why outlier hospitals had significantly better or worse values than other hospitals. Our findings provide a basis for future in-depth research and clinical audits that aim to eliminate disparities and improve the quality of clinical practice. Ultimately, the identification of variations is only beneficial if it can contribute to quality improvement initiatives2,3.

The prognostic factors of overall survival identified in this study were generally concordant with those identified in previous reports14,15,16,35,36,37,38,39. The associations of comorbidities and functional disability with premature mortality could be explained by compromised cancer treatment plans and poorer health conditions14,15. Furthermore, the poorer survival associated with emergency admissions may be indicative of higher complication rates and biologically aggressive tumors in these patients38,39. The association between low BMI and poorer survival could be attributed to reduced dietary intake, sarcopenia, and cachexia16.

Limitations

This study has several limitations. First, and most importantly, the risk adjustment models lacked detailed oncological information such as genetic factors. As stated above, case mix factors that could not be accounted for in the dataset may influence outcomes. Nevertheless, risk adjustments were conducted for many of the important factors known to influence survival. Second, we only analyzed data from designated cancer care hospitals. For this reason, our study population may not be representative of the whole population in the study region, and could be vulnerable to selection bias. The inclusion of all hospitals in the region may produce even larger variations than those observed in our analysis.

Conclusions

We identified between-hospital variations in 3-year survival for patients with gastric, colorectal, and lung cancer diagnosed between 2013 and 2015. Benchmarking institutional oncological performance is complex and should not be generalized from a single outcome metric. However, the elucidation of these variations can support future studies that explore the underlying clinical mechanisms for quality improvement initiatives, which will contribute to the improvement of treatment quality and outcomes.