Skip to main content

Thank you for visiting You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

Identifying prognostic factors for clinical outcomes and costs in four high-volume surgical treatments using routinely collected hospital data


Identifying prognostic factors (PFs) is often costly and labor-intensive. Routinely collected hospital data provide opportunities to identify clinically relevant PFs and construct accurate prognostic models without additional data-collection costs. This multicenter (66 hospitals) study reports on associations various patient-level variables have with outcomes and costs. Outcomes were in-hospital mortality, intensive care unit (ICU) admission, length of stay, 30-day readmission, 30-day reintervention and in-hospital costs. Candidate PFs were age, sex, Elixhauser Comorbidity Score, prior hospitalizations, prior days spent in hospital, and socio-economic status. Included patients dealt with either colorectal carcinoma (CRC, n = 10,254), urinary bladder carcinoma (UBC, n = 17,385), acute percutaneous coronary intervention (aPCI, n = 25,818), or total knee arthroplasty (TKA, n = 39,214). Prior hospitalization significantly increased readmission risk in all treatments (OR between 2.15 and 25.50), whereas prior days spent in hospital decreased this risk (OR between 0.55 and 0.95). In CRC patients, women had lower risk of in-hospital mortality (OR 0.64), ICU admittance (OR 0.68) and 30-day reintervention (OR 0.70). Prior hospitalization was the strongest PF for higher costs across all treatments (31–64% costs increase/hospitalization). Prognostic model performance (c-statistic) ranged 0.67–0.92, with Brier scores below 0.08. R-squared ranged from 0.06–0.19 for LoS and 0.19–0.38 for costs. Identified PFs should be considered as building blocks for treatment-specific prognostic models and information for monitoring patients after surgery. Researchers and clinicians might benefit from gaining a better insight into the drivers behind (costs) prognosis.


Predicting the course of disease and outcome of treatment is crucial for both physicians and patients. Prognostic factor (PF) research is a fundamental first step1 in developing accurate prognostic models for that purpose. PFs are defined as measures that are available at the time of diagnosis, and that are associated with a subsequent clinical outcome2. PF research plays a crucial role in many areas that are relevant to clinical practice, including establishing treatment options, identifying targets for intervention, supporting shared decision-making, and providing more affordable methods for prognosis.

A recent review of PF studies has identified several limitations in PF research, including insufficient sample size, inappropriate analyses, and unclear reporting2. Furthermore, PF research often lacks standardized adjustment for comorbidity, even though this is likely to generate more accurate and generalizable results. Another limitation relates to (the high costs associated with) data availability and translating those data into relevant information. In PF research, data are typically collected and processed in a labor-intensive manner, requiring a substantial number of resources. This is particularly true for biomarkers3, which are often unavailable and/or disproportionately costly to collect and in addition, organic materials often have limited longevity4. Using routinely collected hospital data for PF research might present cost-effective opportunities to contribute to knowledge about which patient factors influence outcomes and costs. In turn, identified PFs might be added to (existing) prognostic models to further improve individualized risk prediction.

The premise of this paper is that routinely collected data in hospital information systems may be of significant value in PF identification and the subsequent construction of prognostic models, which could in turn yield clinically relevant information. Hospital information systems mainly contain electronic health records (EHR) and billing/reimbursement data and are one of the fastest-growing data sources in health care. In addition, prior research has underlined the potential of these data for improving the value (i.e. the outcomes achieved at given level of costs) of healthcare delivery5,6,7. More specifically, by providing insight into patients’ health status (e.g. survival), recovery process (e.g. complications) and sustainability of health (e.g. readmission), these data form a potentially useful source for reliable costs and outcome measurements, which could in turn be used for various methods for steering on value8. Furthermore, these data typically allow for the retrieval of secondary diagnoses, which enables standardized comorbidity adjustment.

The primary objective of this study is to investigate to what extent is it possible to identify (common) PF associations with outcomes and costs across four high-volume surgical treatments, using routinely collected data from 66 Dutch hospitals. More specifically, we investigate possible associations of various candidate-PFs with five clinical outcomes and in-hospital costs. The secondary objective is to evaluate the discriminative ability and predictive accuracy of prognostic models in which we combine the identified PFs.


Descriptive statistics

In total, 92,671 patients treated in 66 Dutch hospitals over a two-year period (2016–2017) were included (Fig. 1). Patients in this cohort received one of the four abovementioned surgical interventions: CRC (n = 10,254), UBC (n = 17,385), aPCI (n = 25,818), and TKA (n = 39,214). The mean age of the cohort was 68.1 years, and 44.7% of the patients were female (Table 1). Patients with CRC and UBC suffered from more severe comorbidity, translating into higher Elixhauser Comorbidity Scores (ECSs): 4.9 and 5.5 for UBC and CRC patients versus 1.2 and 1.1 for aPCI and TKA patients. In-hospital mortality was higher in aPCI patients (2.2%) than in patients with other conditions (0.1–1.4%). ICU admission rates were the lowest in TKA patients (0.8%) and the highest in CRC patients (10.4%). The median LoS after surgery was the highest in CRC patients (5 days) and the lowest in UBC patients (1 day). By contrast, readmission (7.6%) and reintervention (3.7%) rates were highest in UBC patients. CRC was the most expensive treatment with a median total cost of €11,707, followed by TKA (€9251), aPCI (€4984) and UBC (€4721) (Table 2).

Figure 1
figure 1

Flowchart describing study population and treatments.

Table 1 Overview of study population and summary statistics of candidate PF variables, by surgical treatment.
Table 2 Summary statistics of outcomes and costs, by surgical treatment.

Main results

Across the four treatments and six outcomes (including costs), we identified numerous statistically significant PF associations (Tables 3, 4, 5, 6). Notable differences were also identified between individual treatments and cohort results (Appendix 1). Below, however, we will limit the presentation to the results expected to have the highest clinical relevance. Results are presented by outcome with reference to corresponding treatments. This section ends with our findings on prognostic model performance.

Table 3 Prognostic factors for outcomes and costs for colorectal carcinoma, where *p ≤ 0.05.
Table 4 Prognostic factors for outcomes and costs for urinary bladder carcinoma, where *p ≤ 0.05.
Table 5 Prognostic factors for outcomes and costs for acute coronary intervention, where *p ≤ 0.05.
Table 6 Prognostic factors for outcomes and costs for knee osteoarthritis, where *p ≤ 0.05.

In-hospital mortality

In CRC patients, age (OR 1.10), ECS (OR 1.05) and prior hospitalizations (OR 1.73) were significantly associated with a higher risk of in-hospital mortality. In addition, women had a significantly lower mortality risk than men (OR 0.64).

In UBC patients, age (OR 1.10), ECS (OR 1.06), female sex (OR 1.52) and prior days spent in hospital (OR 1.16) were significantly associated with in-hospital mortality risk.

In aPCI patients, age (OR 1.04) and ECS (OR 1.02) were found to be statistically significant PFs for higher in-hospital mortality. By contrast, prior days spent in hospital days (OR 0.78) were significantly associated with reduced risk of in-hospital mortality.

Finally, age (OR 1.09) and ECS (OR 1.16) were also found to be positively associated with this outcome for TKA patients.

ICU admission

For CRC patients, statistically significant associations with a higher risk of ICU admission were found for age (OR 1.03), ECS (OR 1.10), prior hospitalizations (OR 1.89) and low SES (OR 1.31, compared to high SES). By contrast, female sex significantly reduces this risk (OR 0.68).

In UBC patients, ECS (OR 1.08), prior hospitalizations (OR 1.42) and prior days spent in hospital (OR 1.07) were positively related to the risk of ICU admission.

Both ECS (OR 1.08) and prior hospitalizations (OR 1.93) significantly increased ICU admission risk in patients undergoing aPCI.

ECS (OR 1.18), medium SES (1.41) and low SES (OR 1.47) were found to be significantly associated with an increased risk of this outcome in TKA patients, whereas a negative association was found for female sex (OR 0.56).

30-day readmission

In CRC patients, prior hospitalizations (OR 25.50) were associated with an increased readmission risk, as were ECS (OR 1.04) and low SES (OR 1.40).

In UBC patients, age (OR 1.01), ECS (OR 1.02), and prior hospitalizations (OR 2.15) were identified as statistically significant PFs for increased readmission risk. Female sex (OR 0.73) and prior days spent in hospital (OR 0.95) were negatively associated with this outcome for this patient group.

Prior hospitalizations were strongly associated (OR 25.44) with increased readmission risk in aPCI patients, while we found the opposite for the variables female sex (OR 0.46) and prior days spent in hospital (OR 0.61).

In TKA patients, age (OR 1.02), ECS (OR 1.08), and prior hospitalizations (OR 23.40) were positively associated with the risk of this outcome. Again, we found an association with the opposite direction for female sex (OR 0.67) and prior days spent in hospital (OR 0.55).

30-day reintervention

In CRC patients, we found ECS (OR 1.07) and prior hospitalizations (OR 1.63) to be significantly associated with an increased reintervention risk, while a negative association was found again for female sex (OR 0.70).

Similar results were found for UBC patients, with a positive association for prior hospitalizations (OR 1.53) and a negative association for female sex (OR 0.74). In addition, for this patient group we also found a (weak) negative association between prior days spent in hospital and this outcome (OR 0.93).

Also, among aPCI patients, prior hospitalizations (OR 7.83) were associated with a higher reintervention risk.

Finally, only age (OR 1.02) was identified as a PF in TKA patients for this outcome.

Length of stay

For length of stay, a significant positive effect was found for prior hospitalizations among aPCI patients (b 0.25), CRC patients (b 0.27), UBC patients (b 0.28) and TKA patients (b 0.49).

In-hospital costs

Prior hospitalizations were most strongly associated with costs for aPCI patients, with an estimated average cost increase of 63% per additional prior hospitalization, all else equal. Positive associations were also found for patients who underwent CRC (31%), UBC (32%), or TKA (33%). Female sex was negatively associated with costs for CRC (− 2.7%) and UBC patients (− 8.0%). Finally, prior days spent in hospital were identified as a PF for higher costs, with the estimated effect ranging from 2.9% (CRC patients) to 7.5% (UBC patients) average costs increase per additional day in hospital prior to treatment.

Prognostic model performance

Subsequently, the discriminative ability, predictive accuracy and model fit statistics of prognostic models was evaluated (Tables 7, 8).

Table 7 Model fit statistics and brier score for dichotomous outcomes.
Table 8 Model fit statistics for continuous outcomes.

In CRC patients, c-statistic values were 0.84 (CI 0.84–0.89) for in-hospital mortality, 0.78 (CI 0.78–0.81) for ICU admittance, 0.85 (CI 0.84–0.88) for readmission, and 0.74 (CI 0.74–0.84) for reintervention, suggesting fair to good discriminative ability. The R-squared for LoS was 0.06 and 0.26 for costs.

In the UBC patients, the c-statistic also varied across models: 0.81 (CI 0.81–0.86) for in-hospital mortality, 0.79 (CI 0.79–0.83) for ICU admittance, 0.67 (CI 0.67–0.70) for readmission, and 0.71 (CI 0.71–0.75) for reintervention. The R-squared for LoS was 0.10 and 0.21 for costs.

In patients who underwent aPCI, c-statistics were 0.77 (CI 0.75–0.80) for in-hospital mortality, 0.68 (CI 0.67–0.70) for ICU admittance, 0.82 (CI 0.80–0.88) for readmission, and 0.67 (CI 0.66–0.72) for reintervention. The R-squared for LoS was 0.07 and 0.19 for costs.

In TKA patients, c-statistics were 0.92 (CI 0.92–0.97) for in-hospital mortality, 0.88 (CI 0.88–0.91) for ICU admittance, 0.71 (CI 0.71–0.74) for readmission, and 0.90 (CI 0.90–0.95) for reintervention. The R-squared for LoS was 0.19 and 0.38 for costs.

Finally, across the models for dichotomous outcomes, the Brier score was consistently below 0.08, suggesting good to excellent predictive accuracy.


Using data that are routinely available in hospital information systems, this study has generated clinically relevant knowledge on PFs for five outcomes as well as in-hospital costs in four high-volume surgical treatments. The PFs that influenced clinical outcomes most across all treatments were sex, comorbidity and prior hospitalizations. The latter PF was also most strongly predictive of costs. Constructed prognostic models achieved fair to excellent discriminative abilities and had low Brier scores, underlining the potential of using routinely collected data for PF research. Although the proportion of variance in LoS that was explained by our model is limited, clinicians and policy makers might find the explained proportion of costs variance insightful because these highlight targets for costs reduction strategies through interventions that reduce costs variation9,10.

Across the surgical interventions analyzed, we identified several common PFs for outcomes and costs. Given that these PFs were identified across four distinct treatments, similar associations may well exist for other (surgical) treatments too. Although originally validated as a prognostic tool for in-hospital mortality, the ECS might have wider applicability11. Apart from readmission risk in aPCI patients, the ECS was found to be a PF for increased risk of ICU admission and of 30-day readmission, as well as higher LoS. In addition, prior hospitalizations were identified as a strong PF for increased readmission risk across all treatments. This association was previously identified in a non-surgical setting12. In contrast, prior days spent in hospital was associated with lower readmission risk. Although longer LoS after surgery was associated with decreased readmission risk in other surgical treatments13,14, we did not encounter work that previously identified or described the association between (all-cause) prior days spent in hospital and decreased readmission risk.

Finally, prior hospitalizations were strongly and positively associated with costs across all treatments. Given this, and the strong (intermediary) association that prior hospitalizations and readmission risk have, increased spending on readmission prevention could result in a net costs saving for these treatments15.

Comparison of the results for the cohort to those for the underlying treatment subgroups suggests that PF research could benefit from differentiating between specific (surgical) treatments. To illustrate, there has been debate on whether age should always be considered when determining the risk of ICU admission16. We found age to be a PF for ICU admission risk in some treatments, but not all. A similar argument can be made for age in relation to readmission and reintervention risk. Moreover, we sometimes encountered markedly divergent results across outcomes in terms of statistical significance of PF associations when models were estimated on the cohort instead of separately for the four distinct treatments. In short, PFs should be identified for specific combinations of target condition and (surgical) intervention. Ideally, these models should include standardized comorbidity adjustment, which can be done using routinely collected hospital data, as we have shown.

To our knowledge, this is the first study that identified multiple PFs for five outcomes and costs across four different surgical treatments using routinely collected hospital data. Among the strengths of this study are its large sample size and its multicenter design. Due to its national character and the underlying automation of the routine data collection process, risks (selection and attrition bias) often associated with observational studies are unlikely to have meaningfully distorted our results. Identified PFs both represent new knowledge and confirm or contradict PFs identified in previous work (e.g. female sex was found to be associated with far lower readmission risk for aPCI treatments, in contrast to earlier research focusing on non-acute infarctions17). In addition, we believe that it should be possible to reproduce our approach of repurposing routinely collected data for PF research for many other (surgical) treatments. Future work in PF research per our approach might further expand clinical knowledge by focusing research questions on different treatments, comparing treatment options, intercountry differences and or using existing registries more efficiently.

Some limitations intrinsic to the study design should also be mentioned. First, although our data allowed for the measurement and analysis of several clinically relevant candidate PFs, our results may have been influenced by the effect of unobserved confounding (e.g., clinical factors such as disease progression and complexity, and lifestyle-related factors like smoking). For example, while SES is known to be associated with smoking18 and might also play a role in obesity19, we were unable to adjust for this due to lack of data. Data on these factors often is of poor quality due to factors such as incomplete registration20,21. Second, the generalizability of our results might be influenced by contextual factors (e.g., treatment country, surgeon performance, hospital characteristics, surgical approach, and hospital/surgeon volume) underlining the importance of future studies in other countries and settings. Third, although highly unlikely, due to privacy regulations we cannot preclude the possibility of patients having received additional treatment from a different hospital during their initial treatment, which may have resulted in an underestimation of adverse events. Another limitation is that although one-year follow-up often includes the entirety of hospital treatment, we have no record of longer-term outcomes or costs. Finally, although inhospital mortality, ICU admission and 30-day readmission can be considered proxy-outcomes for complications, it should be worth exploring what factors are (also) prognostic for complications in future work.

As a conclusion, routinely collected hospital data are potentially useful for PF research. Researchers and clinicians should consider exploiting such data for that purpose. In attempting to identify clinically relevant PFs for a variety of outcomes, PF research should differentiate between distinct treatments. Patients and clinicians could benefit from our findings in various ways, mainly through inclusion of the identified PFs in condition-specific prognostic models and using the results for (automated) internal feedback on outcomes and costs. In turn, this might support shared decision-making and may assist clinicians to determine which patients to monitor more closely after surgery.


Study design, setting and participants

A retrospective multicenter cohort study was performed using prospective routinely collected data retrieved from the ‘Benchmark Database’ serviced by LOGEX, a Dutch healthcare data analytics company. The data contain patient-level information on diagnosis, care activities and discharges, complemented by several patient characteristics. These data are primarily generated and used for reimbursement purposes and are considered an accurate source for research into the quality and costs of healthcare5,22,23. By using this database, we extracted data on four treatments for which surgical intervention was performed within a two-year period (2016–2017): laparoscopic resection of colorectal carcinoma (CRC), transurethral resection of urinary bladder carcinoma (UBC), acute percutaneous coronary intervention (aPCI), and total knee arthroplasty for osteoarthritis (TKA). We hypothesized that the inclusion of a diverse set of treatments in terms of disease burden, complexity, and acuteness would allow us to examine potential overlap between the cohort and underlying treatment-specific subgroups. We therefore aimed to best capture the abovementioned medical diversity while selecting treatments: CRC (complex, relatively high disease burden), UBC (medium complex, high disease burden), aPCI (acute intervention) and TKA (low complex, low disease burden). Follow-up was possible up to one year after the date of surgery. No ethical approval was required because patient data in the database was already fully anonymized.

Outcomes and candidate prognostic factors

In selecting outcomes, we aimed to best capture all dimensions of treatment24. These dimensions can be divided into three tiers, each often representing different interests for patients. To summarize, tier 1 is achieved/retained health status, tier 2 indicates time to recovery and treatment disutility, and tier 3 indicates the sustainability of health or iatrogenic effects. Based on our data, this resulted in the inclusion of five outcomes in this study: in-hospital mortality (tier 1), intensive care unit (ICU) admission (tier 2), length of stay (post-surgery, tier 2), 30-day readmission (tier 3) and 30-day reintervention (tier 3).

In addition, we included in-hospital costs as an outcome, because of its clear relation to affordable and accessible healthcare8. All costs (i.e., surgical, diagnostic, clinic, and outpatient) incurred in the hospital with respect to the treatment undergone were included. Following the Dutch manual for costing studies, the total costs per treatment was defined as the sum over all delivered care activities multiplied by unit price per care activity25.

Based on previous PF research that identified patient factors as being (potentially) prognostic for our outcome variables7,26 and given data availability, we selected six candidate PFs. Patient age (in years), sex and socio-economic status (from highest (SES1) to lowest (SES3)) based on average income of the neighborhood in which patients lived at were readily available in the data. The number of hospitalizations in the year prior to treatment (all-cause, so not necessarily related to the conditions in the period of our current study), total days of spent in hospital in the year prior to treatment (again regardless of cause), and the Elixhauser Comorbidity Score (ECS) were computed using patient-specific care activities, diagnoses and disease history. The ECS is a graded point system that takes into account the severity of comorbidity, instead of solely including a collection of binary (comorbidity yes/no) scores27. The ECS was derived as a unique score for each included patient by attributing the corresponding Elixhauser Comorbidity Index Score to all known comorbidities that patients had at the time of treatment.

Statistical analysis

Multivariable random-effect logistic and linear regression analysis were used to examine the association between our candidate PFs and the six outcomes (including costs). Specifically, separate regression models were developed for each combination of treatment and outcome (e.g., readmission for TKA patients), as well as separate models per outcome for the cohort. The estimated association for a candidate PF was adjusted for the effect of all other (candidate prognostic) factors because of potential confounding for the factor in question. For dichotomous outcomes, Firth logistic regression was used when the number of events was very low (e.g. in-hospital mortality among TKA patients)28. Because between-hospital variation in outcomes may influence study results when based on data from all hospitals pooled together29, we included hospital random effects in all models. The costs variable was log-transformed prior to estimation. Therefore, the estimated coefficients from the models for this variable can be interpreted as the percentage change in costs following a 1 unit increase in the relevant PF. Statistical significance was assessed using a significance level of 5%.

Prognostic models were constructed using tenfold cross-validation. The discriminative ability was evaluated using the concordance (c) statistic for dichotomous outcomes. Corresponding confidence intervals were calculated using bootstrap. C-statistic values were interpreted as fair (0.7–0.8) , good (0.8–0.9) or excellent (≥ 0.9)30. The models’ predictive accuracy was evaluated using the Brier score (range 0 = perfect and 0.25 = non-informative) for dichotomous outcomes31 and R-squared (proportion of explained variance) for continuous outcomes (i.e. LoS and costs). All analyses were conducted in R, version-3.6.3.

Ethical statements

There were no experiments involved in this study and therefore approval of experimental protocols did not apply. An anonymous database was built from existing reimbursement data that was accumulated by hospitals under the Dutch Healthcare Law (Nederlandse Gezondheidswet). Since this study was based on legally obtained existing and anonymously processed data, no additional informed consent was required because there was no additional data collection. All methods were carried out in full accordance with privacy regulations and guidelines.

Data availability

This study brought together existing data obtained upon request and subject to license restrictions from several different sources. The database is not publicly available due to the (commercially, politically, ethically) sensitive nature of the data. No source consented to their data being retained or shared. Permission was acquired from a third party for use of the data in this study and following publication of this paper.


  1. Steyerberg, E. W. et al. Prognosis research strategy (PROGRESS) 3: Prognostic model research. PLoS Med. (2013).

    Article  PubMed  PubMed Central  Google Scholar 

  2. Riley, R. D. et al. Prognosis research strategy (PROGRESS) 2: Prognostic factor research. PLoS Med. 10, e1001380 (2013).

    Article  Google Scholar 

  3. Amur, S. Biomarker Qualification Program Educational Module Series-Module 1 Biomarker Terminology: Speaking The Same Language.

  4. Mayeux, R. Biomarkers: Potential uses and limitations. Neurotherapeutics (2004).

    Article  Google Scholar 

  5. Eindhoven, D. C. et al. Nationwide claims data validated for quality assessments in acute myocardial infarction in the Netherlands. Netherlands Hear. J. (2017).

    Article  Google Scholar 

  6. Hekkert, K. et al. How to identify potentially preventable readmissions by classifying them using a national administrative database. Int. J. Qual. Heal. Care 29, 826–832 (2017).

    Article  Google Scholar 

  7. Rajkomar, A. et al. Scalable and accurate deep learning with electronic health records. npj Digit. Med. (2018).

    Article  PubMed  PubMed Central  Google Scholar 

  8. Porter, M. E. What is value in health care?. N. Engl. J. Med. 363, 2477–2481 (2010).

    CAS  Article  Google Scholar 

  9. Wakeam, E. et al. Variation in the cost of 5 common operations in the United States. Surg. (United States) 162, 592–604 (2017).

    Google Scholar 

  10. Gutacker, N., Bloor, K., Bojke, C. & Walshe, K. Should interventions to reduce variation in care quality target doctors or hospitals?. Health Policy (2018).

    Article  PubMed  PubMed Central  Google Scholar 

  11. Potts, J. et al. The influence of Elixhauser comorbidity index on percutaneous coronary intervention outcomes. Catheter. Cardiovasc. Interv. 94, 195–203 (2019).

    Article  Google Scholar 

  12. McLaren, D. P. et al. Prior hospital admission predicts thirty-day hospital readmission for heart failure patients. Cardiol. J. 23, 155–162 (2016).

    Article  Google Scholar 

  13. Ansari, S. F., Yan, H., Zou, J., Worth, R. M. & Barbaro, N. M. Hospital length of stay and readmission rate for neurosurgical patients. Neurosurgery 82, 173–179 (2018).

    Article  Google Scholar 

  14. Freeman, R. K., Dilts, J. R., Ascioti, A. J., Dake, M. & Mahidhara, R. S. A comparison of length of stay, readmission rate, and facility reimbursement after lobectomy of the lung. Ann. Thorac. Surg. 96, 1740–1746 (2013).

    Article  Google Scholar 

  15. Nuckols, T. K. et al. Economic evaluation of quality improvement interventions designed to prevent hospital readmission: A systematic review and meta-analysis. JAMA Intern. Med. 177, 975–985 (2017).

    Article  Google Scholar 

  16. Daganou, M., Kyriakoudi, A. & Koutsoukou, A. Should age be a criterion for intensive care unit admission in cancer patients?-Still an issue of uncertainty. J. Thorac. Dis. (2017).

    Article  PubMed  PubMed Central  Google Scholar 

  17. Kwok, C. S. et al. Effect of gender on unplanned readmissions after percutaneous coronary intervention (from the Nationwide Readmissions Database). Am. J. Cardiol. 121, 810–817 (2018).

    Article  Google Scholar 

  18. Hiscock, R., Bauld, L., Amos, A., Fidler, J. A. & Munafò, M. Socioeconomic status and smoking: A review. Ann. N. Y. Acad. Sci. 1248, 107–123 (2012).

    ADS  Article  Google Scholar 

  19. Basto-Abreu, A. et al. The relationship of socioeconomic status with body mass index depends on the socioeconomic measure used. Obesity 26, 176–184 (2018).

    Article  Google Scholar 

  20. Polubriaginof, F., Salmasian, H., Albert, D. A. & Vawdrey, D. K. Challenges with collecting smoking status in electronic health records. AMIA Annu. Symp. Proc. 2017, 1392–1400 (2018).

    PubMed  PubMed Central  Google Scholar 

  21. Razzaghi, H. et al. Impact of missing data for body mass index in an epidemiologic study. Matern. Child Health J. (2016).

    Article  PubMed  PubMed Central  Google Scholar 

  22. Salet, N. et al. Is Textbook Outcome a valuable composite measure for short-term outcomes of gastrointestinal treatments in the Netherlands using hospital information system data? A retrospective cohort study. BMJ Open 8, e019405 (2018).

    Article  Google Scholar 

  23. Vester, M. P. M. et al. Utilization of diagnostic resources and costs in patients with suspected cardiac chest pain. Eur. Heart J. Qual. Care Clin. Outcomes (2020).

    Article  Google Scholar 

  24. Porter, M. E. Measuring health outcomes: The outcomes hierarchy. N. Engl. J. Med. 363, 2477–2481 (2010).

    CAS  Article  Google Scholar 

  25. Kanters, T. A., Bouwmans, C. A. M., Van Der Linden, N., Tan, S. S. & Hakkaart-van Roijen, L. Update of the Dutch manual for costing studies in health care. PLoS ONE (2017).

    Article  PubMed  PubMed Central  Google Scholar 

  26. Kansagara, D. et al. Risk prediction models for hospital readmission: A systematic review. JAMA (2011).

    Article  PubMed  PubMed Central  Google Scholar 

  27. van Walraven, C., Austin, P. C., Jennings, A., Quan, H. & Forster, A. J. A Modification of the elixhauser comorbidity measures into a point system for hospital death using administrative data. Med. Care 47, 626–633 (2009).

    Article  Google Scholar 

  28. Wang, X. Firth logistic regression for rare variant association tests. Front. Genet. (2014).

    Article  PubMed  PubMed Central  Google Scholar 

  29. Hemingway, H. et al. Prognosis research strategy (PROGRESS) 1: A framework for researching clinical outcomes. BMJ 346, e5595 (2013).

    Article  Google Scholar 

  30. Hosmer, D. W. & Lemeshow, S. Applied Logistic Regression 2nd edn. (Wiley, 2000).

    Book  Google Scholar 

  31. Steyerberg, E. W. et al. Assessing the performance of prediction models: A framework for traditional and novel measures. Epidemiology 21, 128–138 (2010).

    Article  Google Scholar 

Download references


We gratefully acknowledge the comments by the participants of the Health Systems and Insurance seminar (June 2020) and the participants of the conference of the Erasmus Initiative ‘Smarter Choices for Better Health’ (November 2019).

Author information

Authors and Affiliations



N.S. and V.S. designed the study, drafted the manuscript, and had a leading role in all other aspects of the study. F.E., R.B. and A.H. contributed to shaping the analysis. R.B., M.S., E.S., F.E. and A.H. performed critical revision of the manuscript. All authors read and approved the final manuscript.

Corresponding author

Correspondence to N. Salet.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher's note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Salet, N., Stangenberger, V.A., Eijkenaar, F. et al. Identifying prognostic factors for clinical outcomes and costs in four high-volume surgical treatments using routinely collected hospital data. Sci Rep 12, 5902 (2022).

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI:


By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.


Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing