Racial disparities in COVID-19 outcomes exist despite comparable Elixhauser comorbidity indices between Blacks, Hispanics, Native Americans, and Whites

Factors contributing to racial inequities in outcomes from coronavirus disease 2019 (COVID-19) remain poorly understood. We compared by race the risk of 4 COVID-19 health outcomes––maximum length of hospital stay (LOS), invasive ventilation, hospitalization exceeding 24 h, and death––stratified by Elixhauser comorbidity index (ECI) ranking. Outcomes and ECI scores were constructed from retrospective data obtained from the Cerner COVID-19 De-Identified Data cohort. We hypothesized that racial disparities in COVID-19 outcomes would exist despite comparable ECI scores among non-Hispanic (NH) Blacks, Hispanics, American Indians/Alaska Natives (AI/ANs), and NH Whites. Compared with NH Whites, NH Blacks had longer hospital LOS, higher rates of ventilator dependence, and a higher mortality rate; AI/ANs, higher odds of hospitalization for ECI = 0 but lower for ECI ≥ 5, longer LOS for ECI = 0, a higher risk of death across all ECI categories except ECI ≥ 5, and higher odds of ventilator dependence; Hispanics, a lower risk of death across all ECI categories except ECI = 0, lower odds of hospitalization, shorter LOS for ECI ≥ 5, and higher odds of ventilator dependence for ECI = 0 but lower for ECI = 1–4. Our findings contest arguments that higher comorbidity levels explain elevated COVID-19 death rates among NH Blacks and AI/ANs compared with Hispanics and NH Whites.


Methods
Settings. We used data from the Cerner COVID-19 De-Identified Data cohort, a subset of the Cerner Real-World Data cohort. Data in Cerner Real-World Data is extracted from the electronic health records (EHRs) of hospitals with which Cerner has a data use agreement and may include pharmacy, clinical and microbiology laboratory, and admission data, as well as billing information from affiliated patient-care locations. All admissions, medication and dispensing orders, laboratory orders and specimens are date and time stamped, providing a temporal relationship between treatment patterns and clinical information. Cerner Corporation has established Health Insurance Portability and Accountability Act (HIPAA)-compliant operating policies to establish de-identification for Cerner Real-World Data 33, 34 . EHR data are cleaned, standardized, and person-matched before being completely de-identified per HIPAA standards. Records of patients identified as having an encounter associated with a diagnosis of or a recent (up to 2 weeks prior) positive lab test for COVID-19 between January and June 2020 were included in the COVID-19 data set. To assess possible disease histories, all encounters and additional medical information for this patient cohort are collected, extending as far back as January 1, 2015, where available. A total of 62 health systems across the US contributed records to this data set.
The University of Utah Institutional Review Board (IRB #136696) determined that this study did not meet the definition of human subjects research according to federal regulations because (1) the investigators used secondary data and did not collect data through intervention or interaction with an individual, and (2) no personally identifiable information was captured in the data. The IRB also determined that the study did not meet the US Food and Drug Administration's (FDA's) definition of human subjects research because it did not involve a drug, device, or any other FDA-regulated product. Thus, the IRB waived the requirements for ethical approval and informed consent for this study.

Measurements. The outcomes of interest involved 4 indications of clinical complications in patients with
COVID-19: hospitalization, maximum hospital LOS, invasive ventilator dependence, and death. These indications were constructed from EHR data to reflect a unique risk profile per patient. Additionally, every outcome had to involve a COVID-19 diagnosis or laboratory indication.
We measured maximum LOS by calculating the difference in days between the start and end dates of each patient encounter and taking the maximum difference per patient. Hospitalization was a binary indicator of whether a patient ever had an LOS of 1 day or more. Invasive ventilator dependence was a binary indicator of whether a patient ever had a diagnosis, procedure, encounter, result, or indication signifying reliance on an invasive ventilator.  Table 1. These codes were kept separate from indications of less-severe ventilator dependence. Death was a binary indicator of whether a patient died at discharge or any time thereafter until the time of data collection. For additional analyses, in-hospital death was obtained and restricted to death at discharge (excluding any later deaths occurring outside of the hospital).
The predictors of interest were race (AI/AN, Asian/Pacific Islander [API], NH Black/African American, White, other/unknown race); ethnicity (Hispanic or Latino); and a comorbidity score derived from the ECI. Like the Charlson comorbidity index (CCI) 18 , the ECI measures patient comorbidity by calculating a risk-assessment score based on ICD-10 diagnosis codes. However, the ECI considers more chronic disease indications (with some more relevant to COVID-19 complications) than does the CCI (31 vs. 17) 35 The ECI is weighted using the Agency for Health Care Research and Quality (AHRQ) methodology 36 and scores are grouped into categories of less than 0, 0, 1-4, and 5 or higher 24  www.nature.com/scientificreports/ ICD-10 codes is found in Supplemental Table 2 37 . Other demographic characteristics included for analysis were sex, insurance status, and 1-digit zip-code region (categorical variables) and age in years (a continuous variable).

Statistical analysis.
Overall demographic characteristics were presented for patients in the COVID-19 cohort. Categorical variables were expressed by frequencies and percentages. Because continuous variables were not normally distributed, they were expressed as medians and interquartile ranges (IQRs). These characteristics were also stratified by ECI group to assess significant demographic differences across comorbidity groups. Categorical variables were compared using a chi-square test and nonparametric continuous variables by a Kruskal-Wallis rank sum test. Each outcome was presented across the demographic and clinical characteristics of interest: gender, race/ethnicity, insurance status, and ECI group. Medians (IQRs) were presented for maximum LOS and frequencies (percentages) for hospitalization, invasive ventilator dependence, and death.
To determine the adjusted associations of race/ethnicity and comorbidity with outcomes, multi-level regression models were fit using logistic regression models for hospitalization, invasive ventilator dependence, and death. Because LOS followed a continuous, exponential distribution, an exponential regression model was fit for maximum LOS. Adjusted odds ratios with 95% confidence intervals (CIs) were reported for the logistic model predictors. Adjusted exponentiated coefficients relating to the percentage change in expected maximum LOS with 95% CIs were reported for the exponential model predictors. All models were fit with race/ethnicity and ECI score and adjusted for age, sex, and insurance status. Additionally, models involved a random effect of 1-digit zip-code to account for clustering of results in similar regions. The predictive ability of the models was assessed for both logistic and exponential models. For logistic regression models, an area under the receiver operating characteristic curve (AUC) was calculated to assess the models' ability to correctly classify outcome categories. For the exponential model, the coefficient of determination (R 2 ) was calculated to estimate the percentage of variation in LOS as explained by the model predictors.
To assess the adjusted impact of race/ethnicity and comorbidity on the hazard of death, a Cox proportional hazards regression model was fit and adjusted for all variables included in the previous models. The outcome involved both time (from hospital admission to hospital discharge) and indication of in-hospital death (dead or alive at discharge).Adjusted hazard ratios (aHRs) and 95% CIs were reported. For all models, diagnostics were performed to ensure optimal model fit.
To further assess differences across comorbidities, sub-analyses were performed by stratifying the cohort by ECI groups (less than 0, 0, 1-4, 5 or higher) and running the same models within each group. Additionally, scatterplot figures were constructed to show the impact of race/ethnicity and comorbidity on the predicted outcomes of clinical complications. Each figure showed the predicted outcome against the ECI score. Smoothed lines were fit amongst the data by generalized additive regression models with shrinkage cubic-regression splines. This was done by fitting different lines for the different racial/ethnic groups. All hypothesis tests were 2-sided with a significance level of 5%. R version 3.6.1 (R Foundation for Statistical Computing, Vienna, Austria) was used for all analyses. In addition, R package "comorbidity" (version 0.5.3) was used to calculate comorbidity scores.
Sample size calculation. Using 80% power, the stratified race/ethnicity distribution by Elixhauser AHRQweighted comorbidity group (Table 1), and the risk of COVID-19 complications by race/ethnicity (Table 2), we needed a sample size of at least 3,591 subjects for each ECI category, assuming the most stringent comparison between AI/AN and NH Whites, to achieve a small effect size 38 of OR = 1.68 in a 2-sided examination. This sample size was attainable in our study given that we had a total of 52,411 subjects (8976; 16,177; 4220; and 23,038 for ECI groups less than 0, 0, 1-4, and 5 or higher, respectively), as shown in the data flow chart (Fig. 1).

Results
A total of 52,411 unique patients with a COVID-19 diagnosis or recent positive laboratory result were included in the analysis cohort. The median (IQR) patient age was 53 years (35-68); 50.6% (26,512) were female. Most patients were Hispanic/Latino (18,425; 35.2%), followed by NH White (15,048; 28.7%), NH Black/African American (10,667; 20.4%), NH other or unknown race (5754; 11.0%), API (1447; 2.8%), and AI/AN (1070; 2.0%). Most had private insurance (18,015; 34.4%), followed by Medicare (11,791; 22.5%) or Medicaid (8597; 16.4%) coverage. Most lived in the southeastern US (9867; 18.8%). Forty-four percent of patients (23,038) had an ECI score of 5 or higher; 30.9% (16,177) had an ECI score of 0 (Table 1). Table 1 also shows patient demographic characteristics stratified by ECI group. Those with higher comorbidity were older and more likely to be male, NH White, and covered by Medicare. Significant differences were observed between all demographic groups when stratified by ECI group (all p < 0.001). Table 2 shows crude risk results for COVID-19-related clinical complications across patient characteristics. Compared with women, men had higher percentages of hospitalization (55.8% vs. 50.2%), a higher median LOS (2.0 vs. 1.0), higher percentages of invasive ventilator dependence (14.2% vs. 9.3%), and higher percentages of death (10.6% vs. 7.4%). NH Whites had the highest outcomes for all clinical complications except invasive ventilator dependence (hospitalization, 65.2%; median LOS, 3.0 days; death, 13.3%). AI/ANs had the highest odds of invasive ventilator dependence (22.1%). Hispanics consistently had the lowest risk of complications across all outcomes. Patients covered by Medicare and those with ECI scores of 5 or higher had the highest risk of complications across all outcomes. Table 3 shows the association of the adjusted predictors with the 4 clinical complications of hospitalization, maximum LOS, invasive ventilator dependence, and death. (Survival modeling for time to death is presented here; logistic modeling for death is reported in Supplemental Table 3). Older patients and men (compared with women) consistently showed a higher risk of complications for all outcomes. AI/ANs had consistently higher risk of complications for all outcomes than NH Whites, all of which were significant (hospitalization aOR 1.21;

Discussion
This study answers the question of whether racial disparities in COVID-19 outcomes exist despite comparable ECIs among NH Black, Hispanic, AI/AN, and White patients. To our knowledge, it is one of the largest systematic evaluations in the US of racial and ethnic differences in survival outcomes stratified by ECI score for patients with COVID-19. Our analyses revealed significant racial disparities in health outcomes among COVID-19 patients with comparable ECI scores. In particular, compared with NH Whites, most race groups had higher risk for all outcomes (hospitalization, LOS, ventilation, and death), with greater clinical and statistical significance for AI/ ANs and NH Blacks. For example, using adjusted estimates, NH Blacks had longer LOS and higher odds of both Table 3. Adjusted associations with hospitalization, maximum length of hospital stay, dependence on invasive ventilator, and death from COVID-19. a Adjusted odds ratio from mixed-effect logistic regression model (clustering on one-digit zip-code). b Adjusted exponentiated coefficients (mixed-effect exponential regression model clustering on one-digit zip-code) relating to change in the ratio of expected maximum length of hospital stay (i.e., "male" coefficient is the ratio of the expected max LOS for males over expected max LOS for females, so max LOS is 16% greater for males than for females Non-Hispanic American Indian or Alaska Native 1.21 (1.03, 1.43) 1.32 (1.16, 1.51 Previous studies suggest that racial disparities in COVID-19 incidence and mortality can be explained by the complex interaction of inequities in social determinants of health, including access to health care 2,39,40 , poverty 40,41 , systemic racism 2,40 , socioeconomic status 2 , lack of testing for SARS-CoV-2 infection 39,42 , discrimination 2 , and virus exposure due to employment in essential-worker occupations 43,44 , all of which may be best viewed through a biopsychosocial framework akin to the weathering hypothesis, which posits that cumulative exposure to chronic stress can lead to accelerated aging by inducing physiologic changes that diminish the body's ability to respond appropriately to acute stressors 45 . Preliminary investigations suggest that a higher prevalence of medical comorbidities explains the clinical differences in outcomes among patients with COVID-19 7,17,[46][47][48] . Yet in our analysis of the 4 above-mentioned outcomes stratified by ECI AHRQ-weighted group, we still observed significant racial disparities in COVID-19 complications. Contrary to previous studies 7,17,46,49 , our analysis showed that for all races, the probability of hospitalization due to COVID-19 increased in unison with an increasing ECI. Accordingly, our findings contest arguments that NH Black and AI/AN patients are dying from COVID-19 at higher rates than their NH White counterparts because they have more comorbidities. After adjustment for predictive association with our chief outcomes, our analysis revealed a higher risk for all 4 outcomes (hospitalization, LOS, ventilation, and death) among older patients, men (compared with women), patients with higher ECI scores, and patients covered by Medicare or Medicaid (compared with those covered by private insurance). These findings align with patterns identified in previous studies of cohorts ranging in size from 191 to 11,210 7,46 . www.nature.com/scientificreports/ Disaggregation by race and ethnicity of the analysis of all 4 primary outcomes uncovered 3 overarching disparities while controlling for comorbidity. First, we found that APIs, NH Blacks, and patients of NH other or unknown race had a higher risk for all outcomes. This aligns with previous findings on racial disparities for NH Blacks for hospitalization 50 , mortality 19 , and ventilation 7 , and raises questions about the intersection of anti-Asian discrimination and xenophobia with health outcomes for API patients 51 . Secondly, our findings showed that, compared with NH Whites, AI/AN patients had a higher risk of death and higher odds of ventilator dependence but lower odds of hospitalization and a trend toward lower LOS for ECI of 5 or higher. These disproportionalities may be understood by the transfer of patients from Indian Health Service (IHS) facilities to non-IHS facilities, as IHS facilities are commonly ill-equipped to care for AI/AN patients with COVID-19 (e.g., they may lack invasive ventilation equipment) 52 . Third, our analysis showed that, compared with NH Whites, Hispanics/Latinos had a lower risk for death, hospitalization, and LOS, but higher odds of ventilator dependence for ECI = 0. Although these findings contradict epidemiological studies that have found a higher risk of COVID-19-related deaths within Hispanic/Latino communities 53,54 , they align with the "Hispanic epidemiological paradox, " which suggests that, although the socioeconomic characteristics of Hispanics/Latinos are similar to those of NH Blacks, comorbidity, mortality, and longevity outcomes in this subpopulation mirror or exceed those of NH Whites 55 .
Our data clearly show that a higher percentage of older patients were NH White and a higher percentage of younger patients were Hispanic/Latino (Supplemental Fig. 3). Other studies have found that, compared with NH Whites, Hispanic/Latino patients with COVID-19 tend to be younger 56 and that older Hispanic/Latino patients with COVID-19 may have a higher risk for death 57,58 . Recent reports of higher COVID-19 death rates among older Hispanic/Latino populations 57 and higher COVID-19 hospitalization rates among Hispanic/Latino  www.nature.com/scientificreports/ children 59 may challenge the "Hispanic paradox. " To better address the needs of the Hispanic/Latino population, future researchers should employ additional data disaggregation to address this question. Lastly, our results indicate that older patients and individuals with higher ECI scores had an increased risk of death from COVID-19. Likewise, men compared with women, all races (except Hispanics/Latinos) compared with NH Whites, and patients with all other health insurance types compared with those with private insurance had an increased likelihood of death. These results are supported by recent findings of higher COVID-19 fatality rates among men, older persons, and patients with a disproportionate burden of comorbidities 60,61 . Emerging literature also points to an association of minority status and insurance type with poor COVID-19 outcomes 7 . Our logistic regression findings reveal similar associations with minority status and insurance type for hospitalizations, death, ventilator dependence, and hospital LOS.
This study has potential limitations. Some of the outcomes and predictors were identified by medical record codes (i.e., ICD and LOINC) that are known to limit the specificity of a study. However, we additionally applied a variety of alternative methods, such as text matching, to provide an additional net with which to capture all possible indications in the data. Medical histories were only available going back 5 years on qualifying patients included in the cohort. Our study included only patients who sought treatment for COVID-19. It is important to note that medically underserved and minority populations without insurance may not seek testing and treatment for COVID-19 62 , which has implications for both Hispanics/Latinos and NH Blacks, who are 2-3 times more likely to be uninsured compared with their NH White counterparts 63 . In addition, because (1) the data we analyzed included only individuals who had accessed health care services, and (2) post-mortem COVID testing is not routinely done, we may have underestimated the death rate among Hispanics/Latinos. Lastly, social variables that could play a potential confounding role in our study were not captured in the EHR data that we analyzed and thus were not included in the multilevel analyses.

Conclusion
Compared with NH White patients with similar ECI scores, NH Black patients had significantly higher LOS and odds of ventilator dependence and death, while AI/AN patients were more likely to have worse indications across all 4 outcomes analyzed: hospitalization, LOS, ventilation, and death. COVID-19 has laid bare an imperative to investigate its negative health outcomes that may be exacerbated by a complex interplay of social, environmental, and behavioral factors faced by indigenous, Hispanic/Latino, and NH Black communities 31 , indicating a need for upstream intervention at patient, community, and policy levels to close the health equity gap.

Data availability
The datasets generated and/or analyzed during the current study are not publicly available due to restrictions by Cerner, the owner of the data. Data may be accessed by signing a data-sharing agreement with Cerner and covering any costs that may be involved.