Explaining among-country variation in COVID-19 case fatality rate

While the epidemic of SARS-CoV-2 has spread worldwide, there is much concern over the mortality rate that the infection induces. Available data suggest that COVID-19 case fatality rate had varied temporally (as the epidemic has progressed) and spatially (among countries). Here, we attempted to identify key factors possibly explaining the variability in case fatality rate across countries. We used data on the temporal trajectory of case fatality rate provided by the European Center for Disease Prevention and Control, and country-specific data on different metrics describing the incidence of known comorbidity factors associated with an increased risk of COVID-19 mortality at the individual level. We also compiled data on demography, economy and political regimes for each country. We found that temporal trajectories of case fatality rate greatly vary among countries. We found several factors associated with temporal changes in case fatality rate both among variables describing comorbidity risk and demographic, economic and political variables. In particular, countries with the highest values of DALYs lost to cardiovascular, cancer and chronic respiratory diseases had the highest values of COVID-19 CFR. CFR was also positively associated with the death rate due to smoking in people over 70 years. Interestingly, CFR was negatively associated with share of death due to lower respiratory infections. Among the demographic, economic and political variables, CFR was positively associated with share of the population over 70, GDP per capita, and level of democracy, while it was negatively associated with number of hospital beds ×1000. Overall, these results emphasize the role of comorbidity and socio-economic factors as possible drivers of COVID-19 case fatality rate at the population level.

Scientific Reports | (2020) 10:18909 | https://doi.org/10.1038/s41598-020-75848-2 www.nature.com/scientificreports/ respiratory assistance and intensive care. Finally, given that CFR is defined as the ratio between number of deaths and number of confirmed cases, countries might simply differ in the accuracy with which they detect the infection. Indeed, CFR estimates are prone to error if the actual number of infected people is much higher that the number of PCR confirmed cases (producing an overestimated CFR), or if mortality occurs with a delay (producing an underestimated CFR). Therefore, variation in CRF might reflect among country variation (i) in population screening (i.e., the number of tests performed), which affects the denominator of the ratio between number of deaths and number of confirmed cases, (ii) in counting and communicating the actual number of patients that have succumbed from SARS-CoV-2 infection.
Here, we conducted an analysis of the factors that might account for the variability of COVID-19 CFR among countries. Using data updated to June 11th 2020, we first investigated if CFR significantly varies between countries, independently from (i) the stage of the epidemic wave, (ii) the testing strategy, and (iii) the social distancing policies adopted by each country. Second, we used country-specific variables assessing the occurrence of comorbidities, as well as demographic, economic and political variables to uncover any association pattern with COVID-19 CFR.

Methods
We used data on daily number of confirmed cases and deaths for each country reported by the European Center for Disease Prevention and Control (ECDC). We computed the case fatality rate as the ratio between deaths and confirmed cases. We restricted the dataset to countries with at least 100 confirmed cases to avoid spurious results due to small numbers. For each country, we also counted the number of days between the 100th case and 11th June 2020, and the number of days from the occurrence of at least one death ×1,000,000, indicating the progression of the epidemic.
We used the online resource https ://ourwo rldin data.org/ to retrieve data on the incidence of known comorbidity factors for each country, as well as information on demographics, economics and political regimes. In particular, we used different metrics to describe comorbidities: (1) disability-adjusted life years (DALYs); (2) share of total disease burden; (3) age-standardized death rates per 100,000; (4) share of deaths. We focused on known comorbidities such as cardiovascular diseases, cancers, chronic respiratory diseases, diabetes mellitus, chronic kidney diseases. We also included metrics related to factors that might impinge on the severity of respiratory syndromes, such as smoking and air pollution. We also used https ://ourwo rldin data.org/ to retrieve information on demographic, economic and political indicators. Table S1, in the supplementary information, reports the list, description and source of the variables used here, according to the GATHER statement 17 .

Statistical analyses.
We first aimed at exploring whether CFR varied among countries while controlling for the differences in the epidemic progression. To this purpose, we run a linear mixed model (LMM) where CFR was the dependent variable, time since 100th case (in days) and squared time since 100th case (as to model non-linear variation), country and the two-way interactions were the fixed factors. Country was also included as a random effect. For computational reasons, in this model the covariance structure of the R matrix was modeled using variance components, and degrees of freedom were computed using the between-within method. Only countries for which at least 10 days had elapsed between the record of the 100th case and 11th June 2020, and for which number of deaths was higher than 1 per 1,000,000 inhabitants were included in this model, to allow a better estimate of the variation between CFR and time. This model included 143 countries and 8441 daily values of CFR. In a second model, we also included the number of tests performed ×1000 as to control for differences in testing strategies among countries, and a stringency index describing the severity of the social distancing rules adopted by each country. This reduced the number of countries included in the model to 72 and the number of total observations to 3778.
To assess the pattern of association between country-specific CFR and comorbidities, we ran four LMMs that included different metrics as fixed factors. The first model included DALYs lost to cardiovascular, cancer and chronic respiratory diseases (due to the strong correlations among variables, we summed the three DALYs and used the sum in the model), and DALYs lost ×100,000 for people older than 70 years. The second model included share of disease burden (due to cardiovascular, cancer and chronic respiratory diseases). The third model included age-standardized death rates (due to cardiovascular diseases, cancer, air pollution, ambient particulate matter pollution, and smoking for people older than 70 years). The last model included share of deaths (due to cardiovascular, cancer, chronic respiratory, kidney diseases, lower respiratory infections, diabetes mellitus, outdoor air pollution). The four models always included a set of covariates that described demographic, economic and political variables, namely population size, share of the population over 70 years, gross domestic product (GDP) per capita, total health care expenditure as share of GDP, number of hospital beds (×1000 inhabitants), political regime, stringency index, and the number of tests performed ×1000. Political regime is scored according to the level of democracy, between − 10 (full autocracy) to + 10 (full democracy) (Polity IV as reported in https ://ourwo rldin data.org/). The stringency index describes the severity of the policies implemented by each country to limit the spread of the virus, according to the following categories: school closure, workplace closure, public events cancelled, restrictions on gatherings, public transport closure, public information campaigns, stay at home, restriction on internal movements, international travel controls, testing policy, contact tracing. Finally, all models also included time since 1 death ×1,000,000 was reached (in days), and squared time as to take into account the progression of the epidemic in each country. In each model, we tested the interaction between the different comorbidity, demographic, economic factors and time. This allowed us to ascertain whether the temporal changes in CFR differed as a function of country-specific comorbidity, demographic, and socio-economic factors. All variables were standardized with mean = 0 and standard deviation = 1 to make parameter estimates directly comparable (variables with asymmetrical distribution were previously log-transformed to reduce skewness). To Scientific Reports | (2020) 10:18909 | https://doi.org/10.1038/s41598-020-75848-2 www.nature.com/scientificreports/ take into account the covariation between observations at different geographical scale, the LMMs also included three nested effects as random variables: continent, geographic region 18 nested within continent, country nested within geographic region nested within continent. The covariance structure of the R matrix was modeled using a first-order autoregressive structure to model the temporal autocorrelation between daily CFR values. Degrees of freedom were computed using the Satterthwaite approximation. In order to minimize the risk of false discovery rate, we set the value of α to 0.01. Therefore, only p values lower than 0.01 were considered as indicative of statistically significant associations. All the analyses were conducted with SAS 14.3 (PROC MIXED).

Results
We found strong evidence for among-country differences in COVID-19 CFR. The LMM showed a highly significant interaction between time since the 100th case and country, indicating that CFR trajectories (both linear and quadratic components), did differ among countries as the epidemic progressed (Table 1, Fig. 1). CFR is defined as the ratio between number of deaths and number of confirmed cases. Since many cases of asymptomatic infection (or infections with mild symptoms) might get unnoticed, CFR is certainly an overestimate of the actual risk of dying when infected with SARS-Cov-2. In the light of this argument, a strategy of massive screening of the population (not only patients who are admitted to the hospitals, but also those with no or mild symptoms) might provide a more realistic estimate of the denominator of the CFR. Although the number of tests performed over the course of the epidemic is available for a smaller number of countries, it nevertheless varies tremendously among them, indicating different screening strategies adopted by each country. Focusing, for instance, on day 80 since the 100th case, and including only countries with more than 1 death per 1,000,000 inhabitants (n = 50 countries), number of tests performed per 1000 inhabitants varied from 0.9 (Indonesia) to 179 (Iceland). Interestingly, Table 1. Linear mixed model exploring variation of COVID-19 case fatality rate (CFR) as a function of time since the 100th case for each country. The model also included squared time since 100th case and the interactions between country and time. Country was also declared as a random effect in the model. The model was restricted to countries that had 10 or more days elapsed between the occurrence of the 100th case and 11th June 2020 and for which number of deaths was higher than 1 per 1,000,000 inhabitants. The analysis is based on 143 countries and 8441 observations. Significant p-values are in bold. www.nature.com/scientificreports/ however, countries that better screened their population did not always suffer from the lowest CFR. At day 80 since the 100th case, there was no correlation between number of tests per 1000 inhabitants and CFR (Pearson's r = − 0.101, p = 0.4873, n = 50). Using the whole dataset, and including the number of tests ×1000, showed that the relationship between CFR and number of tests was country specific, with some countries where screening resulted in a decline in CFR, whereas in other the relationship was positive (interaction between country and number of tests ×1000: F 71,3782 = 23.17, p < 0.0001; Fig. 2). Importantly, the interaction between time and country remained highly significant even when adding both the number of tests ×1000 and the stringency index in the model (time × country: F 69,3429 = 43.50, p < 0.0001; squared time × country: F 69,3429 = 45.88, p < 0.0001). This result, therefore, suggests that variation in CFR among countries does not merely depend on the screening effort provided by each country, nor on the severity of social distancing and isolation rules. In order to uncover the factors that might account for such among-country variation in CFR, we ran four LMMs that included different fixed factors related to comorbidities, demographic, economic and political variables. These models provided evidence for several associations between country-specific risk factors and the temporal dynamics of CFR (Table 2). In particular, countries with high values of DALYs lost to cardiovascular, cancer and chronic respiratory diseases had high CFR (model 1, Table 2, Fig. 3A). Countries with high share of disease burden due to chronic respiratory diseases had high CFR (model 2, Table 2). Countries with the lowest death rate due to cardiovascular diseases had the lowest CFR and countries with the highest death rate due to smoking in people over 70 year-old had the highest CFR (model 3, Table 2, Fig. 3B). Finally, model 4 showed that CFR was positively associated with high share of death due to chronic respiratory, cardiovascular, kidney diseases and outdoor air pollution ( Table 2, Fig. 3C). Interestingly, this model also showed a negative correlation between CFR and share of death due to lower respiratory infections ( Table 2, Fig. 3D).
Among the demographic, economic and political variables, the four models consistently provided evidence for positive associations between the temporal dynamic of CFR and population size, GDP per capita, total health expenditure as share of GDP, share of the population over 70 years, stringency index (Table 3, Fig. 4A-C). As mentioned above, number of tests performed ×1000 showed a more complex pattern of association with CFR ( Table 3, Fig. 4D). Two models (out of four) provided evidence for a negative association between number of hospital beds ×1000 and CFR (Table 3, Fig. 4E), while only one model showed a positive association between CFR and political regime, with democracies having the highest CFR (Table 3, Fig. 4F).

Discussion
As the SARS-CoV-2 epidemic continues to spread worldwide, the mortality induced by the disease has been a serious matter of concern, with some countries paying a high toll to the infection. In this light, it is important to understand why some countries seem to experience lower mortality rate than others and possibly to uncover the associated factors. Here, we showed that CFR greatly differs among countries (even after controlling for the number of tests and the severity of social distancing rules), and several comorbidity (DALYs lost to cardiovascular, cancer and chronic respiratory diseases; death rate due to smoking in people older than 70; share of deaths due to cardiovascular, chronic respiratory, kidney diseases), and socio-economic factors (population size, GDP  www.nature.com/scientificreports/ per capita, share of the population over 70, number of hospital beds ×1000, political regime) were positively associated with COVID-19 CFR. Assessing case fatality rate during an ongoing outbreak is particularly difficult for, at least two reasons 11,19 . CFR at a given time might be an underestimate of the final mortality rate because cumulative number of deaths will eventually keep increasing as some patients are still hosted in intensive care units (i.e., right censoring). Conversely, CFR might overestimate actual mortality if a large fraction of infected people do not develop disease symptoms (or develop mild symptoms), get unnoticed and are not included into the population of confirmed cases. Several countries have implemented a strategy of massive screening, aiming at identifying positives and isolating them, as to avoid further spreading of the disease. Data on the number of tests performed per 1000 people, therefore, allow investigating whether countries with a massive screening policy were also those with the lowest CFR. The results showed no simple association between CFR and testing, with countries showing the expected negative relationship, while others showing a positive correlation. This suggests that differences in population screening are not enough to explain the tremendous variation in CFR among countries. As mentioned above, the other possible bias when computing CFR is that some of the patients who are still in intensive care units may eventually die, and therefore if the total number of confirmed cases remains unchanged, CFR might  Table 2. Linear mixed models investigating the association between COVID-19 case fatality rate (CFR) and several descriptors of comorbidities, demographics, economics and political regime for each country. Each model included the same demographic, economic and political regime variables (GDP per capita, population size, total health care expenditure as share of GDP, number of hospital beds ×1000 inhabitants, share of the population over 70 years, political regime, stringency index and number of tests performed ×1000). In addition, model 1 included DALYs lost to cardiovascular, cancer and chronic respiratory diseases, and DALYs lost ×100,000 for people older than 70 years. Model 2 included share of disease burden (cardiovascular, cancer and chronic respiratory diseases). Model 3 included age-standardized death rates ×100,000 due to cardiovascular diseases, cancer, air pollution, ambient particulate air pollution, and smoking over 70 years. Model 4 included share of deaths for cardiovascular diseases, cancer, chronic respiratory diseases, lower respiratory diseases, diabetes, and outdoor air pollution. All models included time since number of deaths ×1,000,000 higher than 1 and squared time. Three nested factors were also included as random factors (continent, region within continent and country within region within continent).The www.nature.com/scientificreports/ still increase. However, visual inspection of the temporal trajectories of CFR shows that values have reached an asymptote for the vast majority of countries (Fig. 1). This suggests that, unless a second wave hits in the following weeks/months, we should not expect CFR to substantially vary as a consequence of residual mortality. Despite the uncertainty associated with the estimation of CFR, its comparison between countries can provide useful insights into the heterogeneity in the burden paid to the disease. We showed that the trajectories of time dependent variation in CFR greatly differed among countries, while controlling for differences in testing strategies and stringency index. This quantitatively corroborates and statistically validates the intuition that some countries better dealt with the disease than others. The following step was to try to understand whether such heterogeneity arises as the consequence of predictable factors. Previous reports on the clinical outcome of the disease have identified two major factors associated with poor prognosis: age and the presence of comorbidities. Elderly people have been shown to suffer the highest mortality rate following infection with SARS-CoV-2 11,19 . Similarly, previous history of cardiovascular disorders, cancer and diabetes have been reported to substantially increase COVID-19 mortality risk 20,21 . We therefore predicted that countries with a higher share of elderly people and a higher incidence of known comorbidity factors might suffer from the highest CFR.
Our integrated modelling approach provided some evidence in support to these predictions. In particular, several metrics of comorbidity factors were positively associated with the temporal dynamics of CFR (DALYs lost to cardiovascular, cancer and chronic respiratory diseases; share of burden due to chronic respiratory diseases; death rate due to cardiovascular diseases; death rate due to smoking in people older than 70; share of death due to chronic respiratory and kidney diseases). Interestingly, our model also showed that share of death due to lower respiratory infections was negatively associated with COVID-19 CFR. This negative association is www.nature.com/scientificreports/ intriguing, especially in the light of recent reports of possible cross-immunity that might confer partial protection to SARS-CoV-2 22 . All the above-mentioned associations between comorbidities and CFR hold while controlling for several potential confounding factors describing the socio-economic context of different countries. As such, positive associations between comorbidities and CFR at the country level, do not merely reflect differences in the structure of the age-pyramid, or the amount of resources allocated to the health care system. Actually, focusing on such demographic and socio-economic factors allowed us to identify several other variables that contributed to explain among-country variation in CFR. As predicted, countries with the highest share of elderly people (over 70) also had the highest CFR.
Economic parameters might equally well contribute to shape COVID-19 mortality. As the number of severe cases increases during the epidemic, the health care system can get overwhelmed and might be unable to receive and treat all those who need intensive care. Mortality might therefore results from health care systems that are inadequate to deal with large number of cases requiring simultaneous admittance in intensive care units. We used several proxies describing the investment of each country into the health care system and found a negative association between the number of hospital beds per 1000 inhabitants and CFR. However, seemingly in contradiction  Table 3. Linear mixed models investigating the association between COVID-19 case fatality rate (CFR) and several descriptors of comorbidities, demographics, economics and political regime for each country. Each model included the same demographic, economic and political regime variables (GDP per capita, population size, total health care expenditure as share of GDP, number of hospital beds ×1000 inhabitants, share of the population over 70 years, political regime, stringency index and number of tests performed ×1000). In addition, model 1 included DALYs lost to cardiovascular, cancer and chronic respiratory diseases, and DALYs lost ×100,000 for people older than 70 years. Model 2 included share of disease burden (cardiovascular, cancer and chronic respiratory diseases). Model 3 included age-standardized death rates ×100,000 due to cardiovascular diseases, cancer, air pollution, ambient particulate air pollution, and smoking over 70 years. Model 4 included share of deaths for cardiovascular diseases, cancer, chronic respiratory diseases, lower respiratory diseases, diabetes, and outdoor air pollution. All models included time since number of deaths ×1,000,000 higher than 1 and squared time. Three nested factors were also included as random factors (continent, region within continent and country within region within continent www.nature.com/scientificreports/ to this view, we also found that CFR was highest in countries with high GDP per capita and high total health expenditure as share of GDP. While odd, this result corroborates the impression that wealthy countries in Europe and North America have paid a severe toll to the infection. Overall, the relationship between investment into health care system and CFR appear to be more complex than one might expect. A final source of variation accounting for differences in CFR might be due to differential reports of number of deaths and/or confirmed cases between countries. This might reflect different counting/reporting methodology (e.g., testing strategy, deciding whether or not a given patient died because of COVID-19). In addition, most of the countries have been implementing social distancing protocols that differed in the severity of the restrictions imposed and on the timing of policy execution. Moreover, different populations might follow governmental instructions more or less loosely, due to the perceived risk/benefit of applying such instructions. For instance, social distancing and isolation might be more easily applied in countries with autocratic regimes that exert a more stringent control over the population. The role of social and cultural traits in the emergence of zoonotic diseases has already been discussed in the past, including the idea that collectivistic societies might have built as a way to better control epidemic waves 23,24 . We explored how political regime and the severity of isolation policies were associated with CFR. We found moderate evidence (1 out of 4 models) suggesting that countries with a democratic regime were those with the highest CFR. The analysis of the stringency index, describing the severity of the restrictions implemented by each country, showed that highest values of CFR were reached for intermediate values of the stringency index. This might reflect different processes. First, countries where the epidemic wave was relatively low (perhaps because of the factors described above) could have implemented relatively mild restriction policies, compared to countries where the epidemic got out of control and that required imposing more stringent social distancing rules.
Although we report here some associations between comorbidities, demographic, socio-economic variables and COVID-19 CFR, we fully acknowledge that these factors do not perfectly explain the variation in CFR among countries. This might come from the coarse grain of the analyses (country level), the error associated with the metrics used in our study, the role played by other factors not taken into account in our study, or the uncertainties associated with the estimation of CFR while the epidemic is still ongoing. If serological tests will be used on a very large scale to assess the proportion of the population that has been infected by the virus, we will have a better estimate of the mortality rate and the possible factors explaining the among-country heterogeneity. With this in mind, we nevertheless believe that our results stress the role of comorbidities, socio-economic and political factors as potential drivers affecting how a country deals with globally threatening epidemics.

Data availability
The full dataset is available in the online appendix.