Variation in excess all-cause mortality by age, sex, and province during the first wave of the COVID-19 pandemic in Italy

Henry, Nathaniel J.; Elagali, Ahmed; Nguyen, Michele; Chipeta, Michael Give; Moore, Catrin E.

doi:10.1038/s41598-022-04993-7

Download PDF

Article
Open access
Published: 20 January 2022

Variation in excess all-cause mortality by age, sex, and province during the first wave of the COVID-19 pandemic in Italy

Scientific Reports volume 12, Article number: 1077 (2022) Cite this article

5919 Accesses
10 Citations
14 Altmetric
Metrics details

Subjects

Abstract

Although previous evidence suggests that the infection fatality rate from COVID-19 varies by age and sex, and that transmission intensity varies geographically within countries, no study has yet explored the age-sex-space distribution of excess mortality associated with the COVID pandemic. By applying the principles of small-area estimation to existing model formulations for excess mortality, this study develops a novel method for assessing excess mortality across small populations and assesses the pattern of COVID excess mortality by province, year, week, age group, and sex in Italy from March through May 2020. We estimate that 53,200 excess deaths occurred across Italy during this time period, compared to just 35,500 deaths where COVID-19 was registered as the underlying cause of death. Out of the total excess mortality burden, 97% of excess deaths occurred among adults over age 60, and 68% of excess deaths were concentrated among adults over age 80. The burden of excess mortality was unevenly distributed across the country, with just three of Italy’s 107 provinces accounting for 32% of all excess mortality. This method for estimating excess mortality can be adapted to other countries where COVID-19 diagnostic capacity is still insufficient, and could be incorporated into public health rapid response systems.

The WHO estimates of excess mortality associated with the COVID-19 pandemic

Article Open access 14 December 2022

Using mortuary and burial data to place COVID-19 in Lusaka, Zambia within a global context

Article Open access 29 June 2023

Regional excess mortality during the 2020 COVID-19 pandemic in five European countries

Article Open access 25 January 2022

Introduction

Italy received international attention as one of the first countries outside of China to experience a major COVID-19 outbreak. On March 9, 2020, Italy announced a nationwide lockdown to stem community transmission, and deaths peaked two weeks later during the week of March 25¹. After that peak, deaths declined for the following two months, and lockdown restrictions were gradually eased starting in late May². In total, between February and August 2020, approximately 35,500 COVID-19 deaths were registered across Italy, equivalent to approximately 60 deaths per 100,000 people³.

Aside from the timing and magnitude of the first wave of COVID-19 transmission within Italy, two notable features set its COVID-19 epidemic apart from those in other European countries. First, registered COVID-19 cases and deaths were unevenly distributed across the regions of Italy⁴, and that cause-specific mortality from COVID-19 varied by age and sex⁵. Registered COVID-19 deaths were highest in the northern regions of the country, particularly in the region of Lombardy, where the registered COVID-19 death total amounted to over 160 deaths per 100,000 population (Fig. 1). Another salient feature of the Italian epidemic was the early recognition among health authorities that registered COVID-19 deaths under-counted the full mortality burden of the epidemic. In May 2020, the Italian National Institute of Statistics (Istat) reported that while 13,700 COVID-19 deaths had been registered across Italy between 20 February and 31 March, deaths from any cause had increased by 25,300 compared to an expected baseline during the same time period, suggesting that the full mortality burden of the COVID-19 epidemic was nearly double what had been previously reported⁶.

Istat and other research groups drew conclusions about the mortality burden of the COVID-19 pandemic based on a series of excess mortality analyses. Excess mortality analyses attempt to measure the net effect of a discontinuity, such as a COVID-19 outbreak, on all-cause mortality. This is a two-step process: first, the investigator constructs an estimated baseline number of deaths expected during the period in question. While various methods have been used in the past to construct this baseline^7,8,9, previous studies in Italy have averaged the number of deaths recorded by week in the years from 2015 through 2019 to generate a baseline estimate for the same weeks in 2020^2,10. Next, the investigator compares that expected baseline with the count of recorded deaths during the same time period. Excess deaths, a count, are measured as the difference between the observed death count and the expected baseline. Standardized mortality ratios (SMRs) are measured as the ratio between the observed death count and the expected baseline^11,12. Excess mortality analyses of the COVID-19 pandemic are becoming widely used in the media: among others, the Economist, the Financial Times and the New York Times have estimated excess mortality across dozens of countries in Europe, the Americas, and Asia^13,14,15. Previous investigations have also examined how excess mortality analysis can capture deaths caused by COVID-19 but attributed to other causes, as well the indirect mortality burden of the COVID-19 pandemic^8,16.

This study explores another central issue for excess mortality estimation: how can we detect increases in mortality at the local level or across multiple age groups, where the expected number of baseline deaths in each subpopulation of interest is relatively low? This question addresses a tension central to many forms of public health surveillance, where the imperative to identify clustering and high-risk subgroups in health surveillance data must be balanced against reduced study power and possible biases associated with small samples¹⁷. Any suitable approach for estimating small-group excess mortality must quantify uncertainty due to stochastic variation as well as limited data informing the baseline. Even at the national level, weekly estimates of COVID-19 excess mortality presented without uncertainty intervals can leave viewers confused about what constitutes a meaningful departure from the baseline.

To better understand the relationship between registered COVID-19 deaths and excess mortality across Italy, we developed a model to estimate excess mortality by age group, sex, and week across the country’s 107 provinces. Our approach estimated counterfactual baseline mortality rates for each of these groups from March through August 2021, based on mortality rates and predictive covariates observed from January 2015 through February 2021. Our baseline mortality model combined elements from a widely-used Poisson generalized linear modeling (GLM) strategy for mortality estimation⁹; a structured province-year-age random effect that draws power from local correlations in mortality across those three dimensions, based on disease mapping principles¹⁸; and a Fourier curve-fitting method to fit capture seasonal trends in age-specific all-cause mortality⁷. Once baseline mortality was estimated for each week and subpopulation in the study period, we compared these estimates to observed counts from vital records over the same period while preserving uncertainty in the expected baseline mortality.

This study contributes two methodological advancements to existing literature on excess mortality analysis and mortality mapping. First, this is the first study to our knowledge to apply disease mapping methods to estimate excess mortality. Second, whereas previous studies have applied Fourier curve-fitting methods to estimate seasonality among large populations at the national and state levels, this study combines Fourier seasonality analysis with a space–time smoothing model to estimate seasonality of mortality among small populations. We apply these modeling methods to estimate excess mortality and SMRs across Italy at an unprecedented level of detail. Here, we present major findings on excess deaths (the difference between observed and baseline death counts) as well as SMRs (the ratio between observed and expected deaths) across subpopulations of Italy.

Results

Excess mortality exceeds registered COVID-19 deaths

At the national level, the aggregated results from our model generally agreed with previous national studies on the timing and magnitude of excess deaths associated with the COVID-19 pandemic across Italy. Figure 2 shows the estimated weekly death count (in brown) as well as the observed weekly death count (in black) from 26 February through 31 August 2020 across Italy. In line with other studies, the model results suggest that excess mortality peaked on the week of March 25, on the same week as the peak in registered COVID-19 deaths, and then consistently fell until approximately returning to baseline by the end of May 2020, coinciding with the lifting of most lockdown measures across Italy². At no point between June and August 2020 did the observed death count exceed the upper bound of the 95% uncertainty interval for baseline deaths: as a result, the remainder of this section will focus on excess mortality during the 13-week period from 26 February through 26 May 2020. During these 13 weeks, we estimate that 53,200 excess deaths (95% uncertainty interval 26,500–79,700) were associated with the COVID-19 pandemic, compared with 35,500 deaths registered with COVID-19 as the underlying cause during the same period.

Excess deaths and sex

Our findings can be aggregated across sexes and provinces to understand the age structure of excess mortality across Italy. Figure 3 plots weekly excess deaths across the five modeled age categories, while Table 1 lists estimated excess deaths by age and sex grouping at the national level. Excess deaths were overwhelmingly concentrated in older age groups: of the estimated 53,200 excess deaths in Italy from March through May 2020, an estimated 51,600, or 97.0%, of these excess deaths occurred among adults aged 60 and above, while 36,400, or 68%, occurred in adults older than age 80. Among adults aged 90 and above, women experienced 11,000 (5800–16,400) excess deaths, more than double the 4400 (2000–6900) excess deaths among men in the same age group; this reflects the sex composition of the oldest age group, where women made up 72.6% of the over-90 population in January 2020.

Table 1 Estimated excess mortality by age and sex grouping, aggregated to the national level for the weeks of 26 February through 26 May 2020.

Full size table

From March through May 2020, 18 province-age groupings observed significantly fewer deaths than the expected baseline. Of these significantly lower mortality observations, 11 were among the 0–59 age group. These negative excess deaths were included in calculations of total excess mortality nationwide, but do not significantly change the national totals or the age distribution of excess mortality. Tables summarizing excess mortality by age and province grouping have been provided in Supplementary Appendix S1.

The outsized proportion of excess deaths observed within older age categories reflects two aspects of age-structured mortality across Italy during this period: baseline mortality is highest in older age groups even in normal years, and these age groups also experienced a larger relative increase in mortality during the study period. Figure 4 maps the standardized mortality ratio, or the ratio between observed and baseline mortality rates, by age group and province across the entire 13-week period from 26 February through 26 May 2020. Note that the standardized mortality ratio uniformly increases across the northern provinces of Italy when moving from the 0–59 age group to the 60–69 and 70–79 age groups, and remain heightened in the oldest age groups, with some provinces in Lombardy and Emilia Romagna experiencing over three times the expected baseline mortality in older age groups. The greater increase in excess mortality among older age groups can also be expressed by comparing the proportion of excess mortality with each age group with that age group’s share of all-cause mortality in the years 2015 through 2019. Adults over age 60 accounted for 92.6% of all deaths in the years 2015–2019 (2.99 million deaths out of 3.23 million total), but 97.0% of all excess mortality in March through May of 2020 (51,600 of 53,200 excess deaths). Similarly, adults over age 80 experienced 63.4% of all deaths in previous years (2.04 million deaths out of 3.23 million), but 68.4% of excess mortality (36,400 of 53,200 excess deaths).

Spatial concentration

This model is also able to characterize spatial variation in excess mortality during the first wave of the COVID-19 epidemic in Italy. From March through May 2020, just three provinces—Milan, Bergamo, and Brescia, all in the Lombardy region—accounted for 32% of all excess deaths in Italy, with just 9% of the country’s total population. By including just four other provinces across the Piedmont, Emilia-Romagna, and Liguria regions in northern Italy, nearly one-half of all excess deaths are captured in a region that makes up just 16% of Italy’s population. Expanding further, the 22 provinces with the highest number of excess deaths account for three-quarters of all excess mortality, but just 30% of Italy’s national population. Figure 5 shows the locations of these province groupings and each group’s marginal contribution to the total excess mortality curve from March through May 2020.

Diverse experiences of the first wave

Comparing Figs. 2, 4, and 5, both the timing and geographical distribution of excess mortality during March through May 2020 seem to follow the time pattern of registered COVID-19 mortality by region in that same period: both peaked during the week of 25 March and were concentrated in the northern regions of the country, particularly Lombardy. Figure 6 complicates this analysis by demonstrating two features of excess mortality that are apparent at the province level, but not the regional or national levels. The first feature identifies the geographic center of the peak excess mortality during the COVID-19 first wave. While most news outlets and previous studies have identified Lombardy region as the center of the COVID-19 outbreak, the top-right panel of Fig. 6 suggests that the mortality rate increased most in provinces on the border between southwest Lombardy and northwest Emilia-Romagna, adjacent to the Piedmont and Liguria regions. This border effect suggests that the 107 provinces of Italy may be a more informative unit of analysis for both registered COVID-19 deaths and excess mortality than the country’s 20 regions.

A province-level analysis also demonstrates that within high-burden regions, neighboring provinces experienced varying patterns of excess mortality during the first wave of COVID-19. The bottom two panels of Fig. 6 show excess mortality curves for Lodi and Varese, two provinces that both share a border with Milan. Excess mortality in Lodi peaked on the week of March 11, with an SMR of over 6 times the baseline mortality rate; meanwhile, excess mortality in Varese peaked a month later, on the week of 8 April, with mortality for that week peaking at over double the baseline rate.

Discussion

In this study, we explored whether an application of small-area methods to excess mortality analysis could identify previously unreported trends in the pattern of excess mortality across Italy during the first wave of the COVID-19 pandemic. Our findings suggest that a small-area model yields estimates of excess mortality that are consistent with alternative calculation strategies at the national level, while offering new insights into the uneven distribution of excess mortality by age group, sex, province, and week across the country. Excess mortality estimates generated by this model suggest that a disproportionate majority of excess deaths occurred in adults age 60 and older, due to both the higher level of baseline mortality in these age groups and higher elevation of mortality above baseline during the first three months of the COVID-19 epidemic. This analysis also revealed a highly uneven spatial distribution of excess mortality: half of excess deaths were contained within just seven of Italy’s 107 provinces, accounting for less than 16% of the population. While the general nationwide pattern of excess mortality reflected the timing and geographical concentration of registered COVID-19 deaths, regional analyses obscure meaningfully different excess mortality trends across neighboring provinces within a region.

This study extends regression-based methods for estimating age-structured baseline mortality by incorporating location, year, and week structured random effects within a Bayesian hierarchical framework^7,19. This method was found to substantially outperform a simpler approach, in which average death counts across past years are used as the baseline, for predicting weekly excess mortality. It appears that a structured space–time approach stabilizes stochastic variation across relatively small death counts by province and week, producing a smoother mortality risk surface while still accounting for meaningful trends captured by covariates. Because this approach structures uncertainty in a way that allows for principled aggregation, the results can indicate high-risk subgroups by age or location, and identify local variation in excess mortality that might be masked at a less detailed level, without overstating the confidence of findings for individual subpopulations.

This method for measuring excess mortality also has several limitations that should be noted. Because the process for estimating baseline deaths is more complicated and requires additional inputs compared to a simpler averaging method, it is less accessible to a wide range of users. The model for estimating baseline mortality assumes the same relationship between each covariate and mortality across age groups. In reality, some covariates may have a differential effect by age—for example, temperature may have more of an impact on mortality in older age groups due to the greater prevalence of risk factors that inhibit the body’s thermoregulatory response^20,21. This limitation is partly addressed by the separate harmonic seasonality fits for each age group. This study is also limited to the set of covariates which can be estimated by province and year: other covariates that may be predictive of all-cause mortality, such as the prevalence of environmental and occupational risk factors, were excluded due to lack of availability at the province level. While the population groupings reported in this study could be divided into even more granular units, any small-area investigation must protect the privacy rights of individuals²². Finally, as described in the Introduction, findings from excess mortality analyses must be carefully interpreted due to the many possible sources for changing mortality which are not accounted for in the modeling strategy.

Additional mechanisms for public health monitoring are needed to catch future resurgences of COVID-19 and future epidemics. In the context of high-income countries such as Italy, where high-quality mortality data has been rapidly prepared and cleaned for public use, this approach to small-area excess mortality analysis could be employed as a routine surveillance tool, allowing health officials to identify high-mortality subgroups in a population and to introduce intervention measures in a timely manner. This approach could also be applied to assess patterns of excess all-cause mortality across countries such as Brazil, Mexico, and Colombia, which maintain nearly complete vital registration systems²³, but where increases in COVID-19 mortality may have outstripped diagnostic capacity early in the pandemic^24,25. Combining this excess mortality data with cause-of-death information by province would also reveal new insights about the local drivers of excess mortality. We hope that this study provides a new avenue to convert excess mortality analysis into a tool for decision-making in public health.

Methods

Overview

We fit a spatially-explicit hierarchical model with fixed effects by age group and for seven covariates, correlated age-province-year structured random effects, and harmonic curves capturing seasonal variation for each age group and province. Separate models were fit for each sex. We fit this model using mortality and population data from 1 January 2015 through 25 February 2020, then generated 1000 predictive samples of the baseline mortality rate for each sex, age group, and province for the weeks of 26 February through 31 August 2020. For each of the 1000 sampled draws, we compared the baseline mortality rate with the observed mortality rate to estimate a Standardized Mortality Ratio, and compared the predicted baseline deaths with observed deaths to estimate excess deaths from all causes.

All methods were carried out in accordance with the relevant guidelines and regulations governing the use of public data sources. The code used to produce this model can be accessed online at https://github.com/njhenry/covidemr.

Data

All-cause mortality and population data were downloaded from Istat, the Italian National Institute of Statistics. As of 22 October 2020, complete mortality data covering all provinces and municipalities of Italy over the time period 1 January 2015 through 31 August 2020 was available for download from Istat²⁶. The number of deaths over this time period were recorded by year, month, day, Italian municipality, sex, and five-year age group. For the purposes of analysis, these observations were aggregated by sex, Italian province, age group, and week of the year. The five age groups used in this analysis were 0–59 years, 60–69 years, 70–79 years, 80–89 years, and 90+ years of age. These age groups were chosen based on the prior knowledge that the large majority of both all-cause mortality and registered COVID-19 deaths occurred among adults aged 60 and above. Weeks of the year were assigned based on the numeric day of the year, where January 1st of each year was assigned as the first day of the first week. The 365th and 366th days of the year were assigned to week 52, with the hierarchical model adjusting for observed weeks with more than seven days.

Population data by sex, age, and province for the years 2015 through 2020 was downloaded from the Istat web data portal²⁷. Population counts were aggregated by sex, province, year, and the five age groups listed above.

We downloaded or extracted data for each of seven covariates, listed below in Table 2. Covariates were selected based on previous evidence of association between the covariate and all-cause mortality in a high-income context. Further information supporting the inclusion of each covariate is included in Supplementary Appendix S1. After extraction, all covariates were normalized and rescaled to have a mean of zero and a standard deviation of 1 across all data observations.

Table 2 Covariates used to estimate baseline mortality by age group, sex, province, and week across Italy from January 2015 through August 2020. The source and space-time resolution of each covariate is listed.

Full size table

Space–time model

To construct a mortality baseline for the months of March through August 2020 that incorporated multiple sources of uncertainty, we fit a small area model with age and covariate fixed effects, correlated province-year-age errors, and harmonic terms to capture seasonality within each age grouping and province. Because the age structure of mortality might differ by sex in Italy, two models were fit for males and females. For a particular sex, the number of deaths in a given province $p$, age group $a$, year $t$, and week of the year $w$ was assumed to follow a Poisson distribution:

$${D}_{p,a,t,w}\sim Poisson({N}_{p,a,t,w}~{r}_{p,a,t,w})$$

In the formulation above, $D$ is the number of observed deaths, $N$ is the population, and $r$ is the underlying mortality rate per person-week. The quantity $r$ is then fit in log space to a space–time surface which varies by province, age, year, and week:

$$log({r}_{p,a,t,w})\sim \sum_{k=1}^{5}[{I}_{\alpha }~{\alpha }_{k}]+\vec{\beta} ~ {X}_{p,a,t,w}+{Z}_{p,a,t}+{f}_{p,a}(w)$$

The first three terms on the right-hand side of this equation capture age and covariate fixed effects, corresponding to a discrete-time proportional hazards model where the baseline hazard varies by age group^28,29. In this specification, ${\alpha }_{k}$ is the weekly baseline hazard for each of the five age groups, while ${I}_{\alpha }$ is a boolean variable that is 1 when the age group index of an observation is equal to $k$ and zero otherwise. Fixed effects for the covariate design matrix ${X}_{p,a,t,w}$ are denoted by $\vec{\beta}$, a vector of length seven. Together, these terms correspond with a multivariate regression approach to estimating baseline mortality¹⁹.

The term ${Z}_{p,a,t}$ is a structured random effect that accounts for residual variation across provinces, age groups, and years that is not captured by the age or covariate fixed effects. $Z$ is structured as a Gaussian process with mean zero and covariance matrix $K$, where $K$ is a separable process across the dimensions of space, age, and time: $K={\Sigma }_{p}\otimes {\Sigma }_{a}\otimes {\Sigma }_{t}$. The spatial covariance structure ${\Sigma }_{p}$ corresponds to a conditional autoregressive (CAR) process in space³⁰, while the age and temporal covariance structures both correspond to discrete autoregressive processes of order 1. Separable covariance structures have been widely used in the fields of ecology and public health to construct models across space, time, and other dimensions^31,32, and have been found to fit a wide variety of space–time covariance structures³³.

The term ${f}_{p,a}(w)$ refers to a set of harmonic functions that are fit to account for weekly variation in mortality not captured by covariates. A separate function is fit for each age group and province to account for the fact that seasonal variation in mortality may be driven by different factors across space and by age group. Each function is tuned to fit the parameters $A$ and $B$ to the following harmonics:

$${f}_{p,a}(w)=\sum_{j=1}^{2} \left[{A}_{p,a,j}~sin\left(\frac{2\pi jw}{52}\right)+{B}_{p,a,j}~cos\left(\frac{2\pi jw}{52}\right)\right]$$

This harmonic series, which adapts principles from Fourier analysis, is the basis for a classic model for predicting seasonality in flu mortality developed by Robert Serfling⁷. In Serfling’s original formulation as well as more recent excess mortality papers, seasonality was fit using two Fourier terms^8,34. We performed five-fold cross-validation estimate the best grouping variables and harmonic terms for seasonal curve fits. Based on the metrics of out-of-sample mean squared error and coverage, we found that the model performed best when seasonal curves were fit separately by province and age group, using two Fourier terms.

We assigned priors to all model parameters and then fit the model using the Laplace approximation for mixed-effect parameter estimation^35,36. The model was fit in R v.4.0.3 using the package Template Model Builder v.1.7.18^35,37.

Compiling and interpreting results

Using the maximum a posteriori predictions and joint precision matrix for all parameters, we generated 1000 samples for all model parameters using a multivariate-normal approximation of the posterior predictive distribution. These parameter samples were then entered into the original model to construct 1000 draws or “candidate maps” estimating the mortality rate across all provinces, age groups, and weeks in the study period³⁸. Although the model was fit to data from 1 January 2015 through 25 February 2020, the fitted parameter fixed effects, random effects, and seasonality terms could all be applied forward to estimate 1000 draws of predicted baseline mortality from 26 February through 31 August 2020. All subsequent calculations were performed across draws to preserve the correlation structure within draws as well as the model uncertainty across draws.

We compared the distribution of predicted mortality rates with observed mortality rates, calculated as observed deaths divided by population, to calculate 1000 draws of standardized mortality ratios (SMRs) for each province-age-sex-year-week grouping $g$ using the following formula:

$$SM{R}_{g,draw}=\frac{Observed\hspace{0.33em}Death{s}_{g,draw}}{Predicted\hspace{0.33em}Death{s}_{g,draw}}$$

We also multiplied the predicted mortality rates by the population in each province-age-sex-year grouping to calculate predicted baseline death counts for each draw. We then calculated 1000 draws of excess deaths for each grouping:

$$Excess\hspace{0.33em}Death{s}_{g,draw}=Observed\hspace{0.33em}Death{s}_{g,draw}-Predicted\hspace{0.33em}Death{s}_{g,draw}$$

In the results section below, draws for predicted mortality, SMRs, and excess deaths are summarized using the mean and 95% uncertainty interval bounds. The 95% uncertainty interval is reported as the 2.5th percentile and 97.5th percentile of values across 1000 draws.

Model validation

We used five-fold cross validation to compare predictive performance across multiple model specifications and to compare predictive performance with simpler models for calculating excess mortality. Each fold was created by fitting the model without data from the weeks in March through December for each of the years 2015 through 2019, then comparing predicted values for the held out weeks with the observed values. This holdout strategy mirrors the process we hope to capture in the months of March through August 2020 in the counterfactual where COVID-19 did not change the pattern of mortality across Italy.

Because the expected number of deaths in a given province-age-sex-year-week groupings can be very low, particularly in lower age groups, we aggregated all out-of-sample observations across four-week intervals while preserving the other groupings. We then calculated the difference between the out-of-sample recorded deaths and the modeled mortality, and calculated summary metrics: root mean squared error, coverage of the 95% uncertainty intervals, and relative squared error when compared to a simpler model that uses the average mortality rate across all other years.

We found that the out-of-sample root mean squared error for the best-performing model was 2.32E−5, compared to an average weekly mortality rate of 2.05E−4 across all age groups, suggesting a reasonably good fit for the model’s mean estimates. The out-of-sample relative squared error was 0.330 compared to the simple method of averaging weekly values across other years, suggesting that this predictive model substantially outperformed the simpler alternative for the years 2015–2019 even when an entire year of data was held out. The in-sample relative squared error compared to the simpler averaging method was 0.273, a much lower ratio of error, which indicates that the model provides a more flexible fit to the data than the simpler averaging strategy. The out-of-sample coverage of the 95% uncertainty interval was 99.1%, indicating that the predicted uncertainty bounds are conservative. The procedure for out-of-sample validation and results are discussed in more detail in Supplementary Appendix S1.

Visualization

All figures and maps in this study were created using the ggplot2 package in R v.4.0.3^37,39.

Data availability

All data sources used in this analysis are publicly available online and are linked in Supplementary Appendix S1. The code repository accompanying this paper, available online at https://github.com/njhenry/covidemr, contains detailed instructions for downloading and formatting each data source.

References

Sebastiani, G., Massa, M. & Riboli, E. Covid-19 epidemic in Italy: Evolution, projections and impact of government measures. Eur. J. Epidemiol. 35, 341–345 (2020).
Article CAS Google Scholar
Alicandro, G., Remuzzi, G. & La Vecchia, C. Italy’s first wave of the COVID-19 pandemic has ended: No excess mortality in May, 2020. The Lancet 396, e27–e28 (2020).
Article CAS Google Scholar
Institute for Health Metrics and Evaluation (IHME). COVID-19 Mortality, Infection, Testing, Hospital Resource Use, and Social Distancing Projections. (2020).
La Maestra, S., Abbondandolo, A. & De Flora, S. Epidemiological trends of COVID-19 epidemic in Italy over March 2020: From 1,000 to 100,000 cases. J. Med. Virol. 92, 1956–1961 (2020).
Article Google Scholar
Albitar, O., Ballouze, R., Ooi, J. P. & Sheikh Ghadzi, S. M. Risk factors for mortality among COVID-19 patients. Diabetes Res. Clin. Pract. 166, 108293 (2020).
Mannucci, E., Nreu, B. & Monami, M. Factors associated with increased all-cause mortality during the COVID-19 pandemic in Italy. Int. J. Infect. Dis. 98, 121–124 (2020).
Article CAS Google Scholar
Serfling, R. E. Methods for current statistical analysis of excess pneumonia-influenza deaths. Public Health Reports (1896–1970) 78, 494 (1963).
Weinberger, D. M. et al. Estimation of excess deaths associated with the COVID-19 pandemic in the United States, March to May 2020. JAMA Intern. Med. 06520, E1–E9 (2020).
Google Scholar
Noufaily, A. et al. An improved algorithm for outbreak detection in multiple surveillance systems. Stat. Med. 32, 1206–1222 (2013).
Article MathSciNet Google Scholar
Michelozzi, P. et al. Temporal dynamics in total excess mortality and COVID-19 deaths in Italian cities. BMC Public Health 20, 1238 (2020).
Article CAS Google Scholar
Dickman, P. W., Sloggett, A., Hills, M. & Hakulinen, T. Regression models for relative survival. Stat. Med. 23, 51–64 (2004).
Article Google Scholar
Lambert, P. C., Smith, L. K., Jones, D. R. & Botha, J. L. Additive and multiplicative covariate regression models for relative survival incorporating fractional polynomials for time-dependent effects. Stat. Med. 24, 3871–3885 (2005).
Article MathSciNet Google Scholar
Wu, J., McCann, A., Katz, J., Peltier, E. & Singh, K. D. Tracking the true toll of the coronavirus outbreak. (2020).
The Economist. COVID-19 Data: Tracking COVID-19 excess deaths across countries. (2020).
FT Visual and Data Journalism team. FT Coronavirus Tracker. (2020).
U.S. National Center for Health Statistics. Excess Deaths Associated with COVID-19. (2021).
Thacker, S. B. & Berkelman, R. L. Public health surveillance in the United States. Epidemiol. Rev. 10, 164–190 (1988).
Article CAS Google Scholar
Banerjee, S., Carlin, B. P. & Gelfand, A. E. Hierarchical modeling and analysis for spatial data. (CRC Press, 2014).
Ederer, F., Axtell, L. M. & Cutler, S. J. The relative survival rate: A statistical methodology. Natl. Cancer Inst. Monogr. 6, 101–121 (1961).
CAS PubMed Google Scholar
Yu, W., Vaneckova, P., Mengersen, K., Pan, X. & Tong, S. Is the association between temperature and mortality modified by age, gender and socio-economic status?. Sci. Total Environ. 408, 3513–3518 (2010).
Article ADS CAS Google Scholar
Stafoggia, M. et al. Factors affecting in-hospital heat-related mortality: A multi-city case-crossover analysis. J. Epidemiol. Community Health 62, 209–215 (2008).
Article CAS Google Scholar
Bayer, R. & Fairchild, A. L. Surveillance and privacy. Science 290, 1898–1899 (2000).
Article CAS Google Scholar
GBD 2019 Demographics Collaborators. Global age-sex-specific fertility, mortality, healthy life expectancy (HALE), and population estimates in 204 countries and territories, 1950–2019: A comprehensive demographic analysis for the Global Burden of Disease Study 2019. Lancet 396, 1160–1203 (2020).
León Cabrera, J. M., & Kurmanaev, A. Ecuador’s death toll during outbreak is among the worst in the World. A8 (2020).
Burki, T. COVID-19 in Latin America. Lancet. Infect. Dis 20, 547–548 (2020).
Article CAS Google Scholar
Italian National Institute of Statistics (Istat). All-cause mortality by Italian municipality, January 2015 through August 2020 (release: 22 October 2020). (2020).
Italian National Institute of Statistics (Istat). Resident municipal population by age, sex and marital status. (2020).
Cox, D. R. Regression models and life-tables. J. R. Stat. Soc. Ser. B (Methodological) 34, 187–220 (1972).
MathSciNet MATH Google Scholar
Burstein, R. et al. Mapping 123 million neonatal, infant and child deaths between 2000 and 2017. Nature 574, 353–358 (2019).
Article ADS CAS Google Scholar
Riebler, A. et al. An intuitive Bayesian spatial model for disease mapping that accounts for scaling. Stat. Methods Med. Res. 25, 1145–1165 (2016).
Article MathSciNet Google Scholar
Thorson, J. T. & Barnett, L. A. K. Comparing estimates of abundance trends and distribution shifts using single- and multispecies models of fishes and biogenic habitat. ICES J. Mar. Sci. 74, 1311–1321 (2017).
Article Google Scholar
Wakefield, J. et al. Estimating under-five mortality in space and time in a developing world context. Stat. Methods Med. Res. 28, 2614–2634 (2019).
Article MathSciNet Google Scholar
Huang, H. C., Martinez, F., Mateu, J. & Montes, F. Model comparison and selection for stationary space-time models. Comput. Stat. Data Anal. 51, 4577–4596 (2007).
Article MathSciNet Google Scholar
Woolf, S. H., Chapman, D. A., Sabo, R. T., Weinberger, D. M. & Hill, L. Excess deaths from COVID-19 and other causes, March-April 2020. JAMA 324, 510 (2020).
Article CAS Google Scholar
Kristensen, K., Nielsen, A., Berg, C. W., Skaug, H. & Bell, B. M. TMB: Automatic differentiation and laplace approximation. J. Stat. Softw. 70, 5 (2016).
Article Google Scholar
Thorson, J. T. & Kristensen, K. Implementing a generic method for bias correction in statistical models using random effects, with spatial and population dynamics examples. Fish. Res. 175, 66–74 (2016).
Article Google Scholar
R Core Team. R: A Language and Environment for Statistical Computing. (2018).
Patil, A. P., Gething, P. W., Piel, F. B. & Hay, S. I. Bayesian geostatistics in health cartography: The perspective of malaria. Trends Parasitol. 27, 246–253 (2011).
Article Google Scholar
Wickham, H. ggplot2: Elegant graphics for data analysis (Springer-Verlag, 2016).
Book Google Scholar

Download references

Acknowledgements

The authors wish to thank the Italian National Institute of Statistics for their timely compilation and dissemination of complete vital registration data for Italy, 2015-2020, which enabled this investigation. A.E. would like to thank the Bill & Melinda Gates Foundation for funding his research (OPP1152978). C.E.M. is funded by the United Kingdom’s Department of Health and Social Care, the Fleming Fund, the Wellcome Trust (Grant numbers 209142/Z/17/Z and 221579/Z/20/Z), and the Bill and Melinda Gates Foundation (Grant number OPP1176062). The computational aspects of this research were supported by the Wellcome Trust Core Award Grant Number 203141/Z/16/Z and the NIHR Oxford BRC. The views expressed are those of the authors and not necessarily those of the NHS, the NIHR or the Department of Health.

Author information

Authors and Affiliations

Big Data Institute, Li Ka Shing Centre for Information Discovery, University of Oxford, Oxford, UK
Nathaniel J. Henry, Michael Give Chipeta & Catrin E. Moore
Telethon Kids Institute, Perth Children’s Hospital, Perth, WA, Australia
Ahmed Elagali
Asian School of the Environment, Nanyang Technological University, Singapore, 639798, Singapore
Michele Nguyen

Authors

Nathaniel J. Henry
View author publications
You can also search for this author in PubMed Google Scholar
Ahmed Elagali
View author publications
You can also search for this author in PubMed Google Scholar
Michele Nguyen
View author publications
You can also search for this author in PubMed Google Scholar
Michael Give Chipeta
View author publications
You can also search for this author in PubMed Google Scholar
Catrin E. Moore
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

N.J.H. and C.E.M. conceived and planned the study. N.J.H. identified, extracted, and processed the input data. N.J.H. carried out the statistical analyses with assistance and input from A.E., M.N., and M.C. N.J.H. wrote the first draft of the manuscript with assistance from C.E.M. and A.E., and all authors contributed to subsequent revisions. All authors provided intellectual input into aspects of this study.

Corresponding author

Correspondence to Nathaniel J. Henry.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher's note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Supplementary Information.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Henry, N.J., Elagali, A., Nguyen, M. et al. Variation in excess all-cause mortality by age, sex, and province during the first wave of the COVID-19 pandemic in Italy. Sci Rep 12, 1077 (2022). https://doi.org/10.1038/s41598-022-04993-7

Download citation

Received: 16 August 2021
Accepted: 03 January 2022
Published: 20 January 2022
DOI: https://doi.org/10.1038/s41598-022-04993-7

This article is cited by

Exploring the impact of population ageing on the spread of emerging respiratory infections and the associated burden of mortality
- Signe Møgelmose
- Karel Neels
- Niel Hens
BMC Infectious Diseases (2023)
Gender differences in estimated excess mortality during the COVID-19 pandemic in Thailand
- Wiraporn Pothisiri
- Orawan Prasitsiriphon
- Kritchavat Ploddi
BMC Public Health (2023)
Deaths during the first year of the COVID-19 pandemic: insights from regional patterns in Germany and Poland
- Michał Myck
- Monika Oczkowska
- Martina Brandt
BMC Public Health (2023)

Comments

By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.