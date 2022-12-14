Process

The process for producing the estimates of excess mortality consisted of three main steps. First, a Technical Advisory Group (TAG) was established to develop a set of methods that were used to produce estimates of excess deaths associated with the COVID-19 pandemic in countries. Second, WHO member states were consulted on the estimates, input data sources and methods. Finally, feedback from the countries was then incorporated into the modelling to update the estimates. The details of each step are described below.

In February 2021, the WHO, in collaboration with the United Nations Department of Economic and Social Affairs, formed the TAG on COVID-19 Mortality Assessment to advise on the development of analytical methods for estimating excess mortality in all countries. The TAG is composed of leading demographers, epidemiologists, economists, data and social scientists and statisticians from a range of backgrounds and geographies. A complete list of the TAG members is provided at the end of the paper. In addition to determining the levels and the age and sex distributions of the excess deaths associated with the COVID-19 pandemic, the expertise of the TAG has been leveraged to study the impact of the pandemic on broader areas such as inequality in COVID-19 mortality between and within countries, death registration and reporting systems, and how existing surveys and censuses can be used to fill in data gaps to quantify the impact of the pandemic. At the time of writing, this work is still ongoing.

In August 2021, a circular letter was sent to all WHO member states to nominate focal points to take part in country consultation. Member states were requested to review and provide feedback on the preliminary estimates of COVID-19 excess mortality and submit additional data that may not have been previously available to WHO. The first round of the country consultation was conducted between October and November 2021 through WHO’s Country Portal, an online platform to facilitate data exchange between member states and WHO, for which the draft estimates and methodology for each country were made available to the designated national focal points. Countries that had not nominated a focal point were approached through their respective WHO country office or permanent mission in Geneva, Switzerland.

Between October 2021 and February 2022, a global technical consultation and two information sessions with member states were held to brief them on the progress and exchange views on the methodology. A series of regional webinars and technical consultations with individual countries were also organized for further discussion on input data, methods and estimates. By the end of March 2022, 140 countries (or 72% of the 194 member states) had participated in the country consultation, 65 had provided some data and 76 had provided feedback, which was then used to generate updated estimates. The revised estimates for a 24-month period from January 2020 to December 2021 were shared with the national focal points in March 2022.

The process of generating the estimates of excess mortality associated with the COVID-19 pandemic has followed the Guidelines for Accurate and Transparent Health Estimates Reporting23. In view of the fast-changing situation surrounding the pandemic, the excess mortality estimates will continue to be refined and revised as more data are identified and the methodology evolves over time.

Data

The specific countries from each region for which data have been gathered are listed in the Supplementary information and are shown in Extended Data Fig. 1. Estimates of the excess mortality associated with the COVID-19 pandemic require historical ACM data that can be used to generate the death numbers under a hypothetical non-COVID-19 scenario, as well as ACM data for the target years against which the counterfactual is contrasted to calculate the excess. In the absence of nationally representative data, subnational data can be used to estimate national totals.

Reported ACM data at the national level on a weekly or monthly basis are available for only a subset of countries. The data used in this study span multiple sources:

Data routinely shared with WHO as part of its standing agreement with member states as well as specifically provided to WHO in response to a data call for this project.

Data that have been reported by European countries to Eurostat according to the European Statistical System 24 .

Data that have been compiled for the Human Mortality Database as part of the Short-term Mortality Fluctuations project 25,26 .

Data that have been compiled in the World Mortality dataset13.

Additionally, annual level data for 2020 and/or 2021 were obtained from the national statistics offices of China27,28, Grenada29, Saint Kitts and Nevis30, Saint Vincent and the Grenadines31, Sri Lanka32 and Vietnam33.

The countries with current reported ACM generally have ACM data for the pre-pandemic period as well. For those without such historical data, the WHO GHE34 database was used. Using the annual historic mortality we forecasted expected ACM to 2020 and 2021, to provide the expected mortality in these locations. The method for this forecast will be described shortly.

In addition to the data on reported ACM and the estimates from the GHE, the final dimension to the input data are variables that can potentially be used as predictors for excess mortality in those countries/time periods without ACM data. The strategy applied to create a covariate list was pragmatic and focused on identifying those variables that have been found to be contextually important and that have been measured/estimated in the majority of countries. The predictor variables are composed of both time-varying and time-invariant variables. Time-varying variables were the test positivity rate, temperature, confirmed COVID-19 death rate per 100,000 population (which is reported to the WHO), COVID-19 positive test rate per 100,000 population (from Our World in Data https://ourworldindata.org) and a variable constructed from a number of containment measures35. The COVID-19 death rate and the positive test rate are available for all member states for the entire period (https://covid19.who.int/). The time-invariant variables were a binary measure of the income level (low/middle versus high) and the historic diabetes prevalence and cardiovascular mortality rates as estimated by the Global Burden of Disease project9.

Subnational-level (states, provinces, cities, collections thereof and so on) data were obtained from various sources for Argentina36, India37,38,39,40,41,42, Indonesia43 and Turkey44.

Statistical models

We write the excess in country \(c\) at time \(t\) as

$${\delta }_{c,t}={Y}_{c,t}-{E}_{c,t},$$ (1)

where \({Y}_{c,t}\) is the realized ACM and \({E}_{c,t}\) is the ACM that would be expected in the absence of the pandemic. Even for countries with fully observed ACM during the pandemic the excess is a random quantity, because we do not know the counts \({E}_{c,t}\) that would have occurred in the absence of the pandemic—the latter is the result of a modelling exercise, which produces forecasted ACM, with associated uncertainty.

The major challenges for modelling are to form a coherent approach in the face of disparate data sources of varying degrees of quality and in different spatially and temporally aggregated forms. We constructed a model from first principles within a Bayesian inferential framework, and as a first step developed a framework in which we directly model the raw death counts (as opposed to derived quantities such as rates). Death is binary, and so must follow a Bernoulli distribution, and it is also statistically rare, and so the Bernoulli can be accurately approximated by a Poisson distribution. The advantage of the latter is that it is amenable to manipulation when one considers subsets of availability such as over space (when subnational data only are available) or over time (when annual counts only are available). In the Poisson model the variance equals the mean, which is restrictive as mortality data typically exhibit greater variability than the nominal variance. Hence, we use models that allow for such excess-Poisson variation.

For modelling all countries of the world we need to consider various data situations. Although some countries have full data, others have annual or subnational data only, and for countries with no data we need to build a predictive model based on country-specific variables. Extended Data Figure 2 shows the relationship between the different models, and how they feed into the excess calculation.

Model for expected numbers

For all countries and time points we model the expected numbers on the basis of historic data (for most countries, the period 2015–2019 was used for this modelling). We use a negative binomial model that allows for excess-Poisson variation. The annual historic yearly trend in ACM is modelled using a spline model, and within-year variation using a seasonal spline model. A spline is a flexible approach to modelling that allows departures from a linear association45. A negative binomial model has two parameters, a mean (which is obtained from spline components), and a scale parameter that accounts for excess-Poisson variation. The mean count for country \(c\) and in month \(t\) is modelled as:

$${{\rm{M}}{\rm{e}}{\rm{a}}{\rm{n}}{\rm{c}}{\rm{o}}{\rm{u}}{\rm{n}}{\rm{t}}}_{c,t}=\exp ({{\rm{a}}{\rm{n}}{\rm{n}}{\rm{u}}{\rm{a}}{\rm{l}}{\rm{t}}{\rm{r}}{\rm{e}}{\rm{n}}{\rm{d}}}_{c}+{{\rm{s}}{\rm{e}}{\rm{a}}{\rm{s}}{\rm{o}}{\rm{n}}{\rm{a}}{\rm{l}}{\rm{c}}{\rm{o}}{\rm{m}}{\rm{p}}{\rm{o}}{\rm{n}}{\rm{e}}{\rm{n}}{\rm{t}}}_{c,t})$$

where the annual trend uses a thin-plate spline and the seasonal component uses a cyclic cubic spline. After fitting to pre-pandemic data, we project the modelled trend forward to predict expected counts, by month, for 2020 and 2021. There is uncertainty in these predictions, which we incorporate into the excess mortality uncertainty intervals we produce.

For some countries, we only have national historic ACM data. For such countries we model within-year variation using temperature as a surrogate for seasonality. Full details of all modelling steps are given in Knutson et al.15.

Model for countries without full pandemic data

For almost half of the 194 WHO member states we do not have the ACM counts over the pandemic, and so must predict them using country characteristics. We choose a simple form for this prediction model, with mean

$$\text{E}\left[{Y}_{{ct}}{\rm{|}}{E}_{c,t}\right]={E}_{c,t}{\theta }_{c,t},$$ (2)

where \({\theta }_{c,t} > 0\) is a relative rate parameter. If \({\theta }_{c,t} > 1\), then for country \(c\) and at month \(t\) the mortality is greater than expected, whereas if \({\theta }_{c,t} < 1\), then for country \(c\) and at month \(t\) the mortality is less than expected. We used \(G\) time-invariant variables, \({Z}_{{gc}}\) (these are annual values from 2019). These were an indicator of high income and/or low or middle income, the cardiovascular mortality rate in 2019 and the diabetes prevalence rate in 2019. In addition, we used \(B\) time-varying variables: a containment variable (it is calculated using all ordinal containment and closure policy indicators and health-system policy indicators, for further details see Hale et al.14), the square root of the reported COVID-19 death rate, temperature and the COVID-19 test positivity rate. We then build a log-linear model for the rate parameter:

$$\log {\theta }_{c,t}=\mathop{\underbrace{\alpha }}\limits_{{\rm{I}}{\rm{n}}{\rm{t}}{\rm{e}}{\rm{r}}{\rm{c}}{\rm{e}}{\rm{p}}{\rm{t}}}+\mathop{\underbrace{\mathop{\sum }\limits_{g=1}^{G}{\gamma }_{g}\,{Z}_{gc}}}\limits_{{\rm{T}}{\rm{i}}{\rm{m}}{\rm{e}}-{\rm{i}}{\rm{n}}{\rm{v}}{\rm{a}}{\rm{r}}{\rm{i}}{\rm{a}}{\rm{n}}{\rm{t}}\,{\rm{c}}{\rm{o}}{\rm{n}}{\rm{t}}{\rm{r}}{\rm{i}}{\rm{b}}{\rm{u}}{\rm{t}}{\rm{i}}{\rm{o}}{\rm{n}}{\rm{s}}}+\mathop{\underbrace{\mathop{\sum }\limits_{b=1}^{B}{\beta }_{bt}{X}_{bct}}}\limits_{{\rm{T}}{\rm{i}}{\rm{m}}{\rm{e}}-{\rm{v}}{\rm{a}}{\rm{r}}{\rm{y}}{\rm{i}}{\rm{n}}{\rm{g}}\,{\rm{c}}{\rm{o}}{\rm{n}}{\rm{t}}{\rm{r}}{\rm{i}}{\rm{b}}{\rm{u}}{\rm{t}}{\rm{i}}{\rm{o}}{\rm{n}}{\rm{s}}}+\mathop{\underbrace{{{\epsilon }}_{c,t}\,}}\limits_{{\rm{E}}{\rm{x}}{\rm{c}}{\rm{e}}{\rm{s}}{\rm{s}}-{\rm{P}}{\rm{o}}{\rm{i}}{\rm{s}}{\rm{s}}{\rm{o}}{\rm{n}}\,{\rm{v}}{\rm{a}}{\rm{r}}{\rm{i}}{\rm{a}}{\rm{t}}{\rm{i}}{\rm{o}}{\rm{n}}}$$ (3)

where \({\rm{\exp }}\left({\gamma }_{g}\right)\) and \({\rm{\exp }}\left({\beta }_{{bt}}\right)\) denote relative rate parameters and \({{\epsilon }}_{c,t}\sim {\rm{N}}(0,{{\sigma }}_{{\epsilon }}^{2})\) are independent error contributions that pick up random variation unexplained by the log-linear regression function. The time-varying coefficients allow the associations to evolve during the pandemic. As we desire the evolution to be smooth in time, for these time-varying coefficients \({\beta }_{c,t}\) we use a random walk of order 2 (RW2) prior that encourages smooth estimates46. In equation (2) above, we have conditioned on known expected numbers. In reality, and as just described, these are modelled to give a distribution over plausible values. The uncertainty in the expected predictions \({E}_{c,t}\) is well modelled by a gamma distribution, and the advantage of this choice is that it can be conveniently combined with a Poisson model to produce a negative binomial model with the log-linear mean given by equation (3). Full details (including evidence of the accuracy of the gamma model) can be found in Knutson et al.15.

This model was fitted to all countries with observed monthly ACM data over some portion of 2020–2021, using the integrated nested Laplace approximation method47, to obtain posterior distributions over the unknown parameters. The resultant posterior distribution reflects the uncertainty in the parameters (both in the expected numbers and the log-linear covariate model), and can be used to construct a predictive distribution for the ACM in countries with no data or partial data.

For some countries, only subnational data were available, and so we construct a model for the national ACM data using a proportionality assumption, expanding on previous work48. We describe the model in the context of India. We use ACM data from 17 states and union territories out of 36 (data from different numbers of states are available in different pandemic months) to infer the national total, under the assumption that the proportion of deaths in the states with available data remains approximately constant over time. For example, if a state historically accounts for 10% of deaths in India, one would predict a national death total of 10\(\times \) the observed number of deaths in that state only. Under the Poisson framework, this proportionality assumption yields a multinomial distribution for the fractions of deaths and we can predict the unknown national totals over the course of the pandemic after fitting the multinomial model.

Extensive model validation was carried out for both the countries with no data, and those with subnational data only. This included exercises in which we systematically removed all data for each country in turn, or we removed data for all countries for single months. We then predicted these removed data using the retained data and evaluated model performance using metrics such as bias and the coverage of prediction intervals. Results for these exercises can be found in the supplementary materials of Knutson et al.15. We emphasize that the model (3) is not used for countries with subnational data.

For other countries that have annual (but not monthly) national data during the pandemic, we lean on the fact that the distribution of Poisson monthly counts, given the annual count, is multinomial with probabilities that are the normalized rate parameters, that is,

$${p}_{c,t}=\frac{{E}_{c,t}{\theta }_{c,t}}{\mathop{\sum }\limits_{{t}^{{\prime} }=1}^{12}{E}_{c,{t}^{{\prime} }}{\theta }_{c,{t}^{{\prime} }}}.$$

where the rates \({\theta }_{c,t}\) are defined via the log-linear covariate model (3). This gives us a way to apportion the annual counts to the constituent months.

Our approach, differs from those of the other two global endeavours of the Institute for Health Metrics and Evaluation (IHME)49 and The Economist50. We have used a very conventional statistical modelling approach in which a parametric model is fitted using Bayesian inferential machinery, and with the models for different data types being consistent with each other to make the country by country results directly comparable to each other. As an example, if the mortality in subnational regions are Poisson random variables, then the sum (the mortality in the country) is also Poisson. Further, given the total mortality in a country the subnational counts follow a multinomial distribution. Our framework exploits these relationships when we formulate models for the situation in which we have subnational data only. Similarly, our annual model (for countries with such data only) is consistent with the monthly models we use for the majority of the countries. The IHME approach is unprincipled and not transparent and corresponds to a number of steps being bolted together, without a coherent model tying them together. Rather than using a direct count model based on a Poisson framework, the IHME approach models the log of the excess rate as a function of covariates, without any weighting, so that the population sizes of the different countries do not feed into the uncertainty calculation. A fundamental problem with the overall approach is that the uncertainty intervals are constructed in a non-standard and ad hoc way, so that the confidence intervals, in particular, will not be accurate representations of the true uncertainty. The Economist approach models the excess rate with a flexible tree-based machine learning technique, gradient boosting. The approach is clearly described and uses a resampling technique, the bootstrap, to form interval estimates, but there is no theory to support the use of the bootstrap with boosting, and so again, the uncertainty intervals should be viewed sceptically. A full description and critique of the alternative methods are available in Knutson et al.15. In the Supplementary information, we provide a comparison between point and interval country estimates obtained by the methods of the WHO, IHME and The Economist.

P-scores

Recall that the P-score is defined as the ratio of the excess to the expected, expressed as a percentage. Mathematically, this corresponds to,

$${\text{PS}}_{c,t}=100\times \frac{{Y}_{c,t}-{E}_{c,t}}{{E}_{c,t}},$$

and \({\text{PS}}_{c,t}\ge -100,\) with zero deaths corresponding to –100, negative values corresponding to fewer deaths than expected and larger positive values corresponding to increasing levels of relative excess mortality. Under the model (2), we have \(\text{E}\left[{Y}_{c,t}\right]={E}_{c,t}{\theta }_{c,t}\) so that

$$\text{E}\left[{\text{PS}}_{c,t}\right]=100\times \left({\theta }_{c,t}-1\right).$$

For countries whose ACM is unobserved, the rate is modelled via the log-linear form (equation (3)) which gives a specific form to the manner in which we assume the P-score changes as a function of country-specific covariates.

Rankings

A natural, if sometimes unfortunate, inclination is to attempt to rank regions or countries in terms of the various metrics. Statistically, this is fraught with difficulties. The easiest approach, which is often followed in the media, is to simply rank on the basis of a point estimate of the metric, such as the mean or the median. The obvious problem with this approach is that the uncertainty in estimation is not accounted for. Using the Bayesian machinery that we use for inference we can account for the uncertainty probabilistically. In the simplest case of two countries, let \({X}_{1}\) and \({X}_{2}\) represent the excess rates in countries 1 and 2, respectively. We can then evaluate the (posterior) probability that \({X}_{1} > {X}_{2}\), and report this, rather than a binary statement that country 1 has a higher rate than country 2. The extension to multiple countries is immediate, as is the ability to calculate the probability of a higher rate in one country as compared to any collection of other countries.

For illustration of the issues of assessing rankings, we select six European countries that have overlap in their excess rate uncertainty (that is, posterior) distributions. In the left panel of Extended Data Fig. 3 we display posterior distributions for the excess rates of the six countries, ordered from top to bottom by highest to lowest median excess rate. There is clearly overlap in many of the distributions, but quantitative statements on the rankings require more than these plots. In the right panel of Extended Data Fig. 3 we present scatterplot representations of the bivariate probability distributions describing the relationships between pairs of countries. The red lines offer a reference by which we can evaluate the ranking probabilities (by calculating the fractions of points that are either side of the line). For example, the probabilities that the rate for Slovenia is greater than that of each of Italy, Estonia, Spain, the United Kingdom and Portugal are 0.546, 0.749, 0.988, 0.992 and 0.999, respectively. Even these plots do not give the complete picture as they are two-dimensional summaries of a six-dimensional object (the probability distribution over the six rates). We can provide other summaries, for example, the probability that the rate in Slovenia is greater than the rates in all of the other five countries is 0.479.

The rankings just discussed are based on the cumulative excess rate over January 2020–December 2021. Another potentially interesting summary is the relative rankings of countries’ rates over time. In Extended Data Fig. 4 we plot the excess rate over time (top panel) and the ranking probabilities (bottom panels). In each month, we calculate the probabilities that the rate of each country is highest, second highest and so on. In 2020, we see that among the six countries considered, Spain, the United Kingdom and to a lesser extent Italy, have high rates, whereas in 2021, Italy and Slovenia and to a lesser extent Spain have high rates. The rate in Estonia is generally low, apart from the last few months of 2021.

The Supplementary information contains a more substantive example where we consider the rankings of 27 countries of the European Union and the United Kingdom over time, in terms of both the excess rate and the P-score.

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.