Mapping 123 million neonatal, infant and child deaths between 2000 and 2017

Since 2000, many countries have achieved considerable success in improving child survival, but localized progress remains unclear. To inform efforts towards United Nations Sustainable Development Goal 3.2—to end preventable child deaths by 2030—we need consistently estimated data at the subnational level regarding child mortality rates and trends. Here we quantified, for the period 2000–2017, the subnational variation in mortality rates and number of deaths of neonates, infants and children under 5 years of age within 99 low- and middle-income countries using a geostatistical survival model. We estimated that 32% of children under 5 in these countries lived in districts that had attained rates of 25 or fewer child deaths per 1,000 live births by 2017, and that 58% of child deaths between 2000 and 2017 in these countries could have been averted in the absence of geographical inequality. This study enables the identification of high-mortality clusters, patterns of progress and geographical inequalities to inform appropriate investments and implementations that will help to improve the health of all populations.


Precision public health and child mortality
Country-level estimates facilitate international comparisons but mask important geographical heterogeneity. Previous assessments of mortality of children under 5 have noted significant within-country heterogeneity, particularly in sub-Saharan Africa 5,7,10-14 , as well as in Brazil 15 , Iran 16 and China 17 . Understanding public health risks at more granular subpopulation levels is central to the emerging concept of precision public health 18 , which uses "the best available data to target more effectively and efficiently interventions…to those most in need" 18 . Efforts to produce high-resolution estimates of mortality of children under 5, determinants at scales that cover the multiple countries are emerging, including for vaccine coverage 19,20 , malaria 21 , diarrhoea 22 and child growth failure 23,24 . In a previous study, we produced comprehensive estimates of African child mortality rates at a 5 × 5-km scale for 5-year intervals 5 . For areas outside of Africa, in which 72% of the world's children live and 46% of global child deaths occurred in 2017 4 , subnational heterogeneity remains mostly undescribed 25 .
Here we produce estimates of death counts and mortality rates of children under 5, infants (under 1 years of age) and neonates (0-28 days) in 99 countries at policy-relevant subnational scales (first and second administrative levels) for each year from 2000 to 2017. We fit a geostatistical discrete hazards model to a large dataset that is composed of 467 geo-referenced household surveys and censuses, representing approximately 15.9 million births and 1.1 million deaths of children from 2000 to 2017. Our model includes socioeconomic, environmental and health-related spatial covariates with known associations to child mortality and uses a Gaussian process random effect to exploit the correlation between data points near each other across dimensions of space, time and age group, which helps to mitigate the limitations associated with data sparsity in our estimations. For this study, we report U5MR as the expected number of deaths per 1,000 live births, reflecting the probability of dying before the age of 5 for a given location and year.

Unequal rates of child mortality
The risk of a newborn dying before their fifth birthday varies tremendously based on where in the world, and within their country, they are born. Across the 99 countries in this study, we estimate that U5MR varied as much as 24-fold at the national level in 2017, with the highest rate in the Central African Republic of 123.9 deaths (95% uncertainty interval, 104.9-148.2) per 1,000 live births, and the lowest rate in Cuba of 5.1 deaths (4.4-6.0) 4 . We observed large subnational variation within countries in which overall U5MR was either high or comparatively low. For example, in Vietnam, rates across second administrative units (henceforth referred to as 'units') varied 5.7-fold, from 6.9 (4.6-9.8) in the Tenth District in Hồ Chí Minh City to 39.7 (28. 1-55.6) in Mường Tè District in the Northwest region (Figs. 1b, 2).
Decreases in U5MR between 2000 and 2017 were evident to some extent throughout all units (Figs. 1a, b, 2). No unit showed a significant increase in U5MR in this period, and in most units U5MR decreased greatly, even in units in which the mortality risk was the highest. Out of 17,554 units, 60.3% (10,585 units) showed a significant (defined as 95% uncertainty intervals that did not overlap) decrease in U5MR between 2000 and 2017. Across units in 2000, U5MR ranged from 7.5 (5.0-10.6) in Santa Clara district, Villa Clara province, Cuba, to 308.4 (274. 9-348.4) in the Sabon Birni Local Government Area of Sokoto State, Nigeria. By 2017, the unit with the highest estimated U5MR across all 99 countries was Garki Local Government Area, Jigawa state, Nigeria, at 195.1 (158.6-230. 9). Overall, the total percentage of units with a U5MR higher than 80 deaths per 1,000 live births decreased from 28.9% (

Distribution of under-5 deaths may not follow rates
The goal of mortality-reduction efforts is ultimately to prevent premature deaths, and not just to reduce mortality rates. Across the countries studied here, there were 3.5 million (41%) fewer deaths of children under 5 in 2017 than in 2000 (5.0 million compared to 8.5 million). At the national level, the largest number of child deaths in 2017 occurred in India (1.04 (0.98-1.10) million), Nigeria (0.79 (0.65-0.96) million), Pakistan (0.34 (0.27-0.41) million) and the Democratic Republic of the Congo (0.25 (0.21-0.31) million) (Fig. 3a). Within these countries, the geographical concentration of the deaths of the children varied. In Pakistan, over 50% of child deaths in 2017 occurred in Punjab province, which had a U5MR of 63.3 (54.1-76.0) deaths per 1,000 live births (Fig. 3b). By contrast, 50% of child deaths in the Democratic Republic of the Congo in 2017 occurred across 9 out of 26 provinces. Such findings are in a large part artefacts of how borders are drawn around various at-risk populations (the provinces above account for 53% and 63%, respectively, of the under-5 population that is at risk in these two countries), but can have a real impact at the level at which planning occurs. Some concentrated areas with apparent high absolute numbers of deaths highlighted by local-level estimates become less noticeable when reporting at aggregated administrative levels; for example, areas across Guatemala, Honduras and El Salvador are visually striking hotspots in Fig. 3d, but less so in Fig. 3b, c.
Our estimates indicate that targeting areas with a 'high' U5MR of 80 will have a lower overall effect than in previous years owing to the reductions in mortality rates. In 2000, 23.7% of child deathsrepresenting 2.0 (1.7-2.4) million deaths-occurred in regions in which U5MR was less than 80 that year (Fig. 4) Despite population growth, child deaths have declined due to the outpaced decline in U5MR. For example, there were a total of 8.5 (7.2-10.0) million deaths of children under 5 in the countries in this study in 2000; had the 2017 under-5 population been exposed to the same U5MRs that were observed in 2000, there would have been 10.6 (9.0-12.5) million deaths in 2017. Instead, we observed 5.0 (3.8-6.6) million deaths in 2017 (Extended Data Fig. 5).

Discussion
This study offers a comprehensive, geospatially resolved resource for national and subnational estimates of child deaths and mortality rates for 99 LMICs, where 93% of the world's child deaths 4 occurred in 2017. Gains in child survival varied substantially within the vast majority of countries from 2000 to 2017. Countries such as Vietnam, for example, showed more than fivefold variation in mortality rates across second administrative-level units. The inconsistency of successes, even at subnational levels, indicates how differences in health policy, financial resources, access to and use of health services, infrastructure, and economic development ultimately contribute to millions of lives cut short 25-27 . By providing detailed maps that show precisely where these deaths are estimated to have occurred, we provide an important evidence base for looking both to the past, for examples of success, and towards the future, in order to identify where precision public-health initiatives could save the most lives.
The epidemiological toll of child mortality should be considered both in terms of total deaths and as rates of mortality. Focusing only on mortality rates can effectively mask areas in which rates are comparatively low but child deaths are high owing to large population sizes. The number of deaths that occur in high-risk areas has declined, and most under-5 deaths in recent years have occurred in lower-risk areas. This 'prevention paradox' 28 could indicate that whole-population interventions could have a larger overall impact than targeting high-risk areas 29 . At the same time, strategies that target resources to those locations that have the highest number of child deaths risk leaving behind some of the world's most marginalized communities: remote, more-sparsely populated places in which, relative to the number of children born each year, a large number of children die before their fifth birthday. Instead, by considering subnational measures of both counts and rates of deaths of children under 5, decision-makers can better tailor child health programs to align with local contexts, norms and needs. Rural communities with high rates but low counts may benefit from 'lastmile' initiatives to provide effective health services to populations who lack adequate access to care. By contrast, locations with low rates but high counts may require programs that focus on alleviating the cost of care, unsafe environmental exposures or health risks that are uniquely associated with urban slums 30 . The SDGs have pointed the global development agenda towards progress in child survival. Our analysis indicates that reaching the SDG 3.2 targets of 25 child deaths per 1,000 live births and 12 neonatal deaths per 1,000 live births will require only modest improvements or have already been achieved by some units; however, these targets are ambitious for other units in which child mortality remains high. It is worth noting that many countries contain areas that fit both of these profiles. For example, 11 countries  Article reSeArcH had at least 1 unit that had already met SDG 3.2 with high certainty, and at least 1 unit that had not. Subnational estimates can empower countries to benchmark gains in child survival against their own subnational exemplars as well as advances that have been achieved by their peers. Through our counterfactual analysis we showed that even if all units had met the SDG 3.2 goal in 2017, there would still have been 2.4 million deaths of children under 5, indicating that 'ending preventable child deaths' is more complex than simply meeting a target threshold. Future research efforts must address the causes of child mortality in local areas and more precisely identify causes of child deaths that are amenable to intervention. To that end, new and innovative data-collection efforts, such as the ongoing Child Health and Mortality Prevention Surveillance network, offer promising prospects by applying high-validity, pathology-based methods alongside verbal autopsies to determine the cause of death 31 .
This study offers a unique platform to support the identification of local success stories that could be replicated elsewhere. In Rwanda, for example, the highest U5MR at the district level in 2017 was 60.2% (52.0-67.8%) lower than the lowest U5MR at the district level in 2000. Such gains have been partially credited to focused investments in the country's poorest populations, expanding the Mutuelles de santé insurance program, and developing a strong workforce of community health workers who provide evidence-based treatment and health promotion 32,33 . Nepal and Cambodia are among the exemplars for considerably decreasing subnational inequalities in child survival since 2000. In an era when narrowing disparities within countries is as important as reducing national-level gaps, these results provide the evidence base to inform best practices and stimulate national conversations about related social determinants.
Neonatal mortality rates have also declined but failed to keep pace with reductions in mortality rates of older children, leading to a higher proportion of deaths of children under 5 occurring within the first four weeks of life: from 37.4% (37.1-37.7) in 2000 to 43.7% (43.1-44.3%) in 2017. This trend is probably related to the increase in scale of routine programs and improved infrastructure (for example, vaccination 34 , and water and sanitation 35 ) and the introduction of effective interventions to target communicable diseases (for example, malaria control 36 and prevention of mother-to-child transmission of HIV 37 ). These interventions have tended to target amenable causes of mortality that are more common in older children under 5 rather than dominant causes of neonatal mortality, such as prematurity and congenital anomalies 38 . Notably, irrespective of income level or location, some causes of neonatal death (for example, chromosomal anomalies and severe preterm birth complications) remain difficult to prevent completely with current medical technologies. Ultimately, large gains in neonatal mortality will require serious investment in health system strengthening 39 . Affordable approaches to preventing the majority of neonatal deaths in LMICs exist and there are success stories with lessons learned to apply 40-44 , but decisions about which approaches to take must be based on the local epidemiological and health system context. In the absence of spatially detailed cause of death data, subnational neonatal mortality estimates can indicate dominant causes and thus serve as a useful proxy to guide prioritization of interventions 45 .
The accuracy and precision of our estimates were primarily determined by the timeliness, quantity and quality of available data. In Sri Lanka, for example, there were no available surveys, and the wide uncertainty intervals surrounding estimates reflect the dearth of available evidence in that country (Extended Data Figs. 3,4). In certain areas, this decreased the confidence that we had in claiming that a specific subnational area met the SDG 3.2 target (Fig. 1c). This issue is most concerning in cases in which estimated mortality rates are high, thus helping to identify locations in which it would be most useful to focus future data-collection efforts. High mortality rates with large uncertainty intervals were estimated across much of Eastern and Central sub-Saharan Africa, and in Cambodia, Laos, Myanmar and Papua New Guinea (Extended Data Figs. 3, 4). Furthermore, ongoing conflict in countries such as Syria, Yemen and Iraq pose substantial challenges to collecting more contemporaneous data, and our estimates may not fully capture the effects of prolonged civil unrest or war 46,47 . Further methodological and data limitations are discussed in the Methods.
The accurate estimation of mortality is also a matter of equity; highly refined health surveillance is common in high-income countries, whereas in LMICs, in which rates of child mortality are the highest, surveillance that helps to guide investments in health towards the areas with the greatest need is less routine 48 . Ideally, all countries would have high-quality, continuous, and complete civil and vital registration systems that capture all of the births, deaths and causes of death at the appropriate geographical resolution 49 . In the meantime, analyses such as this serve to bridge the information gap that exists between low-mortality countries with strong information systems and countries that face a dual challenge of weaker information systems and higher disease burden. By harnessing the unprecedented availability of geo-referenced data and developing robust statistical methods, we provide a high-resolution atlas of child death counts and rates since 2000, covering countries that account for 93% of child deaths. We bring attention to subnational geographical inequalities in the distribution, rates and absolute counts of child deaths by age. These high-resolution estimates can help decision-makers to structure policy and program implementation and facilitate pathways to end preventable child deaths 50 by 2030.

Online content
Any methods, additional references, Nature Research reporting summaries, source data, extended data, supplementary information, acknowledgements, peer review information; details of author contributions and competing interests; and statements of data and code availability are available at https://doi.org/10.1038/s41586-019-1545-0.  Publisher's note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http:// creativecommons.org/licenses/by/4.0/.

MEthOds
Overview. We fitted a discrete hazards geostatistical model 51,52 with correlated space-time-age errors and made predictions to generate joint estimates-with uncertainty-of the probability of death (the number of deaths per live births) and the number of deaths for children aged 0-28 days (neonates), children under 1 year old (infants) and children under 5 years old at the subnational level for 99 LMICs for each year from 2000 to 2017. The analytical process is summarized in the flowchart in Extended Data Fig. 6. We made estimates at a grid-cell resolution of approximately 5 × 5-km and then produced spatially aggregated estimates at the first (that is, states or provinces) and second (that is, districts or counties) administrative levels, as well as the country level.
Countries were selected for inclusion in this study based on their sociodemographic index (SDI) published in the Global Burden of Disease study (GBD) 53 . The SDI is a measure of development based on income per capita, educational attainment and fertility rates among women under 25 years old. We primarily aimed to include all countries in the middle, low-middle or low SDI quintiles, with several exceptions. Brazil and Mexico were excluded despite middle SDI status owing to the availability of high-quality vital registration data in these countries, which have served as the basis for existing subnational estimates of child mortality. Because this study did not incorporate vital registration data sources (see 'Limitations'), Brazil and Mexico were not estimated directly; instead, state-level estimates from the GBD 2017 study were directly substituted in figures where appropriate 4 . Albania and Moldova were excluded despite middle SDI status owing to geographical discontinuity with other included countries and lack of available survey data. North Korea was excluded despite low-middle SDI status owing to geographical discontinuity and insufficient data. As countries with high-middle SDI status in 2017, China and Malaysia were excluded from this analysis. Libya was included despite high-middle SDI status to create better geographical continuity. Island nations with populations under 1 million were excluded because they typically lacked sufficient survey data or geographical continuity for a geospatial analytic approach to be advantageous over a national approach. Supplementary Figure 3 Table 3.1 lists the countries. Data. We extracted individual records from 555 household sample survey and census sources. Records came in the form of either summary birth histories (SBHs) or complete birth histories (CBHs). All input data were subject to quality checks, which resulted in the exclusion of 82 surveys and censuses owing to quality concerns (see Supplementary Information section 3.2 for more details). Data on life and mortality experiences from CBH sources can be tabulated directly into discrete period and age bins, thus allowing for period-specific mortality estimations, known as the synthetic cohort method [54][55][56] . For SBH data, we used indirect estimation 57 to estimate age-specific mortality probabilities and sample sizes and assign them to specific time periods. Complete details are available in Supplementary Information section 3.3.2.

.1 shows a map of the countries included in this study and Supplementary
In all cases, after pre-processing, each data point provided a number of deaths and a sample size for an age bin in a specific year and location. We referenced all data points to GPS coordinates (latitude and longitude) wherever possible. In cases in which GPS data were unavailable, we matched data points to the smallest possible areal unit (also referred to as 'polygons'). All polygon data were spatially resampled into multiple GPS coordinates and weighted based on the population distribution following a previously described procedure 5,22,23,58 and described in Supplementary Information section 3 In addition to data on child mortality, we used a number of spatial data sources for this analysis. These included a suite of geospatial covariates, population estimates and administrative boundaries 68 . These sources and processing procedures are described in Supplementary Information section 4. Spatial covariates. We extracted values from each of 10 geospatial covariates at each data point location. Geospatial covariates are spatial data represented at the 5 × 5-km grid-cell resolution. The covariates were travel time to the nearest city, educational attainment of maternal-aged women, the ratio of population of children under 5 to women of reproductive age (ages 15-49 years old), the mass per cubic meter of air of particles with a diameter less than 2.5 μm, total population, a binary indicator of urbanicity, intensity of lights at night, the proportion of children aged 12-23 months who had received the third dose of diphtheria-pertussistetanus vaccine, incidence rate of Plasmodium falciparum-associated malaria in children under 5 and prevalence of stunting in children under 5 (see Supplementary  Information). All covariate values were centred on their means and scaled by their standard deviations. Covariates typically had global spatial coverage and values that vary by year. More details of the spatial covariates can be found in Supplementary  Information section 4. Analysis. Geostatistical model. To synthesize information across various sources, and to make consistent estimates across space and time, we fitted discrete hazards 51,52 geostatistical models 59 to our data. The models were discrete in the sense that ages were represented in seven mutually exclusive bins (0, 1-5, 6-11, 12-23, 24-35, 36-47 and 48-59 months), each with its own assumed constant mortality probability. The model explicitly accounted for variation across age bin, year and space through inclusion of both fixed and random effects. Indicator variables for each age bin were included to form a discrete baseline mortality hazard function, representing the risk of mortality in discrete bins from birth to 59 months of age with covariates set at their means. Baseline hazard functions were allowed to vary in space and time in response to changing covariate values, as well as in response to linear effect on year. To model this relationship, we estimated the effect of each covariate value on the risk of mortality. These estimated effects were then applied to the gridded surface of covariate values to make predictions across the entire study geography. We also included a Gaussian random effect across countries to account for larger-scale variations due to political or institutional effects, as well as a Gaussian random effect for each data source to account for source-specific biases. Finally, we included a Gaussian process random effect with a covariance matrix structured to account for remaining correlation across age, time and physical space. As such, estimates at a specific age, time or place benefitted from drawing predictive strength from data points nearby in all of these dimensions.
For each modelling region, we fitted one such discrete hazards model with a binomial data likelihood. All data were prepared such that we counted or estimated the number of children entering into (n) and dying within (Y) each period-age bin from each GPS-point location (s) in each survey (k) within each country (c).
The number of deaths for children in age band (a) in year (t) at location (s) was assumed to follow a binomial distribution: where P a,s,t is the probability of death in age bin a, conditional on survival to that age bin for a particular space-time location. Using a generalized linear regression modelling framework, a logit link function is used to relate P to a linear combination of effects: The first term, β 0 , is an intercept, representing the mean for the first age band when all covariates are equal to zero, whereas β a 1 are fixed effects for each age band, representing the mean overall hazard deviation for each age band from the intercept, when all other covariates are equal to zero. β 2 are the effects of geospatial covariates (X s,t ), which we describe in detail in Supplementary Information section 4. β 3 is an overall linear temporal effect to account for overall temporal trends within the region. All geospatial covariates were centred and scaled by subtracting their mean and dividing by their standard deviations. Each v term represents uncorrelated Gaussian random effects: ν σ normal(0, ) c s [ ] c 2 is a country-level random effect applied to all locations (s) within a country (c); ν σ normal(0, ) 2 is a data source-level random effect for the survey (k) from which the data at location s were observed. Data source-level random effects were used to account for systematic variation or biases across data sources and were included in model fitting but not in prediction from fitted models. The term Z a,s,t ~ Gaussian process(0, K) is a correlated random effect across age, space and time, and is modelled as a four-dimensional mean zero Gaussian process with covariance matrix K. This term accounts for structured residual correlation across these spatial-age-temporal dimensions that are not accounted for by any of the model's other fixed or random effects. This structure was chosen, because the hazard probability for each age group is expected to vary in space and time, and such spatiotemporal correlations are likely to be similar across ages. K is constructed as a separable process across age, space and time = Σ ⊗ Σ ⊗ Σ K ( ) a t s . The continuous spatial component is modelled with a stationary isotropic Matérn covariance function, and the age and temporal effects were each assumed to be discrete auto-regressive order 1. We provide further details on model fitting and specification in Supplementary Information section 5.1.
We assigned priors to all model parameters and performed maximum a posteriori inference using Template Model Builder 60 software in R version 3.4. We fitted the model separately for each of 11 world regions (see Supplementary Fig.  3.1), owing to memory constraints and to allow model parameters to vary across epidemiologically distinct regions. Post-estimation. Using the joint precision matrix and point estimates, we generated 1,000 draws from all model parameters using a multivariate-normal approximation. These model parameter draws were used to predict corresponding draws of mortality probabilities across all age groups for each grid cell in each year. In other words, for each age bin in each year we estimated 1,000 gridded surfaces of mortality probability estimates, each surface corresponding to one draw from the posterior parameter estimates 61 . All subsequent post-estimation procedures were carried out across draws to propagate model uncertainty. We used these estimated spatiotemporal gridded surfaces of age-specific mortality probabilities to produce various final resulting data products. From the fitted model parameters, we produced posterior mortality probability estimates for each age group for each 5 × 5-km grid cell for each year from 2000 to 2017. We combined gridded age group estimates to obtain infant (under 1) and child (under 5) mortality estimates at each gridded location. Using a conversion from mortality probability to mortality rates, and using a gridded surface of population, we also estimated the number of deaths that occurred in each age group at each location in each year. For both mortality probabilities and counts, we multiplied out corresponding gridded estimates by a constant to ensure that at the national-and in two countries, the first administrative-level unit-aggregated estimates for each age group and year were calibrated such that they equalled estimates in the GBD study 4 . This calibration allowed us to take advantage of national data sources, such as vital registration, that could not be used in this study. We also aggregated grid-cell-level estimates to first and second administrative-level units using gridded population surface to weight estimates. These steps are described in Supplementary Information section 5.2. Model validation. We used fivefold cross-validation to assess and compare model performance with respect to estimating local trends of age-specific mortality. Each fold was created by combining complete surveys into subsets of approximately 20% of data sources from the input data. Holding out entire surveys at a time served as a comparable approximation to the type of missingness in our data, essentially helping us check how well our model estimates of mortality probabilities compared to empirical estimates of mortality probability from an unobserved data source that did not inform the model.
For each posterior draw, we aggregated to administrative units. Using data aggregated to the administrative unit and aggregated estimate pairs, we calculated the difference between out-of-sample empirical data estimates and modelled estimates, and we report the following summary metrics: mean error, which serves as a measure of bias, the square root of mean errors, which serves as a measure of the total variation in the errors, the correlation and 95% coverage. At the second administrative-level unit for under-5 mortality, our out-of-sample 95% coverage was 93%, correlation was 0.78, mean absolute error was 0.015 and mean error was −0.0011. These results indicate a good overall fit, with minimal bias. This procedure and the full validation results are discussed in Supplementary Information section 5.3. Limitations. This work should be assessed in full acknowledgement of several data and methodological limitations. We exclusively used CBH and SBH data from household survey and census data sources. Ideally, estimates of child mortality should incorporate all available data, including data from administrative vital registration systems. Vital registration systems are commonly present in many middle-income and all high-income countries. There are known data-quality issues with vital registration sources in many middle-income countries 48,62 that add complications to their inclusion in our modelling procedure. For example, systems may not capture all deaths, and this level of 'underreporting' probably varies in space, time and age. In addition, underreporting is probably negatively correlated with mortality, and could contribute substantial bias to estimates. Statistical methods must be developed to jointly estimate-and adjust for-underreporting in vital registration data before such data can be used in geospatial models of child mortality. Promising work has begun in this domain in specific countries 63 , but further advancement will be necessary to improve estimates across a time series and across many countries at once.
We assume that SBH and CBH data were retrospectively representative in the locations in which they were collected. As such, we assume that survey respondents did not migrate. High-spatial-resolution migration estimates with which to adjust estimates do not yet exist, and many of the data sources that we use do not collect information on migration. We conducted a focused sensitivity analysis (Supplementary Information section 5.4.4) for migration in six countries, and found that although our results were generally robust, there was variation by country. Furthermore, despite providing high-quality retrospective data from representative samples of households, birth history data can suffer from certain non-sampling issues, such as survival/selection biases 64 and misplacement of births 65 . We did not attempt to make corrections to data, and they were used as-is. Furthermore, retrospective birth history data will-by design-have a changing composition of maternal ages depending on the time since the survey. This was minimized by limiting retrospective trends to up to 17 years.
Although we collated a large geo-referenced database of survey data on child mortality, these data represented about 1% (1.1 million) of total deaths of children under 5 in study areas over the period. Where data do not exist or are not available in certain locations, mean estimates are informed from smoothing to nearby estimates and covariates. As such, there could be additional small-scale heterogeneity that is not picked up by our model. Wider uncertainty intervals in areas with no data account for these potential unknowns, and our 95% coverage estimates in out-of-sample predictive tests appear to be well-calibrated at the second administrative unit level. Furthermore, discrete localized mortality shock events could be missing in our analysis due to the lack of data and selection biases in surveys and censuses, and spatiotemporal smoothing. Fatal discontinuities are explicitly accounted for at the national or province level by calibration to GBD estimates. In all, 0.35% (0.4 million) of the 123 million deaths over this period were attributed to fatal discontinuities.
On the modelling side, we integrated point and areal data into a continuous model by constructing pseudo-points from areal data. Modelling approaches that integrate point and areal data as part of a joint model likelihood function are in development 66 but are currently computationally infeasible at the large geographical scales at which we currently model. Furthermore, we divided our models into 11 regional fits (see Supplementary Fig. 3.1), as a full model that encompasses all 99 countries would be computationally infeasible due to memory constraints. Splitting up modelling in this way had the benefit of enabling parameters to vary across epidemiologically distinct world regions. A preferred model, however, would be fitted to all data simultaneously with parameters that are spatially variable.
The separable model used for age-space-time correlations is a common parsimonious assumption afforded in applying spatiotemporal geostatistical models due to efficient computation and inference; however, it yields the assumption of fully symmetric covariance. The symmetry implicit in the separable model dictates, for example, that (holding age constant for simplicity) the covariance between the observations at (location 1, time 1) and (location 2, time 2) is the same as the covariance between (location 1, time 2) and (location 2, time 1). Given our available data density in space-age-time, we believe that attempting to parameterize a more complex non-separable model would be challenging both computationally and inferentially, and it is not clear whether there would be much to benefit from the extra complications.
There are several limitations to address with respect to the use of covariates in the model. Most of the geospatial covariates that we used in the geostatistical model were themselves estimates produced from various geospatial models. Some of those estimated surfaces used covariates that were also included in our model in their estimation process. As such, we emphasize that our model is meant to be predictive, and that drawing inference from fitted coefficients across these highly correlated covariates is problematic and not recommended. Furthermore, we assumed no measurement error in the covariate values and assumed that the functional form between mortality and all covariates was linear in logit space. In certain locations, we used covariate values for prediction that were outside the observed range of the training data. As we explore in Supplementary Information section 5.4.2, however, these areas represent a relatively small proportion of the population.
Finally, we used a method for indirect estimation of SBHs that was recently developed and validated 57 . As such, indirect estimation was carried out as a pre-processing step before fitting the geostatistical model. We attempted to propagate various forms of uncertainty that could be introduced in this step, which resulted in halving the total effective sample size across all SBH data. In future, we aim to fully integrate such processing into the statistical model; such methods are in development 67 , but are not yet computationally feasible at scale. Reporting summary. Further information on research design is available in the Nature Research Reporting Summary linked to this paper.

Data availability
The findings of this study are supported by data that are available from public online repositories, data that are publicly available upon request of the data provider and data that are not publicly available due to restrictions by the data provider. Non-publicly available data were used under a license for the current study, but may be available from the authors upon reasonable request and with permission of the data provider. A detailed table of data sources and availability can be found in Supplementary Table 8.1. The full output of the analyses is publicly available in the Global Health Data Exchange (GHDx; http://ghdx.healthdata.org/ record/ihme-data/lmic-under5-mortality-rate-geospatial-estimates-2000-2017) and can be explored using custom data visualization tools (https://vizhub. healthdata.org/lbd/under5).

Corresponding author(s): Simon Hay
Last updated by author(s): Jul 9, 2019 Reporting Summary Nature Research wishes to improve the reproducibility of the work that we publish. This form provides structure for consistency and transparency in reporting. For further information on Nature Research policies, see Authors & Referees and the Editorial Policy Checklist.

Statistics
For all statistical analyses, confirm that the following items are present in the figure legend, table legend, main text, or Methods section.

n/a Confirmed
The exact sample size (n) for each experimental group/condition, given as a discrete number and unit of measurement A statement on whether measurements were taken from distinct samples or whether the same sample was measured repeatedly The statistical test(s) used AND whether they are one-or two-sided Only common tests should be described solely by name; describe more complex techniques in the Methods section.
A description of all covariates tested A description of any assumptions or corrections, such as tests of normality and adjustment for multiple comparisons A full description of the statistical parameters including central tendency (e.g. means) or other basic estimates (e.g. regression coefficient) AND variation (e.g. standard deviation) or associated estimates of uncertainty (e.g. confidence intervals) For null hypothesis testing, the test statistic (e.g. F, t, r) with confidence intervals, effect sizes, degrees of freedom and P value noted Give P values as exact values whenever suitable.

For Bayesian analysis, information on the choice of priors and Markov chain Monte Carlo settings
For hierarchical and complex designs, identification of the appropriate level for tests and full reporting of outcomes Estimates of effect sizes (e.g. Cohen's d, Pearson's r), indicating how they were calculated Our web collection on statistics for biologists contains articles on many of the points above.

Software and code
Policy information about availability of computer code

Data collection
No primary data collection was carried out for this analysis.

Data analysis
This analysis was carried out using R version 3.5.0. The main statistical model used the Template Model Builder (TMB) software version 1.7.14 in R. All code used for these analyses is publicly available online at https://github.com/ihmeuw/lbd/tree/u5mr-2019.
For manuscripts utilizing custom algorithms or software that are central to the research but not yet described in published literature, software must be made available to editors/reviewers. We strongly encourage code deposition in a community repository (e.g. GitHub). See the Nature Research guidelines for submitting code & software for further information.

Data
Policy information about availability of data All manuscripts must include a data availability statement. This statement should provide the following information, where applicable: -Accession codes, unique identifiers, or web links for publicly available datasets -A list of figures that have associated raw data -A description of any restrictions on data availability The findings of this study are supported by data available in public online repositories, data publicly available upon request of the data provider, and data not publicly available due to restrictions by the data provider. Non-publicly available data were used under license for the current study but may be available from the authors upon reasonable request and with permission of the data provider. A detailed table of data sources and availability can be found in Supplementary