Introduction

Investments and global commitments to population health have facilitated a remarkable improvement in child survival in low- and middle-income countries (LMICs)1. India accounts for one-fifth of the 5.4 million under-five deaths globally2. Therefore, reducing infant and child mortality in India is crucial for efforts to reduce the overall global burden. Much of the literature on child survival focuses on access to preventive and curative antenatal and postnatal care3. Growing evidence finds that analyses of infant and child mortality should focus on subnational variations in antenatal and postnatal intervention coverage to provide a better understanding of the largest gaps in child survival4. However, scant literature examines the effectiveness of large-scale investments, such as in sanitation, to address infant and child mortality.

Prior research documents disproportionately higher growth faltering in early childhood resulting in child stunting (low height for age) rates in India, despite rapid economic progress, relative to other LMICs5. This phenomenon, also referred to as the “Asian Enigma”, presumably arises from the widespread practice of open defecation. Open defecation carries a substantial negative externality with respect to infant and child health6. Proposed mechanisms include repeated infections through the fecal–oral route resulting in dehydration and malabsorption of nutrients following chronic inflammation of the small intestine (tropical enteropathy)7. Historically, lack of access to toilets has served as a major driver of open defecation, particularly in rural India. For this reason, expansion of access to sanitation facilities may reduce open defecation which, in turn, may improve infant and child health6,8,9.

Infant and child mortality due to fecal pathogen-based infections (diarrhea, in particular) and malabsorption of nutrients may decline following improved access to toilets and elimination of open defecation7,10. Scholars view sanitation as one of the most important public health interventions of the past century, with dramatic declines in infant mortality rate (IMR) following improved sanitation in the United States and other western countries in the early 1900s11. Given India’s high burden of infant mortality, we would expect large-scale national sanitation programs in the twenty-first century to precede declines in infant and child mortality.

Contrary to theoretical expectations, however, conventional strategies for improving water and sanitation services do not always correspond with sizable improvements in infant and child health. Basic low-cost interventions insufficiently address the multiple pathways through which infants and children are repeatedly exposed to enteric pathogens, especially in highly contaminated settings12. Many researchers view the lack of evidence from randomized trials in low open defecation prevalence countries as confirmation of the need for more comprehensive interventions13. Given the time and resources needed to implement large-scale interventions experimentally, sole reliance on randomized trials presents feasibility constraints13. Examination of evidence from current, large-scale public health programs could offer evidence to inform the design of new interventions.

In this paper, we study one of the world’s largest sanitation programs ever undertaken, the Swachh Bharat Mission (SBM) or Clean India Mission. From 2014 to 2020, the government of India constructed over 100 million household toilets and, as a result, declared more than 600,000 villages as free of open defecation6. SBM was launched by Prime Minister Narendra Modi on October 2nd, 2014, as a “Jan Andolan” (people's movement) and adopted an intensive, multi-pronged approach to national sanitation14. For the period spanning 2015–2020, SBM averaged an annual budget of approximately 1.25 billion USD and comprised the following key components15,16,17,18:

  1. (a)

    The SBM aimed to provide subsidized toilets to eliminate open defecation. By 2020, approximately 109 million individual household latrines (IHHLs) were built under the program. Over the same period there was a twofold increase in toilet availability and a decline in open defecation from 60 to 19% in the first 5 years of the campaign15,16,17,18.

  2. (b)

    SBM’s “people’s movement” approach included Information, Education and Communication (IEC) campaigns to raise awareness about sanitation and hygiene. With a funding of around $375 million, these campaigns reached rural Indians with an average of 50 messages per month over 5 years15,16,17,18.

  3. (c)

    SBM invested in capacity building and training programs for government officials, frontline workers, volunteers, and communities to enhance sanitation practices15,16,17,18.

  4. (d)

    SBM established waste segregation, collection, transportation, and disposal systems, along with treatment plants and recycling centers for effective waste management15,16,17,18.

  5. (e)

    The SBM introduced mobile and web applications for citizen engagement and monitoring. The National Annual Rural Sanitation Survey (NARSS) was conducted bi-annually to assess progress and provided targeted approaches for underserved regions. Findings were used for delivery augmentation15,16,17,18.

SBM’s approach of combining toilet construction with substantial investments in IEC and community engagement differ markedly from prior sanitation efforts in India18,19. However, despite the increase in household toilet availability and government reports of considerable reduction in open defecation post implementation of SBM, concerns regarding actual utilization of toilets, sustained behavior change and overreporting of Open Defecation Free (ODF) status of Indian regions remain20. We contend that if the 109 million toilets constucted under the SBM program are being utilized, it may be reasonable to expect an associated marked improvement in health and mortality outcomes, particularly among young children. Indeed, recent studies find a significant influence of the program on improving growth and immune-sensitive responses among children6,9,21,22. We build upon prior work and examine SBM’s relationship with infant and child mortality. In the present study, we exploit spatiotemporal variation from over 600 districts in India across a span of 10 years to estimate the extent to which an increase in district-level household toilet availability post 2014 precedes reductions in infant and child mortality.

Methods

Data

India has two national data sources that provide subnational estimates of infant and under five mortality rates. The vital registration system of the Indian Census Bureau (VRS) provides infant mortality estimates at the state-level (n = 36 states and union territories); whereas the District Level Household Surveys (DLHS) (waves two and three), and the National Family Health Surveys (NFHS) (waves four and five) allow estimation of infant mortality at the district-level (n = 640 districts) between23,24,25,26,27.

Graphical analysis of long-term association between toilet coverage and infant mortality: The DLHS and NFHS are large national household surveys sampled to be representative at the district level. NFHS/DLHS follow a multi-stage design based on village size, with households randomly chosen in each selected village for rural areas. In urban areas, a similar approach was used with Census Enumeration Blocks, randomly selecting households per block. The DLHS corresponds to the years 2002–2004 and 2008–2009, while the NFHS corresponds to 2015–2016 and 2019–2021. We calculate infant and under five mortality rates at the district-level, along with estimates of the coverage of toilets and other covariates for all districts across the surveys. The district-level estimates are subsequently used for analyses.

SBM coverage data: For measuring SBM rollout we use district-level data from the Ministry of Drinking Water and Sanitation, Swachh Bharat Mission Target versus Achievement database28. This government database provides data on the number of toilets built from 2014 to 2021 under the SBM for all districts in India, except for nine urban districts. To standardize this indicator, we divided the number of toilets built by the total number of households in each district, as per the Indian 2011 census, resulting in an annual SBM coverage indicator at the district-level expressed as a percentage29.

Data for primary analysis: We merge SBM coverage data with NFHS data by birth-year of the child and district of residence. We matched birth-years with SBM years to investigate the association between toilets built under SBM each year and the infant deaths in the same year. Thus, we exploit variation from 6 years of SBM data for 640 Indian districts for estimating the main coefficients of interest.

Data for robustness checks: The VRS provides annual estimates of infant mortality for years 2000–2019 which are considered the most reliable subnational estimates of infant mortality in India. We use the VRS panel data for longitudinal analyses at the state-level.

All datasets used in this study have been made publicly available by the Government of India.

Outcome

Our primary outcomes comprise (1) infant mortality rate (IMR), defined as the number of children dying before reaching the age of 1 year out of the total number of live births (per thousand), and (2) under five mortaltiy rate (U5MR), defined as the number of children dying before reaching the age of 5 years out of the total number of children born alive (per thousand). The VRS directly provides state-year estimates of the infant mortality. DLHS and NFHS provide complete birth history data, including the month and year of birth and death of the children, reported retrospectively by mothers4. These data allow for the calculation of the IMR and U5MR for each year at the district-level. We validated estimates from DLHS and NFHS using the VRS data, ensuring that the state-level means obtained from DLHS and NFHS correspond closely with estimates from the VRS. This ensured that the district-level estimates concur with patterns at the state and national levels.

Other outcomes include child height-for-age (HAZ) and weight for age-z-scores (WAZ) estimated using the WHO growth standards, and episodes of diarrhea and acute respiratory infection (ARI) reported in the 15 days prior to the survey30,31. These outcomes are indicators of child health and nutrition.

Graphical analysis

We employ graphical and regression methods to quantify the relation between SBM and mortality. First, we map the prevalence of IMR and coverage of toilets across Indian districts in 2002–2004, 2007–2008, 2015–2016, and 2019–2021 to gauge spatiotemporal changes. Second, for the same years, we generate scatterplots to visualize the district-level cross-sectional association between IMR and toilet coverage, as well as the association between the annualized change in IMR and toilet coverage. Third, we map the cumulative coverage of toilets built under SBM across districts from 2015 to 2020 to illustrate geographic variation in program exposure over time.

Exposure and covariates for panel data analyses

For district-level econometric analyses, given the ecological, negative externalities of open defecation and poor sanitation practices, we define our exposure as percentage of households that received a toilet under SBM at the district-level by birth cohort. This served as the continuous treatment variable. We additional constructed a categorical treatment variable to measure the intensity of coverage. For the categorical variable, we binned the SBM coverage by birth cohort at the district-level into the following categories: 0%, > 0 to < 10%, 10 to < 20%, 20 to < 30%, and 30 to ≤ 60% coverage. In any given birth cohort year, no district had more than 60% of households receiving toilets under SBM.

Covariates include the percentage of the population residing in urban areas, caste, religion, and wealth at the district-level. We created a district-wealth variable using a principal component analysis of household ownership and utilization of 33 assets and ammenities (including drinking water, cooking fuel)32. Subsequently, the wealth variable was collapsed at the district-level to provide a weighted estimated of district wealth. This wealth variable controls for the possibility of poorer districts being targeted for toilet construction under SBM. We further include health seeking and behavioural factors including the mean number of antenatal care visits provided by a skilled health professional during pregnancy (which could independently reduce IMR and U5MR), as well as percent of hospital births, and BCG vaccination coverage, child birthorder, maternal education, access to health insurance, piped water, and clean cooking fuel. Covariates serve as standard control variables to account for confounding and to explain variation in IMR and U5MR at the district-level over time33.

Two-way fixed effects econometric specification

For econometric analysis we merged district-year specific SBM coverage data with NFHS4 and NFHS5 data by district and birth-year. For example, if a child was born in district x in year y, then they would be assigned an SBM coverage level as the percentage of households that received a toilet under SBM in year y in district x.

In our primary specification we fit a two-way fixed effects (TWFE) model using district-level data with the following form:

$${IMR/U5MR}_{d,t}={SBM}_{d,t}\alpha + {X}_{d,t}\theta +{D}_{d}+{BY}_{t}+{\upvarepsilon }_{d,t}{\ldots}$$
(1)

Here \({SBM}_{d,t}\) is the variable of interest that measures SBM toilet coverage or the intensity of SBM coverage by birth cohort at the district-level. \({X}_{d,t}\) is the full set of covariates, \({D}_{d}\) is district fixed-effects that controls for all unobserved district-specific time invariant factors and \({BY}_{t}\) are birth year dummies that account for birth cohort specific differences in exposure to SBM that are common across districts. \(\alpha \) measures the association of SBM with child mortality and is similar to a difference-in-difference (DID) estimate of SBM toilet coverage coefficient comparing the pre-SBM period to the post-SBM period. Put another way, the parameter \(\alpha \) estimates the “net association” between change in SBM-related toilet construction and change in IMR or U5MR. TWFE estimators control for all common time trends and any place-specific characteristics that could otherwise confound results34. Standard errors are robust and were clustered at the district-level.

Placebo and falsification tests

Our TWFE models assume that negative trends in IMR and U5MR were absent in districts eventually treated by SBM. This hypothesis is an extension of the parallel trends assumption that underlies any TWFE model. Since this assumption cannot be directly tested with out data, we conduct three types of tests to support the primary model.

Firstly, we conducted a randomization inference test for the primary model using the RITEST Stata module. This test utilizes the district-level SBM coverage data and randomly assigns it across districts and birth cohort years in our analysis sample. Thereafter, our primary TWFE model with IMR and U5MR as the outcomes is executed, and estimates are derived. This process is repeated for the number of iterations specified in the command. We set this number to a thousand. The test provides an estimate of the probability that our main result was obtained purely by chance.

Secondly, we conducted a fake treatment test using the DLHS and NFHS. The DLHS2, DLHS3, and NFHS4 surveys provide us with a cohort of children born between 1998 and 2014 who were born prior to implementation of SBM. To ensure that there were no differential trends in districts that were later exposed to SBM pre-treatment, we assigned SBM coverage at the district-level by birth year between 2015 and 2018 to the same districts between 2011 and 2014. For instance, 2015 coverage was assigned to 2011. We then re-fit our primary model and estimated coefficients with continuous and categorical treatments. These tests indicate whether sanitation investments made after SBM, align with patterns of mortality reduction in the earlier period.

Thirdly, we applied our primary specification using the NFHS4 and NFHS5 data to a set of placebo outcomes: ≥ 4 antenatal care (ANC) visits during pregnancy, receipt of services from the Integrated Child Development Services Program (ICDS) during pregnancy, lactation, and childhood, maternal education, alcohol consumption, and clean cooking fuel. ANC and ICDS were used as indicators of exposure to national early childhood programs that can influence IMR and U5MR. Maternal education is a crucial underlying determinant of mortality. Alcohol consumption was chosen as a pure placebo outcome which theoretically should have no association with SBM rollout. Finally, clean cooking fuel served as a surrogate for coverage of the Pradhan Mantri Ujjwalak Yojana (PMJY) which was introduced in 2016.

Exploring the effects on child health outcomes and interaction effects with contextual factors

To test for potential mechanisms through which SBM can be linked with child mortality and to explore effects on other health and nutrition outcomes, we estimated Eq. 1 with other outcomes including HAZ, WAZ, diarrhea and ARI using the NFHS4 and NFHS5 data at the individual-level.

The mechanism of improved sanitation measures may vary based on availability of healthcare access. We hypoethsized that any potential relationship between SBM and mortality might be more muted in districts with high healthcare access. To further explore this hypothesis, we created dummy variables for districts with high (above the median) versus low (below the median) levels of coverage of BCG vaccinations. Districts with higher coverage of BCG above the median indicate areas with lower disease incidence and consequently lower mortality. We then interacted SBM coverage with this dummy to test if it attenuates SBM effects using Eq. 2 using the NFHS4 and NFHS5 data at district-birth year level.

$${IMR/U5MR}_{d,t}={SBM}_{d,t}\alpha +{SBM}_{d,t}{BCG}_{d,t}\beta +{X}_{d,t}\theta +{D}_{d}+{BY}_{t}+{\upvarepsilon }_{d,t}{\ldots}$$
(2)

Here, \(\alpha \) represents the effects of SBM coverage at low levels of BCG coverage, while \(\beta \) estimates the additional effects of SBM coverage at high levels of BCG coverage in the district.

Similarly, given that household water supply may be a necessity for families to use the newly constructed toilets, the results from our models might depend on the quality of water facilities in the districts. To consider the influence of piped water on SBM, we employed an interaction model of SBM coverage with piped water access using Eq. 3.

$${IMR/U5MR}_{d,t}={SBM}_{d,t}\alpha +{SBM}_{d,t}{Piped}_{d,t}\beta +{X}_{d,t}\theta +{D}_{d}+{BY}_{t}+{\upvarepsilon }_{d,t}{\ldots}$$
(3)

Robustness check 1: state-level analysis with VRS data

Lastly, using state-year data from the VRS we visualized the deviation in the IMR trend post-SBM (2015–2019) from the predicted trend using pre-SBM data (2000–2014). The predicted trend is estimated using a mixed effects regression model with linear time and its square as fixed effects; and nested random effects for states (intercepts) and state-years (slopes). We fit an interrupted time series (ITS) model with state-year data from 2000 to 2019 using the following equation for state s in year t:

$${Log IMR}_{s,t}={Trend}_{t}\alpha +{SBM}_{t}\beta +{Trend}_{t}\text{ X }{SBM}_{t}\pi + {S}_{s}+{\upvarepsilon }_{s,t}{\ldots}$$
(4)

Here Log IMR is the natural logarithm of IMR to account for its non-linear movement over time. \({Trend}_{t}\) is a discrete variable that indexes the years 2000 to 2019. \({SBM}_{t}\) is a dummy variable that takes the value 1 for 2015–2019 and zero otherwise. \({Trend}_{t}\) × \({SBM}_{t}\) is an interaction term of Trend and SBM that tests for change in the trend of log IMR comparing the pre-SBM to the post-SBM period. \(S\) is state fixed-effects that controls for all unobserved state-specific time invariant factors. Broadly, this model produces a within state ITS estimate of the average effect of the SBM on IMR assuming that the pre-SBM trend within states provides a reasonable counterfactual estimate for what would have happened in the absence of the program35. We correct standard error estimates for clustering at the state-level. Centering the trend variable at 2015 allows us to interpret the intercept as the value of log IMR at the start of SBM. We fit an additional ITS model excluding the variable \({SBM}_{t}\) in senstivity analyses.

Robustness check 2: individual-level analysis

There are three reasons for using districts as the level of analysis in our study. Firstly, mortality rates are typically calculated for specific geographical areas such as countries, states, and local areas. Since the DHS surveys are sampled to be representative at the district level, this is the smallest area for which we can obtain stable estimates of mortality from the data. Secondly, the DHS surveys do not capture direct individual-level covariates for children who died. For instance, we lack information on whether the mother received care during pregnancy for children who died. Therefore, at the individual level, we are unable to directly adjust for such confounders. By aggregating variables to the district level, we can adjust for a range of child-specific confounders at the district level, although this does mean that inference is made at the district level rather than for individual children. Thirdly, sanitation effects are known to have positive externalities that extend beyond individual households. It is common practice to use sanitation as an ecological-level exposure aggregated to a level above the household (such as communities or districts) to account for these positive effects. Therefore, our preferred estimates are at the district level. However as a robustness check, to complement the district level analysis, we also perform child-level analyses using Eq. 1.

Robustness check 3: assessing the relationship between cumulative exposure to SBM and mortality

In Eq. 1, our treatment exposure is the percentage of households who received toilets under SBM in a given birth year. As robustness check, we reestimate Eq. 1 using the cumulative percentage of households who received toilets under SBM since the start of the program in a given birth year as the treatment exposure. We hypothesize that particularly in the case of under five mortality, cumulative exposure to the SBM program over multiple years might have a larger association with mortality reductions compared to exposure in a single year. We categorized the continuous cumulative exposure variable into the following bins for our analysis: no coverage of SBM, < 25%, 25 to < 50%, and 50% or above coverage.

Results

Figure 1 (panels a–d) show a secular decline in infant mortality in India from 2003 to 2020. Initially (2003), the majority of districts had an infant mortality rate (IMR) exceeding 60 per 1000 live births, with a mean of 48.9. However, by 2020, most districts had achieved an IMR below 30, with a district mean of 23.5. Notably, the central and Indo-Gangetic plains regions had the highest burden of IMR in 2020. Panels e–h of Fig. 1 depict the changes in household access to toilets. In 2003, toilet coverage remained relatively low across districts, with less than 40% coverage on average (mean of 46.7%). This coverage showed minimal improvement from 2003 to 2008, and some districts even witnessed a decline in access. However, by 2015, toilet coverage had substantially improved nationwide. In 2020, the majority of districts boasted toilet coverage exceeding 60% (mean of 81.2%). Figure 2 (panels a and b) demonstrates an inverse correlation between district-level toilet coverage and IMR both in cross-sectional analysis and after accounting for first differences. Whereas a cross-sectional association may indicate the presence of a link in the long run, the first difference plots show that IMR responds to short term changes in toilet access. Table S1 details the temporal distribution of key district-level characteristics from 2002 to 2021.

Fig. 1
figure 1

Changes in district-level infant mortality and toilet access in India, 2003–2020.

Fig. 2
figure 2

Cross-sectional and first difference district-level association between infant mortality and any toilet access in India, 2003–2020. Pp percentage point. Notes Panel (a)—Cross-sectional scatter plot where each dot is a district. Panel (b)—first difference scatter plot where each dot is a district.

Figure 3 shows the expansion of SBM coverage between 2015 and 2020. The maps show that SBM coverage across district-years varies from less than 5% to > 90%, suggesting substantial yearly variation in toilet construction across India. Table 1 shows the distribution of outcomes and covariates used in the regression models, stratified by periods before and during SBM. Infant and child mortality was lower in the SBM period compared to the pre-SBM period. SBM coverage was 23% across districts, on average. There were improvements in maternal education, ANC, hospital births, health insurance, and clean cooking fuel. Examination of potential key correlates of SBM shows that SBM coverage varies positively with the availability of piped water (Table S2). Higher SBM coverage also co-occurs with higher utilization of health and nutrition programs, including ANC and institutional deliveries. SBM coverage varies inversely with household size. Together, these results suggest that districts scaled up other health services in concurrence with SBM coverage. However, states where the average household size was higher were slower to scale up SBM, likely reflecting a higher population burden and a greater requirement for toilets.

Fig. 3
figure 3

District-level cumulative Swachh Bharat Mission (SBM) coverage, 2015–2020. Note: Cumulative SBM coverage was estimated as the proportion of households that were covered by SBM in a given year. The cumulative SBM coverage was calculated using yearly district-level SBM coverage data. SBM coverage in a year was defined as proportion of households which received a toilet under SBM.

Table 1 Characteristics of districts before and after the Swachh Bharat Mission (SBM).

Figure 4 shows that trends in IMR and U5MR by birth cohort experienced a secular 10 point decline during SBM. Figure 5 shows results from the two-way fixed effects linear regression analyses using categorical SBM exposures at increasing levels. SBM coverage greater than 30% in a district corresponds with with 5.3 fewer infant and 6.7 fewer child deaths per 1000 (p < 0.05). Figure 6 presents the predicted IMR and U5MR with SBM formulated as a continuous exposure (percentage of toilet coverage per district). Fixed effects regression analyses indicate a decline in predicted IMR (− 0.09 per 1000 per 1% increase in SBM) and U5MR (− 0.11 per 1000 per 1% increase in SBM) with increased SBM coverage for both outcomes (Table 2).

Fig. 4
figure 4

Trends in infant and under-five deaths (per 1000) by birth cohort. Note: The mean mortality rate was estimated for each birth cohort by dividing the number of child deaths among infants and children below 5 years in each cohort by the total number of children born in each cohort. The rate was then multiplied by 10,000 to obtain the mean number of infant and under five deaths per 1000 in each cohort. Line placed at 2014 as this was the year SBM was launched.

Fig. 5
figure 5

Panel data analysis examining the relationship between intensity of exposure to SBM and infant and child mortality. Note: A categorical variable was created using mean SBM coverage among households at the district-level with the following categories: 0%, > 0 and < 10%, 10 to < 20%, 20 to < 30%, and ≤ 60% coverage between 2015 and 2020. The reference group was districts with no toilets constructed under SBM in a particular year between 2015 and 2020. The model includes district-level controls for child birth order, maternal education, religion, caste, access to health services and insurance, and household wealth, cooking fuel, and piped water access. The model includes district-level fixed effects and standard errors are clustered at the district-level.

Fig. 6
figure 6

Predicted mortality among infants and children below 5 years obtained from panel data analysis using a continuous SBM exposure and mortality. Note: The model includes district-level controls for child birth order, maternal education, religion, caste, access to health services and insurance, and household wealth, cooking fuel, and piped water access. The model includes district-level fixed effects and standard errors are clustered at the district-level.

Table 2 Results of panel data analysis examining the relationship between continuous SBM exposure and infant and child mortality.

Placebo and falsification tests

In Table S3, randomization inference tests show that the probability that our main estimate of reduction in IMR and U5MR in relation to SBM being obtained purely by chance ranges from 0.02 to 0.07 (i.e. very low chance) across our four model specifications. Table S4 shows that re-estimation of the association between our exposure and outcome using a placebo treatment variable does not detect a relation between placebo treatment and IMR and U5MR indicating that birth cohorts that pre-dated SBM did not experience a reduction in IMR and U5MR in district eventually exposed to SBM. In Table S5, application of our primary exposure specification to a set of 7 placebo outcomes shows that SBM coverage was not significantly associated with these outcomes except for ANC (0.1 percentage points (pp)), and any ICDS service during pregnancy and lactation (− 0.1 pp).

Exploring the effects on child health outcomes and interaction effects with contextual factors

Table S6 presents results from the examination of the relation between intensity of exposure to SBM and child growth. We observe evidence of a threshold effect at coverage levels exceeding 30%, where the coefficients on height-for-age z-scores are both substantial and statistically significant (coefficient = 0.18, p < 0.05). In Table S7, examination of the relation between SBM and diarrhea, ARI shows an inverse relation between SBM and both outcomes but does not reach conventional levels of statistical significance at high SBM levels. We attribute this pattern of results to the short recall period of incidence measures for diarrhea and ARI, which may not capture the full spectrum of illness experienced by children. A more ideal measure would be the yearly disease burden at the district-level.

In supplement Table S8, results from interaction of BCG vaccine and SBM coverage show that in districts with low BCG coverage, even small increases in toilet access correspond with reductions in IMR and U5MR. However, we observe an attenuation of this association for areas with high BCG coverage and high SBM coverage, indicating that higher vaccination levels (and attendant health services) may play a key role in reducing infant and child mortality.

In Table S9 interaction tests of SBM and piped water coverage show significant effects at lower levels of SBM coverage (< 20%) when piped water access is low. However, as piped water access increases, these effects at lower levels of SBM coverage appear attenuated. Notably, effects at high levels of SBM coverage (> 30%) appear to be independent of piped water access. Under SBM guideline recommendations, toilets constructed under SBM were encouraged to use twin-pit technology which does not require direct connection of the toilet to the sewage system and the fecal sludge from these toilets only requires to be cleared with water every 2–3 years15. Therefore, it is possible that the relation between SBM and child mortality may not rely strongly on access to piped water.

Robustness checks

Figure S1 (panel A) compares the observed trend in IMR from 2015 to 2020 with an estimated counterfactual trend derived from pre-SBM state-level data spanning 2000 to 2015 (ITS model). We find a noticeable deviation in the observed trend, indicating that IMR reductions occurred more rapidly during the post-SBM period. Panel B (Fig. S1) shows the results of the state-level ITS model based on Eq. 1. IMR exhibited an annual decline of three percent between 2000 and 2015. Furthermore, the post-SBM period witnessed a mean IMR that was ten percent lower compared to the pre-SBM period. Remarkably, the rate of IMR reduction in the post-SBM period was eight to nine percent per year higher than the pre-SBM rate of reduction (p < 0.001). In Fig. S2 where we conduct individual-level examination of the relation between SBM and the probability of infant, under five deaths shows a consistent decline in these probabilities with increase in SBM coverage, in alignment with our district-level specification results. Table S10 presents the results of the analysis of examining the relationship between intensity of cumulative exposure to SBM and infant and under five mortality rates. Cumulative SBM coverage between 25 and < 50% in a district was associated with 6.2 fewer infant deaths per 1000 (p < 0.05) relative to districts that had no SBM coverage. Interestingly, a cumulative SBM coverage of above 0 was associated with 4.4–7.6 fewer under five deaths per 1000 (p < 0.05) with the largest association observed for cumulative coverage between 25 and 50% indicating that exposure to the SBM program through increased coverage has potential persisting benefits for child survival.

Discussion

India's national sanitation campaign, the Swachh Bharat Mission (SBM), aimed to provide toilets to all households by 2019. As one of the largest global public health interventions, its benefits for infant and child health was explored in this study. Analyzing data from multiple large, nationally representative surveys covering 35 states/union territories and over 600 districts over 20 years, we investigated the relation between increased toilet access under SBM and infant and child mortality reduction from 2000 to 2020. Toilet access and child mortality have a historically robust inverse relation in India. Results from our quasi-experimental analyses suggest for every 10-percentage point increase in district-level toilet access following SBM corresponds with a reduction in district-level IMR by 0.9 points and U5MR by 1.1 points, on average. We further find evidence of a threshold effect wherein the district-level toilet coverage of 30% (and above) corresponds with substantial reductions in infant and child mortality. Similar critical thresholds have also been identified in other recent studies on open defecation36,37. In abolute numbers, this coefficient would scale to an estimated 60,000–70,000 infant lives annually. Our study provides novel evidence of reductions in infant and child mortality following a comprehensive national sanitation program in India, potentially indicating the transformative role of SBM.

Our study's strengths lie in its strong internal validity in that we utilize a quasi-experimental design with SBM as a plausibly exogenous programmatic exposure. Our longitudinal analyses control for birth year and district fixed effects, and a number of important control variables, thereby minimizing confounding bias. Leveraging annual district-level toilet construction, we also avoid the atomistic fallacy and consider unsanitary behaviors' negative externality at a larger geographic scale38. For LMICs struggling with open defecation, our study also may hold external validity—but of course only replication can establish this case.

Our findings on child mortalty and anthropometry align with evidence from global and South Asian contexts, where multiple studies using population representative surveys have indicated that enhanced sanitation can potentially reduce child mortality rates by 5–30%39,40,41. Furthermore, observational data strongly support the notion that improved sanitation practices in India lead to reductions in child mortality, growth faltering, and incidence of diarrheal disease8,42,43. However, it is essential to interpret our findings in light of India's existing primary health care infrastructure, which provides a considerable portion of the population with preventive and curative health services that address various diseases stemming from inadequate sanitation and resulting in child mortality44. Consequently, the effectiveness of the SBM may have been influenced by the availability of universal health services. We contend that that the benefits of improved sanitation measures may vary based on availability of comprehensive healthcare access and synergistic programming aspects of the SBM campaign, above and beyond toilet construction, in relation to behavior change and oral-fecal exposure to contaminants. Results from our analysis of interactions between vaccine coverage and SBM support these synergistic effects. Districts that had higher vaccine coverage showed smaller SBM benefits, suggesting that the existing disease environment, gauged by coverage of preventive health interventions, plays a significant role in moderating sanitation benefits.

Our study adds to the growing body of evidence linking large-scale national sanitation campaigns to improved child health outcomes6,9. Interestingly, recent research also highlights the broader benefits of increased toilet access, including women's safety, financial savings from reduced medical expenses, and overall improvements in quality of life45,46,47. However, despite these positive benefits, disparities in toilet adoption and usage persist due to caste and religion-based discriminatory practices48. Concerningly, some studies indicate that coercive measures and discrimination implemented by local authorities to meet campaign targets have violated individuals' rights, particularly affecting marginalized communities like manual scavengers and lower-caste individuals49. These practices pose challenges to the effective and equitable implementation of the SBM, and raise legitimate concerns about the long-term sustainability of hygiene-related behavior change. While this examination falls outside the scope of our present study, we encourage future research to utilize following rounds of national survey data to explore long-term changes in toilet use and its relation to child mortality in India.

Limitations

Like most observational studies, we acknowledge limitations. We caution readers regarding potential endogeneity in the coverage of SBM. Adoption of SBM presents legitimate issues regarding the non-random distribution of program intensity among districts, which may be correlated with unobserved district features like political dynamics or pre-existing infrastructure. If these features also have an impact on the outcomes we are interested in, they may introduce bias into our estimations. While other researchers have adopted an instrumental variables strategy to address endogeneity in placement of sanitation programs, finding a valid instrument for SBM coverage at the district level is challenging because most instruments (such as distance to political capitals or political affiliation) are directly linked to other determinants of health and sanitation outcomes, violating the exclusion restriction. Thus, we instead decided to use a TWFE strategy, making use of SBM's district-by-district phased deployment and conducted falsification and robustiness checks to increase confidence in our estimates. With respect to the degree of detail in program coverage statistics, we also acknowledge that using coverage data at the district level rather than at the village or household level might obscure variations in program implementation and results within a district.

Unmeasured factors that could cause residual confounding would meet the following criteria to be a significant threat to our results: (1) correlate positively with SBM but not be caused by it, (2) vary inversely with IMR and U5MR, (3) exhibit the same regional variation as our SBM exposure across districts over time, and (4) not be accounted for by the set of control variables included in our analyses. Despite our results being supported by several placebo tests and robustness checks, we acknowledge that the parallel trends assumption cannot be directly tested with our data, therefore we cannot interpret our findings as causal. Further, in our falsification tests, we detected a simultaneous scale-up of antenatal care and SBM, however, the analysis showed that potential biases stemming from this variable are likely of a small order. Additionally, we lack cause-specific IMR and U5MR data, preventing differentiation of mortality causes like diarrhea from other reasons. Future studies should explore these disaggregations when detailed and consistent data become available.

Conclusion

Our study provides evidence of the benefits of India's national sanitation campaign, the Swachh Bharat Mission or Clean India Mission, for infant and child mortality reduction. Our findings add to the growing body of evidence linking national sanitation campaigns to improved child health outcomes and emphasizes the need for similar interventions in other low- and middle-income countries.