Hierarchical disentanglement of contextual from compositional risk factors of diarrhoea among under-five children in low- and middle-income countries

Several studies have documented the burden and risk factors associated with diarrhoea in low and middle-income countries (LMIC). To the best of our knowledge, the contextual and compositional factors associated with diarrhoea across LMIC were poorly operationalized, explored and understood in these studies. We investigated multilevel risk factors associated with diarrhoea among under-five children in LMIC. We analysed diarrhoea-related information of 796,150 under-five children (Level 1) nested within 63,378 neighbourhoods (Level 2) from 57 LMIC (Level 3) using the latest data from cross-sectional and nationally representative Demographic Health Survey conducted between 2010 and 2018. We used multivariable hierarchical Bayesian logistic regression models for data analysis. The overall prevalence of diarrhoea was 14.4% (95% confidence interval 14.2–14.7) ranging from 3.8% in Armenia to 31.4% in Yemen. The odds of diarrhoea was highest among male children, infants, having small birth weights, households in poorer wealth quintiles, children whose mothers had only primary education, and children who had no access to media. Children from neighbourhoods with high illiteracy [adjusted odds ratio (aOR) = 1.07, 95% credible interval (CrI) 1.04–1.10] rates were more likely to have diarrhoea. At the country-level, the odds of diarrhoea nearly doubled (aOR = 1.88, 95% CrI 1.23–2.83) and tripled (aOR = 2.66, 95% CrI 1.65–3.89) among children from countries with middle and lowest human development index respectively. Diarrhoea remains a major health challenge among under-five children in most LMIC. We identified diverse individual-level, community-level and national-level factors associated with the development of diarrhoea among under-five children in these countries and disentangled the associated contextual risk factors from the compositional risk factors. Our findings underscore the need to revitalize existing policies on child and maternal health and implement interventions to prevent diarrhoea at the individual-, community- and societal-levels. The current study showed how the drive to the attainment of SDGs 1, 2, 4, 6 and 10 will enhance the attainment of SDG 3.

www.nature.com/scientificreports/ individual-specific factors, neighbourhood factors and country-level factors that affect the occurrence of diarrhoea among under-five children in the 57 LMIC using hierarchical Bayesian logistic regression models.

Methods
Study design and data. The cross-sectional and nationally representative Demographic and Health Surveys (DHS) data collected during household surveys across most LMIC were used for this study. We extracted and pooled the latest recoded "children data" from the DHS that collected information on diarrhoea, conducted between 2010 and 2018 and available in the DHS data domain by March 2019. Only 57 LMIC met these criteria and were included in this study. The DHS uses a multi-stage, stratified sampling design with households as the sampling unit 33,34 . However, due to differences in the administrative levels in different countries, the number of sampling stages differed. Country-specific sampling methodologies are available at dhsprogram.com and in the country-specific reports [35][36][37] . Sampling weights were computed and provided alongside the data from each country by DHS and were applied to our analysis. The sampling weights were based on the multi-stage sampling procedures to ensure representation of the general population. All the DHS questionnaires were standardized and implemented across all countries with similar interviewer training, supervision, and implementation protocols.
Data source. The secondary data used for this study is available on request from the owners of the data at https:// www. dhspr ogram. com/ data/ datas et_ admin/ login_ main. cfm.

Dependent variable.
Our dependent variable is diarrhoea. Firstly, women were asked to name all births they had within 5 years before the survey dates. They were then asked if any of the children had at least an episode of diarrhoea within 2 weeks preceding the survey date. The response is binary with children who had diarrhoea coded as "1" and "0" otherwise.

Independent variables.
We used three categories of explanatory variables.
Individual-level factors. Sex of the children (male versus female), children age (< 12 months (infants) and 12-59 months), household head sex (male or female), mothers' age (15-24, 25-34, 35-49 years), mothers' highest education (none, primary, secondary or higher); marital status (never, currently or formerly married), employment status (currently employed or not), access to media (yes or no), sources of drinking water (improved or unimproved), toilet type (improved or unimproved), house building material (improved or unimproved), cooking fuel (clean or unclean), weight at birth (average+, small or very small birth weight), and birth order (1, 2, 3 and 4+). These variables have been linked with diarrhoea in the literature 10,17,20,31,32 . We used the DHS wealth index as a proxy indicator for socioeconomic status. The methods used in computing the DHS wealth index have been described in the literature 38 as depicted in Fig. 1.
Neighbourhood-level factors. In this study, the terms "neighbourhood" and "community", were used to describe clustering of children within the same geographical living environment 6,40,41 . Neighbours are the children that share the same Primary Sample Unit (PSU) within the DHS data. The PSUs were identified using the most recent census in each country where DHS was carried out 40,42 . The neighbourhood-level factors included in the current study are the place of residence (rural or urban), neighbourhood poverty-, illiteracy-and unemployment levels as illustrated in Fig. 1. The neighbourhood poverty-, illiteracy-and unemployment levels were computed as the proportion of children from households in the lowest two wealth quintiles, children whose mother has no former education and children whose mother was unemployed respectively within each country as of the survey time. We categorized these neighbourhood factors into two levels (low and high) each using the 50th percentile cut-off to allow for non-linear effects and offer useful results for policy decisions. Similar procedures have been used in previous studies 40,42 .
Country-level factors. We retrieved the country-level data from the human index reports published by the United Nations database 43,44 . The Human Development Index (HDI) was created by the United Nations to emphasize "that people and their capabilities should be the ultimate criteria for assessing the development of a country, not economic growth alone" 45 . The HDI summarizes the average achievement of countries in three key dimensions of human development: "a long and healthy life, access to knowledge and a decent standard of living" 45 . We categorized the countries into the lowest, middle and highest HDI as shown in Fig. 1. We also explored other country-level factors such as country's rural area percentages (a measure of the proportion of a countries population that resides in rural areas), multidimensional poverty index (a measure of acute multidimensional poverty) and intensity of deprivation (a measure of the average percentage of deprivation experienced by people in multidimensional poverty) [43][44][45] . These variables were used for the descriptive statistics but were excluded from the regression models as they correlated with HDI.
Analytical procedures. We used descriptive statistics to show the distribution of the children by country and by the dependent and independent variables in percentages. Chi-square test of association was used to determine the significance of the association between the independent variables and diarrhoea (Table 1). For the country-level data, we applied sampling weights (SW) provided by the DHS to adjust for unequal cluster sizes, stratifications and to ensure that our findings adequately represent the target population for each country. However, for the pooled data, we computed and applied country-women weights (CWW) to the analysis to reflect the differences in population sizes of the women in each country. The CWW is the product of SW and www.nature.com/scientificreports/ country-specific weights (CSW). We computed the CSW as the number of sampled women aged 15-49 years divided by the population of women aged 15-49 years for each country. While the number of sampled women is available in the dataset, we obtained the population of each country from United Nations population prospects 46 . We checked multicollinearity among the independent variables using the "colin" command in Stata version 16. The command provided the variance inflation factor (VIF). All variables with VIF > 2.5 were removed from the regression analysis as literature has shown concerns about VIF > 2.5 47 . Statistical significance was set to 0.05. All analysis was conducted in Stata version 16.
Modelling approaches. The multivariable multilevel logistic regression models were used to identify if an association exists between the individual, community contextual factors and national compositional factors and diarrhoea. Using all the 3-level model for binary response specified above, with children i who had diarrhoea (at level 1), from a neighbourhood j (at level 2), and living in a country k (at level 3) as shown in Fig. 1, we identified, constructed and assessed five models to arrive at a robust model that will help identify risk factors of diarrhoea considering the multi-level structure of the data. The models are based on a hierarchical logistic regression model with mixed outcomes consisting of the fixed and random parts as shown in Eq. (1). The probability that a child i of neighbourhood j from country k had diarrhoea is denoted by π ijk . The "logit" is the logistic function computed as logit π ijk = log π ijk 1−π ijk , β 0 is the intercept, β p is the regression coefficient for the p parameters, X pijk are the covariates, U 0jk is the random components due collectively to all children from neighbourhood j of country k while V 0k is the random components due collectively to all children from country k . The mixed model enables detailed exploration of variation in variables between higher-level units (contextual heterogeneity).
We developed five distinct models to enable a detailed assessment of different combinations of factors to select the most robust model that could identify the contextual and compositional risk factors of diarrhoea. This was  www.nature.com/scientificreports/ aimed at modelling the compositional factors and contextual factors separately and collectively, with reference to the distinct multi-level structure of the data used for the analysis. The first model was the null model (Model I) to assess the variation due to the neighbourhood and country-specific random effects without any explanatory variable. It decomposed the magnitude of variance that existed between country and neighbourhood levels.

Fixed effects (measures of association).
We reported the results of the fixed effects (measures of association) as the odds ratios (ORs) with their 95% credible intervals (CrIs). Rather than the usual 95% confidence intervals (95% CI) obtained in the frequentist approaches, the Bayesian statistical inference allowed us to summarize probability distributions for measures of association alongside the 95% CrI. The 95% credible interval is simply interpretable as "the 95% probability that the population parameter takes a value in a particular range".

Random effects (measures of variation).
In addition to the fixed effects, we also measured the likely effects of the factors considered across the three different levels using the Intraclass Correlation (ICC) and median odds ratio (MOR). The ICC is the measure of the similarity among children living in the same neighbourhood and within the same country. The ICC is a measure of clustering of odds of having diarrhoea in the same neighbourhood and the same country. We calculated the ICC using the linear threshold, which is the latent variable method 50 . Adopting the methods recommended by Larsen et. al. on neighbourhood effects 51 , we reported the random effects in terms of the odds. The MORs are the measures of the variance of the odds ratio in higher levels (neighbourhood and country levels) and it estimates the probability of having diarrhoea that can be attributed to any of the neighbourhood and country factors. If MOR = 1, there is no neighbourhood or country variance. Conversely, the higher the MOR, the more significant are the contextual effects for understanding the probability of developing diarrhoea. A similar approach has been used in similar settings in the literature 52

Results
Sample characteristics. In Table 1, we present the distribution of under-five children studied and the weighted prevalence of diarrhoea by the countries, the regions of the world, year of data collection, and the numbers of neighbourhoods per each country. The median number of neighbourhoods per country sampled was 555, ranging from 252 in Comoros to 28,321 in India.
Measurement of the prevalence of Diarrhoea, special and common cause variations. As shown in Table 1 Fig. 3 showed that only 10(17.5%) countries within the 99% control limits, indicating common-cause variation. Twenty-two (38.6%) countries were above the upper control limit and 25 (43.9%) countries were below the lower control limit, indicating special-cause variation (Fig. 3).

Children individual-level, neighbourhood-level and country-level characteristics.
The descriptive statistics by selected individual level, neighbourhood level and country-level characteristics are listed in Table 2. About a fifth (21%) of the children were infants, about half were males (51%) and most of their mothers were aged 25-34 years (52%). A third (32%) of the mothers had no formal education and 43% had at least secondary education while only 17% belong to households in the richest wealth quintiles. Most of the mothers were currently employed (59%) and 81% were from male-headed households. Most (79%) of the children had drinking water from improved sources, only 45% had access to improved toilet types, 72% are from households that use unclean (biomass) cooking gas and only 10% are from a household whose floor, roof and wall materials are all improved.
On the neighbourhood-level factors, 66% of the children lived in rural areas, 49% from communities with high poverty rate, 50% and 57% were from communities with high illiteracy rate, and high unemployment rate respectively. Three-fifths (59%) of the children are from countries with a high level of intensity of deprivation and 44%, 42% and 14% from countries with the lowest, middle and high HDI respectively. All the variables considered at the individual-, neighbourhood-and county-levels were significantly associated with diarrhoea in a Chi-square test and the bivariate logistic regression models between each of the explanatory variables and diarrhoea. Hence, all the variables were candidates in the multivariable models. Table 3 presents the outputs of each of the different models explored in this study. In the fully adjusted model (Model V) wherein we controlled for the effects of the individual-, neighbourhood-and country-level factors, children age, children sex, mothers educational attainment, mothers age, employment status, media access, sources of drinking water, toilet type, marital status, housing material, cooking fuel type, weight at birth, birth order, place of residence (rural or urban), neighbourhood poverty-, illiteracy-and unemployment rates, as well as HDI were significantly associated with odds of diarrhoea.

Discussion
Using the information provided by parents and guardians of 796,150 under-five children from 57 LMIC, we explored the factors associated with the experience of at least one episode of diarrhoea within 2 weeks preceding the survey dates in each of the countries. The proportion of children who experienced diarrhoea varied widely across the 57 countries from 4% in Armenia to 29% in Afghanistan. Our major finding is that factors that predispose children to diarrhoea are diverse and complex. The factors are made up of individual cum household, neighbourhood and country-level factors. These characteristics formed distinct blocks of compositional and contextual factors associated with diarrhoea. The compositional factors include being an infant, males, from female-headed households, mother aged < 35 years, mother had primary education, unemployed, mother never www.nature.com/scientificreports/ married, from a household in the lower wealth quintiles, and no media access to be at higher odds of diarrhoea. Other significant compositional factors include drinking water from unimproved sources, uses unimproved toilet types, small weight at birth, high birth orders. The contextual factors are residing in rural areas, from communities with high poverty, illiteracy and unemployment rates and from countries with the lowest and middle HDI. We found diarrhoea episodes to be commoner among infants than the older under-five children. This is consistent with existing findings in the literature 1,17,23,31,32,54 and could be attributed to more fragile anatomy of infants as well as the exclusiveness of breastfeeding 14 . Particular attention should be paid to the prevention of diarrhoea among infants as the higher cases among them has been linked with higher fatalities than among the older children 55 . We also found higher odds of having diarrhoea among male children compared with their female counterparts. Similar differences have been identified in the literature 23 but at variance to the findings of Tetteh et al. that diarrhoea was higher among female children 56 .
The odds of having diarrhoea reduced with increments in mothers' age. The odds were higher among children whose mothers were aged 15-24 years and 25-34 years compared with those born to women aged 35-49 years. Similar findings have been reported in the literature 57,58 . These differences may not be unconnected with the fact that teenage and young adult motherhood comes with its challenges including neglect, limited resources and the likelihood of contracting diseases by both the young mothers and their children 59 . Also, it is not unlikely that older mothers are more experienced in preventing diarrhoea among under-five children. Therefore, age-specific intervention could be designed to prioritise the younger mothers.
Educational attainment among mothers has been associated with childhood diseases including diarrhoea 60 . Our findings generally suggested that children from mothers with limited educational attainment are more likely to have diarrhoea as corroborated in the literature 10,30,32,57,60,61 . The differences were more distinct among children whose mothers had only primary and those that had secondary or higher education. This is a clear indication that other factors interact with women education in the likelihood of children having diarrhoea. Education alone may be insufficient in preventing diarrhoea, factors such as household wealth status, access to media, hygiene and sanitation, good water, rural-urban residential, women age etc. are also important in the prevention of diarrhoea. For instance, higher educational attainment is associated with a better awareness of health education including knowledge and guidelines on sanitation, hygiene, feeding and weaning practices etc 60 .
The wealth status of the households to which the children belong appeared to have played a dominant and consistent role in whether a child experience diarrhoea or not across the LMIC studied. Our findings are in agreement with earlier reports 31,60,61 . There were linear increments in the odds of having diarrhoea from those in households in the poorest wealth quintile compared to those in the richest wealth quintile. The likelihood of diarrhoea was generally 23% higher among children from households in the poorest wealth quintiles than those in the richest wealth quintile. The role of wealth, or at least purchasing power, in the knowledge and utilization of health care services, and by extension, in health outcomes, have been documented 62 . Fagbamigbe et al. reported that women from a household in higher wealth quintiles have a higher likelihood of health care utilization in Nigeria 62 . Wealth is a vital tool in gaining access to media, good sanitation and hygiene, clean cooking fuel etc. To prevent diarrhoea in LMIC, there is a need to enhance the means of livelihood and alleviate poverty among mothers generally since most people in these countries currently live below $2 per day 43 . Livelihood enhancement and poverty alleviation strategies could include employments and better education.
In the current study, we identified access to improved sources of drinking water, use of improved toilet types, use of improved housing materials (floor, wall and roof) and use of clean cooking fuel in households to have lowered the odds of diarrhoea in LMIC. As noted by Fagbamigbe et al., poor hygiene and sanitation including the use of unimproved toilets and water sources have a direct pathway to diarrhoea 20 . We could not assess the effect of "use of soap for hand hand-washing before meals and meals preparation" in this study because the information was not available for most countries. Nonetheless, our result is corroborated with findings from other studies, where diarrhoea have been linked with hygiene, water and sanitation 1,10,54,60,63,64 . Adequate practice and www.nature.com/scientificreports/ maintenance of good sanitation, hygiene etc. can reduce the risk of diarrhoea. Efforts should be made to enhance the knowledge and capacity of women and households, in general, to maintain good sanitation and hygiene in addition to the use of improved housing materials and access to safe drinking water. Health promotion and education on the prevention of diarrhoea are often disseminated through media such as radio, television and newspaper. Access to media on diarrhoea prevention has an indirect link to diarrhoea occurrences. Media access improves knowledge about diarrhoea, which in turn enhances preventive and management practices 65 . We identified that the children whose mothers had no access to at least one of these media sources had higher odds of developing diarrhoea. This finding is consistent with what has been reported in the literature 22,66 . However, access to media could be limited by educational attainment, household wealth status and availability of social infrastructures such as electricity which is lacking in most households and communities across the LMIC. Besides media, there may be a need to reach the mothers directly through local postnatal providers and peer education.
Also, children with low birth weights had higher odds of developing diarrhoea compared with those that normal birth weights as reported by Bado et al. 54 . Greater attention should be paid to the health needs and challenges of children with low birth weights to reduce their chances of developing diarrhoea and other childhood diseases. Children with low birth weight are more susceptible to morbidities and mortality. Therefore, it has a causal pathway to diarrhoea. Using birth order as a proxy for the current family size, we found that the odds of having diarrhoea increased consistently with the increase in the birth order of the children. The prevalence rose from 8 to 18% to 32% among those with 2nd, 3rd and 4th or higher birth orders respectively compared with the children who were first births. Similar findings that diarrhoea is commoner among children in large households have been reported 1,61 . This is plausible as larger households can overstretch the limited resources at their disposals. More so, larger family size has been reported to be commoner among households in lower wealth quintiles 67 . This further corroborates our finding on the association between poverty and diarrhoea.
On the contextual factors, we found higher odds of diarrhoea in the rural area compared with the urban areas as reported previously 66,68 but at variance with an Ethiopian study which reported higher odds in urban areas 10 . Also, children from communities with high deprivations in terms of high poverty, illiteracy and unemployment rates had a higher likelihood of experiencing diarrhoea episode compared with the other children from advantaged communities 10,69 . These contradictions could be ascribed to the specifics of each rural and urban areas. For instance, Kenya has a large slum within its capital city, Nairobi. Diarrhoea experience in such slums with high population density within urban areas could be higher than in rural areas with better and cleaner natural sources 70,71 .
In a similar pattern, children from countries with the lowest and middle HDI have higher odds of having diarrhoea than those from countries with the highest HDI. Of all the factors considered in this study, countries' HDI levels presented the highest odds of diarrhoea. While the odds of diarrhoea nearly doubled among children from countries in the middle HDI, it nearly tripled among those from countries having the lowest HDI. This clearly showed that there are country-level contextual factors and other compositional factors that predispose children to diarrhoea. Our finding aligns with previous findings of Mokomane et al. and Ahs et al. 17,22 .
Our findings provide evidence of wide variations in the development of diarrhoea within and across the LMIC. The dis-advantaged communities (those with a high rate of unemployment, illiteracy, poverty) and countries (those lowest human capital development index) are the worst hit by diarrhoea. Efforts should be made to increase the overall well-being of every community as children from more deprived communities, irrespective of the differences in their compositional factors, all have higher odds of having diarrhoea than their peers from better-off communities. Enhancing the development of LMIC in all spheres will sustain human progress, reduce vulnerabilities and build resilience. As pointed out in earlier reports, there are needs for efficient and effective interventions to guide strategies to target risk factors unique to communities and countries 14 . The implications of the findings of this study for clinical practices is that clinical practices alone may be insufficient in reducing diarrhoea incidences. Besides adequate platform to manage diarrhoea cases clinically, mothers' and communitylevel characteristics should be considered in designing strategies to reduce diarrhoea episodes among children. The identified contextual and compositional factors in this study are "modifiable" as far as diarrhoea preventive interventions are concerned. Through appropriate intervention, the factors could be explored as a means of reducing the occurrence of diarrhoea among under-five children in LMIC.

Study limitation.
The data used for this study relied on mothers and guardians/caregivers recall of diarrhoea episodes among their under-five children. This might have introduced a recall bias through underreporting or over-reporting of the cases. However, DHS has incorporated check and control mechanisms to ensure the accuracy of data collected across the countries. Therefore, the recall bias posed no threat to the reliability of our estimates. The cross-sectional nature of the data prevented causal inferences. Nonetheless, the associations established with the risk factors is suitable to design intervention strategies. Also, the secondary nature of the data has limited our choice of community-level independent factors but we were able to generate quality community-level variables to identify the contextual factors. Besides, we have used only quantitative data, availability of qualitative data could have helped dissected the contextual and compositional factors better. These limitations could be addressed by collecting primary data that includes both quantitative and qualitative data. The use of nationally representative data with proven data reliability and integrity have given credence to the reliability of our findings. The strength of our study lies in its ability to pool the diarrhoea experience of about three-quarters of a million children from 57 countries to arrive at our estimates and conclusions.

Conclusion
Diarrhoea remains a major problem in most LMIC studied. We identified diverse individual-level, communitylevel and national-level factors associated with the development of diarrhoea among under-five children in these countries. In all, we found the highest odds of diarrhoea among the poorest children from the less-advantaged communities within countries with the lowest human development index. Thus, there is a need to reduce the incidence and prevalence of diarrhoea among under-5 year children to forestall a possible/likely rebound in the upsurge of diarrhoea-associated mortality in the nearest future.
Recommendations. There is a need to reinforce diarrhoea prevention and control program at all levelscommunity, national and global-across the low and middle-income countries to reduce the chances of an under-five child developing diarrhoea. In particular, interventions should include community-level health education and promotion on ways to avert diarrhoea incidences are the best measures to reduce its occurrences. Poverty alleviation through gainful employment and better education among women remains the gateway to necessary information on strategies to guide against diarrhoea. To achieve a meaningful reduction in the prevalence of diarrhoea, there may be a need to involve community and religion leaders to influence communal behaviour and practices that could enhance overall community sanitation.

Data availability
The data supporting this article is available at http:// dhspr ogram. com on request from the owners of the data.