Approximation of the Cox survival regression model by MCMC Bayesian Hierarchical Poisson modelling of factors associated with childhood mortality in Nigeria

The need for more pragmatic approaches to achieve sustainable development goal on childhood mortality reduction necessitated this study. Simultaneous study of the influence of where the children live and the censoring nature of children survival data is scarce. We identified the compositional and contextual factors associated with under-five (U5M) and infant (INM) mortality in Nigeria from 5 MCMC Bayesian hierarchical Poisson regression models as approximations of the Cox survival regression model. The 2018 DHS data of 33,924 under-five children were used. Life table techniques and the Mlwin 3.05 module for the analysis of hierarchical data were implemented in Stata Version 16. The overall INM rate (INMR) was 70 per 1000 livebirths compared with U5M rate (U5MR) of 131 per 1000 livebirth. The INMR was lowest in Ogun (17 per 1000 live births) and highest in Kaduna (106), Gombe (112) and Kebbi (116) while the lowest U5MR was found in Ogun (29) and highest in Jigawa (212) and Kebbi (248). The risks of INM and U5M were highest among children with none/low maternal education, multiple births, low birthweight, short birth interval, poorer households, when spouses decide on healthcare access, having a big problem getting to a healthcare facility, high community illiteracy level, and from states with a high proportion of the rural population in the fully adjusted model. Compared with the null model, 81% vs 13% and 59% vs 35% of the total variation in INM and U5M were explained by the state- and neighbourhood-level factors respectively. Infant- and under-five mortality in Nigeria is influenced by compositional and contextual factors. The Bayesian hierarchical Poisson regression model used in estimating the factors associated with childhood deaths in Nigeria fitted the survival data.

Individual-level factors. The following individual-level factors were included in the models: sex of the children (male versus female), maternal age in completed years (15-19, 20-24, 25-29, 30-39, 40-49 years), maternal education (no education, primary, secondary or higher); marital status (never married, living together/ married and widowed/divorced) and occupational status (currently working or not working), religious affiliation (Islam, Other Christians, Catholic and others); Ethnicity (Hausa/Fulani, Yoruba, Igbo/Ibobio and others); decision on mothers healthcare-seeking (respondent alone, both respondent and spouse, spouse alone); problem in accessing health care (big problem, not a big problem). Information on household income and expenditure www.nature.com/scientificreports/ was not collected in the 2018 NDHS. We, therefore, used DHS wealth index scores as a proxy indicator for households' socioeconomic position. The scores were aggregated from the household's assets ownership. We classified the scores into three tertiles (poorest, middle, and richest). Other variables were sources of drinking water (unimproved source versus improved source); toilet type (improved source or unimproved source), house material was aggregated from floor, wall and roofing materials (poor or good); type of birth (singleton or multiple); birthweight (average/higher range, small, very small); birth orders (1, 2-4, 4 +), birth intervals; (1st birth, < 36 months, 36 months +), postnatal care (no, yes); delivery mode (normal or caesarean); received tetanus injection (No, Yes).
Neighbourhood-level factors. We operationalized the term neighbourhood to describe clustering within the same geographical living environment. Neighbourhoods were based on sharing a common primary sample unit within the DHS data. The sampling frame for identifying the primary sample unit in the DHS-7 is usually based on two reasons. First, the primary sample unit is the most consistent measure of the neighbourhood across all the surveys 30,31 and thus the most appropriate identifier of the neighbourhood for this cross-state comparison. Secondly, the sample size per cluster in the 2018 NDHS meets the optimum size with a tolerable precision loss.
The following neighbourhood-level factors were included in the models: the place of residence (rural or urban area), neighbourhood poverty-, illiteracy-and unemployment rates. We categorised these rates into two categories: low and high, to allow for non-linear effects.
State-level factor. The 36 + 1 state-level data were collected from the reports published by the Nigeria National Population Commission 14 . We used the "percentage of rural population" in each state to categorise the states into three groups: 0% to 33.3% as low rural proportion; 33.4% to 66.7% as middle rural proportion and 66.8% to 100% as high rural proportion.

Collinearity
We diagnosed collinearity among the explanatory variables using a correlation matrix in an attempt to exclude highly correlated variables. As used in earlier studies, we set a cut off of r = 0.6. This cut-off has been described as having collinearity concern among highly correlated variables 27,32 . We found collinearity between household wealth status and housing material (r = 0.649), birth order and birth interval (r = 0.612) and between maternal age and birth order (r = 0.639). Housing material and birth order were removed from the multivariate analysis, as "household wealth index" and "maternal age" were adjudged more vital to investigate U5M and INM. Also, questions on who take decisions about healthcare utilization were asked from currently married women and those living with spouses which constitute 95% of all respondents. We considered the decision taking more important to U5M than marital status and therefore dropped marital status from the multivariable analysis.
Statistical analyses. Besides  The Poisson and the Cox proportional hazard (CPH) models. The Poisson model is an approximate model for Cox proportional hazard (CPH). The likelihood function of the CPH models with normal random effects is proportional to the likelihood of the random effects in the Poisson models 34,35 . Studies have reported that CPH models with normal random effects can be estimated as generalized linear models with a binary Poisson count response and a specific offset parameter 36,37 . The approximation of the CPH to the Poisson model requires that each observation in the data should be split into multiple records based on the complete set of failure times in the data set to have a counting process format and that the offset should be the logarithm of the length of each time interval. The baseline hazard is modelled as a smooth function of time, in our case a 4th order polynomial 38 . Typically, in the analysis of time-to-event data wherein interest is the investigation of the effect of p treatments (regarded as covariate effects on child mortality in this study). For the i th child, let x ip be the covariates. A standard CPH model can be applied using appropriate mathematical maximization procedures 39,40 in Eq. (1).
where β j is a vector of the coefficients of the explanatory variables, h 0 (t) is the baseline hazard function, h(t) h 0( t ) , the hazard ratio (HR). It is possible to split the follow-up time into k = 1, . . . , K i intervals. Assuming a constant hazard within each of these intervals, the Poisson model can be applied as shown in Eqs. (2) and (3) as noted by Crowther et al. 37 .
where d ik is the censoring indicator: 0 or 1 (child survived or died). This can be presented as a Poisson process for each child during each of the K intervals to count the numbers of occurrences within each interval of time. Ordinarily, d ik does not follow a Poisson distribution, but the above computational process ensured the correct www.nature.com/scientificreports/ form of the likelihood for a piecewise exponential model 37 . The definition of β j remain the same, k is the baseline hazard rate in k th time interval, γ ik is the time at risk, and forms part of the log(offset) in the linear predictor. By splitting the follow-up time at each unique event time and applying the Poisson model, an identical estimate of the treatment effect, β j , to that obtained from the CPH model could be obtained [34][35][36][37][38]41  MCMC Bayesian Hierarchical analysis. By extension, the models which allow for cluster-heterogeneity in the treatment effect can be applied to the survival data. The hazard function for the i th child, in the j th community nested in l . th state, can be formulated as shown in Eq. (4): where h 0 (t) remains as defined, β 01 is constrained to be zero, β 0j is the proportional effect on the baseline hazard function due to the j th community, β 1 is the mean log hazard ratio for the effects of the covariates and b 1j is the deviation of the log hazard ratio in the j th community from the population mean. It is on theasis that the model can be fitted using the Poisson-based Generalized Linear Model (GLMs) models. The 3-level Poisson model follows Eqs. (6) and (7).
Other numerical details have been described earlier 37,[43][44][45] . The following options were specified in the Markov Chain Monte Carlo (MCMC) analysis: Distribution: Poisson; link: log, thinning: 50, burning: 6000, chain: 50,000 and refresh: 500. We specified a 3-level model for binary response reporting infants mortality and under-five mortality, for a child i (at level 1), in a neighbourhood j (at level 2) living in a state k (at level 3). For each of INM and U5M, five Each of the models was based on the hierarchical logistic regression model with mixed outcomes consisting of the fixed and random parts as shown in Eq. (7).
The risk that child i of neighbourhood j from state k will die (INM/U5M) is denoted by γ ijk , U ojk is the random effect of daughters neighbourhood j in state k and V ok is the random effect of state k , e ijk is the noise such that e ijk ∼ 0, σ 2 e , U ojk ∼ 0, σ 2 U and V ok ∼ 0, σ 2 V in a model containing t covariates. We reported the measures of association as incidence rate ratios (IRRs) with their 95% credible intervals (CrI). Measures of variations were explored using the intraclass correlation (ICC) and median incidence rate ratios (MIRR) 46,47 . The ICCs represents the percentage of the total variance in the risk of child mortality that is related to the neighbourhood and state levels (i.e. a measure of clustering of risk of child mortality in the same neighbourhood and state) and is the equivalent of the variance partition coefficient (VPC) which measures the proportion of total variance which are accounted for at the neighbourhood σ 2 The MIRR is the estimate of the probability of child mortality attributable to neighbourhood and state context.

Results
Distribution of participating children, infant mortality and under-five mortality. As shown in Table 2, a total of 33,924 children data was available for analysis in the 2018 NDHS. Nearly two-fifths (39%) of their mothers were aged 30-39 years, 46% had no formal education, 39% had no access to media. About 51% of the children were males, 4% were of multiple births, 66% had drinking water from improved sources, 86% had average or higher birthweights while only 9% of the mothers could single-handedly make decisions on their healthcare access.
As shown in Table 1, for both the INM and U5M, maternal age, maternal education, media access, sex of child, multiple births, household wealth index, sources of drinking water, housing material, ethnicity, religion, weight at birth, birth order, birth interval, postnatal care, and tetanus injection were associated with underfive mortality. Also, the person who decides healthcare access, having problems accessing healthcare, mother

Infant mortality-measures of associations (fixed effects).
In the fully adjusted model, while controlling for the effects of individual-, neighbourhood-and state-level associated factors; maternal age, maternal education, multiple births, weight at birth, birth interval, who decides on healthcare access, problems accessing healthcare facilities, community illiteracy level, and proportion of the rural population within each state were associated with risk of infant mortality. The risk of infant mortality increased by 29% (IRR (incidence risk ratio): 1.29, 95% Credible Interval (CrI): 1.01 to 1.58) among mothers aged 40-49 years compared with those aged 25-29 years. The children from multiple births were nearly thrice (IRR = 2.73, 95% CrI: 2.07 to 3.52) more likely to have infant mortality. The children from mothers with no education or with primary education were 89% (IRR = 1.89, 95% CrI: 1.22 to 2.78) and 80% (IRR = 1.89, 95% CrI: 1.19 to 2.77) respectively more likely to experience infant mortality than those whose mothers had higher education. The risks of INM was 25% and 49% higher among those with very small and small birthweights compared with those with average or higher birth weight. The risk of INM increases by 18 among children whose healthcare seeking decisions were made by their fathers alone compared with when mothers made such decisions. Community illiteracy increases risks of INM by 20% while children from the states with a high percentage of the rural population had a higher risk (IRR = 1.31, 95% CrI, 1.01 to 1.89) of INM compared with those from states with a low rural population (Table 3).

Infant mortality-measures of variations (random effects).
The full model is the best of all the models as it had the lowest Bayesian Information Criterion (BIC). In Model V, there was a variation in the risks of INM across the states (σ 2 = 0.06, 95% CrI: 0.01 to 0.14) and across the neighbourhoods (σ 2 = 0.18, 95% CrI: 0.06 to 0.31). Going by the intra-state and intra-neighbourhood correlation coefficient, 1.82% and 7.00%, the variance in risk of INM could be attributed to state-and neighbourhood-level factors, respectively. The median incidence rate ratio (MIRR) estimates also confirmed evidence of societal contextual (MIRR = 1.43, 95% CrI:

Under-five mortality-measures of associations (fixed effects).
In the fully adjusted model while controlling for the effects of individual-, neighbourhood-and state-level factors; maternal education, multiple births, household wealth status, birth interval, who decides on healthcare access, having a big problem getting    www.nature.com/scientificreports/ to healthcare facility, community illiteracy level, and proportion of the rural population with each state were associated with the risk of U5M. The risks of under-five mortality doubled (IRR = 2.14, 95% CrI: 1.51 to 3.03) among mothers with no education compared with those that had higher education. The children from multiple births were over 100% (IRR = 2.30, 95% CrI: 1.82 to 2.78) at the risk of U5M compared with the singletons. The risk of U5M increased in households in the poorest (60%) and middle (44%) wealth tertiles compared with those from the households in the richest tertiles. The risk of U5M increased by 8% among children whose healthcare seeking decisions were made by their fathers alone compared with when mothers make such decisions. Community illiteracy increases the risk of U5M by 19% while children from the states with a high rural population had a higher risk (IRR = 1.32, 95% CrI: 1.01 to 1.89) of U5M compared with those from states with a low rural population.

Under-five mortality-measures of variations (random effects).
In null model (Model I), there was a distinct variation in the risk of U5M across the states (σ 2 = 0.34, 95% CrI: 0.20 to 0.58) and across the neighbourhoods (σ 2 = 0.21, 95% CrI: 0.15 to 0.28). The estimated intra-state and intra-neighbourhood variance partition coefficient was 8.9% and 14.5% respectively, indicating that the variance in risks of U5M could be attributed to state-and neighbourhood-level factors. However, the full Model was the best of all the Models as it had the lowest Bayesian DIC. The MIRR estimates also confirmed evidence of societal (state) (MIRR = 1.43, 95% CrI: 1.25 to 1.64) and contextual (neighbourhood) (MIRR = 1.42, 95% CrI: 1.30 to 1.55) phenomena driving of U5M in Nigeria (Table 4). From the full model (Model V), it was estimated that if a child moved to another state or neighbourhood with a higher probability of U5M, the increase in their risk of U5M would be 3.90% (95% CrI: 1.58% to 7.11%) and 7.74% (95% CrI: 3.83% to 12.6%) respectively. Compared with Model I, Model V showed that the total variation in the risk of U5M explained by the state-and neighbourhood-level factors were 59.6% and 35.4% respectively. The deviance and parameter chains for the full model is shown in Supplementary Figure   www.nature.com/scientificreports/ hazard model using the Bayesian MCMC procedure. The procedure was carried out by (i) splitting the follow-up time into intervals, (ii) obtained the number of events within each interval and (iii) estimated the random and fixed effects of childhood mortalities. The Bayesian hierarchical Poisson regression model used in estimating the factors associated with childhood deaths in Nigeria fitted the survival data.The estimates were robust and computation time reduced, similar to the conclusions of Crowther et al. 37 . However, the MCMC graphical diagnostics, in some cases, showed correlations between successive simulated chains and low convergence rates. Particularly, the convergence of the model at the neighbourhood level for the infant mortality parameter estimates was low although with large lags but the auto-correlation function (ACF) plots of the neighbourhood estimates of the U5M and the ACF plot for both the infant mortality and U5M parameter estimates at the state level had large lag and achieved convergence. The outstanding case of the low convergence for the infant mortality estimates at the neighbourhood level is a limitation in this study. The low convergence could be attributed to low sample sizes within some clusters (neighbourhoods).
Overall, our analysis revealed abysmally high infant and under-five mortality rates nationally with the associated individual-, neighbourhood-and state-level factors. On controlling for these factors, INM and U5M was higher among children with first-order birth, less than three years birth interval; smaller birth weights, multiple births, fathers' sole decision making on healthcare seeking, community illiteracy, living in states with average to higher rural proportion. Additionally, lack of and low maternal educational attainment and accessing healthcare being highly problematic care were predictors of INM and U5M. Older maternal age (40-49 years) was associated with an increased incidence of INM. Moreover, having secondary level education, poor and middleincome wealth tertiles were associated with an increased incidence of U5M. Community illiteracy and accessing healthcare being highly problematic had a marginal effect on both INM and U5M unlike fathers' sole decision making on healthcare-seeking which had only a marginal effect on the increase in INM. Notably, child's sex, rural residence, ethnicity and media access did not influence the incidence rate ratio of INM and U5M.
Proportionately, low maternal educational attainment and higher rurality of a state had twice influence on the occurrence of INM 29,48,49 . A study conducted by Yaya et al. reported a higher risk of childhood and U5MR with low maternal education, poor household wealth index and rural-urban disparity in Nigeria 20 .
Prior studies corroborated the relationship between older maternal age with both IMR and U5M 20,50 . The risk of increased INM and U5M were twice and thrice likely in birth plurality in this study. While the risk of INM was higher among children with small or very small birth weights it was not associated with U5M. The relationship between small birth size, a known feature of multiple births and INM has been established 51 . Prior studies corroborated the association between first-order birth and short birth intervals and increased risk of INM and U5M 11,48,50,51 .
Studies have indicated associations between both the INM and U5M and composite factors such as maternal age, mothers' education, place of residence, child's sex, birth interval and weight at birth 11,21 . Though prior study reported female infants are more likely than males to survive; child sex had no influence on these indices in this study 50 . Biological and genetic factors have been hypothesized as probable underlying factors for the association between male gender and higher U5M 48,52,53 .
The spousal sole healthcare decision making and its influence on INM and U5M was established in this study. This has been a long time challenge in northern Nigeria. Similar findings have reported. For instance, Adhikari et al. reported infants whose mothers were involved in healthcare decision-making had 25% lower odds of dying in Nepal 54 . Maternal lack of decision making power on child healthcare without prior consent from the spouse or a representative household head, for example, the mother-in-law is rooted in socio-cultural and religious norms in northern Nigeria, a sensitive issue but needs to be addressed. Obasahon et al., in their analysis of 2013 Nigeria DHS reported that odds of utilizing antenatal care services increased about four-folds among women with higher decision-making autonomy 55 . A parallel can be drawn with child healthcare. There is a need to expand and accelerate male involvement in child healthcare. Women Influencing Health, Education, and Rule of Law (WIHER) in Bauchi state, Nigeria which engages men in their prime on gender equality is a step in the right direction and this could be adopted 56 .
Media provides information including healthcare-related ones. Maternal use of traditional media such as newspaper/magazine, radio and television) is associated with a reduced risk of U5M 27  Infant and under-five mortality rates exhibit high variability across the country 21 . Ogun state had the lowest INMR and U5MR while Kebbi state had the highest INMR and U5MR. Additionally, Anambra (South-East), Benue (North-Central), Bayelsa and Delta (South-South) had low INMR. Kaduna (North-West) and Gombe (North-East) had very high INMR. It is of extreme concern why states in the south (Ekiti and Imo) are still within the high INMR bracket and this brings to fore the need to mitigate the identified risk factors. Currently, there is no respite with INM, as no geopolitical region in Nigeria is exempted from high INM. Thus, these findings could drive initiatives for and access to optimal and skilled prenatal and natal care and other child survival strategies to reduce INM nationally unlike the prior perception that these indices are worst in northern Nigeria. Moreover, most states in the northern region still harbour the majority of high INMR, 14 out of 17 in this study. This may be due to a higher proportion of rurality in northern states. Adewuyi et al. reported in the context of rural residence that states in the north-eastern and north-western geopolitical regions had higher INMR 51     www.nature.com/scientificreports/ care seeking, low acceptability of family planning practices and poor perception of child spacing, resistance to childhood immunization resulting in its low uptake, and insurgency 16,17,20 . Nationally, U5MR is on the increase in the last five years, unlike the abating trend earlier reported 11,16 . A difference exists between rural and urban setting based on access to social amenities such as health infrastructure and level of available healthcare, good roads and water supply 57 . States with a higher rural population had higher INMR and U5MR. Access to healthcare utilization remains a predictor of INM and U5M. This reiterates the need for structural and manpower development as important factors in strengthening and improving health service delivery which is a building block in achieving SDGs 3.
There is a need to continue ongoing efforts to address high INM and U5M in Nigeria, especially in the northern states, to achieve child health-related sustainable development goals 58 . Moreover, it is equally important to have a better understanding of ongoing pregnancy and child health initiatives that are being implemented in Anambra and Ogun for others states in the South-East and South-West regions to leverage and implement to reduce current INM and U5M.
Study limitations and strengths. The study was based on a cross-sectional analysis and thus causality can not be ascertained. It should also be noted that this study was unable to cover neonatal mortality based on a time constraint and the complexity involve in its computation. The authors accepted that neonatal mortality, especially early neonatal mortality as one of the critical area that has not seen any improvement since 2008 in Nigeria 16 .
This analysis has, however, offered an in-depth view of the variability of incidence rates of INMR and U5MR across states and provides a vital opportunity for monitoring progress with the implementation of ongoing child survival strategies. The study will serve as a baseline for further research aiming at understanding the contextual factors associated with child mortality in Nigeria at a different level in the society. The results also provide baseline information for interventional research aiming at meeting the global agenda in the nearer future. Notably, this study identified differences in INMR and U5MR across states and thus, provides an opportunity for comparative informed decision making. Other states within the same geopolitical region could leverage effective interventions in a high performing state which resulted in low child mortality, to improve on their current child survival strategies and mortality indices. For instance, Ogun state has the lowest INMR and U5MR but Ekiti state within the same region had high indices. Ekiti state could learn and implement what worked and is working in Ogun state.

Conclusions
This study identified variability of INM and UM5 across states and regions in Nigeria, the highest being in the northern region based on the 2018 NDHS. The lack of and low maternal educational attainment and experience of problems accessing healthcare, first birth order and short birth interval; smaller birth weights, multiple births, fathers' sole decision on healthcare seeking, community illiteracy, and living in states with average to higher rural population proportion were determinants of increased risk of high INM and UM5. Older maternal age-predicted INM while the increased U5M was linked to secondary level education, poor and middle-income wealth tertiles. The pervading high infant and under-five mortality rates call for urgent attention from the federal and state governments in Nigeria and developmental partners to address the identified drivers leveraging on lessons from other states with improved indices. Rural-urban disparity across the states calls for development, equity and optimal access across healthcare and social sectors to attain child health-focused SDG 3.2.