Prediction of incomplete immunization among under-five children in East Africa from recent demographic and health surveys: a machine learning approach

Tadese, Zinabu Bekele; Nigatu, Araya Mesfin; Yehuala, Tirualem Zeleke; Sebastian, Yakub

doi:10.1038/s41598-024-62641-8

Download PDF

Article
Open access
Published: 21 May 2024

Prediction of incomplete immunization among under-five children in East Africa from recent demographic and health surveys: a machine learning approach

Zinabu Bekele Tadese¹,
Araya Mesfin Nigatu²,
Tirualem Zeleke Yehuala² &
…
Yakub Sebastian³

Scientific Reports volume 14, Article number: 11529 (2024) Cite this article

316 Accesses
Metrics details

Subjects

Abstract

The World Health Organization as part of the goal of universal vaccination coverage by 2030 for all individuals. The global under-five mortality rate declined from 59% in 1990 to 38% in 2019, due to high immunization coverage. Despite the significant improvements in immunization coverage, about 20 million children were either unvaccinated or had incomplete immunization, making them more susceptible to mortality and morbidity. This study aimed to identify predictors of incomplete vaccination among children under-5 years in East Africa. An analysis of secondary data from six east African countries using Demographic and Health Survey dataset from 2016 to the recent 2021 was performed. A total weighted sample of 27,806 children aged (12–35) months was included in this study. Data were extracted using STATA version 17 statistical software and imported to a Jupyter notebook for further analysis. A supervised machine learning algorithm was implemented using different classification models. All analysis and calculations were performed using Python 3 programming language in Jupyter Notebook using imblearn, sklearn, XGBoost, and shap packages. XGBoost classifier demonstrated the best performance with accuracy (79.01%), recall (89.88%), F1-score (81.10%), precision (73.89%), and AUC 86%. Predictors of incomplete immunization are identified using XGBoost models with help of Shapely additive eXplanation. This study revealed that the number of living children during birth, antenatal care follow-up, maternal age, place of delivery, birth order, preceding birth interval and mothers’ occupation were the top predicting factors of incomplete immunization. Thus, family planning programs should prioritize the number of living children during birth and the preceding birth interval by enhancing maternal education. In conclusion promoting institutional delivery and increasing the number of antenatal care follow-ups by more than fourfold is encouraged.

Safety outcomes following COVID-19 vaccination and infection in 5.1 million children in England

Article Open access 27 May 2024

Post-COVID conditions following COVID-19 vaccination: a retrospective matched cohort study of patients with SARS-CoV-2 infection

Article Open access 22 May 2024

A guide to vaccinology: from basic principles to new developments

Article 22 December 2020

Introduction

The national Immunization program is one of the most economically advantageous health treatments, with tested methods for reaching the most vulnerable and difficult-to-reach groups in both developing and developed countries^1,2,3,4. World Health Organization (WHO) launched the Expanded Program Immunization (EPI) in 1974 to ensure universal access to all vaccines for all targeted groups, including children, adolescents, and adults⁵. According to the WHO guidelines, the national EPI now aims to immunize infants between the ages of 0 and 23 months (about 2 years) against eight vaccine-preventable childhood illnesses, such as one dose of measles, three doses of polio, one dose of Bacillus Calmette-Guerin (BCG), and three doses of pentavalent⁶. Hence, a child is fully vaccinated if he/she has received all eight doses of vaccination listed above.

According to the United Nations Children’s Fund (UNICEF) and WHO report in 2019, almost 20 million children were either unvaccinated or had incomplete immunization, making them more susceptible to mortality and morbidity⁷. Of the 20 million children worldwide, who missed vaccination in 2019, over 60% were from 10 countries, many of whom live in countries with weak health systems^7,8. The global under-five mortality rate declined from 59%, which was 93 deaths per 1000 live births in 1990, to 38% in 2021, due to huge portion of immunization⁹. COVID-19 pandemic and related disruptions have put a burden on health systems, resulting in 25 million children missing out vaccinations in 2021, a number that is 5.9 million higher than in 2019 and the largest amount since 2009¹⁰.

Although WHO's goal is to make vaccination services accessible to everyone worldwide by 2030, about 13.5 million children did not receive the first dose of a vaccine due lack of access to vaccination services¹¹. Ethiopia had over 10.9 million children under the age of one who missed the first dose of measles between 2010 and 2018, which was the highest amount¹². Through effective vaccination programs, COVID-19 demonstrates the vital role that vaccines play in illness prevention, lifesaving, and promoting a wealthier future¹³.

Even though that Africa has made remarkable progress in immunization services, according to the 2013 immunization data report, vaccine coverage was 75 percent, and Ethiopia has the second largest number of unvaccinated children in the region, next to Nigeria¹⁴. More children in Africa have lost their immunizations in recent years as the number of births has increased and immunization programs have stagnated¹². In Ethiopia, the prevalence of complete childhood vaccination status among children aged 12–23 months increased from 24.6 to 39% between 2011 and 2016, respectively¹¹. Despite this, according to a recent systematic review and meta-analysis report, one in two children was not vaccinated or four out of ten children had incomplete vaccine in Ethiopia¹⁵.

Several researches have been conducted to investigate the potential factors associated with incomplete immunization through the application of classical statistical analysis techniques^{16,17,18,19,20} based on prior assumptions that could limit the potential to discover hidden knowledge. In contrast, machine learning algorithms are designed to make the most accurate predictions possible, enabling systems to learn from data rather than making prior assumptions²¹. There are still high rates of incomplete childhood immunization, which require further investigation to prioritize and promote childhood vaccination to ensure the health and well-being of all children in east Africa. Therefore, this research was aimed to predict incomplete immunization among under-five children in East Africa using machine learning algorithms.

Methods

Study design and setting

Demographic and Health Survey (DHS) used population based cross-sectional survey study design to collect data and this study employed predictive modeling approach. Secondary data of six east African countries namely Burundi, Ethiopia, Madagascar, Uganda, Rwanda, and Zambia DHS dataset from 2016 to the recent 2021 were considered for this analysis.

Source and study population

Source population includes all mothers aged 15–49 years who had children under the age of five while all mothers aged 15–49 years who had children under the age of five and started immunization for their children were considered as source population.

Inclusion criteria

Mothers with children aged 12–35 months who had begun immunization were included in the study.

Data source, Sample size and sampling procedure

Data source

Data was obtained from the MEASURE of DHS program²². The DHS is a nationally representative survey that collects data on basic health indicators such as mortality, morbidity, family planning service utilization, fertility, maternal and child health services (vaccination). Each country’s survey consisted of different datasets including men, women, children, birth, and household datasets.

Sample size determination and sampling procedure

A total of 27,806 weighted sample and 27,691 actual sample were considered from six east African countries (Burundi, Ethiopia, Madagascar, Uganda, Rwanda, and Zambia) as shown in Table 1.

Table 1 Sample size determination for incomplete immunization in east Africa DHS 2016–2021.

Full size table

DHS used two stages of stratified sampling technique to select study participants. In the first stage, Enumeration Areas (EAs) were randomly selected whereas in the second stage households were selected. The survey datasets were accessed through the web page of the International DHS Program after subscription and appropriate letter is acknowledged.

Study variables

Incomplete immunization in children under the age of five were outcome variable categorized as 1 = Yes (children who had not completed the full dose of vaccination) and 0 = No (those who had received the full dose of vaccination). Baseline explanatory variables were selected from previous studies^{14,16,18,23,24,25,26,27,28,29,30}. Thus, sociodemographic factors include mothers age, marital status, mothers’ occupation, mothers’ educational level, husband education, place of residence and sex of household head. socioeconomic factors include wealth index and media exposure while reproductive(obstetrics) history factors include mothers’ history of ANC follow-up, place of delivery, sex of child, number of living children, birth order, child size at birth, PNC visit and preceding birth interval were independent variables.

Operational definition

Incomplete immunization: “children who started vaccination and missed at least one dose from eight recommended vaccination at any time instance between 1 and 12 months”^31,32.

Complete immunization: “when children had been vaccinated for all recommended vaccination (one dose of BCG, three doses of polio, three doses of pentavalent, and one dose of measles)”³².

Data management and analysis

Data extraction was carried out using Stata version 17, and then imported to Jupyter Notebook for further analysis. Sample size weighting was used to draw valid inference. Data were thoroughly cleaned, and missing values were imputed to ensure completeness. Outlier detection was performed to identify and remove extreme values that could have skewed the analysis. Python 3 programming language in Jupyter Notebook using imblearn, sklearn³³, XGBoost³⁴ and SHAP³⁵ packages were utilized to perform the necessary calculations and analysis.

Machine learning framework for prediction of incomplete immunization

A general framework utilized in earlier research³⁶ was created (Fig. 1) based on Yufeng Guo's seven machine learning processes³⁷, to predict incomplete vaccination. All machine learning algorithms and techniques were implemented using Python version 3.10.11 programming language in Jupyter Notebook.

Data collection and preprocessing method

The dataset for this study was extracted from Demographic and Health Survey website and obtained upon a formal request after subscription and registration on their system. A total actual sample of 27,691 under-five children who started vaccination then appropriate data preparation was performed to make data suitable for ML task.

Missing data were managed using various imputation procedures to fill incomplete fields with statistically relevant substitutes. The k-nearest neighbor (KNN) technique has proven to be typically effective for missing value imputation³⁸. In this study a simple imputer class of scikit-learn module mode for categorical data and KNN for numerical data were used for imputing missing values in the dataset. Outliers were identified using a boxplot and replaced using the Interquartile Range (IQR) scores for the next step.

Before fitting the ML model, feature engineering was applied. Among various data transformation techniques, we used One Hot Encoder and label Encoder to encode categorical variables into numeric values and min–max normalization technique was used for scaling. Standard balancing strategies including random under-sampling, random over-sampling, and the Synthetic Minority Oversampling Technique (SMOTE) were tested to address the unbalanced categories of the outcome variable. As a result, SMOTE outperformed the other resampling techniques on baseline model.

Following feature engineering dimensionality reduction was applied. High-dimensional data may contain a lot of redundant and useless information, which might seriously reduce how well learning algorithms work³⁹. The mutual information and variance threshold from filter method, Recursive Feature Elimination (RFE) from wrapper method and Boruta feature selection method were tested and compared their performance on baseline model for feature selection technique.

Since every ML need training and test dataset, data split was allocated as 80% for training and 20% for testing. The popular k-fold cross validation approach was utilized to ensure the performance of the model because the train-test split function method has disadvantages that it might result in the data being over-fitted or under-fitted on splitted data. In K-fold method, the dataset is split into ‘k’ sub-samples, in which one sample is used for testing and the rest of the k − 1 data set is used for training purpose³³.

Model development methods

The dataset used in the analysis falls under the category of binary classification since incomplete immunization is categorized into two mutually exclusive categories. Accordingly, eight classification algorithms (Logistic Regression, Random Forest, K-nearest neighbor (KNN), Artificial Neural Network, Support Vector Machine, Naïve Bayes, eXtreme gradient boosting (XGBoost), and Decision tree) were fitted for this study. These methods were chosen based on prior research that used machine learning techniques for classification tasks using DHS data, with each country's performance taken into account^{40,41,42,43,44,45,46,47}.

To verify the algorithm’s performance in terms of classifications, a confusion matrix (also known as an error matrix) and Jaccard score is used. It summarizes the actual and predicted classifications of a dataset and shows the number of correct and incorrect predictions, which are further categorized into true negatives, false negatives, true positives, and false positives. Additionally, the importance and effect of each variable's contribution on the outcome were identified using SHapley Additive exPlanations (SHAP). SHAP is a game theoretic approach to explain the output of any machine learning model that connects optimal credit allocation with local explanations using the classic Shapley values from game theory and their related extensions³⁵. Furthermore, receiver operating characteristic curve AUC was used for visualizing summary of performance ML models. Detail of confusion matrix were adapted from⁴⁸ and presented as follows:

	Predicted positive	Predicted negative
Actual positive	True positive (TP)	False negative (FN)
Actual negative	False positive (FP)	True negative (TN)

Building an efficient ML model is thought to depend heavily on tuning hyper-parameters, especially for tree-based ML models that include a lot of hyper-parameters⁴⁹. It might be challenging to determine what values to use for a particular algorithm's hyperparameters on a specific dataset while the process of searching terminates when predefined criteria are satisfied. For this study, hyper-parameter tunning was done using grid search methods. Finally, supervised ML uses the highest-performing classifier with a defined performance to predict incomplete immunization based on identified independent factors.

It is not over yet because we are still ignorant of the precise feature categories connected to incomplete immunization. For this purpose, rule generation was done using best performed model. Association rules are IF–THEN rules that are particularly significant since they are simple to understand and limit the attributes chosen for the model during rule generation to those that are pertinent⁵⁰.

Ethics considerations and consent to participate

Since the study was a secondary data analysis, participant consent was not necessary. Permission for data access has been granted from the Demographic and Health Survey (DHS) measure through an online platform by filling all requirements needed to access data from http://www.dhsprogram.com. The IRB-approved procedures for DHS public-use datasets do not allow respondents, households, or sample communities to be identified. There are no names of individuals or household addresses in the data files.

Results

Sociodemographic characteristics

This study included a total weighted sample of 27,806 children under five. According to the data about 79.54% of participant mothers reside in rural areas. Sex of household head: Males accounted for 79.22% of the household heads. About 20% were female, approximately 85% were married, and only 6.23% and 9% single and widowed or divorced, respectively. Of all mothers in the total country, only 3.62% were professional workers, and most of them (71.78%) were not professional workers. Above half, (55.35%) of husbands took primary education, and still, 21.37% have no regular education. Of all, only 23.28% completed secondary and above level education. Surprisingly, half (49.32% of mothers) took primary education, and 27.25% and 21.37% had no regular education and completed secondary or above by level education, respectively. A summary of sociodemographic characteristics is shown in Table 2 and (Fig. 2) for mothers' ages.

Table 2 Sociodemographic characteristics of incomplete immunization among under-five children in east Africa DHS 2016–2021.

Full size table

Reproductive (obstetrics) history characteristics

The data reveals that the majority of children born to the participants were male, accounting for 51.02% of the total number. Furthermore, most participants did not receive PNC checkup, which is concerning, as it is an important aspect of postnatal care. However, it is reassuring to note that 21.38% of the participants did receive PNC checkup. In terms of place of delivery, most deliveries took place in a health institution, which is a positive indicator of access to healthcare services. However, it is important to note that home deliveries still accounted for a considerable proportion of births, at 32.86%. Finally, the majority of children were of average size at birth accounts 68.93%, while 21.14% were small and 9.93% were large. A summary of reproductive history characteristics is shown in Table 3.

Table 3 Reproductive (obstetrics) history characteristics of incomplete immunization in east Africa DHS 2016–2021.

Full size table

Socioeconomic characteristics

According to the data, the wealth index of the participants was distributed as follows: 48.35% were poor, 34.65% were middle class, and 16.99% were rich. It is important to note that many participants were classified as poor, which could have implications for their access to healthcare and other essential services. Regarding media exposure, most participants had access to media, accounting for 51.62% of the total number. However, it is concerning that almost half (48.38%) of the participants did not have access to media, which could limit their access to important health information and education.

Machine learning analysis of incomplete immunization

This study tried to do feature selection using different techniques to reduce the number of features, as shown in (Fig. 3). Mutual information and variance threshold from the filter method, Recursive Feature Elimination (RFE) from the wrapper method, and Boruta feature selection method were tested and compared their accuracy on baseline model. Despite testing various methods, the highest accuracy was achieved when all features were included in the model development process. This may be attributed to the fact that the original features were already extensive and informative, thus including all of them resulted in the best performance.

Model development and evaluation

Data were splitted into training and test data after being cleaned and balanced. We allocated 80% of the data for training and 20% for testing. Then we developed eight ML models to predict incomplete immunization. All models were fitted on both unbalanced data and balanced data. Finally, each model’s performance was evaluated and compared in the test set before and after balancing in order to select the best predictive model. Accordingly, high performance was achieved after balancing the target variable shown in Table 4.

Table 4 Model performance comparison.

Full size table

After applying the SMOTE balancing technique, the results showed that the random forest and XGBoost models were the best predictive models, having the same performance with an accuracy of 78.34%, f1-scores 76.76%, and Jaccard scores 62.29% for random forest and accuracy 78.78%, f1-score 76.24%, and Jaccard scores 61.16% for XGBoost.

The model that performs best on balanced data was exposed to hyperparameter tuning, which is random forest and XGBoost classifier models. Since both models have roughly the same performance in this study, hyperparameter tuning was applied using the grid search approach to both the random forest classifier and the XGBoost classifier in order to ensure the best model. A Grid Search method with ten-fold cross validation was used to optimize the hyper-parameters of ML models. Since it is not straight forward to select best parameter, ‘criterion’: ‘entropy’, ‘n_estimator’:100, 200, 500, ‘max_depth’: None, 5, 10, ‘max_features’:‘sqrt’, ‘log2’, None were searched and 'max_depth': None, 'max_features': 'log2', 'n_estimators': 500 ‘random state = 0’ were pulled for random forest model. While 'n_estimators': [100, 200, 500], 'max_depth': [3, 5, 10], 'learning_rate': [0.1, 0.01, 0.001], 'subsample': [0.8, 1.0], 'colsample_bytree': [0.8, 1.0], 'random_state': [0, 42] were searched and 'colsample_bytree': 1.0, 'learning_rate': 0.1, 'max_depth': 5, 'n_estimators': 500, 'random_state': 42, 'subsample': 0.8 were pulled for XGBoost model. After implementing this, XGBoost was still able to outperform random forest, therefore it was employed as a prediction model.

Visualization of feature importance

While classical analysis is more structured and relies on pre-defined rules and formulas like p-value cut point to select significant features, machine learning algorithms are designed to adapt and learn from data. Although ML models are often considered as black boxes because it is difficult to interpret why an algorithm provides accurate predictions on particular problem⁵¹; therefore, we introduced the SHAP value in this study. SHAP is a unified framework proposed by Lundberg and Lee⁵² to interpret ML predictions, and it is a new approach to explain various black-box ML models. We leveraged SHAP to explain our predictive model, which includes related predicting factors that lead to incomplete immunization. The importance of predictors is evaluated by the mean SHAP value, as shown on (Fig. 4). Features with a long bar located at the top are highly related to incomplete immunization. Results from feature importance showed, the number of living children during birth, ANC follow-up history, maternal age, place of delivery, birth order, and preceding birth interval were associated with a higher predicted probability of incomplete immunization among under-five children.

From Fig. 5, the feature ranking (y-axis) indicates the importance of the predictive model. The SHAP value (x-axis) is a unified index that responds to the influence of a certain feature in the model. In each feature important row, the attributions of all variables to the outcome were drawn with dots of distinct colors, where the red dots represent the high-risk value, and the blue dots represent the low-risk value. Features pushing the prediction higher are shown in red, those pushing the prediction lower are in blue. Furthermore, the rest of the other variables had slightly significant effect to low effect on incomplete immunization.

Predicting incomplete immunization

After training, 5698 test samples were used to evaluate the XGBoost model's performance. Out of 2856 incomplete immunization status, the model predicted 2567 of them correctly as incomplete (true positive). And out of 2842 complete, the model predicted 1935 of the as complete (true negative). But the model misclassified 907 true complete immunization status as incomplete (false positive) and 289 true incomplete as complete immunization status (false negative) as shown on (Fig. 6). Overall, the model predicted with an accuracy of 79.01%, recall of 89.88%, F1-score of 81.10%, and 73.89% precision on test data.

$${\mathbf{Accuracy}} = {\text{ TP }} + {\text{ TN }}/ \, \left( {{\text{TP }} + {\text{ FP}} + {\text{FN}} + {\text{TN}}} \right) \, = > { 2567} + {1935 / 2567 } + {289} + {9}0{7} + {1935 } = {\mathbf{79}}.{\mathbf{01}}\%$$

$${\mathbf{Precision}} = {\text{ TP }}/ \, \left( {{\text{TP }} + {\text{ FP}}} \right) = > {2567/ 2567 } + {9}0{7 } = {\mathbf{73}}.{\mathbf{89}}\%$$

$${\mathbf{Recall}} = {\text{ TP }}/ \, \left( {{\text{TP }} + {\text{ FN}}} \right) = > {2567/ 2567 } + {289 } = {\mathbf{89}}.{\mathbf{88}}\%$$

$${\mathbf{F1}}\,{\mathbf{score}} = \, \left( {{2 } \times {\text{ Precision }} \times {\text{ Recall}}} \right) \, / \, \left( {{\text{Precision }} + {\text{ Recall}}} \right) \, \left( {{2} \times 0.{8988} \times 0.{7389}} \right) \, / \, 0.{8988 } + \, 0.{7389 } = {\mathbf{81}}.{\mathbf{10}}\%$$

Area under the receiver operating characteristic curve (AUC) (Fig. 7) was used to summarize model performance overall thresholds and thus misclassification error weightings. XGBoost model produced an area under the curve of 66% on unbalanced data, whereas after balancing and hyperparameter tuning, the prediction on test data produced an area under the curve of 86% which indicates a good predicting model. Below the figure green line shows the model after balancing and tuning while the orange line shows AUC on unbalanced data.

Association rule mining

Association rule mining is a technique used to discover interesting relationships between variables in large datasets⁵³. For this study association rule mining was done using Apriori algorithm to identify the precise category that is linked with incomplete immunization. Before applying association rule mining data discretization was performed for the variables that were not categorical at all. Thus, mothers’ age was categorized as (15–24, 25–34, 35–49). Number of living children categorized as (1–3, 4–6, and > 6). Preceding birth interval categorized as (< 25, 25–48, > 48). ANC follow up categorized as (no visit, 1–4 and > 4). Birth order is categorized as (1st, 2nd and 3rd and above 3rd). Apriori algorithm produces 13 rules connected to target category 1 that replace incomplete immunization status with more than 70% confidence level, but only four rules with more than 85% confidence level were generated.

Rule1 IF ('mothers_age_15-24’, ‘delivery_place_home', ‘ANC_follow_1-4' THEN target_1 confidence 89.9% lift 1.696.

Rule2 IF ('mothers_age_15-24', ‘ANC_follow_1-4', 'mothers_edu_Nedu’, ‘delivery_place_home') THEN target_1 confidence 87.4% lift 1.695.

Rule3 IF 'mothers_age_15-24', 'delivery_place_home', 'mothers_edu_Nedu') THEN target_1 confidence 85.58% lift 1.678.

Rule4 IF ‘ANC_follow_1-4','delivery_place_home', 'mothers_edu_Nedu') THEN target_1 confidence 85.5 lift 1.

Discussion

Recently COVID-19 pandemic and related disruptions have put a burden on the health systems. This results in 25 million children missing out vaccinations in 2021, a number that is 5.9 million higher than in 2019 and the largest amount since 2009¹⁰. This study was conducted to predict top risk factors of incomplete immunization among children under five. Eight supervised machine learning algorithms were trained on both balanced and imbalanced data for prediction purposes. The performance of those eight ML models was compared by their classification accuracy, f1-score and Jaccard score. SMOTE's data balancing approach outperformed models developed using unbalanced data in terms of accuracy, f1 score, Jaccard score and Area under curve score. In this study XGBoost and random forest performed best same result on balanced data. But after applying hyperparameter XGBoost model improved performance over the random forest with an accuracy of 79.01%, recall of 89.88%, F1-score of 81.10%, precision 73.89%, and AUC 86% while random forest was chosen on study conducted in Sindh province, Pakistan⁵⁴ on predicting elevated risk of defaulting from immunization. This may be due to the fact that they did not test XGBoost model on their research. Final prediction was made on test data after optimizing hyperparameters of XGBoost classifier in turn improved AUC. The model predicted 2567 true positive (true case of incomplete immunization) 1935 true negative (true complete immunization) and misclassified 289 as complete and 907 as incomplete.

Accordingly, top features were identified by SHAP mean value based on their importance in predicting incomplete immunization after model is tuned on XGBoost. In addition, the contribution of each feature to the prediction for incomplete immunization and model accuracy, were identified using SHAP impact (on model output). Those with red dots have high predictive probability or pushing the prediction higher, in contrast feature located at the bottom of tree explainer with blue color were low predictive probability or pushing the prediction lower. This research found number of living children during birth, ANC follow-up history, maternal age, place of delivery, birth order, preceding birth interval were the top associated with a higher predicted probability or pushing factor to incomplete immunization among under five children in east Africa. This result is supported by previous studies done in east Africa using multilevel analysis have shown that factors such as birth order, ANC follow up, place of delivery, preceding birth interval, maternal age have profound influence on mother’s health-seeking behavior and child immunization status¹⁶.

Another aim of this research was to identify specific categories that are associated with incomplete immunization. Association rule mining was employed to identify which category is more associated with incomplete immunization among children under five in east Africa. The analysis revealed that children whose mothers had no education, were delivered at home instead of a health institution and ANC follow (1–4 times) were highly associated with predictive probability to have incomplete immunization. Additionally, the study found that younger mothers (15–24) were also associated with incomplete immunization.

In our research findings, younger mothers (15–24) were associated with incomplete immunization among children under five in east Africa. This may be due the fact that older mothers had childcare experience which the young mothers are yet to acquire. Additionally, older women may be more willing to continue immunizing their children since they may have previously had children who received vaccinations and had no negative side effects. Similarly, a study conducted in Nigeria¹⁸ Ethiopia¹⁴ and Kenya⁵⁵ agreed that Children of young women (15–24 years) are more likely to be incompletely immunized when compared with children of older women. Possible explanation could be this study attributed to large samples and included more areas beyond Ethiopia.

Results from rule generation revealed that children whose mothers had no education were associated with incomplete immunization. A similar association between maternal education and child immunization has been reported in several other studies, including Togo⁵⁶ Nigeria¹⁸ Athens Greece⁵⁷ Hadiya zone, Ethiopia⁵⁸ systematic review across the globe⁵⁹. Indeed, a woman's education has a demonstrable impact on her ability to acquire information about the usage of health services in general and vaccination services in particular, as well as her level of living. According to research, education has a significant impact on mothers' health-seeking habits, including child vaccination¹⁸. Education also makes it simpler for women to communicate with medical experts, leading to a better understanding of and ability to absorb knowledge about actions that enhance children's welfare. In contrast study conducted in rural of Mozambique showed Mothers' educational levels had no influence on the child's vaccination status⁶⁰. This may be due study conducted only on rural area since residence variation have impact on mothers’ education related factor like media exposure in addition small sample size is not representative which led to bias.

Home delivery was associated with incomplete child immunization in East Africa. This finding is in line with the studies conducted in Nepal⁶¹ India⁶² Tigray northern Ethiopia⁶³ Madagascar⁶⁴ Kenya⁵⁵. Similarly home delivery was reported to be a risk factor in case–control studies⁶⁵ and systematic review⁶⁶ conducted in Ethiopia. The explanation could be women who give birth at a home are less likely to be aware of their own and their children’s health status than institutional delivery. According to a systematic review and meta-analysis conducted in Ethiopia, women who gave birth at home were 3 times more likely to have incompletely immunized children than women who delivered at health facilities⁶⁷.

Our study also shows that the utilization of health services such as ANC can be an important factor for the incompleteness of children’s vaccination status. This is consistent with the study conducted in Tanzania⁶⁸ Senegal⁶⁹ and a previous study in East Africa by multilevel analysis¹⁶. This might be explained by the fact that mothers obtaining sufficient helpful information about kid immunizations at ANC visits, giving them confidence in their children's preventive health. This result, also supported by systematic review and meta-analysis from Ethiopia, revealed that ANC follow-up services were found to be significantly associated with incomplete vaccination⁶⁷. Indeed, ANC follow-up is important for child immunization as it allows healthcare providers to monitor the mother's health during pregnancy, ensuring that the child receives the necessary vaccinations.

Strength and limitation of the study

This study has the following limitations. Frist recall and social desirability biases. Although the DHS program is typically regarded as one of the most trustworthy sources of quantitative data, particularly maternal and child health, it may be that the responses were affected by recall and social desirability biases. While acknowledging these and other limitations inherent in national demographic surveys of this kind, the surveys still offer the greatest population-based data currently available, encompassing all the nation's provinces and regions and guaranteeing external validity or generalizability.

Nevertheless, this research has several strengths, one of which is the utilization of machine learning techniques that learn from data rather than relying on prior assumptions as in classical analysis methods. Furthermore, this study provides an invaluable contribution to immunization status literature in context of machine learning.

Conclusion

This study was conducted with aim of predicting and identifying predicting factors of incomplete immunization in east Africa. Using SHAP mean values and SHAP plots, we proved that the ML method can illustrate the influence of key features and establish a high-accuracy incomplete immunization prediction model. The illustration of cumulative domain-specific feature importance and visualized interpretation of feature importance can allow policy makers and immunization program manager on respective study area to intuitively understand the decision-making process for incomplete immunization among under-five children. Prior to this, number of living children during birth, ANC follow-up history, maternal age, place of delivery, birth order, preceding birth interval must all be taken into consideration while implementing health policies intended to reduce the incomplete immunization. Family planning programs should focus on the number of living children during births and preceding birth interval, by enhancing mothers’ education for respective country. We highly recommend promoting institutional delivery and increasing the number of ANC follow-ups by more than four times. It is essential that all stakeholders like Eastern Africa regional coordination center (RCC) take appropriate measurements to ensure that the immunization process is accessible to all children in the country.

Data availability

The dataset used and analyzed in this study is available from the DHS program official database (http://dhsprogram.com) upon formal request. Approval for dataset access is typically confirmed via email.

Abbreviations

AUC:: Area under the curve
COVID-19:: Coronavirus disease of 2019
DHS:: Demographic and health survey
EA:: Enumeration area
EPI:: Expanded immunization program
FN:: False negative
FP:: False positive
IA:: Immunization agenda
KNN:: K-nearest neighbour
ML:: Machine learning
ROC:: Receiver operating characteristic
SHAP:: SHapley Additive exPlanations
SMOTE:: Synthetic minority oversampling technique
TN:: True negative
TP:: True positive
WHO:: World health organization (WHO)
XGBoost:: EXtreme gradient boosting

References

Miller, M. A. & Hinman, A. R. In Vaccines, 6th edn (eds Plotkin, S. A., Orenstein, W. A., & Offit, P. A.) 1413–1426 (W.B. Saunders, 2013).
Ozawa, S. et al. Return on investment from childhood immunization in low- and middle-income countries, 2011–20. Health Aff. (Project Hope) 35, 199–207. https://doi.org/10.1377/hlthaff.2015.1086 (2016).
Article Google Scholar
Bloom, D. E. In Hot Topics in Infection and Immunity in Children VII (eds Curtis, N., Finn, A., & Pollard, A. J.) 1–8 (Springer, 2011).
Sim, S. Y., Watts, E., Constenla, D., Brenzel, L. & Patenaude, B. N. Return on investment from immunization against 10 pathogens in 94 low- and middle-income countries, 2011–30. Health Aff. (Project Hope) 39, 1343–1353. https://doi.org/10.1377/hlthaff.2020.00103 (2020).
Article Google Scholar
Machingaidze, S., Wiysonge, C. S. & Hussey, G. D. Strengthening the expanded programme on immunization in Africa: Looking beyond 2015. PLoS Med. 10, e1001405 (2013).
Article PubMed PubMed Central Google Scholar
Masud, T. & Navaratne, K. V. The expanded program on immunization in Pakistan: Recommendations for improving performance. (2012).
WHO/UNICEF. Progress and challenges with achieving universal immunization coverage. (2020).
WHO and UNICEF: Progress and Challenges with Achieving Universal Immunization Coverage. (WHO/UNICEF Estimates of National Immunization Coverage, J., 2019).
UNICEF. Under Five Mortality. https://data.unicef.org/topic/child-survival/under-five-mortality/ (2023).
WHO/UNICEF. Estimates of National Immunization Coverage. http://www.who.int/news-room/fact-sheets/detail/immunization-coverage (2021).
Debie, A., Lakew, A. M., Tamirat, K. S., Amare, G. & Tesema, G. A. Complete vaccination service utilization inequalities among children aged 12–23 months in Ethiopia: A multivariate decomposition analyses. Int. J. Equity Health 19, 65. https://doi.org/10.1186/s12939-020-01166-8 (2020).
Article PubMed PubMed Central Google Scholar
UNICEF. (2020).
Faisal, S. et al. Modeling the factors associated with incomplete immunization among children. Math. Probl. Eng. 2022 (2022).
Negussie, A., Kassahun, W., Assegid, S. & Hagan, A. K. Factors associated with incomplete childhood immunization in Arbegona district, southern Ethiopia: A case-control study. BMC Public Health 16, 27. https://doi.org/10.1186/s12889-015-2678-1 (2016).
Article PubMed PubMed Central Google Scholar
Nour, T. Y. et al. Predictors of immunization coverage among 12–23 month old children in Ethiopia: Systematic review and meta-analysis. BMC Public Health 20, 1803. https://doi.org/10.1186/s12889-020-09890-0 (2020).
Article PubMed PubMed Central Google Scholar
Tesema, G. A., Tessema, Z. T., Tamirat, K. S. & Teshale, A. B. Complete basic childhood vaccination and associated factors among children aged 12–23 months in East Africa: A multilevel analysis of recent demographic and health surveys. BMC Public Health 20, 1837. https://doi.org/10.1186/s12889-020-09965-y (2020).
Article PubMed PubMed Central Google Scholar
Skull, S. A., Ngeow, J. Y. Y., Hogg, G. & Biggs, B.-A. Incomplete immunity and missed vaccination opportunities in East African immigrants settling in Australia. J. Immigr. Minor. Health 10, 263–268. https://doi.org/10.1007/s10903-007-9071-9 (2008).
Article PubMed Google Scholar
Adedokun, S. T., Uthman, O. A., Adekanmbi, V. T. & Wiysonge, C. S. Incomplete childhood immunization in Nigeria: A multilevel analysis of individual and contextual factors. BMC Public Health 17, 236. https://doi.org/10.1186/s12889-017-4137-7 (2017).
Article PubMed PubMed Central Google Scholar
Russo, G. et al. Vaccine coverage and determinants of incomplete vaccination in children aged 12–23 months in Dschang, West Region, Cameroon: A cross-sectional survey during a polio outbreak. BMC Public Health 15, 630. https://doi.org/10.1186/s12889-015-2000-2 (2015).
Article PubMed PubMed Central Google Scholar
Mohamud Hayir, T. M., Magan, M. A., Mohamed, L. M., Mohamud, M. A. & Muse, A. A. Barriers for full immunization coverage among under 5 years children in Mogadishu, Somalia. J. Fam. Med. Prim. Care 9, 2664–2669. https://doi.org/10.4103/jfmpc.jfmpc_119_20 (2020).
Article Google Scholar
Kebede Kassaw, A. A. et al. Spatial distribution and machine learning prediction of sexually transmitted infections and associated factors among sexually active men and women in Ethiopia, evidence from EDHS 2016. BMC Infect. Dis. 23, 49. https://doi.org/10.1186/s12879-023-07987-6 (2023).
Article PubMed PubMed Central Google Scholar
DHS. Data Collection. https://www.dhsprogram.com/Data/.
Etana, B. & Deressa, W. Factors associated with complete immunization coverage in children aged 12–23 months in Ambo Woreda, Central Ethiopia. BMC Public Health 12, 566. https://doi.org/10.1186/1471-2458-12-566 (2012).
Article PubMed PubMed Central Google Scholar
Kassahun, M. B., Biks, G. A. & Teferra, A. S. Level of immunization coverage and associated factors among children aged 12–23 months in Lay Armachiho District, North Gondar Zone, Northwest Ethiopia: A community based cross sectional study. BMC. Res. Notes 8, 239. https://doi.org/10.1186/s13104-015-1192-y (2015).
Article PubMed PubMed Central Google Scholar
Sheikh, N. et al. Coverage, timelines, and determinants of incomplete immunization in Bangladesh. Trop. Med. Infect. Dis. 3, 72 (2018).
Article PubMed PubMed Central Google Scholar
Bugvi, A. S. et al. Factors associated with non-utilization of child immunization in Pakistan: Evidence from the Demographic and Health Survey 2006–07. BMC Public Health 14, 232. https://doi.org/10.1186/1471-2458-14-232 (2014).
Article PubMed PubMed Central Google Scholar
Tadesse, H., Deribew, A. & Woldie, M. Predictors of defaulting from completion of child immunization in south Ethiopia, May 2008—A case control study. BMC Public Health 9, 150. https://doi.org/10.1186/1471-2458-9-150 (2009).
Article PubMed PubMed Central Google Scholar
Jani, J. V., De Schacht, C., Jani, I. V. & Bjune, G. Risk factors for incomplete vaccination and missed opportunity for immunization in rural Mozambique. BMC Public Health 8, 161. https://doi.org/10.1186/1471-2458-8-161 (2008).
Article PubMed PubMed Central Google Scholar
De, P. & Bhattacharya, B. N. Determinants of child immunization in fourless-developed states of North India. J. Child Health Care 6, 34–50 (2002).
Article Google Scholar
Rahman, M. & Obaida-Nasrin, S. Factors affecting acceptance of complete immunization coverage of children under five years in rural Bangladesh. Salud pública de méxico 52, 134–140 (2010).
Article PubMed Google Scholar
Atnafu, A. et al. Prevalence and determinants of incomplete or not at all vaccination among children aged 12–36 months in Dabat and Gondar districts, northwest of Ethiopia: Findings from the primary health care project. BMJ Open 10, e041163. https://doi.org/10.1136/bmjopen-2020-041163 (2020).
Article PubMed PubMed Central Google Scholar
Melaku, M. S., Nigatu, A. M. & Mewosha, W. Z. Spatial distribution of incomplete immunization among under-five children in Ethiopia: Evidence from 2005, 2011, and 2016 Ethiopian Demographic and health survey data. BMC Public Health 20, 1362. https://doi.org/10.1186/s12889-020-09461-3 (2020).
Article PubMed PubMed Central Google Scholar
Pedregosa, F. et al. Scikit-learn: Machine learning in Python. J. Mach. Learn. Res. 12, 2825–2830 (2011).
MathSciNet Google Scholar
Chen, T. & Guestrin, C. XGBoost: A Scalable Tree Boosting System. (2016).
Lundberg, S. M. et al. From local explanations to global understanding with explainable AI for trees. Nat. Mach. Intell. 2, 56–67. https://doi.org/10.1038/s42256-019-0138-9 (2020).
Article PubMed PubMed Central Google Scholar
Rawat, S., Rawat, A., Kumar, D. & Sabitha, A. S. Application of machine learning and data visualization techniques for decision support in the insurance sector. Int. J. Inf. Manag. Data Insights 1, 100012 (2021).
Google Scholar
Guo, Y. The 7 steps of machine learning (2017). towardsdatascience.com (2017).
Brownlee, J. Data Preparation for Machine Learning: Data Cleaning, Feature Selection, and Data Transforms in Python (Machine Learning Mastery, 2020).
Yu, L. & Liu, H. In Proceedings of the 20th International Conference on Machine Learning (ICML-03). 856–863.
Bekele, W. T. Machine learning algorithms for predicting low birth weight in Ethiopia. BMC Med. Inform. Decis. Mak. 22, 232. https://doi.org/10.1186/s12911-022-01981-9 (2022).
Article PubMed PubMed Central Google Scholar
Bitew, F. H., Sparks, C. S. & Nyarko, S. H. Machine learning algorithms for predicting undernutrition among under-five children in Ethiopia. Public Health Nutr. 1–12 (2021).
Chilyabanyama, O. N. et al. Performance of machine learning classifiers in classifying stunting among under-five children in Zambia. Children (Basel, Switzerland). https://doi.org/10.3390/children9071082 (2022).
Article PubMed PubMed Central Google Scholar
Emmanuel, M. Application of Machine Learning Methods in Analysis of Infant Mortality in Rwanda: Analysis of Rwanda Demographic Health Survey 2014–15 Dataset (University of Rwanda, 2021).
Fenta, H. M., Zewotir, T. & Muluneh, E. K. A machine learning classifier approach for identifying the determinants of under-five child undernutrition in Ethiopian administrative zones. BMC Med. Inform. Decis. Mak. 21, 1–12 (2021).
Article Google Scholar
Kananura, R. M. Machine learning predictive modelling for identification of predictors of acute respiratory infection and diarrhoea in Uganda’s rural and urban settings. PLoS Glob. Public Health 2, e0000430. https://doi.org/10.1371/journal.pgph.0000430 (2022).
Article PubMed PubMed Central Google Scholar
Saroj, R. K., Yadav, P. K., Singh, R. & Chilyabanyama, O. N. Machine learning algorithms for understanding the determinants of under-five mortality. BioData Min. 15, 20. https://doi.org/10.1186/s13040-022-00308-8 (2022).
Article PubMed PubMed Central Google Scholar
Tesfaye, B., Atique, S., Azim, T. & Kebede, M. M. Predicting skilled delivery service use in Ethiopia: Dual application of logistic regression and machine learning algorithms. BMC Med. Inform. Decis. Mak. 19, 209. https://doi.org/10.1186/s12911-019-0942-5 (2019).
Article PubMed PubMed Central Google Scholar
Bekkar, M., Djemaa, H. K. & Alitouche, T. A. Evaluation measures for models assessment over imbalanced data sets. J. Inf. Eng. Appl. 3, 15–33 (2013).
Google Scholar
Yang, L. & Shami, A. On hyperparameter optimization of machine learning algorithms: Theory and practice. Neurocomputing 415, 295–316. https://doi.org/10.1016/j.neucom.2020.07.061 (2020).
Article Google Scholar
Kebede, S. D. et al. Prediction of contraceptive discontinuation among reproductive-age women in Ethiopia using Ethiopian Demographic and Health Survey 2016 Dataset: A machine learning approach. BMC Med. Inform. Decis. Mak. 23, 9. https://doi.org/10.1186/s12911-023-02102-w (2023).
Article PubMed PubMed Central Google Scholar
Wang, K. et al. Interpretable prediction of 3-year all-cause mortality in patients with heart failure caused by coronary heart disease based on machine learning and SHAP. Comput. Biol. Med. 137, 104813. https://doi.org/10.1016/j.compbiomed.2021.104813 (2021).
Article PubMed Google Scholar
Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30 (2017).
Li, Q., Zhang, Y., Kang, H., Xin, Y. & Shi, C. Mining association rules between stroke risk factors based on the Apriori algorithm. Technol. Health Care. 25, 197–205. https://doi.org/10.3233/thc-171322 (2017).
Article PubMed Google Scholar
Chandir, S. et al. Using predictive analytics to identify children at high risk of defaulting from a routine immunization program: Feasibility study. JMIR Public Health Surveill. 4, e9681 (2018).
Article Google Scholar
Mutua, M. K., Kimani-Murage, E. & Ettarh, R. R. Childhood vaccination in informal urban settlements in Nairobi, Kenya: Who gets vaccinated?. BMC Public Health 11, 6. https://doi.org/10.1186/1471-2458-11-6 (2011).
Article PubMed PubMed Central Google Scholar
Landoh, D. E. et al. Predictors of incomplete immunization coverage among one to five years old children in Togo. BMC Public Health 16, 968. https://doi.org/10.1186/s12889-016-3625-5 (2016).
Article PubMed PubMed Central Google Scholar
Pavlopoulou, I. D., Michail, K. A., Samoli, E., Tsiftis, G. & Tsoumakas, K. Immunization coverage and predictive factors for complete and age-appropriate vaccination among preschoolers in Athens, Greece: A cross- sectional study. BMC Public Health 13, 908. https://doi.org/10.1186/1471-2458-13-908 (2013).
Article PubMed PubMed Central Google Scholar
Zewdie, A., Letebo, M. & Mekonnen, T. Reasons for defaulting from childhood immunization program: A qualitative study from Hadiya zone, Southern Ethiopia. BMC Public Health 16, 1240. https://doi.org/10.1186/s12889-016-3904-1 (2016).
Article PubMed PubMed Central Google Scholar
Tauil, M. D. C., Sato, A. P. S. & Waldman, E. A. Factors associated with incomplete or delayed vaccination across countries: A systematic review. Vaccine 34, 2635–2643. https://doi.org/10.1016/j.vaccine.2016.04.016 (2016).
Article PubMed Google Scholar
Shrestha, S., Shrestha, M., Wagle, R. R. & Bhandari, G. Predictors of incompletion of immunization among children residing in the slums of Kathmandu valley, Nepal: A case-control study. BMC Public Health 16, 970. https://doi.org/10.1186/s12889-016-3651-3 (2016).
Article PubMed PubMed Central Google Scholar
Chhabra, P., Nair, P., Gupta, A., Sandhir, M. & Kannan, A. T. Immunization in urbanized villages of Delhi. Indian J. Pediatr. 74, 131–134. https://doi.org/10.1007/s12098-007-0004-3 (2007).
Article PubMed Google Scholar
Aregawi, H. G., Gebrehiwot, T. G., Abebe, Y. G., Meles, K. G. & Wuneh, A. D. Determinants of defaulting from completion of child immunization in Laelay Adiabo District, Tigray Region, Northern Ethiopia: A case-control study. PLoS One 12, e0185533. https://doi.org/10.1371/journal.pone.0185533 (2017).
Article CAS PubMed PubMed Central Google Scholar
Verrier, F. et al. Vaccination coverage and risk factors associated with incomplete vaccination among children in Cambodia, Madagascar, and Senegal. Open Forum Infect. Dis. 10, ofad136. https://doi.org/10.1093/ofid/ofad136 (2023).
Article PubMed PubMed Central Google Scholar
Tesfaye, F., Tamiso, A., Birhan, Y. & Tadele, T. Predictors of immunization defaulting among children age 12–23 months in Hawassa Zuria District of southern Ethiopia: Community based unmatched case control study. Int. J. Public Health 3 (2014).
Atnafu Gebeyehu, N. et al. Incomplete immunization and its determinants among children in Africa: Systematic review and meta-analysis. Hum. Vaccines Immunother. https://doi.org/10.1080/21645515.2023.2202125 (2023).
Desalew, A., Semahegn, A., Birhanu, S. & Tesfaye, G. Incomplete vaccination and its predictors among children in Ethiopia: A systematic review and meta-analysis. Glob. Pediatr. Health 7, 2333794x20968681. https://doi.org/10.1177/2333794x20968681 (2020).
Mrisho, M. et al. The use of antenatal and postnatal care: perspectives and experiences of women and health care providers in rural southern Tanzania. BMC Pregnancy Childbirth 9, 10. https://doi.org/10.1186/1471-2393-9-10 (2009).
Article PubMed PubMed Central Google Scholar
Mbengue, M. A. S. et al. Determinants of complete immunization among senegalese children aged 12–23 months: Evidence from the demographic and health survey. BMC Public Health 17, 630. https://doi.org/10.1186/s12889-017-4493-3 (2017).
Article PubMed PubMed Central Google Scholar
Sarker, A. R., Akram, R., Ali, N., Chowdhury, Z. I. & Sultana, M. Coverage and determinants of full immunization: Vaccination coverage among senegalese children. Medicina (Kaunas, Lithuania). https://doi.org/10.3390/medicina55080480 (2019).
Article PubMed PubMed Central Google Scholar

Download references

Acknowledgements

We are grateful to measure of Demographic and Health Survey Program for their tremendous work in making the survey data accessible to the general public for study.

Author information

Authors and Affiliations

Department of Health Informatics, School of Public Health, College of Medicine and Health Science, Samara University, Samara, Ethiopia
Zinabu Bekele Tadese
Department of Health Informatics, Institute of Public Health, College of Medicine and Health Science, University of Gondar, Gondar, Ethiopia
Araya Mesfin Nigatu & Tirualem Zeleke Yehuala
Department of Information Technology, Faculty of Science and Technology, Charles Darwin University, Darwin, Australia
Yakub Sebastian

Authors

Zinabu Bekele Tadese
View author publications
You can also search for this author in PubMed Google Scholar
Araya Mesfin Nigatu
View author publications
You can also search for this author in PubMed Google Scholar
Tirualem Zeleke Yehuala
View author publications
You can also search for this author in PubMed Google Scholar
Yakub Sebastian
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

Z.B., A.M., Y.S. and T.Z. made invaluable contributions to this work either in conceptualization, study design, data extraction, execution, analysis, interpretation or in all of these areas; took part in drafting, revising or critically reviewing the article. All authors read and approved the final manuscript.

Corresponding author

Correspondence to Zinabu Bekele Tadese.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher's note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Tadese, Z.B., Nigatu, A.M., Yehuala, T.Z. et al. Prediction of incomplete immunization among under-five children in East Africa from recent demographic and health surveys: a machine learning approach. Sci Rep 14, 11529 (2024). https://doi.org/10.1038/s41598-024-62641-8

Download citation

Received: 20 August 2023
Accepted: 20 May 2024
Published: 21 May 2024
DOI: https://doi.org/10.1038/s41598-024-62641-8

Comments

By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.

Subjects

Abstract

Similar content being viewed by others

Safety outcomes following COVID-19 vaccination and infection in 5.1 million children in England

Post-COVID conditions following COVID-19 vaccination: a retrospective matched cohort study of patients with SARS-CoV-2 infection

A guide to vaccinology: from basic principles to new developments

Introduction

Methods

Study design and setting

Source and study population

Inclusion criteria

Data source, Sample size and sampling procedure

Data source

Sample size determination and sampling procedure

Study variables

Operational definition

Data management and analysis

Machine learning framework for prediction of incomplete immunization

Data collection and preprocessing method

Model development methods

Ethics considerations and consent to participate

Results

Sociodemographic characteristics

Reproductive (obstetrics) history characteristics

Socioeconomic characteristics

Machine learning analysis of incomplete immunization

Model development and evaluation

Visualization of feature importance

Predicting incomplete immunization

Association rule mining

Discussion

Strength and limitation of the study

Conclusion

Data availability

Abbreviations

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Competing interests

Additional information

Publisher's note

Rights and permissions

About this article

Cite this article

Share this article

Comments

Search

Quick links