Introduction

Soil-transmitted helminth (STH) infections, considered as neglected tropical diseases, remain one of the most common global public health issues in developing countries. It is estimated that 1.5 billion people (20% of the global population1) are infected with STH worldwide, with the poorest and most deprived communities typically affected2. The main STH species that infect human populations include Ascaris lumbricoides (roundworm), Trichuris trichiura (whipworm) and hookworm (the two species). These parasites are transmitted by eggs shed in human faeces which contaminate the environment in areas with poor hygiene and sanitation.

In Indonesia, one-fifth or more of the population is estimated to be infected with A. lumbricoides or hookworms and approximately 12% infected with T. trichuria3.

The World Health Organization (WHO) recommends periodic mass drug administration (MDA) targeting at-risk populations such as school-aged children and adults in STH endemic areas2,4,5. Administration of anthelmintics such as albendazole is considered a safe and effective control strategy to tackle STH infections at a community level. Supplemental prevention strategies are often encouraged in addition to periodic deworming, including improvements in environmental sanitation, provision of clean water supplies, promoting latrine use, access to health services and provision of health education. Such initiatives have been encouraged as important components of efforts to ensure sustainable and effective STH control programs6,7,8,9,10.

Health promotion and health education focusing on STH have been found to be cost-effective and affordable measures for disadvantaged communities, in combination with other intervention strategies. Health education programs can be designed to reduce contamination of soil and to promote the use of latrines and individual hygienic behaviour, thereby helping to prevent transmission and re-infection by STH11,12,13,14.

However, there is an urgent need for systematic and validated measurement tools to measure knowledge, attitudes and practices (KAP) related to STH. Relevant studies typically design a knowledge questionnaire including the following aspects: signs and symptoms, transmission of infection and perceived causes, treatment and prevention measures with respect to STH infections8,15,16,17. Measurement tools for practices or behaviour associated with STH prevention vary and the included contents are often inconsistent. These usually include questions about personal hygiene practices such as hand washing, washing fruits and vegetables, boiling water for drinking, wearing shoes and participant history of deworming, but often lack standardized scales8,17,18. We have observed that recent studies present simple frequency distributions of responses to all sub-questions under each KAP section. Only a few studies conduct univariate analyses (in relation to socio-demographic characteristics) and assess associations between knowledge or practices with STH outcomes8,16,19. Evidence is limited in determining the relationship between knowledge and practices, as well as their influence on STH outcomes. There is also a lack of established quantitative tools to assess STH-related knowledge and practices in large populations. The objective of this study was therefore to identify the socio-demographic predictors for knowledge and practices related to STH and validate the quantitative tools in population use.

Methods

Study design and data collection

A cross-sectional study design was employed to collect survey data from residents of 16 villages involved in an overarching sanitation improvement study in Central Java, Indonesia, during February-April 2014. As reported in a recent publication, participants were recruited through random selection of households in the villages. Face-to-face interviews were carried out by trained research assistants, consisting of local nurses, midwives and public health workers20.

Two structured questionnaires (for adult and child) were developed in English and translated to Bahasa Indonesia (Appendix A and B). The adult questionnaire (for participants aged 13 years and above) elicited details on demographics, latrine use, water access and usage, hand washing practices, knowledge and behaviour associated with STH, together with a section for interviewers to observe and record the condition of respondent’s hands and nails (i.e. nail biting and cleanliness). The child questionnaire (for respondents aged 2–12 years) was identical, with the exception that the sections on housing conditions and animals were omitted. Children were also not asked questions on water usage beyond the source of water for drinking. Parents of children below the age of five were asked to answer on behalf of their child20.

Measurements of knowledge and practices

STH knowledge and behaviour (hand washing and hygiene practices) were measured using two scales containing 18 and 10 questions respectively and the total scores were calculated based on the number of correct responses to those questions. Further details of the knowledge and practice scales have been presented elsewhere20.

STH infection status

Stool samples were collected from all children and adult respondents. Faecal sample collection and processing are detailed in the Research Team’s previously published article20. Prevalence of each type of STH infection is determined based on the percentage of individuals receiving positive test results out of the total respondents in each cohort group. The outcomes included Ascaris, Trichuris, hookworms and any STH infection (the presence of at least one of the previous three types of infections).

Data analysis

Data analysis was performed using SPSS Statistics 24. Descriptive statistics including frequency distribution, mean and standard deviation (SD) were calculated for participant demographic characteristics and each of the outcome variables, knowledge (about STH) and behaviour (hygiene and sanitation practices in preventing STH) scores. In order to make better sense of the comparisons of knowledge and behaviour scores between subgroups within the study samples, the total scores were converted to percentages of correct responses in the scales [(total correct responses/total number of question) × 100%] for each participant. Independent sample t-test (or Mann–Whitney U test as the non-parametric alternative) and One-Way ANOVA (or Wilcoxon signed Rank test) were carried out to assess the mean differences in the outcome scores between different subgroups (different categories under each demographic variable, e.g. male and female subgroups under ‘gender’). In addition, Spearman’s correlation analyses were performed to examine the relationships between age, knowledge and practice scores. Due to the hierarchical nature of the data, the measurements taken from members within the same household are likely to be correlated (they are not independent) but different to those of participants from other households. Thus, two mixed effects models were performed to identify the significant factors in predicting knowledge and behaviour scores in consideration of individual and household characteristics in the adult and child combined samples. Histogram distributions, scatterplots and Pearson’s correlations were applied to compare the observed and predicted values of knowledge and behavior scores for model validation. Statistical significance was set at p < 0.05.

Ethics

Ethical approval for this study was obtained beforehand from the Semarang City authorities (ref. 070/613/IV/2011), and from the Human Research Ethics Committees at Diponegoro University and at Griffith University (ref. PBH/17/11/HREC). Methods were carried out in accordance with relevant guidelines and regulations and informed consent was obtained from all subjects and/or their legal guardians(s).

Results

Table 1 displays the socio-demographic characteristics of respondents. A total of 6,466 individuals (from 2195 households) responded to the survey, of which 82% were adults (including adolescents, n = 5303) and 18% were children aged 12 and younger (n = 1161). Of the study participants, 50.2% (n = 3229) were males. The child sample cohort had a slightly higher proportion of males than the adult sample cohort (52.5% vs. 49.7%). The age rage of the participants were from 2 to 93 and the mean ages were 39.3 (± 17.2) and 7.0 (± 3.0) between the two samples, respectively.

Table 1 Demographic characteristics of respondents.

The majority of the respondents had attained education up to the end of junior school (54.4%of the overall respondents) and 17.2% of them were considered well educated with college and higher levels of education. As the child sample in this study consisted of children aged 12 and younger, the highest level of education in this sample was elementary education (54.7%). Of the 1161 children, 35.9% was aged 5–11. In the children sample, 0.2% in the younger sub-group (aged 5–11) and 85.2% in the older sub-group (aged 6–12) were at school at the time of interview (data not shown). The employed adults worked in various occupation categories such as self-employed (11.3%), private sector (9.4%), farmers (6.6%) and public sector (1.0%). Almost one third of the adults indicated ‘other or unspecified’ category to describe their occupations. In addition, over 70% of the responding adults were students (11.1%), doing home duties (16.6%) self-employed (11.3%) or classified as other or unspecified occupation (32.8%). In terms of household income, only adult participants responded to this question. Over 60% (n = 3017) of the study participants reported a monthly household income of ≥ 1 M IDR (equivalent to approximately $2.77 USD based on the average 2014 exchange rate).

Knowledge of STH (knowledge score) and demographic factors

The comparisons of participants’ levels of knowledge of STH across different socio-demographic factors are outlined in the first part of Table 2. Adults had a significantly higher mean knowledge than children (57.3% vs. 42.9%, p < 0.001). Overall, a weak positive association between knowledge score and age was observed (rho: 0.16, p < 0.001). However, the results also indicated a strong positive correlation between knowledge score and age in the child cohort (rho: 0.51, p < 0.001) but a negative association in the adult cohort (rho: − 0.22, p < 0.001).

Table 2 Knowledge and behaviour scores by demographic factors.

Figure 1A depicts the pattern of change in knowledge score by age group. A steep increase was observed among children and a stable but gradually decreasing trend with age was found in adults (except a sharp decrease from aged > 60). Significant differences in mean knowledge scores were also identified between education, occupation and income groups (p < 0.001 in One-Way ANOVA with post-hoc tests, Table 2). Knowledge scores increase with education levels. The mean knowledge score among adults with college, or above, levels of education was almost twice that of the score for children not at school (mean scores: 59.9 and 31.9 respectively). The post-hoc tests indicated a significant difference between any pair of the education groups, except the two highest levels of education (Grades 10–12 and college and higher, p = 0.53, data not shown). Knowledge scores were lowest among students, farmers and unemployed participants and highest among public sector employees.

Figure 1
figure 1

Mean scores of observed knowledge and behavior scores (%) by age group.

Hand washing and hygiene practices (behaviour score) and demographic factors

The comparisons of participants’ behavior scores measuring their hygiene and sanitation practices with regard to STH (abbreviated as behaviour score) across different socio-demographic factors are presented in the second part of Table 2. Adult participants had a significantly higher mean behaviour score than children (69.0% vs. 61.2%, p < 0.001). Behaviour scores were positively associated with age in the whole sample (rho: 0.12 p < 0.001). Further analysis revealed that a positive association was found between age and behaviour score in children but a weak negative association was identified in adults (rho: 0.35 and − 0.03, p < 0.001 and p = 0.038 respectively). Figure 1B indicated that the mean behaviour score in children aged < 6 was significantly lower than those of the other age groups (51.7 vs. score range: 66–70). Behaviour score (in Table 2) increased with education (p < 0.001, with no difference between any pair of the three highest education levels in post-hoc comparisons, p > 0.05, data not shown). Significant differences in mean behaviour scores were also identified between occupation and income groups (p < 0.001 in One-Way ANOVA tests). The results of post-hoc analyses showed that public sector employees practiced greater levels of hygiene compared to other occupation categories, whilst students, self-employed and unemployed participants practiced the lowest levels (excluding children not at school).

STH outcomes and mean differences in knowledge and behaviour scores between infected and uninfected individuals

Table 3 presents the results of STH prevalence and the differences in knowledge and behavior (practice) scores between infected and uninfected participants in the child and adult samples. The overall STH infection (infected with at least one type of STH) rates were 32.7% (n = 380) in children and 34.1% (n = 1810) in adults. The results also showed that children without any STH infections had consistently higher knowledge and practice scores than those infected with different types of STHs. Uninfected children showed a significantly higher knowledge score than those infected with at least one type of STH (43.74 vs. 41.24, p = 0.025) and than those with Ascaris infections (43.62 vs. 41.12, p = 0.034). In the adult sample, uninfected participants show a much higher level of hand washing and hygiene practices than those infected with Trichuris (69.11 vs 64.08, p < 0.001).

Table 3 Results of STH outcomes and comparisons of knowledge and practice scores between infected and uninfected groups.

Results of mixed effects model in predicting knowledge scores

The mixed effects model approach takes into account the variables at both individual (fixed effects) and household levels (random effects) to predict the changes in knowledge and behavior scores. In the process of finding the best fitting model for knowledge score, various combinations of multiple demographic factors were included in different models to compare the model performances with the null model (including household only) [− 2Restricted Log Likelihood (− 2LL), Akaike’s Information Criterion (AIC) and Schwarz’s Bayesian Criterion (BIC) as the indicators for model comparison]. Due to strong correlations, education, occupation and income status were included independently in different models together with gender, adult/child and age to avoid multi-collinearity among these three variables. In order to consider the very different patterns in the change of knowledge scores with age between adult and child samples, an interaction term between adult/child and age (age* adult/child) was added into the model (Table 4). All the model assessment indicators show the best fitting outcome when education was included with the first three demographic variables (gender, adult/child and age) in the model. The final model including education performed slightly better than the model including occupation (htly better than the m2LL, AIC and BIC values: 46,656–46,675 and 47,347–47,364) and improved significantly compared with the null model (htly better than the m2LL, AIC and BIC values: 51,259–51,276; data not shown).

Table 4 Estimates of fixed and random effects for knowledge score (percentage).

In the final model, the results of ANOVA tests for the fixed effects of the included demographic variables show that the intercept and all the included demographic variables made significant contributions to the model (p value for gender: 0.013, all the other p values < 0.001) (data not shown). Table 4A summarizes the test results and parameter estimates for the fixed effects in the model. The baseline intercept indicates that the mean knowledge score percentage for youngest female adults with college level of education was 61.49 (using reference categories for the included variables, 95%CI: 60.60–62.39). Male respondents have a slightly lower score than females (mean score reduced by 0.57, p = 0.013). Being a child would have a significantly reduced mean score by 29.42 points (p < 0.001) than adults. In general, knowledge score increases with education levels. For example, the scores for participants who were not at school, had education levels between grades 1–6 and grades 7–9 were significantly lower than that of participants with college and above levels (p values < 0.001). However, the knowledge score among respondents with education between grades 10–12 was not significantly lower than those with college level education (p = 0.458). Age and (age*adult/child) were also significant (both p values < 0.001) in predicting knowledge score. The covariance parameters in Table 4B suggest that both residual (at the individual level) and intercept (at the household level) variance components are significant (both p < 0.001), meaning that the average knowledge scores vary among participants and also between households. About 44.8% [56.49/(69.63 + 56.49) = 44.79%)] of the total variance could be attributed to the differences in knowledge score between households.

Results of mixed effects model in predicting behaviour scores

Similar to the process of model development for knowledge score, a series of model testing and assessment were applied to identify the best fitting model to predict participants’ behavior scores. In addition to the demographic factors and age* adult/child included in the previous model (Table 5), knowledge score was also added in the model considering its strong correlation with behavior score (rho: 0.48, p < 0.001). Along with household variance, practice score was also entered to test for random effects in explaining behavior score in the final model. The model assessment criteria (htly better than the m2LL, AIC and BIC) reduced from 51,461–51,419 for the null model to around 47,609–47,626 for the final model (htly better than the m-2LL, AIC and BIC data not shown), suggesting an improvement in model fitting. The results of ANOVA tests for the fixed effects of the intercept and all the included variables contributed significantly in explaining behaviour score (all p values ≤ 0.001) except education (p = 0.67, data not shown). The intercept of 43.36 indicates the mean behaviour score at baseline (young adult females and reference for all included variables). Being a male and a child would have a reduced mean score by 2.26 and 6.0 points than females and adults respectively (both p < 0.001). Both Age and (age* adult/child) were significant in predicting practice score (p values 0.002 and < 0.001 respectively). Knowledge score is also a significant predictor (p < 0.001). Every point increase in knowledge score increases behaviour score by 0.4. The covariance parameters in Table 5B show that both residual (individual characteristics) and intercept (household variations) covariances are significant (both p < 0.001), which indicated that behavior scores vary between households and among respondents. Between-households variability explained a large proportion (97.15/(71.27 + 97.15) = 57.68%) of the overall variance in practice score.

Table 5 Estimates of fixed and random effects for behaviour score (percentage).

Validation of the developed models in predicting knowledge and behaviour scores

The two final mixed effects models were used to compute predicted knowledge and behaviour scores for each participant. The predicted scores were compared with the knowledge and behaviour scores reported by the participants (observed scores). Figure 2 illustrates the comparison of observed and predicted frequency distributions of knowledge scores. The observed distribution has a distinct peak at the score 64 (mode value in the sample), while the predicted distribution peaking around 58–60, both slightly left skewed with mean values at 54.7 and 54.5, respectively. The comparison of quartile values (Supplementary Table 1A) show that the predicted scores tend to underestimate the observed scores (median values: 56.92 vs. 60). Spearman correlation analysis was used to assess how closely the overall predicted knowledge scores are to the observed scores at the individual level. The correlation coefficient of nearly 85% (see Supplementary Fig. 1, rho: 0.847, p < 0.001) indicates a strong correlation between the observed and predicted knowledge scores, suggesting a well-fitted model for the sample data.

Figure 2
figure 2

Distribution of observed and predicted knowledge scores (%).

The results from the final model, in consideration of the positive relationship between knowledge and behaviour scores, have confirmed that knowledge score was a significant predictor for behaviour score. Figure 3 presents the comparison between observed frequency distribution and predicted frequency distribution of knowledge scores calculated based on the developed mixed effects model. Overall, the predicted distribution was reasonably similar to the pattern of the observed distribution, with very accurate sample mean prediction (67.5 vs 67.6). Despite the minor under- or overestimates in predicting the quartile values (difference ranging from – 1.78 to 1.83, in Supplementary Table 1B), the result of Spearman correlation further confirmed well matched predicted and observed scores (Supplementary Fig. 2, rho: 0.883, p < 0.001). The validation results indicate that the final model fits the sample data (including adult and child sub-samples) and can be used to predict behaviour scores among the study respondents.

Figure 3
figure 3

Distribution of observed and predicted behaviour scores (%).

Discussion

The findings of this study revealed that the participants’ socio-demographic variables such as age, gender and education in conjunction with household variability are significant in predicting individual’s knowledge level of STH. In addition, participant’s knowledge score coupled with their socio-demographic characteristics are significant predictors for their hand washing and hygiene practices, taking into account household-level variability. The findings are in line with previous studies which identified age, gender and educational level as the common associated demographic factors8,19,21,22. However, none of these studies combined adults and children in their analyses and examined the effect of household-level variation on respondents’ levels of knowledge and practices in relation to STH infections and prevention. Further, most of these studies assessed associations between socio-demographic characteristics and each single component/item (such as a specific transmission route or symptom) in their KAP questionnaires. Only one study19 used a summated scale (a combination of several questions) to measure general knowledge and explores its associations with children’s gender and age. The present study provided additional information to examine how knowledge interacted with participants’ socio-demographic characteristics (including being a child or an adult) to predict their washing and hygiene practices.

About one third of children and adults in the study communities had parasitic worm infections (32.7% and 34.1% respectively), and Ascaris was identified as the predominant species (27.7% in children and 25.6% in adults). The study also suggests that children without STH infections had consistently higher knowledge and practice scores than those with different types of STH infections, despite some statistically insignificant results. This finding is in agreement with a study showing lower hookworm prevalence among schoolchildren with better hygiene knowledge than those who lacked such knowledge in rural Cote d’Ivoire19. However, the insignificant differences in knowledge and practice scores between groups in terms of infection status with Trichuris (knowledge scores in uninfected group vs. infected group: 42.98 and 39.60; practice scores: 61.31 and 53.54., respectively), as well as level of practice in relation to hookworm infection (uninfected: 61.32 vs. infected: 58.37) could be due to small number of infected individuals (n = 20, 96 and 58 respectively) in the infected group for these analyses. The findings support the potential benefits of health education or STH prevention programs targeting children to increase their knowledge of STH, leads to protective hygiene practices to prevent STH12,23. The findings of an intervention study revealed that the implementation of health-education package to prevent worm infections in China demonstrated an effective reduction of STH infections in the participating schoolchildren24.

On the other hand, our study indicated that the knowledge and practice scores between infected and uninfected adults did not differ significantly except for Trichuris infection. Adults infected with Trichuris tended to have a significantly lower practice score than those without the infection. It is unclear why the knowledge and practice scores were consistently higher in the uninfected children than the infected ones, but such patterns were not observed in adults. For example, adults infected with Ascaris had higher knowledge and behaviour levels than those uninfected adults. However, those adults with Trichuris infections tended to have a significantly lower behaviour score than the uninfected ones (64.08 vs. 69.11). As noted by recent studies on examining KAP and parasitic worm infections, acquired knowledge did not necessarily translate into behavioural changes15,19,25. For instance, Sady et al.21 argued that poverty could have contributed to infected individuals’ delay in seeking treatment even though they showed a higher level of knowledge than those without infections. This was because they did not have enough money to pay the cost of transport and medical services, especially for residents living in rural or remote communities. In such communities, inadequate sanitation and poor environmental conditions (such as a lack of access to clean water) could be the key barriers to community deworming or STH health education programs. In addition, there was also misconception regarding the effect and accessibility of treatment. For example, some study participants considered that the effect of preventative chemotherapy was permanent and repeated treatments were not required19; or the treatment was ineffective, inaccessible and not affordable12,15,18,22. As a result, STH re-infections could be a common issue among adult residents in rural communities, even though they might have previously been exposed to STH prevention and health education programs and had good knowledge of prevention of STH infections. As Acka et al.15 suggested, STH prevention and treatment programs should combine community-based health education campaigns and school-based interventions to enhance and sustain positive sanitation and hygiene practices in rural settings.

The model validation results further confirmed that the quantitative measurement tools developed in this project and subsequent related projects are suitable for assessing knowledge and behaviour associated with STH. The validation findings in this study demonstrated great model performances in predicting knowledge and behavior scores, when applying the scales to the study population. Using the basic demographic variables of the study sample, we can correctly predict respondents' knowledge and practice scores to a great extent (85–88% agreement between observed and predictive scores). The first validated model showed that education level is a significant predictor for knowledge scores. This finding is consistent with previous KAP studies that education was significantly associated with KAP on STH or schistosomiasis infections8,21,22. However, these studies did not establish association based on overall knowledge or practice scores. They examined the associations based on individual questions included in the scales. Nevertheless, this finding highlights the importance of health education programs in communities. Our finding also indicated that existing relevant studies often displayed frequency distributions of individual KAP items under each sub-category of knowledge or practice questionnaires8,15,16,21. To our knowledge, there are no validated quantitative instrument available to measure the levels of STH-related knowledge and behaviour. There is also a lack of standardized scales or tools to determine adequate levels of STH-related knowledge and practices. In order to inform future STH prevention (in relation to specific forms of STH infections), further research efforts can be made to determine the optimal cut-off points for adequate/inadequate levels of knowledge and practices with regard to STH to identify the target population for further population-based health education programs.

In conclusion, the results of this study have demonstrated promising validation outcomes for the knowledge and practice scales included in the BALatrine questionnaire with the study population (adults and children). They can be used to aid population health education programs and to increase sustainability of mass deworming and environmental improvements. The findings further confirmed the importance of integrating school-based (targeting children) and community-based (targeting adults and households) health education or STH prevention programs to maximize the effect of STH prevention and intervention programs.