Model validation for a knowledge and practices survey towards prevention of soil-transmitted helminth infections in rural villages in Indonesia

The rate of soil-transmitted helminth (STH) infection is estimated to be around 20% in Indonesia. Health promotion and health education are cost-effective strategies to supplement STH prevention and control programs. Existing studies suggest that quantitative tools for knowledge, attitudes and practices (KAP) are important to monitor effective community-based STH interventions. However, evidence is limited regarding the applicability of such tools. This study aims to identify the socio-demographic predictors for STH-related knowledge and practices and validate the quantitative tools in population use. A cross-sectional study design was conducted among residents of 16 villages in Central Java, Indonesia. Adult and child respondents were interviewed to assess general knowledge and practices in relation to STH. Two mixed effects models identified the significant factors in predicting knowledge and practice scores. The model predicted knowledge and practice scores were compared with the observed scores to validate the quantitative measurements developed in this study. Participants’ socio-demographic variables were significant in predicting an individual’s STH-related knowledge level and their hand washing and hygiene practices, taking into account household-level variability. Model validation results confirmed that the quantitative measurement tools were suitable for assessing STH associated knowledge and behaviour. The questionnaire developed in this study can be used to support school- and community-based health education interventions to maximize the effect of STH prevention and control programs.


Methods
Study design and data collection. A cross-sectional study design was employed to collect survey data from residents of 16 villages involved in an overarching sanitation improvement study in Central Java, Indonesia, during February-April 2014. As reported in a recent publication, participants were recruited through random selection of households in the villages. Face-to-face interviews were carried out by trained research assistants, consisting of local nurses, midwives and public health workers 20 .
Two structured questionnaires (for adult and child) were developed in English and translated to Bahasa Indonesia (Appendix A and B). The adult questionnaire (for participants aged 13 years and above) elicited details on demographics, latrine use, water access and usage, hand washing practices, knowledge and behaviour associated with STH, together with a section for interviewers to observe and record the condition of respondent's hands and nails (i.e. nail biting and cleanliness). The child questionnaire (for respondents aged 2-12 years) was identical, with the exception that the sections on housing conditions and animals were omitted. Children were also not asked questions on water usage beyond the source of water for drinking. Parents of children below the age of five were asked to answer on behalf of their child 20 . Measurements of knowledge and practices. STH knowledge and behaviour (hand washing and hygiene practices) were measured using two scales containing 18 and 10 questions respectively and the total scores were calculated based on the number of correct responses to those questions. Further details of the knowledge and practice scales have been presented elsewhere 20 . STH infection status. Stool samples were collected from all children and adult respondents. Faecal sample collection and processing are detailed in the Research Team's previously published article 20 . Prevalence of each type of STH infection is determined based on the percentage of individuals receiving positive test results out of the total respondents in each cohort group. The outcomes included Ascaris, Trichuris, hookworms and any STH infection (the presence of at least one of the previous three types of infections). Data analysis. Data analysis was performed using SPSS Statistics 24. Descriptive statistics including frequency distribution, mean and standard deviation (SD) were calculated for participant demographic characteristics and each of the outcome variables, knowledge (about STH) and behaviour (hygiene and sanitation practices in preventing STH) scores. In order to make better sense of the comparisons of knowledge and behaviour scores between subgroups within the study samples, the total scores were converted to percentages of correct responses in the scales [(total correct responses/total number of question) × 100%] for each participant. Independent sample t-test (or Mann-Whitney U test as the non-parametric alternative) and One-Way ANOVA (or Wilcoxon signed Rank test) were carried out to assess the mean differences in the outcome scores between different subgroups (different categories under each demographic variable, e.g. male and female subgroups under 'gender'). In addition, Spearman's correlation analyses were performed to examine the relationships between age, knowledge and practice scores. Due to the hierarchical nature of the data, the measurements taken from members within the same household are likely to be correlated (they are not independent) but different to those of participants from other households. Thus, two mixed effects models were performed to identify the significant factors in predicting knowledge and behaviour scores in consideration of individual and household characteristics in the adult and child combined samples. Histogram Table 1 displays the socio-demographic characteristics of respondents. A total of 6,466 individuals (from 2195 households) responded to the survey, of which 82% were adults (including adolescents, n = 5303) and 18% were children aged 12 and younger (n = 1161). Of the study participants, 50.2% (n = 3229) were males. The child sample cohort had a slightly higher proportion of males than the adult sample cohort (52.5% vs. 49.7%). The age rage of the participants were from 2 to 93 and the mean ages were 39.3 (± 17.2) and 7.0 (± 3.0) between the two samples, respectively. The majority of the respondents had attained education up to the end of junior school (54.4%of the overall respondents) and 17.2% of them were considered well educated with college and higher levels of education. As the child sample in this study consisted of children aged 12 and younger, the highest level of education in this sample was elementary education (54.7%). Of the 1161 children, 35.9% was aged 5-11. In the children sample, 0.2% in the younger sub-group (aged 5-11) and 85.2% in the older sub-group (aged [6][7][8][9][10][11][12] were at school at the time of interview (data not shown). The employed adults worked in various occupation categories such as selfemployed (11.3%), private sector (9.4%), farmers (6.6%) and public sector (1.0%). Almost one third of the adults indicated 'other or unspecified' category to describe their occupations. In addition, over 70% of the responding adults were students (11.1%), doing home duties (16.6%) self-employed (11.3%) or classified as other or unspecified occupation (32.8%). In terms of household income, only adult participants responded to this question. Over 60% (n = 3017) of the study participants reported a monthly household income of ≥ 1 M IDR (equivalent to approximately $2.77 USD based on the average 2014 exchange rate).   www.nature.com/scientificreports/ Knowledge of STH (knowledge score) and demographic factors. The comparisons of participants' levels of knowledge of STH across different socio-demographic factors are outlined in the first part of Table 2. Adults had a significantly higher mean knowledge than children (57.3% vs. 42.9%, p < 0.001). Overall, a weak positive association between knowledge score and age was observed (rho: 0.16, p < 0.001). However, the results also indicated a strong positive correlation between knowledge score and age in the child cohort (rho: 0.51, p < 0.001) but a negative association in the adult cohort (rho: − 0.22, p < 0.001). Figure 1A depicts the pattern of change in knowledge score by age group. A steep increase was observed among children and a stable but gradually decreasing trend with age was found in adults (except a sharp decrease from aged > 60). Significant differences in mean knowledge scores were also identified between education, occupation and income groups (p < 0.001 in One-Way ANOVA with post-hoc tests, Table 2). Knowledge scores increase with education levels. The mean knowledge score among adults with college, or above, levels of education was almost twice that of the score for children not at school (mean scores: 59.9 and 31.9 respectively). The post-hoc tests indicated a significant difference between any pair of the education groups, except the two highest levels of education (Grades 10-12 and college and higher, p = 0.53, data not shown). Knowledge scores were lowest among students, farmers and unemployed participants and highest among public sector employees.

Results
Hand washing and hygiene practices (behaviour score) and demographic factors. The comparisons of participants' behavior scores measuring their hygiene and sanitation practices with regard to STH (abbreviated as behaviour score) across different socio-demographic factors are presented in the second part of Table 2. Adult participants had a significantly higher mean behaviour score than children (69.0% vs. 61.2%, p < 0.001). Behaviour scores were positively associated with age in the whole sample (rho: 0.12 p < 0.001). Further analysis revealed that a positive association was found between age and behaviour score in children but a weak www.nature.com/scientificreports/ negative association was identified in adults (rho: 0.35 and − 0.03, p < 0.001 and p = 0.038 respectively). Figure 1B indicated that the mean behaviour score in children aged < 6 was significantly lower than those of the other age groups (51.7 vs. score range: 66-70). Behaviour score (in Table 2) increased with education (p < 0.001, with no difference between any pair of the three highest education levels in post-hoc comparisons, p > 0.05, data not shown). Significant differences in mean behaviour scores were also identified between occupation and income groups (p < 0.001 in One-Way ANOVA tests). The results of post-hoc analyses showed that public sector employees practiced greater levels of hygiene compared to other occupation categories, whilst students, self-employed and unemployed participants practiced the lowest levels (excluding children not at school). Table 3 presents the results of STH prevalence and the differences in knowledge and behavior (practice) scores between infected and uninfected participants in the child and adult samples. The overall STH infection (infected with at least one type of STH) rates were 32.7% (n = 380) in children and 34.1% (n = 1810) in adults. The results also showed that children without any STH infections had consistently higher knowledge and practice scores than those infected with different types of STHs. Uninfected children showed a significantly higher knowledge score than those infected with at least one type of STH (43.74 vs. 41.24, p = 0.025) and than those with Ascaris infections (43.62 vs. 41.12, p = 0.034). In the adult sample, uninfected participants show a much higher level of hand washing and hygiene practices than those infected with Trichuris (69.11 vs 64.08, p < 0.001).

STH outcomes and mean differences in knowledge and behaviour scores between infected and uninfected individuals.
(A) Knowledge score by age group (B) Behaviour score by age group  . Due to strong correlations, education, occupation and income status were included independently in different models together with gender, adult/child and age to avoid multi-collinearity among these three variables. In order to consider the very different patterns in the change of knowledge scores with age between adult and child samples, an interaction term between adult/child and age (age* adult/child) was added into the model ( Table 4). All the model assessment indicators show the best fitting outcome when education was included with the first three demographic variables (gender, adult/child and age) in the model.  Table 4A summarizes the test results and parameter estimates for the fixed effects in the model. The baseline intercept indicates that the mean knowledge score percentage for youngest female adults with college level of education was 61.49 (using reference categories for the included variables, 95%CI: 60.60-62.39). Male respondents have a slightly lower score than females (mean score reduced by 0.57, p = 0.013). Being a child would have a significantly reduced mean score by 29.42 points (p < 0.001) than adults. In general, knowledge score increases with education levels. For example, the scores for participants who were not at school, had education levels between grades 1-6 and grades 7-9 were significantly lower than that of participants with college and above levels (p values < 0.001). However, the knowledge score among respondents with education between grades 10-12 was not significantly lower than those with college level education (p = 0.458). Age and (age*adult/child) were also significant (both p values < 0.001) in predicting knowledge score. The covariance parameters in Table 4B suggest that both residual (at the individual level) and intercept (at the household level) variance components are significant (both p < 0.001), meaning that the average knowledge scores vary among participants and also between households. About 44.8% [56.49/(69.63 + 56.49) = 44.79%)] of the total variance could be attributed to the differences in knowledge score between households.

Results of mixed effects model in predicting behaviour scores.
Similar to the process of model development for knowledge score, a series of model testing and assessment were applied to identify the best fitting model to predict participants' behavior scores. In addition to the demographic factors and age* adult/child included in the previous model (Table 5), knowledge score was also added in the model considering its strong correlation with behavior score (rho: 0.48, p < 0.001). Along with household variance, practice score was also entered to test for random effects in explaining behavior score in the final model. The model assessment criteria (htly better than the m2LL, AIC and BIC) reduced from 51,461-51,419 for the null model to around 47,609- Being a male and a child would have a reduced mean score by 2.26 and 6.0 points than females and adults respectively (both p < 0.001). Both Age and (age* adult/child) were significant in predicting practice score (p values 0.002 and < 0.001 respectively). Knowledge score is also a significant predictor (p < 0.001). Every point increase in knowledge score increases behaviour score by 0.4. The covariance parameters in Table 5B show that both residual (individual characteristics) and intercept (household variations) covariances are significant (both p < 0.001), which indicated that behavior scores vary between households and among respondents. Between-households variability explained a large proportion (97.15/(71.27 + 97.15) = 57.68%) of the overall variance in practice score.

Validation of the developed models in predicting knowledge and behaviour scores. The two
final mixed effects models were used to compute predicted knowledge and behaviour scores for each participant. The predicted scores were compared with the knowledge and behaviour scores reported by the participants (observed scores). Figure 2 illustrates the comparison of observed and predicted frequency distributions of knowledge scores. The observed distribution has a distinct peak at the score 64 (mode value in the sample), while the predicted distribution peaking around 58-60, both slightly left skewed with mean values at 54.7 and 54.5, respectively. The comparison of quartile values (Supplementary Table 1A) show that the predicted scores tend to underestimate the observed scores (median values: 56.92 vs. 60). Spearman correlation analysis was used to assess how closely the overall predicted knowledge scores are to the observed scores at the individual level. The correlation coefficient of nearly 85% (see Supplementary Fig. 1, rho: 0.847, p < 0.001) indicates a strong correlation between the observed and predicted knowledge scores, suggesting a well-fitted model for the sample data.
The results from the final model, in consideration of the positive relationship between knowledge and behaviour scores, have confirmed that knowledge score was a significant predictor for behaviour score. Figure 3 presents the comparison between observed frequency distribution and predicted frequency distribution of knowledge scores calculated based on the developed mixed effects model. Overall, the predicted distribution was reasonably similar to the pattern of the observed distribution, with very accurate sample mean prediction (67.5 vs 67.6). Despite the minor under-or overestimates in predicting the quartile values (difference ranging from -1.78 to 1.83, in Supplementary Table 1B), the result of Spearman correlation further confirmed well matched predicted and observed scores ( Supplementary Fig. 2, rho: 0.883, p < 0.001). The validation results indicate that the final model fits the sample data (including adult and child sub-samples) and can be used to predict behaviour scores among the study respondents. www.nature.com/scientificreports/

Discussion
The findings of this study revealed that the participants' socio-demographic variables such as age, gender and education in conjunction with household variability are significant in predicting individual's knowledge level of STH. In addition, participant's knowledge score coupled with their socio-demographic characteristics are significant predictors for their hand washing and hygiene practices, taking into account household-level variability. The findings are in line with previous studies which identified age, gender and educational level as the common associated demographic factors 8,19,21,22 . However, none of these studies combined adults and children in their analyses and examined the effect of household-level variation on respondents' levels of knowledge and practices in relation to STH infections and prevention. Further, most of these studies assessed associations between socio-demographic characteristics and each single component/item (such as a specific transmission route or symptom) in their KAP questionnaires. Only one study 19 used a summated scale (a combination of several questions) to measure general knowledge and explores its associations with children's gender and age. The present study provided additional information to examine how knowledge interacted with participants' sociodemographic characteristics (including being a child or an adult) to predict their washing and hygiene practices. About one third of children and adults in the study communities had parasitic worm infections (32.7% and 34.1% respectively), and Ascaris was identified as the predominant species (27.7% in children and 25.6% in adults). The study also suggests that children without STH infections had consistently higher knowledge and   The findings support the potential benefits of health education or STH prevention programs targeting children to increase their knowledge of STH, leads to protective hygiene practices to prevent STH 12,23 . The findings of an intervention study revealed that the implementation of health-education package to prevent worm infections in China demonstrated an effective reduction of STH infections in the participating schoolchildren 24 .
On the other hand, our study indicated that the knowledge and practice scores between infected and uninfected adults did not differ significantly except for Trichuris infection. Adults infected with Trichuris tended to have a significantly lower practice score than those without the infection. It is unclear why the knowledge and practice scores were consistently higher in the uninfected children than the infected ones, but such patterns were not observed in adults. For example, adults infected with Ascaris had higher knowledge and behaviour levels than those uninfected adults. However, those adults with Trichuris infections tended to have a significantly lower behaviour score than the uninfected ones (64.08 vs. 69.11). As noted by recent studies on examining KAP and parasitic worm infections, acquired knowledge did not necessarily translate into behavioural changes 15,19,25 . For instance, Sady et al. 21 argued that poverty could have contributed to infected individuals' delay in seeking treatment even though they showed a higher level of knowledge than those without infections. This was because they did not have enough money to pay the cost of transport and medical services, especially for residents living in rural or remote communities. In such communities, inadequate sanitation and poor environmental conditions (such as a lack of access to clean water) could be the key barriers to community deworming or STH health education programs. In addition, there was also misconception regarding the effect and accessibility of treatment. For example, some study participants considered that the effect of preventative chemotherapy was permanent and repeated treatments were not required 19 ; or the treatment was ineffective, inaccessible and not affordable 12,15,18,22 . As a result, STH re-infections could be a common issue among adult residents in rural communities, even though they might have previously been exposed to STH prevention and health education programs and had good knowledge of prevention of STH infections. As Acka et al. 15 suggested, STH prevention and treatment programs should combine community-based health education campaigns and school-based interventions to enhance and sustain positive sanitation and hygiene practices in rural settings.
The model validation results further confirmed that the quantitative measurement tools developed in this project and subsequent related projects are suitable for assessing knowledge and behaviour associated with STH. The validation findings in this study demonstrated great model performances in predicting knowledge and behavior scores, when applying the scales to the study population. Using the basic demographic variables of the study sample, we can correctly predict respondents' knowledge and practice scores to a great extent (85-88% agreement between observed and predictive scores). The first validated model showed that education level is a significant predictor for knowledge scores. This finding is consistent with previous KAP studies that education was significantly associated with KAP on STH or schistosomiasis infections 8,21,22 . However, these studies did not establish association based on overall knowledge or practice scores. They examined the associations based on individual questions included in the scales. Nevertheless, this finding highlights the importance of health education programs in communities. Our finding also indicated that existing relevant studies often displayed frequency distributions of individual KAP items under each sub-category of knowledge or practice questionnaires 8,15,16,21 . To our knowledge, there are no validated quantitative instrument available to measure the levels of STH-related knowledge and behaviour. There is also a lack of standardized scales or tools to determine adequate levels of STH-related knowledge and practices. In order to inform future STH prevention (in relation to specific forms of STH infections), further research efforts can be made to determine the optimal cut-off points for adequate/ inadequate levels of knowledge and practices with regard to STH to identify the target population for further population-based health education programs.
In conclusion, the results of this study have demonstrated promising validation outcomes for the knowledge and practice scales included in the BALatrine questionnaire with the study population (adults and children). They can be used to aid population health education programs and to increase sustainability of mass deworming and environmental improvements. The findings further confirmed the importance of integrating school-based (targeting children) and community-based (targeting adults and households) health education or STH prevention programs to maximize the effect of STH prevention and intervention programs.

Data availability
Data collected as part of this study are not publicly available but can be shared upon reasonable request by emailing the corresponding author.