Factors associated with sufficient knowledge of antibiotics and antimicrobial resistance in the Japanese general population

We conducted two online surveys about antibiotics targeted at the Japanese general population in March 2017 and February 2018. In total, 6,982 participants completed the questionnaire. Factors associated with knowledge of antibiotics, knowledge of antimicrobial resistance (AMR) and appropriate behavioural changes were evaluated by a machine learning approach using DataRobot. Factors strongly associated with three dependent variables in the model were extracted based on permuation importance. We found that the strongest determinant of knowledge of antibiotics and AMR was education level. Knowledge of antibiotics was strongly associated with the frequency of internet use. Exposure to primary information was associated with motivation for appropriate behavioural changes. Improving the availability of primary information would be a beneficial intervention. Individuals lacking higher education and without opportunities to obtain primary information should be considered a target population for effective interventions.

We generated three datasets from the original data with three partition types. For over 30 models, the fit was compared for each dependent variable (knowledge of antibiotics, knowledge of AMR, and behavioural change). We obtained the model with the largest AUC value for each dependent variable in each dataset, yielding nine models for three dependent variables.  Table 2. Responses to questions about AMR. Numbers in parentheses represent percentages. AMR; antimicrobial resistance. *One point is assigned for correct answers to both questions 1 a) and b). Choices A, B, C, and D for question 2 are one point each (maximum four points). If F is chosen for question 2, the final score for question 2 is be 0, regardless of other choices. E does not affect the final score for question 2. The total score is the sum of scores for questions 1 and 2 (range, 0-5). Total scores of three, four, and five indicate sufficient knowledge of AMR.
The factor that was most strongly associated with knowledge of antibiotics and AMR was education level (permutation importance were 1.0 in all datasets for both). Knowledge of antibiotics was strongly associated with the frequency of internet use (permutation importance was 0.62, 0.49, and 0.35 in each dataset). Exposure to primary information was strongly associated with motivation to make appropriate behavioural changes (permutation importance was 1.0, 0.86, and 1.0 in each dataset). The details of the models and factors strongly associated with each dependent variable are shown in Tables 4 and 5 Table 5. Common variables strongly associated* with the dependent variables in all datasets. Numbers represent permutation importance. AMR; antimicrobial resistance. *Variables ranked among the top 10 in permutation importance in all three datasets.

Discussion
This study provides the first evidence for an association between socioeconomic factors and sufficient knowledge of AMR in Japan. It should be noted that age is not consistently regarded as a determinant of sufficient knowledge, although several previous studies have reported that antibiotic use tends to be more frequent in older individuals than in relatively young individuals 7 . A few recent studies have reported an association between education level and knowledge of antibiotics and AMR 10 , which is consistent with our results. Therefore, targeted interventions based on education level might be more effective than existing interventions aimed at the general public as a whole. For example, educational programmes at primary schools such as e-Bug by Public Health England 12,13 are promising.  www.nature.com/scientificreports www.nature.com/scientificreports/ Our results also showed that the source of information about antibiotics and AMR might play an important role in the knowledge level of the general public. In previous studies of sources of information about antibiotics and AMR [14][15][16][17] , the Internet has not been identified as an important source. These results highlight the importance of information source for the implementation of educational interventions targeting the general public.
Napolitano and colleagues pointed out that parents who do not refer to physicians as an information source tend to use antibiotics more frequently 14 . Hoffmann and colleagues reported that patients who obtain information from general practitioners are more informed about antibiotics and AMR than those who obtain information from other sources 15 . Additionally, that study showed that higher education level is more strongly associated with a higher score than information source. Those findings are compatible with our present results indicating that education level and exposure to primary information are strongly associated with the motivation for appropriate behavioural changes. These findings taken together show that exposure to primary information, especially information provided by healthcare professionals, may promote effective changes in antibiotic consumption.   www.nature.com/scientificreports www.nature.com/scientificreports/ However, internet use has not previously been linked to sufficient knowledge of antibiotics and AMR. One of our key findings was that frequent internet use is an independent factor for having sufficient knowledge about antibiotics; this seems to contradict previous findings that primary information is important for sufficient knowledge about AMR and appropriate behavioural changes. It is possible that the existence of antibiotics has become general knowledge and therefore the public can easily access information about antibiotics. However, the concept of AMR is relatively novel for the general public because only 3 years have passed since the National Action Plan was issued by the Japanese government. In addition, people who use the Internet frequently might have a higher sensitivity to information than those do not use the Internet and tend to search for new terms more frequently. As a result, internet-based information tends to present both advantages and disadvantages, and primary information from healthcare professionals becomes more reliable. Although further research about the effect of information source is needed, our data clearly indicate that the frequent provision of primary information would substantially improve the knowledge and behaviour of the general public.
This study has some strengths and limitations. One important strength that differentiates our analysis from previous studies is its methodology. DataRobot, which we used to conduct all analyses, can rapidly compare many model types. Accordingly, we can choose the optimal model among dozens of models and avoid arbitrariness in model selection. However, it is important to note that the AUC values of the models were relatively low, despite this strict model selection process. A larger number of variables is probably needed to explain the association of knowledge and behaviour with socioeconomic factors. In addition, the questionnaire was administered online. Therefore, respondents were internet literate and might represent a biased sample.
In conclusion, our results confirm the importance of education level in having sufficient knowledge of antibiotics and AMR. The use of primary information was strongly associated with both knowledge and behavioural changes, suggesting that the information source could be an important factor for improving the general public's knowledge of AMR. Although our results should be interpreted with caution, they could help health policy decisionmakers conduct educational interventions directed at the general public.

Methods
Data source. A web-based questionnaire was developed to collect responses anonymously. When participants first visited the survey website, the policy for the use of the collected data and the protection of personal information was displayed. The details of the questionnaire are available as Supplementary File S1.  Table 7. Responses to questions about beliefs and behaviours related to antibiotics and AMR. Numbers in parentheses indicate percentages. AMR; antimicrobial resistance. *Information provided by healthcare professionals, research institutes, and governmental organizations. **Information provided by family, friends, private individuals, private companies, and mass media (television and journals).
Scientific RepoRtS | (2020) 10:3502 | https://doi.org/10.1038/s41598-020-60444-1 www.nature.com/scientificreports www.nature.com/scientificreports/ The online nationwide cross-sectional survey was conducted twice, in March 2017 and February 2018. The participants were selected among Japanese adults aged 20-69 years from a public panel in which 7.6 million people were registered by a research company (INTAGE Corporation). A total of 6,982 complete survey responses were obtained. Individuals aged 70 years or older were excluded owing to potential difficulties in responding to online surveys and to match the age criterion applied in a similar European study (20-69 years) 4 . In addition, the participants were selected to reflect the general population (based on the national population census of Japan in 2015) in terms of sex, age, place of residence and population size. Participant characteristics and their responses to other questions related to antibiotics and AMR are shown in Tables 6 and 7.
To examine factors associated with sufficient knowledge of antibiotics and AMR, data from all 6,982 respondents were used. To identify factors associated with appropriate behavioural changes, data from 2,962 respondents were used. The sample sizes differed because the questions addressing behavioural changes could only be answered by those who initially answered that their behaviour had changed.
Data preparation and pre-processing. Outcome. Knowledge of antibiotics was determined based on four questionnaire items. Two or fewer correct answers indicated a lack of knowledge whereas three or more correct responses indicated sufficient knowledge.
Knowledge of AMR was determined based on the results of two questions. The first question consisted of two small questions that were considered correct only if both were answered correctly. The second question asked respondents about the causes of AMR, and comprised a multiple-choice question with six options, four of which were regarded as correct statements. The other two options were "Others" and "Do not know". The former was excluded from the calculation of the number of correct answers because the correctness depends on precise responses. If the respondent selected "Do not know", the number of correct answers was 0, regardless of other choices. Accordingly, the total number of correct answers ranged from 0 to 5. Respondents with two or fewer correct answers were regarded as lacking knowledge about AMR whereas those with three or more correct responses were regarded as having sufficient knowledge about AMR.
As for behavioural change after obtaining knowledge about AMR, respondents who indicated two or more behavioural changes among those who had the opportunity to obtain knowledge about AMR were regarded as changed their behaviour. The details of the question about behavioural change are available in Supplementary File S1.
Pre-processing. Various pre-processing methods were automatically applied to the data. For categorical values, the pre-processing methods included 'one-hot encoding' 18 and 'ordinary encoding' . For numerical values, 'standardization' 19 , 'constant splines' 20 , and 'imputing missing values' 21 were used.  www.nature.com/scientificreports www.nature.com/scientificreports/ Validation. Data were separated into training and validation sets. Cross-validation was used for model construction and evaluation. Five-fold cross validation was used, and the partitions were determined with stratified sampling. Each of the three types of partitions were obtained with different random seeds.
As an optimization metric, logarithmic loss was used.
Model building by machine learning. Models were created using the automated machine learning platform DataRobot. It was used to create over 40 models, including "blender models" obtained by using several machine learning algorithms. A blender model, sometimes referred to as an ensemble model, increases accuracy by combining the predictions of two or more models. The best model of all developed models was selected based on the largest area under the curve (AUC) value. All analyses were conducted on 4 December 2019. The details of variables included in each analysis are available in Table 8.
Evaluation. Factors were identified based on permutation importance 22 . Permutation importance measures how much worse a model would perform if DataRobot made predictions after randomly shuffling the elements in a given column (while leaving the other columns unchanged). DataRobot normalizes the scores by setting the value of the most important column to 1. The influence of changes in values for each factor on the outcome was also evaluated based on partial dependence 23 . A partial dependence plot was generated to show the marginal effect of features on the predicted outcome of a machine learning model. Sensitivity analysis. We conducted sensitivity analyses by multi-class classification approach. We classified all participants in accordance with the score each of them marked. This approach did not improve accuracy of precision substantially in view of AUC value then we deployed the original approach as our main results. The details of sensitivity analyses are available in Supplementary File S3.
Ethics approval. This research was approved by the institutional ethical review board of the National Center for Global Health and Medicine, Tokyo, Japan and was conducted in accordance with the approved guidelines.
Transparency declarations. The first and the corresponding author (ST) affirms that this manuscript is an honest, accurate, and transparent account of the study; that no important aspects of the study have been omitted; and that any discrepancies from the study as planned (and, if relevant, registered) have been explained.