## Introduction

Coronavirus disease 2019 (COVID-19) is a severe respiratory syndrome caused by the new betacoronavirus severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2)1,2,3. The typical clinical manifestations of COVID-19 are nonspecific and can mimic other diseases resembling flu symptoms2,4,5,6,7. Disease onset may result in progressive respiratory failure with alveolar damage and, in some cases, death2. Severe disease is more likely in elderly people, people with impaired immunity and those suffering from respiratory system diseases, cardiovascular diseases, cancer and diabetes8.

Indisputably, COVID-19 has already had a tremendous impact on regional and global health systems and economies. In many countries, bans and restrictions have been introduced to stop the spread of the virus6. However, the direct threats to physical health and life posed by COVID-19 are not the only impacts of the pandemic on worldwide health care systems. Certain aspects of the pandemic are believed to have major impacts on the mental health of the population. The need for isolation and the related economic issues and the fear of the possibility of the infection and death of immediate family members are undoubtedly significant in the context of the mental health of the global population9,10,11,12. Moreover, due to the dramatically changing situation, it is anticipated that psycho-emotional disturbances, somatization (which can manifest as a headache) and parafunctional oral behaviors (which can contribute to orofacial pain and temporomandibular disorders) arising from and intensified by the current epidemiological situation will become significant problems for global and regional healthcare systems in the future9,11,12. The available research show that socio-demographic and socio-economic factors such can be associated with adverse effects of the pandemic13,14,15. The lockdown situation unfortunately hampers the performance of extensive and sound research. Conducting online surveys seems to be the only safe large-scale solution available. Some evidence of COVID-19-related mental health issues and headache in the general public has been published10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27. Most of these studies were surveys showing increased symptoms of depression, anxiety, and stress related to COVID-19, mainly as a result of psychosocial stressors such as fear of negative health impacts, the loss of life and economic issues. The results of the abovementioned studies have been heterogeneous, probably because of differences in methods, locations and timing with regard to the stage of the pandemic. There have been only a few studies published regarding the influence of the pandemic on oral behaviors, such as bruxism28,29. None of the results of the published surveys investigated headache as a symptom of somatization related to the COVID-19 pandemic.

All the aspects of the COVID-19 pandemic discussed above and the insufficiency of well-documented and reliable research on its global impacts on mental health and headache indicate the need for a large-scale study on the indirect effects of the chronic fear induced by the pandemic. In the present study, we aimed to identify the predictors, risk factors and factors associated with mental disorders, headache and potentially stress-modulated parafunctional oral behaviors among the adult residents of North America and Europe using validated questionnaires administered as an online survey during the COVID-19 pandemic. We used author sociodemographic survey, Hospital Anxiety and Depression Scale (HADS), Migraine Disability Assessment Questionnaire (MIDAS) and Oral Behavior Checklist (OBC). Furthermore, we performed a decision tree analysis to identify multidimensional dependencies among the investigated characteristics. We hope that this study will help prepare global and local healthcare systems by enabling the identification of high-risk groups and therefore the more effective prevention of the secondary effects of the pandemic.

## Results

### Background characteristics of the sample

During the study period, a total of 1642 subjects responded to the questionnaire. In total, 1130 were from North America and Europe, and 99.91% (N = 1129) fully completed the questionnaire; 843 subjects from North America responded to the questionnaire, out of whom 100% (N = 843) fully completed it, and 287 subjects from Europe responded to the questionnaire, out of whom 99.65% (N = 286) fully completed it. The groups of respondents (North America and Europe) were adults; 47.74% (N = 539) were men and 52.26% (N = 590) were women. Among the respondents from North America, 43.42% (N = 366) were men and 56.58% (N = 477) were women. Among the respondents from Europe, 60.49% (N = 173) were men and 39.51% (N = 113) were women. The age of the respondents ranged from 18–72 years old, with a mean age ± standard deviation (SD) of 32.59 ± 9.18 years. The age of respondents from North America ranged from 18 to 72 years old, with a mean age ± SD of 32.65 ± 9.11 years. The age of respondents from Europe ranged from 18 to 66 years old, with a mean age ± SD of 32.09 ± 9.37 years.

#### Participant age

We observed statistically significant negative correlations between participant age and the MIDAS score (r(1129) = −0.08, p < 0.007), HADS-Anxiety (HADS-A) score (r(1129) =−0.14, p < 0.0001), HADS total score (r(1129) = −0.09, p = 0.002) and OBC score (r(1129) = −0.24, p < 0.0001). Older participants experienced less disability from migraines and less anxiety, obtained lower scores on the general HADS scale and reported a lower intensity of oral behaviors. The association between respondent age and HADS-Depression (HADS-D) scores was statistically nonsignificant (r(1129) = -0.03, p = 0.40).

#### Participant gender

We found statistically significant differences between the genders with respect to the MIDAS score (FWelch(1, 1055.99) = 26.45, p < 0.00001), HADS-D score (F(1, 1127) = 10.57, p = 0.001), HADS-A score (FWelch (1, 1086.59) = 63.81), p < 0.0001), HADS total score (FWelch (1, 1073.79) = 39.66, p < 0.0001) and OBC score (F(1, 1127) = 29.55, p < 0.0001). In all cases, women had higher scores than men. Detailed results are presented in Table 1.

#### Europe vs. North America

Whereas European and North America respondents had similar HADS-D scores (FWelch (1, 459.0) = 1.39, p = 0.24), there were statistically significant differences between European and North American respondents in the MIDAS score (FWelch(1, 884.47) = 20.94, p < 0.00001), HADS-A score (F(1, 467.32) = 66.68, p < 0.0001), HADS total score (FWelch (1458.49) = 27.32, p < 0.0001) and OBC score for intensity (F(1, 1127) = 28.35, p < 0.0001). In all cases, North American residents scored higher than Europeans (Table 2).

#### Marital status

We did not observe any statistically significant differences in MIDAS (F(1, 1127) = 0.39, p = 0.53), HADS-D (F(2,1127) = 0.69, p = 0.41), HADS-A (F(2,1127) = 2.81, p = 0.09), HADS total (F(2,1127) = 0.29, p = 0.59) and OBC (F(2,1126) = 2.55, p = 0.07) scores between participants who were in a relationship and those who were single. The details are presented in Table 3.

#### Education level

Only 7 participants reported having a primary level of education. These respondents were included in the high school group. We observed a statistically significant effect of education level on the HADS-D (F(2,1126) = 5.20, p = 0.006), HADS-A (F(2,1126) = 9.06, p = 0.0001), HADS total (F(2,1126) = 8.61, p = 0.0002) and OBC (F(2,1126) = 8.97, p = 0.0001) scores. Participants with higher education levels had lower scores on all of the analyzed measures. While moderate scores were characteristic of people with a college education, the highest scores were identified in people with a high school education. In all cases, the post hoc Tukey test showed that differences between people with higher education and a high school education were statistically significant (all p values < 0.005). The differences between people with higher education and a college education and between those with a college education and a high school education were statistically nonsignificant (all p's > 0.09) except for HADS-A. Here, respondents with higher education differed significantly from those with a college education (p = 0.048). The difference between the latter group and people with a high school education was statistically nonsignificant (p = 0.14). The effect of education was statistically nonsignificant for the MIDAS score (FWelch(2, 612.90) = 2.94, p = 0.054). The details are presented in Table 4.

### Trees for each target variable

#### Assumptions

To determine which groups of people are at a high risk for migraine-related disability, depression, anxiety and oral behaviors, we built separate decision trees for all analyzed variables. We expected each tree to maintain a proper balance between specificity and generalizability.

If a tree is too specific (detailed), the model can be overfitted and may work well with the current dataset but not with the new datasets. An adequate level of generalizability, however, allows us to capture the underlying structure of data and draw conclusions that can be applied to new observations. Considering the above, we decided that the maximum depth of a tree should be 4 and the minimum number of samples to generate leaves should be 10% of the sample size, which was 113 individuals (those numbers were selected arbitrarily).

#### Hospital anxiety and depression scale (HADS) anxiety

An explanation of the decision tree construction and the method of tree interpretation is presented in section "Validation of the classification procedure". With regard to the HADS-A score (Fig. 1), place of residence and gender were important features (higher in Europe, higher and in females). In Europe, age also affected the HADS-A score. Interestingly, older female individuals had lower HADS-A scores (12) than younger female individuals (14), but among males, the highest scores were identified in those between 29 and 34 years old (13). Younger males and older males had relatively lower scores (12 and 10, respectively). The lowest HADS-A score (9.0) was in the group of American males. The highest HADS-A score (15.0) was in European females younger than 27.5 years.

#### Hospital anxiety and depression scale (HADS) depression

An explanation of the decision tree construction and the method of tree interpretation is presented in section "Validation of the classification procedure". In regard to the HADS-D score (Fig. 2), for both males and females, education, age and relationship status were important. Interestingly, among males with higher education levels than high school, younger individuals (< 28.5) had lower HADS-D scores (6.0) than older individuals (≥ 29), who had an average score of 7.0. Among females with the same level of education, younger individuals had higher HADS-D scores (9.0) than older individuals (8.0). Males with higher education levels than high school who were younger than 28.5 years old had the lowest HADS-D scores (6.0). The highest HADS-D scores (9.0) were identified in females with a high school education or those with a relatively higher education level who were younger than 28.5 years.

#### Oral behavior checklist (OBC)

An explanation of the decision tree construction and the method of tree interpretation is presented in section "Validation of the classification procedure". In terms of the OBC scores (Fig. 3), age and gender were important. Among females, younger people had higher OBC scores than older people. Among males, middle aged people (30–34 years old) had the highest OBC score (13.0). Younger males (12.0) and older males (9.0) had relatively lower OBC scores, with the latter having the lowest OBC scores in the entire population. The highest OBC score (14.0) was found in younger (under 28.5 years old) females.

#### Migraine disability assessment (MIDAS)

An explanation of the decision tree construction and the method of tree interpretation is presented in section "Validation of the classification procedure". The tree presented in Fig. 4 shows that among males, age was important for the MIDAS score, but among females, relationship status and education level were important. The tree shows that the lowest MIDAS score, which was 0, was identified in males older than 34.5 years. The highest score (average 15.5) was in females who were not single, had education levels other than higher education and were younger than 28.5 years. The groups with intermediate MIDAS scores can be seen on the tree.

Based on each tree, we calculated the feature importance of each characteristic for each analyzed variable. These calculations are presented in Table 5. The most informative (with highest impact on the analyzed value) features are highlighted.

### Validation of the classification procedure

When the assigning rule was applied to the dataset for Europeans, the most vulnerable group accounted for 27.6% of the population. The most vulnerable group in North America using the same rule accounted for 45.6% of the population. In other words, if the help was provided to the entire population in Europe, 72.4% of the people who received it would not have needed it because they were not in the most vulnerable group. Similarly, in North America, the proportion of people who were not in most susceptible group was 54.5%.

In this step, classifiers were generated separately for Europe and North America. As mentioned in the methods, we used 60% of the data to train the classifiers (create rules). The remaining 40% of the sample was used to test the efficiency of the classifier.

After the classification was applied to the remaining 40% of the sample, we obtained the predicted probability of the respondents belonging to the high-risk group.

As mentioned in the methods, we further selected 1/3 of the individuals from the test sample who were predicted to belong to the group at highest risk. In this subsample, we calculated the precision metric, which is the fraction of the positive respondents (i.e., those who were classified as high-risk individuals), among those selected from the highest risk subsample. The operation was performed separately for North American and European residents. For North American residents, 59.8% of the people were properly classified as the most vulnerable. For Europe, the 44.7% were properly classified.

## Discussion

Although COVID-19 primarily affects physical health, the secondary influence of issues related to the pandemic on mental health should also receive attention. Previously published surveys showed increased symptoms of depression, anxiety, and stress related to COVID-19, as a possible result of psychosocial stressors such as the fear of the disease, the loss of life and economic issues10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27. Moreover, the results of the aforementioned studies were heterogeneous, probably because of differences in methods, locations and the timing of the studies with regard to the stage of the pandemic16. Only two studies on oral behaviors have been published28,29. None of the results of the published surveys assessed headache as a somatic symptom related to the COVID-19 pandemic. However, due to a large amount of worrisome information in social media, the general public is also experiencing overwhelming psychological pressure, which may lead to a variety of psychological conditions, stress-related parafunctional oral behaviors and somatization. Moreover, mental health aspects are relegated to the background due to the heavy burden on local and global health systems imposed by the immediate effects of the pandemic. Therefore, it is essential to identify simple method of identifying the group at high risk of developing psychiatric disorders and deliver early preventive measures or treatment to avoid further consequences. The present online survey was conducted to identify the predictors, risk factors and factors associated with mental disorders (anxiety, depression), headache, and oral parafunctional behaviors among the adult residents of North America and Europe during the COVID-19 pandemic.

As the one of the most important results, in the entire study group, we observed high HADS-A scores. The results are in agreement with data on the prevalence of anxiety during other epidemiological or natural catastrophes, such as the Ebola outbreak30, tsunamis31 and the September 11th attacks32. Moreover, most recent data on mental health problems in China during the COVID-19 outbreak have also shown increased risks of anxiety and depression19.

In presented study, we observed higher anxiety scores in younger participants and female. Available literature shows that early-onset generalized anxiety disorder (GAD) is associated with female gender, higher education levels and higher levels of neuroticism, while late-onset GAD is associated with physical illnesses33. What is more, gender-based differences in anxiety have been consistently found, and females are approximately twice as likely as males to have mood disorders34.

Numerous studies have shown that health and mortality outcomes for married persons are better than those for single persons35, especially among men36. In a recent study investigating the relationship between marriage and quality of life, single men were found to have a worse quality of life than married men, whereas single women were found to have a better quality of life than married, separate or divorced women37. However, data concerning marital status and anxiety are controversial. A higher anxiety level was observed in single individuals than in married persons among patients with epilepsy38. However, the level of anxiety was similar in married and single patients receiving palliative radiotherapy39. In the present study, we did not observe a relationship between the anxiety score and relationship status.

In presented study, the anxiety score was correlated also to educational level. Patients with higher education levels had lower anxiety scores than those with primary and high school educations. The similar effect of education on anxiety has been also demonstrated previously. The Hunt study showed that a higher educational level may protect against anxiety and depression40. Also, Cekirdekci and Bugan showed higher anxiety scores in patients with lower education levels in the population diagnosed with cardiac syndrome X41.

Also, the place of residence seems to influence the anxiety score. In the available literature, the prevalence of anxiety has been shown to be higher in North America than in European countries42. It is worth noting that we also observed the effect of place of residence on anxiety. In the present study, North American respondents had generally higher anxiety levels than respondents from Europe. On the other hand, taking into account the decision trees analysis and predictors, the highest anxiety level was identified in European females who were younger than 27.5 years.

In presented study, we tried also to establish the similar relations for depression scores, as mood disorders are highly prevalent in the global population, with prevalence ranging from 5.4 to 7.8%43. Available literature shows that the lifetime prevalence of depression in females is twice that in males44. Females with depression tend to have a younger age of onset45, longer duration46 more severe and recurrent episodes47 and lower quality of life48 than male patients. Education level has been associated with the risk of depression49. However, this relationship is not consistent, and some studies have shown that a lower education level is not related to a higher prevalence of major depression50.

In the present study, effects of gender and education level on the depression score were observed. The highest depression score was observed in females with a high school education or with an education level other than high school who were younger than 28.5 years. Taking into account the place of residence, depression scores were similar in North American and European respondents.

Another aspect studied in this manuscript was headache. As the headache can be the associated with somatization and is highly prevalent worldwide51. Numerous studies in the general population have consistently demonstrated that headache is more prevalent in women than in men52,53. The most important risk factors for headache include the overuse of acute migraine medication, ineffective acute treatment, obesity, depression, stressful life events, age, and low education level53. In the present study, females and subjects with lower education levels had higher MIDAS scores. As the effect of gender and education level on migraine has been previously described51,52, the results of this study are in line with the findings of previous studies. Taking into account decision trees analysis, the highest MIDAS scores were observed in the group of non-single females with education levels other than higher education who were younger than 28.5 years.

Another studied in the presented study aspects were potentially stress-related oral behaviors. To the best of our knowledge, this is the first study to investigate the prevalence of oral behaviors during the COVID-19 pandemic. Oral behaviors are frequently observed in the general population and can lead to serious clinical implications including temporomandibular disorders and orofacial pain54. The relationship between oral behaviors and temporomandibular disorder has been reported by several authors in children, adolescents and adults54,55,56,57. Oral parafunctions include teeth clenching, lip biting, thumb sucking, nail biting and other oral habits. Bruxism is the most common oral motor activity and is anticipated to be present in 31% of the general population58. Nail biting and holding objects in the mouth are other oral parafunctions observed frequently in children and adolescents59. Winocur et al.55 found that biting hard objects and nail biting were associated with tired jaws in adolescent females. Atsü demonstrated that TMD signs and symptoms were relatively more frequent in the adolescent female group (47.8%), and these results may be explained by biological differences, hormone levels and higher pain sensitivity in women59. In the present study, the effects of gender, age, place of residence, and education level on the OBC score were observed. Similar to the results regarding anxiety levels, younger females with lower education levels were in the highest risk group for parafunctional oral behavior.

The highest OBC scores were observed in females younger than 28.5 years.

The design of the study was thorough and enabled the authors to obtain results online in an easy way and in a short time period. It allowed us to define risk groups rapidly, which, in the future, may allow the establishment of precisely targeted risk groups and the provision of the necessary prophylactic or treatment measures to the highest-risk population, thereby preventing the development of mental disturbances during this and other global crises. It is very important to achieve scientific advances even when access to patients is difficult, and assistance cannot be provided to everyone. The present study is the first to consider potentially stress-related parafunctional oral behaviors, the occurrence of headaches and the prevalence of mental disorders during the COVID-19 pandemic. Moreover, this is the first study to highlight the risk factors for mental disorders and high-risk groups who could potentially develop mental disorders during global crises. The strength of the study is the fact that it was conducted on a large and representative group of respondents and compared residents in two different continents: North America and Europe. The questionnaires used in the study were validated, established and highly specific tools. The obtained results are novel, interesting and clinically useful.

Despite its novelty and many strengths, this study is not without some limitations. First, the study was performed as an online survey, which, despite widespread access to the Internet, could possibly have defined or limited the study group. It is expected that this form of data obtaining would be more available for younger respondents as the elderly may be less technology proficient. This could influence reliability of the data including potential bias. Additionally, the fact that the survey was conducted in English was a limitation, especially for residents of Europe.

The present study demonstrated showed levels of anxiety, depression, headache, and oral behaviors during the COVID-19 pandemic in both North America and European residents. For the first time, we have also shown increased levels of oral parafunctional habits during the COVID-19 pandemic, which may result in an increased prevalence of orofacial pain and temporomandibular disorders in the future. Therefore, health care systems should be prepared for more patients with mental disorders, headache, orofacial pain and temporomandibular disorders during the current pandemic and future global crises. The results obtained in this study facilitated the identification of the group at highest risk for the mentioned secondary effects of the pandemic. This group was composed of females younger than 28.5 years old, especially those who were single, less well educated and living in Europe. These results indicate the need to perform further research in this population. Determining this risk group may allow the implementation of screening tests and the faster implementation of preventive and treatment measures, with the aim of reducing the long-term negative effects of this and future global crises. Due to the fact that in the times of almost every crisis, performing screening tests and access to large populations could be very difficult, in authors’ opinion, the clinical recommendations from the presented study findings would be performing screening for the occurrence of psycho-emotional disturbances and somatization first in the defined highest risk groups. This will allow faster detection of people presenting disturbing symptoms and faster and more accurate implementation of interventions.

## Methods

This study was conducted in accordance with the principles of the Declaration of Helsinki. The study was approved by the Ethics Committee of Wroclaw Medical University in Poland (ID: KB-302/2020). All the study participants provided informed consent before being included in the study.

### Data collection procedure

To collect the data, the authors created an online questionnaire on the Google Original Platform—Google Forms because (1) it is the platform with which they have the greatest experience and (2) according to them, it is the most user-friendly for both researchers and respondents. The questionnaire was posted on Reddit, an American social news aggregation platform that also allows users to be involved in discussions. The authors posted links to the questionnaire on several Reddit pages called “subreddits”, including local American and European forums as well as SARS-CoV-2-related forums. Reaching out to these internet communities and explaining why such data are important enabled rapid data collection—in 3 days, from March 22th to March 25th, 2020, the authors collected 1642 answers. It is worth mentioning that the Redditors (as Reddit users call themselves) who took part in the survey were satisfied with their participation, and many of them decided to share the link to the questionnaire with their families or friends. The authors chose to post the questionnaire on Reddit because they noticed that subreddits were thriving in the first quarter of 2020; for example, according to subredditstats.com, a website with statistical data about subreddits, /r/Coronavirus, which is currently the largest SARS-CoV-2-related subreddit, was created on 20th of January 2020. It had 6 subscribers that day, but by the end of April 2020, it had grown to have more than two million subscribers. The questionnaire was anonymous, and the authors did not collect any data that allowed them to identify the respondents.

### Questionnaires

#### Author sociodemographic survey

The sociodemographic portion of the survey asked basic questions about gender , age, place of residence (name of country), marital status (in a relationship, married or single), education level (primary school, high school or college or higher), and existing medical conditions. As the educational level information could vary between specific countries due to differences between the independent educational systems, we tried to mention all the possible universal types/levels of education: primary school, high school, college graduate, higher education (professional or post-graduate level). Then, for the purposes of statistical analysis, on the basis of presented possible answers, we created 3 types of categories: primary education (primary school), secondary education (high school or college graduate) and higher education (professional or post-graduate level). This allowed for taking into account the problem of differences in individual educational systems of individual countries.

#### Hospital anxiety and depression scale (HADS)

The HADS is a widely used self-assessment of anxiety and depressive symptoms, focusing mostly on the cognitive and psychological aspects60,61,62. Somatic concerns and physical symptoms are not assessed by this scale. It is commonly used in general medical populations as well as in healthy populations63. The psychometric properties, including the internal consistency, discriminatory ability, validity and test–retest correlations, are considered satisfactory; thus, the HADS is one of the most commonly used self-assessment questionnaires for anxiety/depression symptom screening62.

The HADS consists of a total of 14 items in 2 separate subscales: anxiety (HADS-A) and depression (HADS-D), each of which includes 7 items. All items were scored by the participant using a Likert scale (4 points, from 0 to 3 points). The total score varies from 0 to 42 points, and both subscale scores vary from 0 to 21 points.

The originally recommended cutoff scores for the subscales were as follows: a score from 0 to 7 indicates a noncase, a score from 8 to 10 indicates a possible case, and a score from 11 to 21 indicates a probable case63. Currently, the categorization system includes more groups: 0 –7, normal; 8 –10, mild; 11–15, moderate; and > 16, severe61.

In this study, scores of 11 or more were considered to indicate a “high risk of anxiety/depression”, according to the cutoff values described above.

#### Migraine disability assessment (MIDAS)

The MIDAS is a short, 5-item tool designed for the rapid assessment of the consequences of migraine for a patient, focusing on time lost (in terms of lack of productivity) due to the headache. The patient indicates the number of days with significant disability due to migraines during the last 3 months before the assessment. The score is obtained by summing days mentioned in the responses to the 5 items, and this total score is classified in one of four clinical groups: little or no disability (0–5 days), mild disability (6–10 days), moderate disability (11–20 days) and severe disability (21 days or more)64,65.

All properties, including the internal consistency, test–retest correlations and validity, are considered satisfactory and have been confirmed in several studies65,66.

In this study, scores of 21 or more were considered to indicate a “high risk of headache” with a significant impact of those headaches on daily functioning.

#### Oral behavior checklist (OBC)

The OBC is a self-assessment tool designed for the evaluation of the frequency of different oral behaviors during the day or at night. It consists of 21 items, out of which 2 refer to night-time behaviors, while the rest refer to daily oral function. For each item, a participant provides an answer describing the frequency of this behavior: during the night (how many nights in a week such behavior appears) or during the day (none of the time/a little of the time/some of the time/most of the time/all of the time). For each item, a score of 0–4 points is assigned, yielding a total sum in the range from 0 to 84 points. The score is interpreted as follows: 0—no risk of parafunctional oral activity, 1–24—low risk of parafunctional oral activity, 25–84—high risk of parafunctional oral activity.

During the design of the study, the internal consistency, test–retest correlations and validity were found to be good, and the OBC is the tool most commonly used for the assessment of oral behaviors67.

In this study, a score of 25 points or more was used as a cutoff value for a high risk of parafunctional oral activity.

### Target group definitions

Given that the HADS does not include questions on somatic concerns, we defined several target “high-risk” groups by combining a score indicating a high risk of mental health issues (HADS-D/A) with a score indicating a high risk of somatic/physical issues (the MIDAS or OBC). In this way, 4 different high-risk target groups were established:

• anxiety and headaches (HADS-A score of 11 or more AND MIDAS score of 21 or more);

• anxiety and oral parafunctional activity (HADS-A score of 11 or more AND OBC score of 25 or more);

• depression and headaches (HADS-D score of 11 or more AND MIDAS score of 21 or more);

• depression and oral parafunctional activity (HADS-D score of 11 or more AND OBC score of 25 or more).

Based on the definitions of these high-risk target groups, in the third stage of the research, we defined the group at highest risk of negative effects and applied a classification algorithm to predict if the likelihood of belonging to such a group was associated with any basic characteristics (age, gender, place of residence, relationship status and education level).

In this way, the possible combinations of the risks of two different mental disorders with the risks of two different somatic/physical manifestations were exhaustively analyzed. It is noteworthy that the use of the HADS does not allow any diagnosis of depressive or anxiety disorders; it only indicates a high risk of anxiety/depressive symptoms62 and suggests that professional/institutional help should be administered.

### Statistical analysis

We analyzed our data in three stages. First, we analyzed the background characteristics of the entire sample. When our data did not satisfy the assumptions of standard parametric analyses, we used either nonparametric tests or relevant alternatives to classic parametric methods. Accordingly, Spearman's rank correlation coefficients were used to assess the relationships between nonnormally distributed continuous variables (participant age and MIDAS, HADS, OBC scores). To test for significant differences between groups of respondents defined by categorical variables (i.e., respondent gender, place of residence, relationship status, education level) we used both classic one-way ANOVA (if variances in the compared groups were homoscedastic) or Welch's one-way ANOVA (if group variances were heteroscedastic). We report the results from Welch's ANOVA analyses with the appropriate remarks. If Welch's ANOVA indicated statistically significant results and more than two groups were analyzed, post hoc pairwise comparisons were performed with the Games-Howell test. Post hoc pairwise comparisons for the classic one-way ANOVA were conducted with the Tukey test.

### Decision tree-based analysis

In the second stage, we generated a regression decision tree to identify the multidimensional dependencies among all characteristics identified in the first stage (age, gender, place of residence, relationship status and education level) and analyzed the values (MIDAS, HADS and OBC).

The primary goal of using decision trees is to create a model that predicts the value of a target variable by learning simple decision rules inferred from the data features. A decision tree is one of many predictive modeling methods. The clear advantage of the decision tree model over other methods (linear regression, supported vector machines, artificial neural networks and others) is its graphical representation enabling the straightforward interpretation of the rules explaining dependencies among variables. Unlike in the standard procedure, in which one subset of data is used to train the model and the second subset is used to validate its efficiency, here we built a model on the entire available dataset to visualize and discern the influence of the variables on the analyzed values. In this study, we used the common Python (version 3.7.4) programming language and generated decision trees with the CART algorithm. The authors used decision tree implementation provided in the scikit-learn library (one of the most popular machine learning libraries on GitHub. https://scikit-learn.org/).

Decision trees can be built only with continuous (numerical) variables that require prior transformation, and the categorical features (gender, place of residence, relationship status and education level) were encoded as a numeric array. We used the label binarization method to transform each categorical variable (LabelBinarizer method in sklearn.preprocessing, https://scikit-learn.org/stable/modules/generated/sklearn.preprocessing.LabelBinarizer.html). If the variable had 2 possible values, e.g., gender (female, male) one result feature is generated. In the example of gender, the result would be 1 if person is a male and 0 if the person is a female. When the initial variable has more options, e.g., education level (higher, college, high school), the result of label binarization is a set of features for each possible option, and they are assigned the value of 1 if the original row has option and 0 when it has a different one.

Although decision trees are a very informative and compelling method of data exploration and data mining68, they are not very common in the biomedical literature; therefore, in Fig. 5, a hypothetical decision tree is presented, and brief guidelines for the appropriate interpretation of the results obtained from the trees generated in our study are presented.

Each decision tree includes root nodes, subnodes and leaf nodes. They are connected with branches. Nodes include the predicates (e.g., male ≤ 0.5), the size of a sample (samples), the predicted value on a current level (value) and the mean absolute error (mae). Prediction with classification trees is performed by navigating down the tree through the logical results of the predicates until a leaf is reached69. If the logical test of the predicate results in “true”, we follow the left-hand option; otherwise, we should follow the right-hand option. The predicted value is presented in a leaf node. If we would like to use the presented tree to predict the MIDAS score for a single (single = 1) woman (male = 0), we need to answer Male ≤ 0.5 (the predicate in the root node). In our case, the logical value “true” is returned. Then, we follow the “true” branch (left-hand side). The predicted value for females is 10.0. The next step in the prediction process is to answer single (≤ 0.5). In the given example (single = 1), the result is “false”, so we follow the right-hand side. The final prediction is 6.0. We also know that there are 126 observations in the initial dataset of single females. The indicated mae is high, so this prediction is likely inaccurate.

The mae is a measure of the error between observations expressing the same phenomenon. It is calculated with the following formula:

$$MAE = \mathop \sum \limits_{i = n}^{n} \frac{{\left| {y_{i} - x_{i} } \right|}}{n}$$

where, yi—prediction, xi—actual value, n—sample size.

Having generated a decision tree, we are able to evaluate the importance of each feature in the prediction process. Feature importance evaluation always pertains to a generated decision tree. To perform such an evaluation, the Gini importance score is calculated. Splits in a decision tree are determined by choosing the feature and splitting criterion that result in the greatest reduction in total impurity, which ultimately indicates the importance of that feature in the specific tree. A split that generates a large decrease in impurity is considered important; therefore, variables used to determine important splits are also considered important. Based on this idea, the importance for each variable X in terms of the reduction in impurity is computed as the sum of all the measures of the decrease in impurity at all nodes in the tree at which a split occurs based on X70.

When the Gini importance score is 1, it means that one feature is sufficient to predict the analyzed value. If it is 0, such a feature is not represented in a tree at all. The sum of all Gini scores for all features is 1. The higher the Gini score, the more informative (important) a feature is (the more influence it has on an analyzed value).

### Validation of the prediction of the high-risk group

The third stage of the analysis was the validation of how accurate the prediction of the high-risk was and involved the application of the knowledge acquired in previous stages. The target group definition presented earlier was used in this stage.

The primary aim of identifying dependencies among all characteristics selected in the first stage (age, gender, place of residence, marital status and education level) and the analyzed scales was to determine the most vulnerable individuals to enable the precise and efficient targeting of the provision of support. In this stage of the analysis, we validated the efficiency of the developed classification procedure. This was accomplished in several steps. First, we defined a rule assigning a person to the most vulnerable group according to the mentioned cutoff points for each of the analyzed measures. Consequently, the rule assigning a given person to the most vulnerable group was as follows:

• If

• (HADS ANXIETY > 10 AND MIDAS > 20) OR.

• (HADS ANXIETY > 10 AND OBC > 24) OR.

• (HADS DEPRESSION > 10 AND MIDAS > 20) OR.

• (HADS DEPRESSION > 10 AND OBC > 24).

• THEN

• EXPOSED = True.

• ELSE

• EXPOSED = False.

This rule was based on the 4 different high-risk target groups defined earlier.

Having identified this group, we assumed that relevant interventions should be delivered separately to the entire populations of North America and Europe. However, the efficiency of such an approach is questionable since a great deal of effort will probably be devoted to diagnosing and providing help to people who do not actually need it. The efficiency of such an approach is calculated as the percent of the population that truly needs the help.

The next step involved building a classifier using part of the initial dataset (the so-called training set) to train the model. The training set was 60% of the initial dataset. During the training process, we fed the algorithm the basic characteristics of the respondents: gender, relationship status, and education level. The result was whether an individual belonged to the most vulnerable group.

The goal of the classifier is to assign a person who has not been diagnosed to a specific group (i.e., i) most vulnerable; ii) the rest of population), knowing only the mentioned basic characteristics and lacking an actual diagnosis.

After the classifier is trained, its efficiency should be verified. Classification quality is calculated based on the remaining part of the initial dataset that was not used in the training process. The test set was composed of the remaining 40% of respondents. During testing, classification was performed on that 40% of the samples. The classifier predicts the probability that respondents belong to the high-risk group. Assuming that help is delivered primarily to the group predicted by the classifier as the most vulnerable persons, we calculated the efficiency and compared it to the initial situation where we assumed the delivery of interventions to the entire population. We assumed that support should be provided to 1/3 of the population due to resource-related limitations, which is why the results will be calculated for the group of respondents for whom the classification predicted the highest probability of a high level of risk.

The classifiers were created separately for North America and Europe and were validated in the corresponding sets.