Introduction

The face is arguably the most important part of the human body, playing a significant role in determining an individual’s attractiveness1. Facial deformities or disfigurements can have severe psychological consequences2. Especially individuals affected by facial nerve palsy can suffer greatly from such facial deformities. These patients may experience impairments across various domains of their quality of life, including diminished self-esteem, psychological and social challenges, as well as difficulties in performing daily activities3,4,5. In order to fully assess these impairments clinically, it is crucial to not only to rely on objective measurement methods but also to consider the subjective perception of patients, particularly their self-perception of their appearance. Since the development and validation of disease-specific patient-reported outcome measures (PROMs), significant progress has been made, and these measures have been implemented in many clinical settings in the field of otorhinolaryngology6. For patients with facial palsy, the questionnaires best validated for this purpose to date, the facial disability index (FDI) and the facial clinimetric evaluation (FaCE) scale, have been widely used in clinical routines for several years5. A limitation of both questionnaires is the assessment of self-perception of appearance. Since it has been shown that patients with facial palsy may have a higher risk of developing body dysmorphic disorder, it is even more important to capture the appearance from the patient’s point of view7. In this context, the patient-reported outcome instrument FACE-Q Paralysis questionnaire, which was developed by Klassen et al. in 2020, stands out as a comprehensive tool for capturing the patient’s perspective in the areas of Appearance, Facial Function, Health-related Quality of Life, and Adverse Effect8,9, The FACE-Q Paralysis is part of the more comprehensive FACE-Q Craniofacial questionnaire, which is an outcome measure designed for patients with visible and/or functional facial distinction9. The gap of knowledge of the already established questionnaires FDI and FaCE is the assessment of the appearance from the patient’s point of view. This is the reason why this study attempts to fill the gap also for use in clinical settings for German speaking patients. Its focus on the Appearance domain (five out of 16 subdomains) makes it a valuable addition to the existing validated PROMs for patients with facial palsy. The original version of the questionnaire is available in English and has been translated into multiple languages, but not yet into German8,10.

In the present study, our goal was therefore to validate a German version of the FACE-Q Craniofacial, of which the FACE-Q Paralysis forms one part, among patients with facial palsy, by comparison with the FDI and FaCE questionnaire. In addition, the aim was to investigate possible independent predictors that could influence the response to the FACE-Q Paralysis questionnaire.

Materials and methods

This prospective observational study was performed at the Department of Otorhinolaryngology, Jena University Hospital, Jena, Germany. Approval for the study was obtained through the local institutional ethics review board, the Ethics committee of the Friedrich-Schiller-University, Jena, Germany (No. 2022-2695-Bef). Written informed consent was obtained from all study participants, and/or their legal guardians/caregivers. All experimental procedures with human subjects followed the institutional research committee’s ethical standards and the 1964 Helsinki Declaration and its later amendments.

Translation of the FACE-Q craniofacial module and of the FACE-Q paralysis module

The FACE-Q Craniofacial questionnaire is a PROM instrument intended for patients with visible and or functional facial differences between the ages of 8 and 29 years. The questionnaire consists of four domains representing appearance, function, health-related quality of life (HRQOL) and adverse effects. Each domain is composed of several subdomains, which consist of multiple independently functioning scales9,11. While the Craniofacial module is designed to address facial differences overall, the FACE-Q Paralysis module, which is part of the larger FACE-Q Craniofacial, specifically focuses on patients with facial paralysis and has no age limitation. The FACE-Q Paralysis questionnaire consists of the same four domains as the FACE-Q Craniofacial, each with several subdomains. These domains include Appearance (includes subdomains Eyes, Face, Forehead, Lips, Smile), Facial Function (includes subdomains Breathing, Eating/Drinking, Eyes, Face, Speech), Health-related quality of life (includes subdomains Appearance Distress, Psychological, Social, Speech Distress) and Adverse Effects (includes subdomains Eyes, Face). The questionnaire thus includes 16 subdomains made up a total of 146 questions. Multiple items are included in each FACE-Q scale that can be rated on a 3- to 4-point Likert scale. Depending on the question of the subdomain there were various possible answers (e.g. Not at all, a little bit, quite a bit, very much) between which the patient could choose on the Likert scale. The raw scores of each scale are converted into a range from 0 (worst) to 100 (best) based on the findings of Rasch analysis12. Exceptions are the subdomains Eye Function, Eye Adverse Effects and Face Adverse Effects, which are checklists for identifying problems experienced by the patients. These checklists cannot be converted based on Rasch analysis because the sets of items may not function together statistically8,10.

For this study, the entire FACE-Q Craniofacial was translated first, but validation in German was done for the FACE-Q Paralysis. A German version of the FACE-Q Craniofacial questionnaire was produced out of the original English version. The translation process and cross-cultural adaption was done following the International Society for Pharmacoeconomics and Outcomes Research (ISPOR) guidelines13 as requested by the Q-Portfolio team. Two separate forward translations were performed by native German speakers who were fluent in English. Based on their translation, a reconciled version was agreed on. A backwards translation was done by a native English speaker. The original English version and the backwards translation of the questionnaire were then compared by the Q-Portfolio Team at McMaster University (Hamilton, Canada) and The Hospital for Sick Children (Toronto, Canada), publisher of the original questionnaire. A pilot study was performed to test the comprehensibility of the German version. Six patients, who were fluent in German, were recruited from the outpatient clinic of the Facial-Nerve-Center, Jena University Jena, Germany. After completing the questionnaire, these patients were interviewed to identify any potential difficulties in comprehension and gather suggestions for improving the translation. Any misunderstandings that these patients raised, such as single terms or filler words to improve language comprehension, were improved before finalizing the questionnaire for use in this study. Each of the six patients was able to complete the questionnaire and answer the associated comprehension questions. The German version of questionnaire can now be requested via the website of the Q-Portfolio Team (https://qportfolio.org/face-q/paralysis/).

The other PROMs: facial disability index (FDI) and facial clinimetric evaluation (FaCE)

In addition to the FACE-Q Paralysis, the two other questionnaires of the survey, the facial disability index (FDI) and the facial clinimetric evaluation (FaCE), have already been validated in German14. The FDI consists of 10 questions with Likert-scale response options, subdivided into two parts: Physical Function and Social/Well-being Function. The Physical Function scale ranges from − 25 (worst) to 100 (best), while the Social/Well-being Function scale ranges from 0 (worst) to 100 (best)15. The FaCE consists of 15 questions with 5-point-Likert scale responses, subdivided into six domains: Facial Movement, Facial Comfort, Oral Function, Eye Comfort, Lacrimal Control, and Social Function. Each scale ranges from 0 (worst) to 100 (best) and a total score is obtained16.

Patient selection and survey

Selection criteria for the study were fluent German-speaking patients over 8 years of age presenting to the Facial-Nerve-Center in Jena between 2018 and 2022 who were diagnosed with facial nerve disorders. Individuals with both acute and chronic facial nerve palsy were included. Furthermore, the study also invited patients who had already recovered from their facial palsy, thus representing individuals with milder symptoms. The survey consisted of 22 pages, including a one-page cover letter to the patients, one page consisting of seven questions on personal data and the three questionnaires FACE-Q Paralysis, FDI and FaCE (total of 20 pages). A total of 800 patients were contacted by mail between November 2022 and February 2023, of whom 214 patients (response rate: 27%) participated in the survey and returned at least one of the paper-based questionnaires by mail. Inclusion criteria were defined as follows: patient age at least 8 years and they were required to complete at least one of the three questionnaires. For the FDI and FaCE questionnaires, full completion was mandatory, while for the FACE-Q questionnaire a minimum of 50% completion for each subdomain was necessary. The remaining responses for the missing items were derived from the answer that was given most common response for the domain, following the provided instructions for use9. One patient had to be excluded because he was under the age of 8 years. By reading the instructions and the information on the data protection policy, and by completing the questionnaires, each patient has given their consent to the collection and processing of their data”.

Statistical analyses

All statistical calculations were performed with IBM SPSS Statistics (Version 29.0; IBM Corp., USA). Unless otherwise stated, descriptive statistics data are presented in mean, standard deviation (SD), median, range, and relative data in percentages. In order to assess the internal consistency of the questions within the domains of the German version of the FACE-Q, Cronbach’s alpha coefficient was calculated and the 95% confidence interval is provided. Generally, a Cronbach’s alpha coefficient value above 0.7 is considered to indicate acceptable internal consistency17. Spearman’s rank correlation coefficients (Spearman’s Rho) were calculated to assess the correlations between items within the FACE-Q and against FDI and FaCE. It is commonly accepted that a correlation coefficient of 0.30–0.59 represents a fair correlation, 0.60–0.79 represents a moderate correlation and values exceeding 0.8 indicate a very strong correlation18. Nominal p-values for two-sided testing were used, with a significance level set at p < 0.05. In order to determine which clinical parameters exhibited a statistically significant impact (p < 0.05) on the FACE-Q results, a univariate analysis was performed using the non-parametric Mann–Whitney U Test. This involved dichotomizing the clinical parameters, such as dividing age into two groups above and below the respective median. For multiple significant results in the univariate analyses, the corresponding parameters were subsequently tested for their influence using multiple binary regression analysis. In each case, the regression coefficient B, 95% confidence interval, standard error and significance p are presented. Subdomains that had no significant effect in the univariate analyses and those that were significant only for the two parameters physical therapy and surgery were not considered in the further multivariate analyses. Normality tests were performed for all scales of each questionnaire. Skewness, kurtosis and test values for Shapiro–Wilk test were reported. Maximum likelihood factor analysis was performed on all 16 items of the FACE-Q questionnaire. Kaiser–Meyer–Olkin value (KMO > 0.5) and Bartlett’s test of sphericity (p < 0.05) were performed to show the appropriateness of the values for factor analysis19. The number of factors was determined and presented as rotated factor matrix.

Results

Patients’ characteristics

Table 1 presents the characteristics of the 213 included patients. The median age of the participants was 57 years. More female than male patients were included (61.5%). The most frequent etiology was an idiopathic facial palsy (44.6% of patients). The duration between initial diagnosis of facial nerve paresis and survey was 72.0 ± 75.8 months (range 1 to 560 months). Therefore, acute and chronic palsy were represented. Over half of the respondents had received physical therapy at some point (59.2%) and a similar proportion had participated in a facial palsy training (58.2%).

Table 1 Characteristics for the participants with facial nerve palsy.

Questionnaires: FACE-Q, FDI and FaCE

The results of the FACE-Q, as well as the FDI and FaCE questionnaires, are presented in Fig. 1. The lowest mean FACE-Q subdomain score was observed for Appearance Smile with a mean score of 35.7 ± 27.7, while the lowest mean score within the three checklists was found for Eye Function (21.4 ± 5.1). For the remaining subdomains Eyes, Face, Forehead and Lips of the domain Appearance, mean scores between 48.9 ± 21.2 and 59.3 ± 19.3 could be determined. It can be observed that, on average, the respondents reported the highest level of impairment in the domain of Appearance. The best results could be reached for the subdomains Eating/Drinking (79.9 ± 23.3) and Speech Distress (79.2 ± 21.0) associated with the domain Facial Function. For FDI, a lower mean score was found for the domain Social/Well-being Function (69.8 ± 20.5) compared to the domain Physical Function (75.0 ± 19.1). The lowest mean score for FaCE was found for the domain Facial Movement (52.2 ± 30.1), while the highest score was found for the domain Social Function (81.6 ± 23.2).

Figure 1
figure 1

Results of the FACE-Q, FDI and FaCE questionnaires. All domains are presented with mean values and standard deviation.

Internal consistency of the three questionnaires

The internal consistency of the three questionnaires is shown in Supplementary Table 1. Cronbach’s alpha values for FACE-Q were ≥ 0.771 for 12 scales. Only the subdomain Breathing showed a lower Cronbach’s alpha of 0.609. The internal consistency of the FDI and FaCE questionnaires both showed high Cronbach’s alpha values of ≥ 0.759 for FaCE and ≥ 0.791 for FDI. Therefore, the internal consistencies for both questionnaires are similar to those shown in the original German validation of the FDI (≥ 0.835) and FaCE (≥ 0.667)14.

Correlation between the three questionnaires and within the FACE-Q

Supplementary Table 2 shows the correlation between FACE-Q, FDI, and FaCE. The correlation from FACE-Q to both questionnaires varied greatly. For FDI, correlations ranged from rho = 0.316 to rho = 0.758. Best correlations of the domain Physical Function existed with rho > 0.630 to the FACE-Q subdomains Eating/Drinking (rho = 0.758; p < 0.001) and Facial Function (rho = 0.743; p < 0.001). In the domain Social Function, the best correlation existed to the FACE-Q subdomain Social Function (rho = 0.655; p < 0.001). For FaCE, correlations ranged from rho = 0.203 and rho = 0.828. The domains Total Score, Oral Function and Facial Movement showed the best correlation to FACE-Q. These domains correlated to the FACE-Q subdomains Facial Function (rho = 0.828; p < 0.001). Eating/Drinking (rho = 0.787; p < 0.001) and Facial Function (rho = 0.786; p < 0.001), respectively. On average, correlations from FACE-Q to FaCE (rho = 0.501) were slightly better compared to FDI (rho = 0.495; except for one correlation, all p < 0.001). Correlations within the questionnaire FACE-Q are shown in Supplementary Tables 35. Correlations ranged from correlation coefficient rho = 0.150 (correlation Breathing to Eyes) to rho = 0.836 (correlation Smile to Face). The average for FACE-Q was rho = 0.518 (all p < 0.025).

Univariate analysis of associations between clinical parameters between FACE-Q subdomains

The results of the univariate analysis indicate that 10 of 12 variables had a statistically significant univariate impact on the outcomes of individual subdomains of the FACE-Q (all p < 0.05; cf. Table 2). Variables that affected any of the subdomains were: sex, age, duration, idiopathic cause, neoplastic cause, postoperative cause, drug treated, participation in facial mimic training, physical therapy and surgery. Participation in physical therapy was significantly associated with most subdomains (all subdomains except Breathing). Patients who received physical therapy had lower mean scores (range, 20.1 ± 4.9 to 76.7 ± 21.5), indicating more impairments within these subdomains, than patients who did not (range, 23.3 ± 4.8 to 82.8 ± 19.7). The variables inflammatory cause (yes/no) and any therapy (yes/no), both did not reach a significant level in any subdomain and remained at most marginal (all p > 0.05). The subdomain Breathing is the only subdomain that was not associated with any of the variables studied (all p > 0.05).

Table 2 Associations between the clinical parameters and FACE-Q outcomes.

Multivariate analysis of independent associations between clinical parameters between FACE-Q subdomains

Tables 3, 4 and 5 present the results of the multivariate analysis. Three models were calculated, with the first model (Table 3) including univariate significant clinical parameters, the second model (Table 4) including physical therapy and the third model (Table 5) including surgery in addition. Significant associations between clinical parameters and FACE-Q domains were found in the linear regression models. The interval between facial palsy onset and the survey as well as idiopathic cause were significantly associated with the subdomain Appearance Face, with a longer interval having a negative effect and an idiopathic cause having a positive effect on the results. For each additional month that the patient was affected by facial palsy, the Rasch score decreased by 8.6 points (95% CI 2.09–15.18; p = 0.010), i.e. the longer the onset was, the better was the facial function based on the FACE-Q domains. Patients with idiopathic cause scored 8.79 points higher (95% CI 2.19–15.37; p = 0.009), i.e. better facial function, than patients with known etiology. Patients with a shorter duration (< 55 months) scored higher compared to patients with a longer duration (> 55 months) in six domains: Appearance Eyes, Appearance Face, Appearance Smile, Appearance Distress, Eye Function, Facial Function. An idiopathic cause of facial nerve palsy was significantly associated with higher scores in all domains except the subdomains Speech, Breathing and Speech Function. Gender was independently related to three domains, including Appearance Lips, Appearance Distress, and Eating/Drinking. In these domains, males obtained higher scores compared to females. Age was only associated with the domain Eating/Drinking, with patients aged 57 and younger achieving higher scores than older patients.

Table 3 Multivariate model 1: independent associations between clinical parameters and the domains of the FACE-Q.
Table 4 Multivariate model 2: Independent associations between clinical parameters including also physical therapy and the domains of the FACE-Q.
Table 5 Multivariate model 3: independent associations between clinical parameters including also surgery and the domains of the FACE-Q.

Additionally, physical therapy was included in in another multivariate model (Table 4). It emerged as an independent predictor for the subdomains of Appearance (Eyes, Face, Forehead, Lips, Smile), as well as Appearance Distress, Psychological Function, Eating/Drinking and Eye Function. Patients who participated in physical therapy showed lower scores, indicating more impairment, than those who did not undergo physical therapy. The strongest effect was seen in the subdomain Facial Function, where patients who participated in physical therapy scored 12.89 points lower (95% CI 6.47–19.32; p < 0.001) than those who did not. Even taking these effects into account, the parameter idiopathic cause remained an independent predictor for all mentioned FACE-Q domains. Similarly, when the parameter surgery was added in a third model (Table 5), it was found to be independently and significantly associated with subdomains Appearance Face, Appearance Eyes, Appearance Smile, Appearance Distress, Psychological Function, Social Function, Eating/Drinking and Facial Function. Patients who did not undergo surgery displayed higher scores than those who received surgery. The strongest effect was seen in the subdomain Eating/Drinking, where patients who underwent surgery scored 12.63 points lower (95% CI 5.41–19.86; p = 0.001) than those who did not.

Factor analysis

Supplementary Table 6 shows the normality tests for the questionnaire scales. The KMO = 0.889 confirmed the suitability for factor analysis. Bartlett’s test of sphericity, x2 = 2645.98 (p < 0.001), showed significantly high correlations between items for MLFA. Three factors in combination were able to explain 65.72% of the variance. The scree plot justified keeping three factors. Supplementary Table 7 shows the factor loadings after rotation. Based on the given original structure of the FACE-Q (main domains), the items that cluster on the same factor suggest that factor 1 represents appearance, factor 2 function and factor 3 quality of life. We performed the same analysis twice, once including all items and once excluding the two checklists (Eye Adverse Effect, Face Adverse Effect). Both analyses gave very similar results (KMO = 0.897; x2 = 2389.57; p < 0.001) with the importance that three factors were found.

Discussion

Patient-reported outcome (PROM) instruments can provide valuable information about patients’ subjective quality of life. Patients with facial palsy often suffer from the disease for several months or even life-long, and their overall quality of life is impaired20. The results of the PROMs can be used clinically with these patients to tailor therapy to their individual needs and improve their quality of life. To ensure reliable use, an instrument must be validated and its reliability demonstrated in the target language and cultural context13.

The present study showed that the translated German version of FACE-Q Paralysis has good to excellent consistency, as described for the original English version8. No difficulties were encountered in the translation process due to cultural differences. Patients had no difficulty in clearly understanding individual questions in German. The German version showed good to excellent internal validity, except for the subdomain Breathing. Cronbach’s alpha ranged from 0.77 to 0.97. For the original English version, the values ranged from 0.78 to 0.96. In both this study and the original version, the subdomain Breathing had a lower alpha of 0.61 and 0.71, respectively8. Thus, the values for the German version were in a similar range as those of the original English version. A possible reason for the lower internal consistency for this subdomain could be that it assesses many different aspects related to breathing, such as breathing while eating, sleeping or exercising. Thus, the construct measured may be highly diverse, resulting in a lower internal consistency21. The correlations within the FACE-Q Paralysis show higher intercorrelations within the scales of each domain than with other domains, as was also shown for the original English version8.

A major aim of this study was to investigate the domain Appearance, which is not covered by other validated PROMs such as the FDI and the FaCE5,15,16. On average, the domain Appearance consistently produced lower scores across all subdomains. The mean scores within the domain Appearance ranged from 37.5 to 59.3, while the mean scores of the other domains ranged from 59.5 to 79.9. The subdomain Smile had the lowest mean score of 37.5. According to previous research, the visibility of the teeth and the position of the upper lip are crucial predictive variables of attractiveness22. In patients with facial palsy, these aspects are often affected, as they are unable to achieve a meaningful excursion when smiling, even with maximum effort23. The patient’s perception of this altered smile can be confirmed by the outcome scores of the FACE-Q domain Smile. IN contrast, the subdomain Forehead had the highest score of 59.3. The forehead is known to have a lower correlation with overall attractiveness than other facial features24. It is often less visibly affected by motor impairments and can be easily covered by hairstyles, hats or other accessories. While the appearance of the forehead may remain relatively unchanged, there may be limitations in the ability to furrow or raise the eyebrows. In general, the lower the score, the greater the impairment and the higher the level of distress about one’s appearance. Dissatisfaction with one’s appearance may contribute to lower self-esteem. This has already been shown in a previous study by Norris et al.25. It is important to clinically assess this aspect of self-perception at an early stage in order to offer psychological support to patients if needed.

To determine which parameters might predict FACE-Q Paralysis scores, we further investigated the influence of potentially contributing factors on the scores. Indeed, the following factors were found to be independent negative predictors for the PROM: longer interval to the onset of the palsy (> 55 months), female sex, age (> 57 years), physical therapy and surgery. On the other hand, an idiopathic cause of facial paralysis correlated positively with FACE-Q scores. In contrast, other etiologies and other adjuvant therapies did not significantly predict scores. In the domain Appearance, the presence of an idiopathic cause of facial palsy was unexpectedly a significant predictor in all subdomains, correlating with higher scores on the FACE-Q. The subdomains Appearance Distress, Psychological Function, Social Function, Eating/Drinking, Eye Function, Facial Function and Eye Adverse Effect were also positively influenced by the presence of an idiopathic cause. The median interval form the survey to the onset of the palsy of patients with idiopathic cause was 4 years. Given that idiopathic paresis has the best prognosis and therefore the highest likelihood of recovery, it may well be that many of the patients with idiopathic causes were already within the range of probable recovery and therefore suffered less disability26. In addition, patients with idiopathic facial palsy are more often affected by paresis than by paralysis. This could also lead to higher scores for this group27. Other etiologies were not found to be significant in this study.

Longer interval to the onset of the palsy (> 55 months) correlated with lower scores in the subdomains Appearance Eye, Face and Smile, as well as Appearance Distress, Eye Function and Facial Function. As all recruited patients presented to the Facial Nerve Centre, it can be assumed that patients with long-term problems, who are not affected by rapid recovery, were more likely to be included in this study28. Another reason for lower scores in the domain Appearance could be the shift in focus due to a longer interval to the onset of the paralysis. The focus on facial functions, which is mainly present in the acute phase, may be compensated by a longer interval or by a habituation effect, and be replaced by a focus on limitations in appearance28.

As expected from previous studies, longer interval to onset of the palsy was not shown to be a significant predictor in the subdomains Psychological and Social Function, among others29. These aspects of quality of life may have adapted over time30. Satisfaction with appearance, on the other hand, which is negatively affected by longer interval, may remain unchanged or even worsen over time.

The two variables physical therapy and surgery turned out to be independent predictors in several subdomains. Patients who had undergone physical therapy or surgery had lower scores, indicating greater impairment in these subdomains. Rather obviously, patients who are in need of such therapies tend to have greater severity of facial palsy. Reconstruction surgery is mainly performed in patients with facial paralysis, with much more severe facial dysfunction. This is reflected in lower scores. It would be interesting to carry out further research and compare the results of the FACE-Q before and after respective treatment to determine possible changes in quality of life due to the treatment.

The main limitations of the study are, on the one hand, the selection of the parameters considered as possible predictors and, on the other hand, the presence of selection bias. Since all included patients were recruited through a facial nerve center, it can be assumed that more severe cases, more chronic than acute cases and patients specifically seeking therapy to improve their impairments were included in this study than would be expected in a more representative sample of individuals affected by facial palsy28. A more comprehensive examination should be carried out to identify further possible factors influencing the questionnaire, such as comorbidity and current status of palsy. We further cannot exclude selection bias due to the effect that 73% of the patients contacted did not answer at least one of the questionnaires. In addition, no objective assessment of facial nerve function was recorded. In a future survey, this could even be provided by the patients themselves, for example using the Sunnybrook grading, to investigate the relationship between subjective perception and a functional assessment31.

Conclusion

The German version of the FACE-Q paralysis module works well in adult patients with facial nerve palsy. We were able to identify predictors in our cohort for the different scales. Knowledge of these influencing factors can be useful for clinicians in order to reduce the psychological impact of facial nerve palsy and provide early supportive interventions in areas of individual importance to patients.