Abstract
Rating scales are the dominating tool for the quantitative assessment of mental health. They are often believed to have a higher validity than language-based responses, which are the natural way of communicating mental states. Furthermore, it is unclear how difficulties articulating emotions—alexithymia—affect the accuracy of language-based communication of emotions. We investigated whether narratives describing emotional states are more accurately classified by questions-based computational analysis of language (QCLA) compared to commonly used rating scales. Additionally, we examined how this is affected by alexithymia. In Phase 1, participants (N = 348) generated narratives describing events related to depression, anxiety, satisfaction, and harmony. In Phase 2, another set of participants summarized the emotions described in the narratives of Phase 1 in five descriptive words and rating scales (PHQ-9, GAD-7, SWLS, and HILS). The words were quantified with a natural language processing model (i.e., LSA) and classified with machine learning (i.e., multinomial regression). The results showed that the language-based responses can be more accurate in classifying the emotional states compared to the rating scales. The degree of alexithymia did not influence the correctness of classification based on words or rating scales, suggesting that QCLA is not sensitive to alexithymia. However, narratives generated by people with high alexithymia were more difficult to classify than those generated by people with low alexithymia. These results suggest that the assessment of mental health may be improved by language-based responses analyzed by computational methods compared to currently used rating scales.
Similar content being viewed by others
Introduction
Quantitative assessments of psychiatric conditions are typically based on standardized questionnaires. Major depressive disorder (MDD) and generalized anxiety disorder (GAD) are two leading psychiatric conditions1. Improving the screening accuracy of tools is paramount as more accurate and faster assessment is associated with better treatment response and long-term outcomes2. Previous research has identified that the inability to recognize and describe one’s own emotions is strongly associated with depression and anxiety3. This trait is known as alexithymia, which occurs in ~10% of the general population4. Individuals with alexithymia tend to lack mental representations to experience emotions as identifiable and describable feelings, and thus may also experience them to a lesser extent5.
Rating scales are well-established assessment tools in current research. MDD and GAD are largely diagnosed and assessed through rating scales, where these scales are based on the DSM-56,7. The employment of these scales is effective as they allow symptoms of psychopathology to be quantified. Perhaps one of the biggest advantages of rating scales is that they allow clinicians to assess specific constructs, as well as the symptoms or perceived causes of a symptom. In general, rating scales can be advantageous as they are easy to utilize, efficient to apply, and simple in their processing and evaluation. However, rating scales also have disadvantages as they require participants to translate their mental states into one-dimensional answers that may not be relevant to the unique and complex mental states of the patients. This raises the question of whether rating scales should be the preferred method for the assessment of mental health8.
Expressing feelings and emotions with language may be more intuitive than numeric ratings, as language is the natural way to communicate mental states. Language-based statements are a source of greater inter-individual variation than rating scales. Psychometric properties of rating scales are often challenged, which also applies to scales measuring alexithymia9. Therefore, unidimensional surveys may not comprehensively assess the complexity of mental states. Rating scale statements such as “I am sad” may be too rigid and lack sufficient nuance to create an accurate image of one’s feelings and emotions8. Following this, language-based measurements may have better validity than rating scales in assessing emotional states, and thus serve as a good addition to current practice.
Verbal language has been shown to allow individuals to communicate their inner thoughts and feelings to others10. Similarly, questions based on language measurements permit the assessment of mental states without priming participants with the to-be-measured construct8. Recent research has shown that computational models that create multidimensional quantification of texts (e.g., latent semantic analysis, LSA) can be used to assess depression and anxiety, either independently or jointly with rating scales8.
Emotional states such as anxiety and depression, or harmony and satisfaction are closely related concepts, which typically show high correlations using rating scales. However, combining the assessment of all four constructs may allow for a more complex assessment of MDD and GAD8. By evaluating these closely related concepts together, a fuller picture of an individual’s emotional landscape could be gained. This integrated approach helps in identifying the complex interplay of positive and negative emotional states, enhancing the accuracy of diagnosis and potentially leading to more tailored and effective treatment strategies. Measuring both the negative and positive emotional states provides a more comprehensive assessment of the patient’s emotional functioning than assessing negative states alone. Life satisfaction is strongly associated with the absence of depression and anxiety11. Similarly, harmony has a positive effect on depression and anxiety, acting as a buffer12. Furthermore, MDD and GAD share high comorbidity, where up to 75% of individuals suffering from a depressive disorder also had lifetime comorbidity of an anxiety disorder13. DSM-5 separates MDD and GAD, however, the assessment of MDD also includes symptoms of GAD14.
The construct of alexithymia was introduced by Nemiah and Sifneos15. People who are high on this trait have deficits in cognitive processing and regulation of emotions16, as well as being less organized and differentiated in emotional schemata17. This entails a lower ability to make accurate interpretations of their own and others’ emotions. Alexithymia is also connected to a deficit in language expression and reception16.
Multiple Code Theory18 posits that different types of information, including emotions, are processed using different coding systems. It offers a framework for understanding some of the challenges faced by individuals with alexithymia. Multiple Code Theory suggests that emotional information is one of the types of information that the brain encodes and processes using a specific code. In individuals with alexithymia, there may be a disruption or deficiency in this emotional coding system. This disruption could lead to difficulties in recognizing and labeling emotions, as these individuals may not process emotional information as effectively as those without alexithymia19. Language-based assessment of emotional states involves writing a short text about an event connected to an emotional state. As alexithymia relates to individuals’ linguistic abilities, it may lower the accuracy of language-based assessments. Previous research on language-based computational assessment of mental health has successfully shown high validity in a normal population8. As this and previous studies have not controlled for alexithymia, there is a risk that participants with high levels of alexithymia may not be a suitable group for computational language-based assessment of mental health. Wotschack and Klann-Delius20 investigated alexithymia proficiency using an emotion identification task, which showed that participants with high levels of alexithymia have difficulties in assessing anxiety or depression in themselves and others. Participants with high alexithymia scores produced fewer emotional words and synonyms than participants scoring low on alexithymia20. This suggests that alexithymia may be connected to decreased and less varied vocabulary for conceptualizing and expressing emotions and, in turn, make it harder for rating scales to quantify depression or anxiety. However, research shows that the fundamental ability to name emotions might not be compromised in individuals with high levels of alexithymia. One study21 indicated that people with high levels of alexithymia were able to name emotions similarly to the control group without alexithymia. This suggests that while experiencing and processing emotional states might be impaired in these individuals, their ability to cognitively label emotions may remain intact. This observation aligns with the Referential Process19, which describes the transition from experiencing emotions on a bodily, subsymbolic level to articulating them symbolically through language. While alexithymic individuals may cognitively recognize emotions, they face challenges in connecting these labels to actual emotional experiences and bodily sensations. Additionally, the same study indicated that alexithymic individuals exhibit pronounced difficulties with emotional empathy, a component of emotional processing that demands a deeper understanding and sharing of emotions, as opposed to the more straightforward cognitive empathy tasks like emotion labeling. These findings highlight that although basic emotion naming is not necessarily impaired, the richer language and emotional openness required for nuanced expression and empathetic communication are limited in alexithymia.
Consequently, despite possessing the capability for basic affect labeling, individuals with alexithymia may struggle to express emotions in texts. A literature review found that alexithymia is associated with lower verbal expressiveness in 14 out of 15 studies10. Furthermore, they are less open about their emotions compared to participants with low alexithymia scores22. Individuals with higher levels of alexithymia tend to rate more extreme emotions, such as anger and fear, as less intense compared to those with lower alexithymia scores. This diminished emotional sensitivity in individuals with high alexithymia levels suggests that they require greater emotional intensity to perceive certain emotions accurately23,24. Following the results from previous research, it is unclear if language-based measures apply to individuals with alexithymia.
Previous studies of question-based computational language assessment (QCLA) of narratives have shown a high correlation with rating scales25. Here we investigated whether QCLA has higher accuracy in the classification of emotional constructs in narratives compared to rating scales, and to what extent accuracy depends on low or high alexithymia. Based on the previous research, the following hypotheses were formed:
H1: Computational assessment of language-based responses is more accurate in classifying emotional narratives compared to rating scales.
H2: Language-based computational assisted classification of emotional narratives written by participants with high, compared to low, alexithymia scores are more difficult to evaluate.
H3: Language-based computational assisted classification of emotional narratives is less accurate when the words or rating scales assessments are made by participants with high, compared to low, alexithymia scores.
Results
Classification
The percentage of correct classifications of emotional states based on word responses (62%) was significant (X2(1, 231) = 19.48, p = 0.0000, φ = 0.29), and almost twice as large, compared to when they were based on rating scales (33%, Table 1 and Fig. 1). The baseline probability for correct classification (i.e., the likelihood for guessing) is 25%. Basing the categorization on individual items of the four rating scales (26 items in total, i.e. 9 items for PHQ-9, 7 for GAD-7, and 5 for SWLS and 5 HILS) did not change the categorization accuracy (30%) compared to using the summed scores (X2(1, 231) = 0.24, p = 0.6236, φ = 0.03). There was no significant difference between using only word responses compared to combining rating scales and word responses. The percentage of correct classification divided into the four emotions is shown in Table 2.
The confusion matrix for the Phase 2 data is found in Table 3. This table shows that the rating scales (in the upper part of the table) have more errors than the word measures in (lower part of the table).
Various classification measures are found in Table 4, including accuracy, precision, specificity, sensitivity, and F1 scores.
Test–retest of narratives in Phases 1 and 2
A multiple linear regression was conducted to predict empirical rating scales. The results showed significant Pearson correlations between the empirical ratings and estimated values rating from r = 0.33 to r = 0.62 (r(pred) in Table 5). The Pearson correlation between Phases 1 and 2 estimates based on word responses was higher (0.48 ≤ r ≤ 0.77) for all emotions compared to corresponding correlations for rating scales (0.17 ≤ r ≤ 0.48).
Alexithymia of narrators in Phase 1
All participants were divided into low and high alexithymia based on PAQ scores using a median split (i.e., PAQ ≤ 68 and PAQ > 68, respectively). Participants in Phase 2 had a higher percentage of correct classification when the evaluated narratives were written by participants (in Phase 1) with low compared to high alexithymia. This was true when classification was based on words (68% versus 55% correct classification, t(226) = 2.03, P = 0.04, two-tailed) or rating scales (39 versus 26% classification, t(226) = 2.10, P = 0.04, two-tailed) (Table 1).
Alexithymia of evaluators in Phase 2
Participants with high and low PAQ scores in Phase 2 did not differ in the percentage correct classification. This was not significant using t-tests neither when classifications were based on words, nor when it was based on rating scales (Table 1).
Mean measure of emotions divided into high and low alexithymia scores for Phases 1 and 2
The mean rating scores for each of the four rating scales were significantly higher for the participants with low PAQ scores, compared to those with high PAQ scores; however, no differences were found for the associated language-based estimates of these scores (Table 6).
Standard deviations measure of emotions divided into high and low alexithymia scores
The standard deviations of HILS and GAD-7 were larger for the participants with low PAQ rating scales compared to those with high scores, where the latter finding also was true after Bonferroni correction for multiple comparisons. However, there were no differences in the word estimates of these scores (Table 7).
Emotional word clouds
Word clouds were generated to describe the words indicative of the four emotional states. Figure 2 shows 25 words with the highest estimates of HILS (black), SWILS (red), PHQ-9 (blue), and GAD-7 (turquoise) scores, respectively. High PHQ-9 scores were associated with depression, sad, etc., whereas high GAD-7 scores were associated with anxious, worried, nervous, etc. Both HILS and SWLS were associated with happy and content, whereas the former was associated with calm, peaceful, etc., and the latter with satisfied, grateful, etc. All aforementioned words are significantly larger than the remaining words on the scales following Bonferroni correction for multiple comparisons. The font size is coded for word frequency in the data set.
Word clouds of alexithymia and depression
Figure 3 shows four-word clouds where the leftmost are associated with low PAQ scores (≤68) and the rightmost with high PAQ scores (>68). The Pearson correlation between estimated PAQ from words and PAQ rating scale where r = 0.18 (P < 0.001). The upper clouds are associated with high PHQ-9 scores and the lower with low PHQ-9 scores, where the Pearson correlation between estimated PHQ-9 scores from word response and PHQ-9 scores were r = 0.62 (P < 0.001).
Discussion
The present study aimed to investigate whether question-based computational language-based responses or rating scales-based measures show higher validity in classifying emotional narratives and how this relates to alexithymia. The results indicate that language-based responses in emotional narratives can have higher accuracy than rating scales in distinguishing levels of depression, anxiety, satisfaction, and harmony. Participants with high alexithymia scores generated narratives that were more difficult to classify than those with low scores. However, the alexithymia score of the evaluators did not influence the probability of correct classification. The word-based measure had a higher test-retest correlation than the rating scales. Finally, the mean and standard deviation of the alexithymia rating scale scores depended on the alexithymia score, but not on the corresponding semantic estimate.
The present study shows that question-based computational language-based responses more accurately classify the four examined emotional states compared to rating scales. This finding goes beyond the results of Kjell et al.8 showed reasonably high correlations to rating scales. Furthermore, Kjell et al.8 showed higher validity of computational language-based responses in the classification of emotional states of pictures of faces compared to rating scales, when the emotional states were generated by actors. However, our study is the first to show higher classification accuracy based on language material generated by the participants. Furthermore, the effect size in ref. 8 was rather small (i.e., an improvement of 5% where language-based responses showed 83% correct classification compared to 79% for rating scales) compared to the current study (i.e., an improvement of almost 30%). Furthermore, the test–retest correlation between Phases 1 and 2 was considerably larger for language-based responses compared to the rating scales measured for each of the four emotions.
Language-based measures also have other benefits than higher validity and reliability compared to rating scales. Perhaps most importantly, language is the natural way that people communicate their mental states, whereas numerical communication of these states rarely occurs in real-life situations. Empirical data shows that patients prefer communicating their feelings of depression using language as they find it to be more accurate, precise, and natural compared to rating scales26. In addition, one aspect where rating scales outperform language is the ease of usage and speed26. Furthermore, language-based communication of mental health allows for person-centered care, where the unique situation of patients can be communicated, whereas rating scales are unidimensional and do not allow free expressions.
Another aspect of language-based communication is that it allows for both assessment and treatment of mental health. It is well established that repetitive, expressive writing about self-experienced traumas improves mental health as measured by depression, anxiety, or post-traumatic stress disorder (PTSD)27. Combining the expressive writing phenomena with the results from the current study indicates that prompted language-based responses can both assess and treat mental health.
Another aspect of language-based responses is that they can be used to visualize and define mental constructs using word clouds, which is not possible with rating scales. Figure 2 shows indicative words related to narratives of depression, anxiety, satisfaction, and harmony. Depression and anxiety clearly show different word patterns, where these words provide meaningful descriptions or definitions of the two mental states. Furthermore, they also clearly differ from words related to harmony and satisfaction. However, the word clouds for harmony and life satisfaction are rather similar.
The natural language processing methods have recently been significantly improved by deep neural network models that are enhanced by attention mechanisms called transformers. For example, bidirectional encoder representations from transformers (BERT)28, which is perhaps the most widely cited transformer-based language model, has shown great performance in a number of language tasks. However, the main advantage of transformer-based models is that they are able to understand the grammatical context that is in free texts, whereas the current data is based on context-free descriptive words where non-contextual models such as LSA perform well. The current data was also analyzed using BERT, however, although the results were similar, it did not improve over the used LSA model.
The alexithymia scores did not influence how accurately the participants classified the emotions, neither based on rating scales nor on language-based responses. These results uphold the notion that simply being able to identify emotions by name does not automatically negate the possibility of alexithymia, since the disorder encompasses a deeper challenge in emotional consciousness and engagement beyond mere verbal identification21.
However, our results found that the participants with high alexithymia scores generated narratives that were more difficult to classify than those of the participants with low scores. This was true both when the classification was based on rating scales and on word responses. This is also in line with previous research showing that individuals with alexithymia often struggle with the Referential Process, which involves translating subsymbolic emotional experiences into symbolic, linguistic representations. According to Multiple Code Theory, this difficulty arises because those with alexithymia may not efficiently bridge the gap between the bodily, affective experience of emotions and their cognitive, verbal expression. Consequently, their emotional narratives lack the richness and clarity seen in individuals without alexithymia, making their emotional language more ambiguous and harder to categorize using standard linguistic and psychological tools19.
These results are consistent with previous findings showing that alexithymia is a deficit in language expression16. However, our results do not support the previous findings that imply that individuals with alexithymia have difficulties taking the perspective of another person, which in turn would make the identification of other people’s emotions in the text more difficult29. found that individuals with alexithymia did not show more narrative engagement when reading a first-person story in comparison to a third-person story. However, individuals with low alexithymia scores showed more narrative engagement29. The authors argued that the reason for this effect is that individuals with alexithymia have difficulties in mentally simulating another person’s perspective. Alexithymia was not found to have an effect on judging someone else’s emotional text written in the first-person perspective in our study.
Furthermore, alexithymia is linked to mentalizing deficits, which are related to the inability to imagine other people’s mental states30. Therefore, it was expected that participants with high alexithymia scores provide significantly lower classification accuracy in the word measure than participants with low scores, which was not found in our data. The difference in correct predictions for alexithymia related to the participants generating narratives versus evaluating the narratives may be understood by the fact that it is more difficult to generate than evaluate emotional narratives20.
A limitation of the study is that it was conducted in a normal population. A pre-screening with a selection of participants with alexithymia scores above certain criteria may have generated stronger effects. Nevertheless, the prevalence of being diagnosed with alexithymia in the general population is ~10%4.
Additionally, self-reports were used as a measure of alexithymia. While they may have biases, especially in self-evaluative capacities, which can be limited in those with alexithymia, they still offer valuable information on how people view their emotional abilities and challenges31. Using self-reports also allows for consistency with existing literature, facilitating comparison and contributing to the broader understanding of alexithymia9. These are practical for large-scale data collection and are well-validated in psychological research, making them suitable for studying alexithymia. They provide crucial insights into individuals’ self-perceived emotional experiences, which is especially relevant for alexithymia, characterized by difficulties in processing and communicating emotions. Despite potential limitations in capturing the full spectrum of emotional awareness, self-report data remain a useful and important aspect of psychological research on emotional processing disorders.
Furthermore, the study was conducted online, leaving less possibility to monitor and control how the participants conducted the task. To minimize this effect, we used control questions to detect participants who did not follow the instructions.
Finally, perhaps the biggest limitation is that the rating scales in Phase 2 were not filled in by multiple people for each text. This means that the inter-rater reliability cannot be assessed, and thus the overall reliability of the method cannot be fully assessed. In addition, the texts were not rated by clinicians, and further research would benefit from assessing the inter-rater reliability both on a general level as well as between the general population and clinicians.
Overall, as our study did not include clinicians rating the texts, the results of our study can not be generalized. It is likely that clinicians who have had comprehensive training in psychopathological symptoms would have more success in rating the states correctly. Future research should aim to include clinicians for this reason as well.
Finally, it is likely that constructs that are incorporated into rating scales are constructs that very rarely come up in a spontaneous text. For example, anhedonia, while a central symptom of depression, is not something that one thinks to report when asked to write about their experience of depression. This means that the use of free text may not catch all symptoms of depression and anxiety. Therefore, the use of free text in this form should be limited to the triage of psychopathology at this stage, as it can not replace the use of rating scales by a qualified clinician.
The possibility of improving the assessment of mental health with language-based research has great potential. Future research should aim at a computational-based assessment of language to improve the evaluation of mental health. More accurate, earlier, and reliable measures of mental health may allow more specific treatment and increase the possibility of recovery and shorter sick leaves. Future research should also aim at evaluating whether computational language-based methods are more accurate in the assessment of depression and anxiety according to the DSM-5 criteria in clinical settings. Language-based measurements could be compared to other methods than self-report measures, such as psychological interviews to improve the diagnostic process. Narratives generated for specific emotional states could offer insight into less common manifestations of depression and anxiety, which are missed when using traditional methods. The research into semantic methods is nascent and its generalizability to different groups should be further examined.
Future research may investigate the validity of language-based responses in measuring other psychological constructs than depression, anxiety, satisfaction, and harmony. For instance, an interesting question would be if language-based responses may improve the validity of measuring the BIG-5 personality traits. Another possibility is to investigate whether language-based models are a valid method for the assessment of other complex disorders, such as personality disorders.
The present study indicates that a computational-based assessment of classifying emotions based on word responses can be more accurate than classifications based on rating scales. The classification accuracy did not depend on the alexithymia scores, however, people with high alexithymia scores generated narratives that were more difficult to classify. This suggests that computational assessment of language may be a valid method for classifying emotional states that are also applicable to individuals with alexithymia. Future research should investigate whether computationally assisted language-based response improves the accuracy of assessment of mental health in clinical settings.
Methods
Transparency and openness
We report how we determined our sample size, all data exclusions, and all measures in the study, and we follow JARS32. Data were analyzed using the online software Semantic Excel (semanticexcel.com; see also ref. 33) and IBM SPSS Statistics for Windows, Version 27. This study’s design and its analysis were not pre-registered.
Participants
The inclusion criteria were adult US residents with English as their first language and an age of 18 years or older. Participants gave written consent to participate in the study, and were excluded from the analysis if they did not consent, did not complete the entire survey, or answered the control questions incorrectly. They were also excluded if they did not follow the instructions in their free text or descriptive word responses. For example, the text contained only a single sentence or did not indicate one of the four emotional states (e.g. “Anxiety is a worthless emotion.” “I don’t feel anxiety.”). Based on these criteria, 34 of a total of 150 participants were excluded in Phase 1. In Phase 2, after excluding 18 of 250 participants, 232 participants were included in the analysis.
Of the participants included in the analysis in Phase 1, 78 were female, 31 were male, five were nonbinary and two preferred not to say. The age range was from 20 to 73 years (M = 39.7, SD = 13.9). In Phase 2, there were 150 females, 73 males, seven nonbinary, and two preferred not to say. The age range was from 18 to 79 years (M = 32.66, SD = 11.15).
Materials
Symptoms of anxiety and depression were assessed with the Generalized Anxiety Disorder scale (GAD-7)34 with seven items such as “How often in the past two months have you felt nervous, anxious or on edge?” (α = 0.93) and the Patient Health Questionnaire (PHQ-9)6 with nine items such as “How often in the past two months have you been bothered by little pleasure or interest in doing things?” (α = 0.90). Both used a scale from not at all (0), several days (1), more than half the days (2), and nearly every day (3).
For measuring harmony, the Harmony in Life Scale (HILS25); was used. It uses five items, such as “My lifestyle allows me to be in harmony” (α = 0.94). The Satisfaction with Life Scale (SWLS)35; contained five items, such as “In most ways, my life is close to my ideal” (α = 0.93).
For the assessment of alexithymia, the Perth Alexithymia Questionnaire (PAQ)17; was used. It consists of 24 items with questions such as “I tend to ignore how I feel” (α = 0.96). The PAQ, HILS, and SWLS were all rated on a 7-point Likert scale ranging from strongly disagree (1) to strongly agree (7).
The narrative participants were required to compose a narrative about a recent experience of depression/anxiety, satisfaction/harmony within the last two months in Phase 1 (~5 sentences) was utilized as material in Phase 2 to be evaluated by another set of participants. All participants were asked to write 5 descriptive words indicating the emotional state described in the narrative (see the supplementary information for the exact instructions). The semantic questions for depression, anxiety, life satisfaction, and harmony in life have been developed and validated in the general population by Kjell et al.8. The exact phrasing of these questions can be found in Supplementary Information.
Procedure
Participants were recruited using Prolific for a compensation of £2 for Phase 1 and £1.25 for Phase 2 (based on a rate of £6/h). The study was conducted in English. Every participant gave consent at the beginning of the study. The survey was published on Qualtrics and was accessed via a link.
The study consisted of two phases. During Phase 1, the participants were asked to write one autobiographical text (~5 sentences) about a period of their life within the last two months during which they experienced one of the four emotional states harmony, satisfaction, depression, or anxiety. They were randomly assigned to one emotional state and instructed not to include the exact word of the emotional state in their narrative (i.e., not to write harmony, satisfaction, depression, anxiety). Participants were informed that the text they wrote would later be read by other participants. Thereafter, each participant was asked to write down five words describing the same emotional state as in their text. Additionally, they were instructed to fill out the rating scales HILS, SWLS, GAD-7, PHQ-9, and PAQ based on the emotions from the narrative. Finally, they were asked about their demographic information. Phase 1 took ~20 min to complete.
During Phase 2, another set of participants first read one of the texts about an emotional state composed by a participant from Phase 1. Then, participants were asked to describe the emotional state of the narratives using five descriptive words. Afterward, they filled out the same rating scales as stated in Phase 1 based on the text they had read to assess the author’s emotional state. Phase 2 took approximately 12 min to complete.
Data analysis
Creation of semantic representations
The descriptive words generated by the participants were quantified using latent semantic analysis (LSA), following the procedure by Kjell et al.8 (see also ref. 36). LSA generates a high-dimensional representation of words based on how they co-occur in a corpus. We selected a database consisting of words generated by participants in experimental conditions related to the five-word generation task in the current experiments (e.g. “Describe whether you are depressed/anxious/satisfied/in harmony with three/five/ten descriptive words) taken largely from Kjell et al.8, but also other related publications. This dataset consisted of 69,167 words in total and 6630 unique words generated from 7088 responses to open-ended questions.
Based on this database, a word-by-word co-occurrence frequency table was generated, where the contexts were the word responses for participants from a given question. The cells in the table were normalized by taking the logarithmic plus one. A data compression algorithm called singular value decomposition (SVD) was applied to this table, where the dimensions are ordered after the accounted variance in the original matrix. The first 300 dimensions were used. This resulted in a semantic representation where each word is represented in a high dimensional vector, which was normalized to the length of one.
A semantic representation of the five words used to describe the narratives in the current study was generated by adding the associated vector in the semantic representation and normalizing the length of the resulting vector to one.
Classifying the emotions of the narratives
Multinomial logistic regression was used to classify the participants’ responses into one of the four possible emotions. The word responses were based on the semantic representation of the five words. The rating scales were based on the total scores of the four rating scales. In addition, we also made a model based on the combination of the total scores of the rating scales and the semantic representations. The multinomial logistic regression was evaluated by a 10% nested cross-validation leave-out procedure37. A model was generated based on 90% of the data and then evaluated by the left out 10% data. To avoid overfitting, the grouping was made so that all responses to a specific narrative were always either in the train or the test datasets. This procedure was repeated ten times so that classification was made on all the data points. In each training fold, a nested cross-validation procedure was used, where the number of semantic dimensions used was optimized, so that the mean number of first dimensions was 33.8 with a standard deviation of 6.5. To increase the size of the training dataset, we included data from a related study with the same number of participants (N = 348)38, that used the same procedure as in this article; however, the data from that study did not include a measure of alexithymia. The training was conducted on data from Phase 1 and Phase 2, but the evaluation of the correct classification was only done on the Phase 2 data. Thus, the size of the training dataset was N = 732, and where this article applies and presents the results of the data, including alexithymia with N = 348.
Word clouds were generated by the same multinomial logistic regression as described above. However, here the semantic representation was based on individual words.
Multiple linear regression was also conducted to predict empirical rating scales, where the number of dimensions was optimized using the same cross-validation and leave-out methods as the multinomial logistic regression.
Data availability
All data have been made publicly available at the Open Science Framework and can be accessed at https://osf.io/7ukf9/.
Code availability
The analysis conducted in this paper can be done in the online software Semantic Excel (www.semanticexcel.com; see also ref. 33). A high-level code can also be found at https://osf.io/7ukf9/.
References
Liu, Q. et al. Changes in the global burden of depression from 1990 to 2017: findings from the Global Burden of Disease study. J. Psychiatr. Res. 126, 134–140, (2020).
Kraus, C., Kadriu, B., Lanzenberger, R., Zarate, C. A. Jr. & Kasper, S. Prognosis and improved outcomes in major depression: a review. Transl. psychiatry 9, 1–17 (2019).
Sagar, R., Talwar, S., Desai, G. & Chaturvedi, S. K. Relationship between alexithymia and depression: a narrative review. Indian J. Psychiatry 63, 127–133 (2021).
Hickman, L. The importance of assessing alexithymia in psychological research. PsyPAG Q. 111, 29–32 (2019).
Taylor, G. J., Bagby, R. M. & Parker, J. D. What’s in the name ‘alexithymia’? A commentary on “Affective agnosia: expansion of the alexithymia construct and a new opportunity to integrate and extend Freud’s legacy”. Neurosci. Biobehav. Rev. 68, 1006–1020 (2016).
Kroenke, K., Spitzer, R. L. & Williams, J. B. The PHQ9: validity of a brief depression severity measure. J. Gen. Intern. Med. 16, 606–613 (2001).
Mordeno, I. G. et al. Development and validation of a DSM-5-based generalized anxiety disorder self-report Scale: investigating frequency and intensity rating differences. Curr. Psychol. 40, 5247–5255 (2021).
Kjell, O. N., Kjell, K., Garcia, D. & Sikström, S. Semantic measures: Using natural language processing to measure, differentiate, and describe psychological constructs. Psychol. Methods 24, 92 (2019).
Preece, D. A. et al. Do self-report measures of alexithymia measure alexithymia or general psychological distress? A factor analytic examination across five samples. Personal. Individ. Differ. 155, 109721 (2020).
Welding, C. & Samur, D. Language processing in alexithymia. In Alexithymia: Advances in Research, Theory, and Clinical Practice (eds Luminet, O., Bagby, R. M., Taylor, G. J.) 90-104 (Cambridge University Press, 2018).
Koivumaa-Honkanen, H., Kaprio, J., Honkanen, R., Viinamäki, H. & Koskenvuo, M. Life satisfaction and depression in a 15-year follow-up of healthy adults. Soc. Psychiatry Psychiatr. Epidemiol. 39, 994–999 (2004).
Carreno, D. F., Eisenbeck, N., Pérez-Escobar, J. A. & García-Montes, J. M. Inner harmony as an essential facet of well-being: a multinational study during the COVID-19 pandemic. Front. Psychol. 12, 911 (2021).
Lamers, F. et al. Comorbidity patterns of anxiety and depressive disorders in a large cohort study: the Netherlands Study of Depression and Anxiety (NESDA). J. Clin. Psychiatry 72, 3397 (2011).
Choi, K. W., Kim, Y. K. & Jeon, H. J. Comorbid anxiety and depression: clinical and conceptual consideration and transdiagnostic treatment. Anxiety Disord. 219–235 https://doi.org/10.1007/978-981-32-9705-0_14 (2020).
Nemiah, J. C., & Sifneos, P. E. Affects and fantasy in patients with psychosomatic disorders. In Modern Trends in Psychosomatic Medicine (ed. Hill, O.) (London: Butterworths 1970).
Luminet, O., Nielson, K. A. & Ridout, N. Cognitive-emotional processing in alexithymia: an integrative review. Cogn. Emot. 35, 449–487 (2021).
Preece, D., Becerra, R., Allan, A., Robinson, K. & Dandy, J. Establishing the theoretical components of alexithymia via factor analysis: introduction and validation of the attention-appraisal model of alexithymia. Personal. Individ. Differ. 119, 341–352 (2017).
Bucci, W. The multiple code theory and the psychoanalytic process: a framework for research. Annu. Psychoanal. 22, 239–259 (1994).
Di Trani, M., Mariani, R., Renzi, A., Greenman, P. S. & Solano, L. Alexithymia according to Bucci’s multiple code theory: a preliminary investigation with healthy and hypertensive individuals. Psychol. Psychother. 91, 232–247 (2018).
Wotschack, C. & Klann-Delius, G. Alexithymia and the conceptualization of emotions: a study of language use and semantic knowledge. J. Res. Personal. 47, 514–523 (2013).
Alkan Härtwig, E., Aust, S., Heekeren, H. R. & Heuser, I. No words for feelings? Not only for my own: diminished emotional empathic ability in alexithymia. Front. Behav. Neurosci. 14, 112 (2020).
Wagner, H. & Lee, V. Alexithymia and individual differences in emotional expression. J. Res. Personal. 42, 83–95 (2008).
Luminet, O., Nielson, K. A. & Ridout, N. Having no words for feelings: alexithymia as a fundamental personality dimension at the interface of cognition and emotion. Cogn. Emot. 35, 435–448 (2021).
Rigby, S. N., Jakobson, L. S., Pearson, P. M. & Stoesz, B. M. Alexithymia and the evaluation of emotionally valenced scenes. Front. Psychol. 11, 1820 (2020).
Kjell, O. N., Daukantaitė, D., Hefferon, K. & Sikström, S. The harmony in life scale complements the satisfaction with life scale: expanding the conceptualization of the cognitive component of subjective well-being. Soc. Indic. Res. 126, 893–919 (2016).
Sikström, S., Pålsson Höök, A. & Kjell, O. Precise language responses versus easy rating scales-comparing respondents’ views with clinicians’ belief of the respondent’s views. PLoS ONE 18, e0267995 (2023).
Pennebaker, J. W. Expressive writing in psychological science. Perspect. Psychol. Sci. 13, 226–229 (2018).
Devlin, J., Chang, M.-W., Lee, K. & Toutanova, K. BERT: pre-training of deep bidirectional transformers for language understanding. In Proc. 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies Vol. 1 (Long and Short Papers) 4171–4186 (Association for Computational Linguistics, 2019).
Samur, D., Luminet, O. & Koole, S. L. Alexithymia predicts lower reading frequency: the mediating roles of mentalising ability and reading attitude. Poetics 65, 1–11 (2017).
Samur, D., Tops, M., Slapšinskaitė, R. & Koole, S. L. Getting lost in a story: how narrative engagement emerges from narrative perspective and individual differences in alexithymia. Cogn. Emot. 35, 576–58 (2021).
Lumley, M. A., Neely, L. C. & Burger, A. J. The assessment of alexithymia in medical settings: implications for understanding and treating health problems. J. Personal. Assess. 89, 230–246 (2007).
Kazak, A. Journal article reporting standards. Am. Psychol. (IF 16.4) https://doi.org/10.1037/amp0000263 (2018).
Sikström, S., Kjell, O., & Kjell, K. SemanticExcel.com: an online software for statistical analyses of text data based on natural language processing. In Statistical Semantics: Methods and Applications (eds Sikström, S. & Garcia, D.) (Springer, 2020).
Spitzer, R. L., Kroenke, K., Williams, J. B. & Löwe, B. A brief measure for assessing generalized anxiety disorder: the GAD-7. Arch. Intern. Med. 166, 1092–1097 (2006).
Diener, E., Emmons, R. A., Larsen, R. J. & Griffin, S. The satisfaction with life scale. J. Personal. Assess. 49, 71–75 (1985).
Landauer, T. K. & Dumais, S. T. A solution to Plato’s problem: the latent semantic analysis theory of acquisition, induction, and representation of knowledge. Psychol. Rev. 104, 211–240 (1997).
Stone, M. Cross-validatory choice and assessment of statistical predictions. J. R. Stat. Soc. B (Methodol.) 36, 111–147 (1974).
Sikström, S., Valavičiūtė, I., & Kajonius, P. Five small words capture the big five: personality assessment using natural language processing (submitted).
Funding
Open access funding provided by Lund University.
Author information
Authors and Affiliations
Contributions
S.S.: conceptualization, methodology, software, formal analysis, investigation, project administration, methodology, supervision, writing, revision. M.I., J.A., S.N.: data collection, writing, visualization, revision. L.S.: revision. All authors have read and approved the submitted version of the manuscript. Each author believes that the manuscript represents honest work, and any discrepancies have been resolved through discussion and consensus among all authors.
Corresponding author
Ethics declarations
Competing interests
The author, Sverker Sikström, was a shareholder and founder of Ablemind.co. All other authors declare no competing interests.
Ethics
The study was approved by the Swedish Ethical Review Authority (Dno. 2021-04627, Validering av semantiska mått).
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary information
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Sikström, S., Nicolai, M., Ahrendt, J. et al. Language or rating scales based classifications of emotions: computational analysis of language and alexithymia. npj Mental Health Res 3, 37 (2024). https://doi.org/10.1038/s44184-024-00080-z
Received:
Accepted:
Published:
DOI: https://doi.org/10.1038/s44184-024-00080-z