Introduction

Major depressive disorder (MDD) and bipolar disorder (BD) are two prevalent mood disorders characterized by significant abnormalities in mood and emotional states [1]. However, the early identification of these disorders remains a challenging due to the lack of valuable biological markers. Many individuals with MDD are not diagnosed early, with only 47.3% being identified by general practitioners [2, 3]. Similarly, BD often presents as a depressive episode, leading to misdiagnosis in 25% of BD patients, with a substantial delay of 5-10 years before receiving the correct diagnosis [2, 4]. Extensive research has been conducted in various fields, including neuroimaging, biochemistry, genetics, and epigenetics, to identify biological markers for mood disorders [5,6,7,8]. However, the practical application of these markers is limited by factors such as high cost, poor feasibility, or inadequate specificity. In light of these challenges, electrodermal activity has emerged as a potential biomarker for mood disorders due to its non-invasiveness, ease of monitoring, affordability, and high sensitivity to the emotional perception [9,10,11].

Electrodermal activity is a sensitive physiological indicator of changes in sympathetic nervous system activity, reflecting the electrical phenomenon occurring in the sweat glands, dermis, and epidermis tissues innervated by the sympathetic nervous system [12, 13]. It provides valuable insights into physiological arousal and, due to its subconscious nature, offers and objective means to investigate emotions [14]. Skin conductance detection and skin potential (SP) detection are the methods employed to measure electrodermal activity. The former measures the changes in electrical resistance by applying a small external current to two skin-attached electrodes, while the latter directly measures the natural potential difference of the limb skin without any external current. Previous studies have demonstrated that patients with MDD exhibit a diminishing performance of skin conductance in response to repeated non-significant stimulus [15]. Another study found that incorporating skin conductance differences between resting and task states into a decision tree classification model achieved a 74% accuracy in distinguishing depression patients from healthy controls [16]. However, the measurement of skin conductance may cause discomfort due to the additional electrical stimulation, and its results may be influenced by physiological factors such as gender and age [17, 18].

On the other hand, SP has been suggested to provide unique psychological information. However, detecting SP signals can be challenging and complex due to variations in both positive and negative directions and the existence of monophasic, biphasic, and triphasic responses [19,20,21,22]. In this study, we have developed a wearable wireless device to measure SP signals between the middle finger and the left wrist. Preliminary experiments have demonstrated the accurate identification of four distinct emotional states (happiness, sadness, anger, fear), elicited by video stimuli using participants’ SP signals alone, achieving a 75% accuracy [23]. These findings suggest that SP signals hold promise as a viable approach for the objective assessment of emotions. Recent hypotheses propose that the SP difference may be attributed to a high concentration of superoxide radicals in the connective tissue, and elevated levels of oxidants, including superoxide free radicals and nitric oxide, have been implicated in oxidative stress and the pathogenesis/physiology of mood disorders [24,25,26]. Therefore, exploring the application of SP as a potential biomarker for the diagnosis of mood disorders is warranted.

To date, few studies have utilized machine learning models to differentiate patients with mood disorders from healthy controls using SP signals during task states. In this study, we recruited patients with major depressive disorder (MDD), bipolar depression (BPD), and healthy controls, and measured SP signals during various emotion-inducing tasks using a customized wearable device. Multiple discriminant models were constructed to distinguish patients from healthy controls by the difference in SP signals. Additionally, we explored the correlation between participants’ SP signals and blood indicators of oxidative stress (Fig. 1).

Fig. 1: Research steps to distinguish patients with mood disorders from healthy controls using machine-learning discriminant models based on task-state skin potential data.
figure 1

In the present study, hematological samples of healthy controls were not collected. A total of five discriminant models were constructed, and the SVM model differentiated groups with the highest classification accuracy. The accuracy of this model in distinguishing major depressive disorder from healthy controls was 78%, and the accuracy in correctly classifying participants into three groups was 59%. BPD, bipolar depressive disorder; MDD, major depressive disorder (unipolar depression); CON, healthy controls.

Methods

Participant

Seventy-seven patients with BPD and 53 patients with MDD were recruited in the First Affiliated Hospital of Zhejiang University School of Medicine, Zhejiang, China. A total of 79 healthy controls were recruited through word-of-mouth or recruitment advertisements at Zhejiang University and the local community. All participants were between the ages of 15 and 55 and had received at least a middle school education. Diagnosis of the disease was based on the Diagnostic and Statistical Manual of Mental Disorder (DSM-IV) criteria. The researchers conducted the Mini International Neuropsychiatric Interview (MINI) screening for all patients to confirm their diagnosis. Healthy participants were asked to confirm that there were no first-degree relatives with schizophrenia, bipolar disorder, or depression. Exclusion criteria for the patient group included: 1) meeting diagnostic criteria for schizophrenia and related spectrum disorders; 2) Young Mania Rating Scale score greater than 6; 3) history of severe head injury (loss of consciousness for more than 5 minutes), current or past epilepsy, intracranial hypertension, or other severe neurological disorders; 4) history of alcohol or substance abuse/dependence within the 6 months prior to testing.

The following scales were used to assess the severity of participants’ anxiety and depressive mood states, including the Beck Depression Inventory (BDI-II), the Beck Anxiety Inventory (BAI), the 17-item version of the Hamilton Depression Rating Scale (HAMD) and the Hamilton Anxiety Rating Scale (HAMA). Blood samples from the participants were collected to test the following indicators: cortisol levels, adrenocorticotropic hormone, uric acid, indirect bilirubin levels, direct bilirubin levels, and prealbumin. All participants were tested for cognitive function, and the assessment instruments included the Trail Making Test, the Symbol Coding Test, the Hopkins Verbal Learning Test-Revised (HVLT-R), the Neuropsychological Assessment Battery-mazes subtest (MAZES), the Stroop Color and Word Test, and the Continuous Performance Test-Identical Pair (CPT-IP). Please see Table S2 in the supplementary material for a detailed description of the cognitive functioning tests.

Ethical approval was granted by the Hospital Ethical Committee in The First Affiliated Hospital of Zhejiang University School of Medicine. All participants or their guardians agreed to participate in this study and signed a written informed consent form.

Apparatus and stimuli

The collection of skin potential data was completed using a customized professional device (model: CTP008) consisting of a small box that contains electronic components and two electrodes. The two electrodes were attached to the middle finger and left wrist of all participants. Please refer to our previous paper for more information about the signal acquisition principle of this device and its details [23]. The device collects SP signals at a sampling rate of 5 Hz. The data was transmitted via Bluetooth to the Mood Monitor Assistant app installed on a smartphone. The collection of skin potential data is synchronized with the presentation time of the stimuli. All participants were instructed to view six stimulus tasks in sequence, which consisted of pictures, videos, or text with specific emotional content. For a detailed description of these six tasks, please refer to Table S1 in the supplementary materials. They are: 1) free viewing task, 2) positive and negative emotion recognition task, 3) semantic stimulation task, 4) situational intervention task, 5) emotion induction task, and 6) text context stimulation task. Stimuli were displayed on a 24-inch monitor with a refresh rate of 60 Hz and were presented in the center of the screen. All participants viewed stimuli at 100 cm. The eyes are aligned with the upper third of the display screen. A camera was placed on the monitor to record the participants’ facial expression information. The researchers used the video data to confirm that the participants were viewing the content on the monitor during the experiment. (Figure S1)

Feature extraction

The data extraction and pre-processing procedures were conducted using Python 3.1. These procedures involved outlier removal, interpolation, and normalization. Each sample data was normalized separately based on specific tasks. Furthermore, to enhance the quality of the SP data, denoising techniques were applied. This involved utilizing a low-pass filtering module in the hardware as well as employing wavelet transform. Wavelet transform is a commonly used method in bio-signal analysis studies [23, 27].

In our study, a total of 18 feature values were derived for each task, including 10 time-domain features and 8 time-frequency domain features. Table 1 provides a detailed overview of these feature values. For the time-domain features, all characteristics, except for the maximum and minimum values of raw voltage, were computed after normalizing the raw SP signal. The computation of time-frequency domain feature energy values was performed using the PyWavelets library in Python. Wavelet packet decomposition, which encompasses dual time-frequency analysis, multi-resolution analysis, and arbitrary multi-scale transformation, which utilized to facilitate the time-frequency analysis of SP signals.

Table 1 Measures and definition of skin potential characteristics.

Data analysis

Basic statistical analysis was performed in R 4.2.2 (https://www.R-project.org/). One-way analysis of variance (ANOVA) was employed to analyze quantitative data, such as age and SP features. Post hoc multiple comparisons were performed using Bonferroni or Tamhane’s T2 correction for inter-group comparisons. Independent two-sample t-tests were utilized to compare two sets of quantitative data, such as HAMD scores. Chi-square tests or rank-sum tests were employed to compare qualitative data, including gender, education level, and clinical characteristics. Statistical significance was defined as a two-tailed p-value less than 0.05.

For discriminant analysis and model performance comparison, the scikit-learn machine learning library in Python was utilized. Several discriminant models, including K-Nearest Neighbor, Linear Discriminant Analysis, Support Vector Machine (SVM), Logistic Regression, and Gradient Boosting Decision Tree, were evaluated for their accuracy in distinguishing patients form healthy controls using SP data. The reported discriminant results in this study were obtained using the SVM model.

SVM is a machine learning algorithm that finds the optimal hyperplane to separate data points into different classes. It creates a decision boundary with the maximum possible margin from the sample points, thereby improving the generalization performance. The hyperparameter settings of the SVM model we set as follows: a penalty coefficient of 1.3, a kernel function parameter of 0.0085, and the Radial Basis Function as the choice of kernel function. To circumvent class imbalance, 50 participants were randomly selected from each group of BPD patients, depression patients, and healthy controls for inclusion in the discriminant model. Leave-one-out cross-validation was used to evaluate the performance of the machine learning model. Sensitivity, specificity, accuracy, F1 score, and area under the ROC were employed as performance metrics for the model evaluation.

Results

Demographic and clinical characteristics

A total of 77 patients with bipolar depression (BPD), 53 patients with major depressive disorder (MDD), and 79 healthy controls participated in this study. Detailed demographic and clinical characteristics are summarized in Table 2. The mean age of BPD patients was significantly lower than the other two groups (P < 0.01). Healthy controls had a significantly higher level of education compared to BPD and MDD patients (P < 0.01). There was no significant statistical difference in the Hamilton Depression Rating Scale (HAMD) and Hamilton Anxiety Rating Scale (HAMA) scores between BPD and MDD patients. Both BPD and MDD patients performed significantly worse than healthy controls in all six cognitive tests (P < 0.05), but there was no statistical difference between the two patient groups (P > 0.05).

Table 2 Demographic and clinical characteristics of patients with bipolar depressive disorder, major depressive disorder, and healthy controls.

The mean duration of BPD in this study was 3.9 years. At the start of the study, 2 BPD patients did not use any psychiatric-related medications, 32 patients used only 1 mood stabilizer, 34 patients used 2 mood stabilizers, and 9 patients used more than 2 mood stabilizers. The mean duration of MDD in the patients in this study was 3.7 years. Until the start of the study, 1 depressed patient did not use any psychiatric-related medication, 46 patients used only 1 antidepressant, and 6 patients used 2 or more antidepressants.

Comparison of SP characteristics under task conditions

The comparison results of SP characteristics in BPD, MDD, and healthy controls under different task conditions are summarized in Table 3 and Table S3-S8 (supplementary materials). No significant differences in SP characteristics were found among the three groups during the task of viewing textual stimuli. However, SP abnormalities were detected in MDD patients compared to healthy controls in all other tasks. Task-related SP abnormalities were also observed in BPD patients during the free viewing and emotion induction tasks. No significant differences in task-related SP were found between MDD and BPD patients in pairwise comparisons among the three groups.

Table 3 The task state skin potential characteristics with significant differences between three groups comparing patients with BPD, MDD, and healthy controls.

Predictive performance of the discriminant model

Fifty participants were randomly selected from each of the three groups (BPD, MDD, and healthy controls), and their SP data were used to construct disease discriminant models for separating patients from healthy controls. Five different models were developed to assess their discriminatory validity. In this study, a total of 18 SP features were detected for each task, and all 108 indicators for the 6 tasks were included in the discriminant model. The discriminant results of the five models are summarized in Table S9. Given support vector machine (SVM) model outperformed other models, the results of SVM will be primarily reported. (Table 4, Fig. 2)

Table 4 The confusion matrix of the support vector machine model for discriminant analysis of patients with BPD, MDD, and healthy controls based on skin potential characteristics.
Fig. 2: ROC curves of the support vector machine model discriminating bipolar depression, unipolar depression, and healthy controls based on skin potential characteristics.
figure 2

ROC receiver operating characteristic, AUC area under the curve, BPD bipolar depressive disorder, MDD major depressive disorder (unipolar depression); CON healthy controls.

The SVM model built with SP features achieved an accuracy of 78% (sensitivity 78%, specificity 82%) in distinguishing MDD patients from healthy controls. It achieved an accuracy of 65% (sensitivity 62%, specificity 62%) in classifying BPD patients and healthy controls. Moreover, the SVM model demonstrated a 69% accuracy (sensitivity 76%, specificity 62%) in differentiating MDD patients from BPD patients. When classifying BPD patients, MDD patients, and healthy controls into three groups, the SVM model achieved an accuracy of 59% (sensitivity 59%, specificity 79%). (Table 4, Fig. 2).

Correlation analysis

The correlation analysis results between SP features under task conditions and clinical features, cognitive performance, and hematological indicators are summarized in Figure S2 and Figure S3 in the supplementary material. Significant correlations were found between several SP features under different tasks and symptom scale scores of BPD and MDD patients. No significant correlations were detected between cognitive performance and SP features in either BPD or MDD patients. In BPD patients, significant positive correlations were observed between their prealbumin levels and SP time-frequency indicators under most tasks, included positive and negative emotion recognition task, semantic stimulation task, emotion induction task, and text context stimulation task (P < 0.05).

Furthermore, in MDD patients, significant positive correlations were found between hematological oxidative stress measures and skin conductance time-frequency indicators under tasks such as free viewing task, positive and negative emotion recognition task, emotion induction task, and text context stimulation task (P < 0.05).

Discussion

In this study, we investigated skin conductance abnormalities in patients with major depressive disorder (MDD) and bipolar depression (BPD) during emotion-inducing tasks and explored the feasibility of differentiating patients from healthy controls based on these abnormalities. We designed six different video, pictorial, and textual emotion-inducing tasks participants to view, and we used a self-designed device to collect SP data. A total of 18 analysis measures were extracted from the time and time-frequency domains for each task. Our findings revealed that an SVM discriminant model based on SP measures could differentiate MDD patients from healthy controls with a 79% accuracy rate. Furthermore, the model achieved a 59% accuracy rate in distinguishing among three groups of MDD patients, BPD patients, and healthy controls. Additionally, we found significant correlations between oxidative stress indicators in the blood of patients and certain SP features.

Previous research has indicated that mood disorders are associated with abnormal processing of emotional experiences triggered by real-life stimuli. Patients with MDD tend to be overly concerned with negative emotions and show reduced sensitivity to positive emotions, while Patients with bipolar disorder may exhibit difficulties regulating emotional response when confronted with emotional stimuli [28, 29]. Emotion regulation involves the control of the limbic system by the individual’s cerebral cortex. Studies have found no difference in attention maintenance between MDD and BPD patients when viewing positive and negative facial expressions, but MDD patients showed increased activity in the dorsal anterior cingulate gyrus when viewing neutral facial expressions [30]. These findings suggest that the physiological mechanisms underlying the processing of different emotional stimuli may vary among mood disorders. Considering that SP may serve as an objective biological indicator of emotional information, we designed six different emotion-inducing tasks to investigate SP characteristics in MDD, BPD patients during the task states.

Our previous studies have demonstrated the successful acquisition of SP signals using our self-developed device [23]. In selecting of skin potential indicators, we calculated both time domain indicators and time-frequency domain measures (frequency range 0-0.0625 Hz), providing a comprehensive characterization of SP features. This approach differs from traditional analysis methods used in skin electrical activity, which typically employ Fourier transform, wavelet basis, or autoregression on raw data to obtain bioelectric signal characteristics [31]. In this study, we applied wavelet transform to obtain time-frequency domain data, as this algorithm offers good denoising capabilities and provides a stable characterization of the SP signals [27]. Our discriminant analysis demonstrated the value of these SP indicators in distinguishing patients from healthy controls, providing valuable parameters for subsequent SP studies.

Prior research suggests that a machine-learning approach to disease identification based solely on brain data should achieve an accuracy rate exceeding 75% [32]. Our study found that an SVM classifier based on SP data achieved a 79% accuracy rate in differentiating MDD patients from healthy controls. This finding is notable for two reasons. First, it highlights that electrodermal activity abnormalities are core pathological features in the affective disorders [15, 16, 33]. Second, the SVM model we constructed proved to be suitable for discriminant analysis using this type of data, as the other models we employed in this study did not achieve the desired accuracy rates. It is worth mentioning that the accuracy obtain by Kim, et al. [16], who used a decision tree classifier to separate MDD patients from healthy controls based on skin conductance data, was only 74% [16]. Another significant finding of our study was that the SVM classifier achieved a 58% accuracy rate in discriminating between bipolar depression, MDD, and healthy controls. Although this accuracy rate is nearly double that of random discrimination among the three groups (33%), it falls short of the ideal accuracy level. The overlapping pathological mechanisms between depression and bipolar disorder, as well as the fact that some first episodes of bipolar disorder present as depressive episodes pose, challenges for distinguishing between the two in both clinical practice and scientific research [1]. The average age recruited MMD participants in our study was 23 years, and it is possible that some patients may develop a manic episode in the future. Therefore, whether skin potentials are used as an indicator of disease state or non-disease state, it is difficult to achieve high accuracy in using skin potentials to differentiate between bipolar depression and depression because of these factors. Future studies could consider recruiting older depressed patients who have experienced more than two depressive episodes and have previously responded to a single antidepressant to collect SP data and validate the discrimination between MDD and BPD.

The physiological mechanisms underlying skin electrical potential generation remain unclear, and the oxidative stress hypothesis has emerged as one potential explanation. Our study found a significant positive correlation between SP measures and blood prealbumin levels in four tasks among BPD patients. Additionally, SP characteristics in depressed patients were positively correlated with various oxidative stress-related blood indicators, including prealbumin, uric acid, indirect bilirubin, direct bilirubin, and albumin. These findings support the hypothesis that SP abnormalities in individuals are closely related to their oxidative stress status of individuals. The selected indicators in the blood that reflect the level of oxidative stress were based on previous studies in the literature [34,35,36,37]. In addition to enzymatic antioxidants such as superoxide dismutase, non-enzymatic antioxidants, including uric acid, bilirubin, and albumin play a crucial role in the body’s endogenous antioxidant defense system. These non-enzymatic antioxidants account for more than 85% of the antioxidant capacity in plasma [36]. Prealbumin, known as transthyretin, which is a carrier protein for the transport of thyroxine and retinol and is also considered a non-enzymatic antioxidant [37]. Oxidative stress, which reflects an imbalance between reactive oxygen species and antioxidant mechanisms, has been suggested to be closely associated with MDD and BD [25]. Bartoli, et al. [38] found that patients with BD had significantly higher levels of uric acid compared to other psychiatric disorders [38]. Another study found that plasma bilirubin was lower in patients with BD compared to healthy controls [39]. Alice et al. found that non-enzymatic antioxidants may have value in differentiating bipolar disorder from depression with 74.9% accuracy based on a decision tree model. And patients with elevated indirect bilirubin and decreased direct bilirubin suggest a higher risk of being diagnosed with BD [40]. The evidence from these previous studies suggests that skin potential merits further investigation as a non-invasive, potentially highly disease-recognizing electrophysiological indicator of affective disorder pathology.

Limitations

Several limitations should be considered in interpreting the findings of this study. Firstly, we did not collect SP data during the resting state (i.e., non-task) of participants, focusing solely on the task state. Future studies should include the collection of resting state data and compare it with the task state to gain a more comprehensive understanding of SP characteristics. Secondly, due to the exploratory nature of this study, we were unable to strictly match the age of participants across groups. While the impact of age on SP remains uncertain, it would be valuable to analyze the potential effects of covariates such as age and current emotional experience on SP characteristics. Although previous research has not provided conclusive evidence regarding the influence of age on SP, it is worth exploring in future studies. Additionally, while facial videos with eye movements were recorded during the experiment, data on participants’ self-reports of their emotional responses to the presented stimuli were not collected. Therefore, further investigations analyzing the possible effects of covariates, including age and current emotional experience, on SP characteristics are warranted. Thirdly, this study employed a cross-sectional design, and the persistence of SP abnormalities in BPD and MDD patients during the task state as their condition improves remains unknown. Longitudinal studies are needed to investigate the temporal dynamics of SP abnormalities in these patients. Lastly, the present study did not examine the differences in SP characteristics between patients with affective disorders and patients with other psychiatric disorders. Future research should include comparisons with other non-affective psychiatric disorders to further validate the specificity of SP abnormalities in affective disorders.

Conclusion

Patients with depression and bipolar depression have abnormalities in task-state skin potential that partially reflect the pathological mechanism of the illness, and the abnormalities are potential biological markers of affective disorders.