Abstract
Due to the effect of emotions on interactions, interpretations, and decisions, automatic detection and analysis of human emotions based on EEG signals has an important role in the treatment of psychiatric diseases. However, the low spatial resolution of EEG recorders poses a challenge. In order to overcome this problem, in this paper we model each emotion by mapping from scalp sensors to brain sources using Bernoulli–Laplacebased Bayesian model. The standard lowresolution electromagnetic tomography (sLORETA) method is used to initialize the source signals in this algorithm. Finally, a dynamic graph convolutional neural network (DGCNN) is used to classify emotional EEG in which the sources of the proposed localization model are considered as the underlying graph nodes. In the proposed method, the relationships between the EEG source signals are encoded in the DGCNN adjacency matrix. Experiments on our EEG dataset recorded at the BrainComputer Interface Research Laboratory, University of Tabriz as well as publicly available SEED and DEAP datasets show that brain source modeling by the proposed algorithm significantly improves the accuracy of emotion recognition, such that it achieve a classification accuracy of 99.25% during the classification of the two classes of positive and negative emotions. These results represent an absolute 1–2% improvement in terms of classification accuracy over subjectdependent and subjectindependent scenarios over the existing approaches.
Introduction
In human daily life, emotions affect interactions, interpretations, and decisionmaking^{1}. In addition, information about emotional states is essential for a more natural human–computer interface. In order to reduce the gap between human–machine interactions, the design of emotion recognition systems has been considered as a major research field in recent decades^{2}. This area is considered as the intersection of artificial intelligence and human communication analysis. Face expressions and speech are mostly used to convey people's emotional states in daily life. However, these situations can be intentionally changed. Therefore, the use of this information will likely lead to the false classification of emotional states^{3,4}. Electroencephalography (EEG) as a noninvasive physiological signal is suitable for direct measurement of the electrical activity of the brain in an emotional state. Hence, the study of these signals makes it possible to truly detect human emotions^{5}. Despite the high temporal resolution of the EEG signal, the low spatial resolution of this signal poses a challenge when used in studies of functional brain activity. In order to increase the spatial resolution of EEG signal, this information is mapped from the sensor space to the space of brain sources. However, due to the limitations in the use of sensors, the number of brain sources is always more than the number of sensors^{6}. This issue converts EEG signal mapping problem to an underdetermined problem. EEG source imaging (ESI) becomes possible by solving an illposed inverse problem (Fig. 1)^{7}. ESI is a computational method for threedimensional source localization of electrical activity in the cerebral cortex in the brain volume, also called EEG source localization. The estimation accuracy of this method depends on the choice of the head model and the inverse solution. The current due to postsynaptic potentials is propagated simultaneously by the pyramidal neurons according to Poisson equations, but this propagation is not homogeneous. High resistance of the skull weakens this current. This attenuation must be modeled in the calculations. MRI is used to determine the thickness of the skull and the resulting local conductivity properties. These properties are taken into account in the lead field to determine the relationship between electrical activity at a particular electrode and the activity of various sources in the brain. Accuracy in determining this lead field leads to increased source localization accuracy. Among the various ways to induce emotions, such as watching movies, watching images, listening to music is a better approach to stimulate brain, because sensory content of music directly reaches the audience and does not require translation or another medium^{8,9}.
In^{10}, the spatial and temporal distribution of the emotional EEG signal is calculated using the Independent Component Analysis (ICA) algorithms on the results of the standard lowresolution electromagnetic tomography (sLORETA) algorithm. In this study, specific independent components (ICs) were identified for listening to a piece of music and scales. Significant differences were observed between these ICs and the ICs calculated for rest time EEGs^{10}. Active brain regions were calculated using lowresolution electromagnetic tomography (LORETA) in^{11}. Hjorth parameters, power spectrum density, and wavelet are used as properties extracted from this data to classify emotions using the support vector machine classification (SVM) method. 57.30% accuracy has been obtained for this method. The accuracy score for this method in another research^{12} was reported 85.92%. The study of the effect of age on neural activation and response to various emotional stimuli in^{13,14} showed that aging affects the limbic area and thus changes emotional processing and N170 amplitude. In this study, the database of facial emotion images (POFA)^{15}, which includes 110 black and white images of facial expressions, was used as emotional stimuli. The brain areas involved during emotional interference conditions have been investigated in^{16}. The brain source activities were computed using sLORETA. Considerable decreased activity [p < 0.05/66] with respect to baseline are observed in Eighteen gyri in faceword interference and fiftyfour gyri in wordface interference, respectively^{16}. To detect EEG emotions, a dynamic convolutional neural network (DGCNN) is presented in^{17}. The DREAMER dataset (a database for detecting emotions through EEG and ECG signals)^{18} and Shanghai Jiao Tong University (SJTU) emotion EEG dataset (SEED)^{19} have been used to evaluate this method. The results show a mean accuracy of 86.23%, 84.54%, and 85.02% for capacity, arousal, and dominance classification, respectively. The gender differences effect on EEG spectral power and source locations is evaluated in^{20}.
Watching emotional music videos is applied as emotional stimuli in this research. In another study^{21}, a combined technique of electrode frequency distribution maps (EFDMs) with short Fourier transform (STFT) was proposed. In order to classify emotions, a deep convolutional neural network based on residual block (CNN) has been utilized in this approach. The average classification score of this technique has been obtained 90.59% and 82.84% for SEED dataset and database of emotion analysis using physiological signals (DEAP)^{22}, respectively. Researchers in^{23} used the WignerWill quasidistribution (SPWVD) to convert filtered EEG signals into images. These images were intended as input to AlexNet, ResNet50 and VGG16, along with customizable CNN. The reported results show 90.98%, 91.91%, 92.71% and 93.01% accuracy for AlexNet, ResNet50, VGG16, and CNN, respectively. An instanceadaptive graph (IAG) approach has been suggested in^{24}, in which sparse graphical representations of input EEG data have been constructed. According to the results, the accuracy of this method is 86.30%. A regularized graph neural network (RGNN) has been offered in^{25} for the emotional EEG classification. Accuracy of 73.84% on the SEEDIV dataset is achieved and 85.30% on EEG SEED dataset for this procedure. In^{26}, channelwise features are applied as the input of twolayer stacked shortterm memory (LSTM). Accuracies of 98.93% and 99.10% during the twoclass classification of valence and arousal in the DEAP dataset and an accuracy of 99.63% during the threeclass classification on the SEED dataset have been attained, respectively.
Examining existing methods for emotion recognition reveals that the high spatial resolution of the EEG signal is crucial for extracting sufficient information in the feature selection and extraction process for emotion recognition. As mentioned above, the mapping of the EEG signal from the scalp sensor space to the brain source space can accurately show the pattern of the brain areas involved during emotional stimuli. The relationship between different brain areas can be determined based on these source signals relation. The pattern of brain activity during emotional stimulation is determined by creating a graph based on the relationship of the brain source. These graphs are used to separate different emotions. In study^{17}, raw EEG signal information is used as input to the DGCNN network, but in our proposed method, the raw EEG signal is first given to an EEG source localization algorithm—Bayesian model based on the BernoulliLaplace prior method—as input. The output of this algorithm contains spatiotemporal information of emotional EEG sources. Accordingly, in addition to temporal information, topographic and spatial information of electrical activity of the brain is entered in the recognition process. This information is encoded in a graph. Relationships between brain sources extracted from the localization algorithm used to weight the adjacency matrix of this graph. The results are used in the DGCNN algorithm to classify emotions. The potentials recorded in the electrodes actually represent the superposition of these brain source activities. As a result, it is clear that the information obtained from the localization algorithms is more accurate and efficient than the raw EEG signal information.
In this study, features obtained from extracted BernoulliLaplacebased Bayesian model sources are considered as the signal of dynamical graph convolutional neural networks (DGCNN) nodes. By encoding the intersource relations of EEG source signals in the adjacency matrix, the pattern of activity in different brain areas is used to increase the accuracy of emotion classification. This algorithm allows the classification of unseen emotional EEG signals into negative and positive emotional classes.
The main sections of this study are summarized as follows: In Section “Mathematical background”, Mathematical background of EEG Source localization and dynamical graph convolutional neural networks (DGCNN) have been introduced. Then, the proposed approach for emotional states classification has been provided in Section “Emotional EEG source recognition based on DGCNN”. In the “Simulation results” section, the results of the proposed method are explained. Finally, the results will be discussed.
Mathematical background
In this section, the basic theory of EEG source localization and dynamical graph convolutional neural networks will be presented.
EEG source localization
EEG source localization method provides spatiotemporal information about the activity of different areas of the brain. Brain source localization improves the noninvasive detection of functional, mental, and even physiological abnormalities related to the brain in clinical applications^{27}. In these methods, the sources are considered as several discrete magnetic dipoles in the threedimensional space of the brain. One of the most common methods in this field is the LORETA method. The basic hypothesis in this method is that the current density of brain source at any point in the cortex is close to the average current density of its neighbors. A major problem in this method is the low spatial resolution and the blurring and scattering noise of the point sources of the images^{28}. In order to solve this problem, using the current density standardization hypothesis, the sLORETA method has been proposed as a generalization of the LORETA method^{29}. Since the electric potential at any point on the scalp is a linear combination of the dipole amplitudes of the brain sources, therefore, the relationship between the potential in the scalp and the dipole amplitudes of the sources is defined as follows^{30,31}:
where, \({\mathbf{y}} \in {\mathbb{R}}^{N}\) is the EEG data of N electrodes and the amplitudes of M dipoles in the 3D spatial space is shown by \({\mathbf{x}} \in {\mathbb{R}}^{M}\). The N × M lead field matrix H models the propagation of the electromagnetic field from the sources to the sensors and the noise of recorded EEG data is considered as an additive white Gaussian noise e^{32,33}.
As mentioned above, the inverse problem is an underdetermined problem due to the limited placement of electroencephalogram sensors and a large number of brain sources. This imposes more constraints on achieving a unique solution. Proper regulation is usually required to solve an illposed inverse problem. Solutions that are considered the usual \(l_{2}\) norm have low computational complexity. However, in several cases, it is believed that the actual activity of the brain is concentrated in several focal areas. In such situations, the \(l_{2}\) norm creates overestimating problem of active space areas. To solve this problem, the promotion of sparse solutions is proposed, for example, based on \(l_{1}\) norm that can be easily controlled by optimization techniques. In^{34}, it is considered to use a \(l_{0} + l_{1}\) norm to apply sparse source activity (ensuring that a small number of nonzero elements are present in the solution) while regulating the nonzero amplitudes of the solution. More precisely, the norm limits the amplitude values of nonzero elements while the pseudonorm controls their position. Using Bernoulli–Laplace prior, the hybrid \(l_{0} + l_{1}\) norm is introduced in the Bayesian framework. The proposed Bayesian model uses the Markov chain Monte Carlo sampling technique to estimate the model hyperparameters. It has been proven that this model is in favor of sparsity. It is very common to consider an additive white Gaussian noise with variance \(\sigma_{n}^{2}\) in EEG analysis^{30}.
\({{\varvec{\uptheta}}} \, = \, \left\{ {{\mathbf{x}},\sigma_{n}^{2} } \right\}\) is unknown parameter vectors related to the proposed model (1). Priors of these parameters for Bayesian inference are given as follow:

(1)
Dipole Amplitudes Prior: A \(l_{0} + l_{1}\) regularization using BernoulliLaplace prior distribution for each \({\mathbf{x}}\) vector element is introduced similar to Bayesian to encourage sparse solutions whose nonzero elements have small amplitudes. The corresponding pdf for the ith element of \({\mathbf{x}}\) is
$$ f(x_{i} \omega ,\lambda ) \, = \, (1  \omega )\delta (x_{i} ) \, + \frac{\omega }{2\lambda }exp\left( {  \frac{{x_{i} }}{\lambda }} \right) $$(2)where the parameter of the Laplace distribution is \(\lambda\), the Dirac delta function is \(\delta \left( . \right)\). ω as a weight balances the effects of the Laplace distribution and the Dirac delta function.

(2)
The Noise Variance Prior: A noninformative Jeffrey’s prior is considered for the noise variance:
$$ f(\sigma_{n}^{2} ) \propto \frac{1}{{\sigma_{n}^{2} }}1_{{{\mathbb{R}}^{ + } }} (\sigma_{n}^{2} ) $$(3)where \(1_{{{\mathbb{R}}^{ + } }} (\xi ) \, = \, 1{\text{ if }}\xi \in {\mathbb{R}}^{ + }\) and 0 otherwise. This is a very common choice for a noninformative prior^{35}. Attend that a more informative prior distribution of signaltonoise ratio can also be considered.
The hyperparameter vector of the previous priors is \({{\varvec{\Phi}}} \, = \, \left\{ {\omega ,\lambda } \right\}\).
The joint posterior distribution of the model can be represented by considering the previously introduced likelihood and priors using the following hierarchical construction:
where the model parameters and hyperparameters vector is \(\left\{ {{{\varvec{\uptheta}}},{{\varvec{\Phi}}}} \right\}\). The Bayesian estimators of \(\left\{ {{{\varvec{\uptheta}}},{{\varvec{\Phi}}}} \right\}\) cannot be calculated with simple closedform declarations, because this posterior distribution has complexities. In order to sample the joint posterior distribution, a Markov chain monte carlo (MCMC) method can be used (4). This method uses the generated samples to build Bayesian estimators of the unknown model parameters. For this purpose, a Gibbs sampler^{35} is considered, which generates samples repeatedly from conditional distributions (4), i.e., from \(f\left( {\sigma_{n}^{2} {\mathbf{y}}, \, {\mathbf{x}}} \right), \, f\left( {\lambda {\mathbf{x}}} \right), \, f\left( {\omega {\mathbf{x}}} \right){\text{ and }}f\left( {x_{i} {\mathbf{y}},{\text{ x}}_{  i} ,\omega ,\lambda ,\sigma_{n}^{2} } \right).\)
The likelihood and the prior distribution of x are used to calculate the conditional distribution of each signal element x_{i}. This distribution can be defined as follows:
where the truncated Gaussian distributions on \({\mathbb{R}}^{ + }\) and \({\mathbb{R}}^{  }\) are shown using \({\mathcal{N}}_{ + }\) and \({\mathcal{N}}_{  }\), respectively. The vector x can be decomposed on the orthonormal basis B = {n_{1}, … ,n_{M}} such that \({\mathbf{x}} \, = \, {\tilde{\mathbf{x}}}_{  i} \, + \, x_{i} {\mathbf{n}}_{i}\) where \({\tilde{\mathbf{x}}}_{  i}\) is obtained by setting the ith element of x to 0. Defining \({{\varvec{\upnu}}}_{i} \, = \, {\mathbf{y}} \,  \, {\mathbf{H\tilde{x}}}_{  i}\) and \({\mathbf{h}}_{i} \, = \, {\mathbf{Hn}}_{i}\), the weights are defined as
where
and
Dynamical graph convolutional neural network
Network data can be easily modeled as a graph signal. In this situation, the fundamental network topology is demonstrated using a graph. Data values are consecrated to the graph nodes. An undirected graph \({\mathcal{G}} = ({\mathcal{V},\mathcal{D}},{\mathbf{W}})\) with node set \({\mathcal{V}} = \left\{ {1,...,M} \right\}\), edge set \({\mathcal{D}} \subseteq {\mathcal{V}} \times {\mathcal{V}}\) and \({\mathbf{W}} \in {\mathbb{R}}^{M \times M}\) define an weighted adjacency matrix that explains the connections between any two nodes in \({\mathcal{V}}\). \(w_{ij}\) is the entry of \({\mathbf{W}}\) in the ith row and jth column. The set of nodes that share an edge with node i is called the neighborhood of node \(i \, \in \, {\mathcal{V}}\), which are defined as \(C_{i} = \left\{ {j \in {\mathcal{V}}:(j,i) \in {\mathcal{D}}} \right\}\).
A common signal processing method for graph data operation is graph convolution or spectral graph filtering, in which graph Fourier transform (GFT)^{36} is typically used. The Laplacian matrix of the graph \({\mathcal{G}}\) is defined using L. L can be represented as follow:
where ith diagonal element of \({\mathbf{S}} \in {\mathbb{R}}^{M \times M}\) diagonal matrix can be computed by \({\mathbf{S}}_{ii} = \sum\nolimits_{j} {w_{ij} }\). The GFT of a given signal \({\mathbf{x}} \in {\mathbb{R}}^{M}\) is represented as:
where the transformed signal in the frequency domain is defined by \({\hat{\mathbf{x}}}\). The singular value decomposition (SVD) of the graph Laplacian matrix L is an orthonormal matrix U as follow^{37}:
By considering (10), the inverse GFT can be declared as follows:
For the two signals x and \({\mathbf{z}}\), the convolution on the graph \(*_{{\mathcal{G}}}\) can be defined as follows^{38}:
where \(\odot\) shows Hadamard's product in terms of element.
The optimal adjacency matrix \({\mathbf{W}}^{ * }\) can be learned. The spatial filtering \({\text{g}} ({\mathbf{L}}^{ * } )\) defines the graph convolution of x signal with the vector \({\mathbf{U}}^{ * } {\mathbf{g}}({{\varvec{\Lambda}}}^{ * } )\), which can be demonstrated as follows:
where \({\text{g}} ({{\varvec{\Lambda}}})\) is demonstrated as
where the \({\mathbf{L}}^{ * }\) can be computed based on (9) using \({\mathbf{W}}^{ * }\), and \({{\varvec{\Lambda}}}^{ * } = {\text{diag}} ([\lambda_{0}^{ * } {,} \cdots , \, \lambda_{N  1}^{ * } ])\) is a diagonal matrix, whereas direct calculation of \({\mathbf{g}}({{\varvec{\Lambda}}}^{ * } )\) expression is difficult, we use, e.g. the Kψ order Chebyshev polynomials to fastly calculate the polynomial expansion of \({\mathbf{g}}({{\varvec{\Lambda}}}^{ * } )\) as follow^{38}:
where the following recursive expressions can be used to T_{k} (x) recursively calculation. \(\theta_{k}\) is the coefficient of Chebyshev polynomials.
Therefore, (16) is used to rewrite the convolution graph operation of (14) as follow:
where \({\tilde{\mathbf{L}}}^{ * } = \, {{2{\mathbf{L}}^{*} } \mathord{\left/ {\vphantom {{2{\mathbf{L}}^{*} } {\lambda_{max}^{ * } }}} \right. \kern\nulldelimiterspace} {\lambda_{max}^{ * } }}  {\mathbf{I}}_{M} .\)
The backpropagation (BP) method is used to iteratively optimize the optimal network parameters, in which the network parameters update until the optimal or suboptimal solutions are attained. Thus, a loss function is expressed based on crossentropy cost. In order to dynamically learn the optimal adjacency matrix \({\mathbf{W}}^{ * }\) of the DGCNN model in the BP method, we must calculate the partial derivative of the loss function relative to \({\mathbf{W}}^{ * }\). After that, the updating formula of the optimal adjacency matrix \({\mathbf{W}}^{ * }\) can be expressed as:
where the learning rate of the network is shown by ψψ.
Emotional EEG source recognition based on DGCNN
In this section, how to extract signals from the brain sources of emotions, the use of the DGCNN algorithm to classify the types of emotional states, and detailed information about the data used in this study are described in detail.
Proposed classification algorithm using DGCNN and Bayesian model based emotional EEG source
Considering the challenges of feature selection and extraction in previous methods and the need to increase the accuracy of classifying both positive and negative emotions, this section presents a method based on EEG source localization and graph theory. To this end, Fig. 2 shows a blockdiagram of the proposed method for classifying two emotional classes:

Emotional EEG source localization using BernoulliLaplacebased Bayesian model: The brain sources that generate the EEG signal are calculated using BernoulliLaplacebased Bayesian model algorithm. This algorithm is initialized using the sLORETA method.

Graph generation: In the proposed method, a graph signal on the top of each graph node is obtained based on each extracted source signal. The graph adjacency matrix is also weighted based on the calculated correlation between the extracted source signals.

Graph pattern classification using the DGCNN algorithm: The weighted graph adjacency matrix, the graph corresponding to the extracted source signals, is given as input to the DGCNN algorithm for recognizing and classifying emotions.
In this study, active areas of the brain during two kinds of emotional stimuli are identified using the proposed Bayesian model based on BernoulliLaplace prior. The sLORETA method is applied to initialize the source signals in this algorithm. To calculate the results of the sLORETA algorithm, we use the Colin27 brain atlas model from the Montreal Neurological Institute (MNI) and the OpenMEEG BEM head model^{39,40}. The localization solution space is surrounded by the gray matter of the cortex. A resolution of 5 mm (mm/voxel) with 5614 voxels at MNI coordinates is used for this space in localization. If the number of vertices in the space of the localization solution increases, the recognition accuracy of the active areas during emotion induction increases. The differences between the active brain sources for recorded dataset in the Bradmann (BA) of cerebral cortex^{41} for sLORETA and Bayesian model based on BernoulliLaplace prior methods are shown in Fig. 3. The lateral view of the active brain areas for subject 1 during positive and negative emotional stimulation using the sLORETA method is shown in Fig. 3a and c, respectively. In addition, the lateral view of the active brain areas for subject 1 during positive and negative emotional stimulation using the Bayesian model based on Bernoulli–Laplace prior is presented in Fig. 3b and d, respectively.
In the results of sLORETA topographic images, the areas including the auditory cortex, lingual gyrus, and amygdala located in the lower and middle temporal cortex and the middle occipital cortex show the most activity during emotional stimulation of the brain. Considering the results of the sLORETA method, 26 Broadman regions are considered as the region of interest (ROI) for feature extraction. BA5, 6, 7, 9, 10, 11, 18, 19, 21, 22, 29, 37, 38, 39, and 40 with bilateral hemispheres are selected as ROI areas (Fig. 4).
However, the Bayesian model based on Bernoulli–Laplace prior method concentrates the active areas and thus reduces the number of these areas. Unlike previous methods, this method simplifies the complex pattern of most active brain areas. Differences in the brain areas that are activated during positive and negative stimuli indicate that a spatialinformationaware classifier can be used to accurately classify emotions. In the proposed method, most activities are seen in BA 19, 37, 18 areas due to the induction of negative emotions and BA 20, 21, 22 due to the induction of positive emotions. Bayesian model based on Bernoulli–Laplace prior method calculates the current source density (CSD) for each voxel (in amperes in each region). In order to reduce the computational volume of the proposed method and identify a set of the powerful dipoles and their corresponding neighbors, we calculate the energy of all source signals and then choose all signal sources whose power is greater than 50% of the maximum power of the activity amplitude. The signals from each source are used as input to the classification algorithm. Sources with less than 50% of the maximum power are discarded to reduce computational complexity. From what has been said, it is clear that the formation of source signals graph can provide a pattern of different areas activity during an emotional stimulus to classify emotions. In this case, there will be a graph of brain sources for each emotional stimulus. For this purpose, an adjacency matrix that describes the relationships between nodes will be needed. In this matrix, if there is an edge from node i to node j, \({\mathbf{A}}_{ij} ,{\mathbf{A}}_{ji} = 1\), otherwise \({\mathbf{A}}_{ij} ,{\mathbf{A}}_{ji} = 0\) (Fig. 5).
The location of the vertices in the MNI coordinates is considered as graph nodes and the corresponding source signal is considered as the graph signal at the top of that node. In the proposed approach, the correlation between the two corresponding nodes signals of one edge is considered as the initial value of the graph edge weight. More precisely, the correlation between the \(i\)th source \({\mathbf{x}}_{i}\) and the \(j\)th source \({\mathbf{x}}_{j}\) can be computed as follows:
where \( \, i,j \in \left\{ {1,...,M} \right\}{,}t \in \left\{ {1,...,T} \right\}\).
Here, we define a threshold \(\beta\), such that when \(w_{ij} > \beta\), the \(i\)th source is linked with the \(j\)th source in the constructed graph \(\mathcal{G}\). In this paper, a model based on graphstructured data is proposed to learn and classify the patterns of EEG source. In DGCNN, the adjacency matrix is updated with graph model parameters changes during model training to learn the relationships between EEG source signals according to (19), unlike the traditional graph convolutional neural network (GCNN) method, in which the adjacency matrix was determined before model training. This approach improves the classification results. In the proposed algorithm, the network parameters are frequently updated to achieve optimal or semioptimal solutions according to (16). The structure of the proposed algorithm is indicated in Fig. 6, which includes the graph filtering layer, convolutional layers, and one fully connected layer. The detailed procedures of the proposed algorithm are summarized in Algorithm 1.
Emotional EEG datasets
SEED dataset
In this data set, the EEG signal of 15 Chinese people (8 females and 7 males, age range: 23.27 ± 2.37) was recorded while watching 15 video clips. Chinese film clips with three types of emotions, i.e., negative, positive and neutralis shown for subjects.The sampling rate is 200 Hz. After watching each clip, participants immediately chose emotional labels that included positive, neutral, and negative attributes. A bandpass frequency filter from 0 to 75 Hz was applied. A hamming window with a specific duration with nonoverlap was used to divide each signal into 8 data segments.
DEAP dataset
DEAP is a database containing physiological signals for analyzing emotions. EEG and environmental physiological signals were recorded by 32 healthy participants (16 males and 16 females, aged 19 to 37 years) while each watching 40 oneminute pieces of music videos. 32 active AgCl electrodes (placed according to International System 10–20) with a sampling rate of 512 Hz were used for EEG recording. This database includes peripheral nervous system signals: GSR, respiratory rate, skin temperature, electrocardiogram, blood volume by plethysmography, zygomatic and trapezoidal muscle electromyogram, and electrooculogram (EOG). The 32channel EEG data sampling were reduced to 128 Hz and the EOG was removed by filtering 4.0–45.0 Hz from the data. And then, a 5 s hamming window with nonoverlap was used to divide each signal into 12 data segments.
Recorded EEG
In the database of the BrainComputer Interface Research Laboratory, University of Tabriz, Iran, the EEG signals of 16 people without a history of mental illness (6 women and 10 men between 28 and 21 years old) were recorded while listening to emotional music^{42,43}. The 21channel Encephalan Medicom device was used to record the EEG signal. The sampling rate in this experiment is about 250 Hz. The international standard system 10–20 is utilized to arrange the electrodes on the head (Fig. 7). In the questionnaire version, the SelfAssessment Manikin (SAM)^{44} and in the test process, a 9point test was used to assess positive and negative emotions. In addition, the participants^{45} completed the Beck Depression Inventory (BDI) questionnaire. The SAM results and description of the BDI test are presented in Table 1. Details of the selected music for each theme are demonstrated in Table 2. The sequence of how to play musical stimuli for participants is shown in Fig. 8. A fifteensecond silence is applied between the two pieces of music. An intermediate filter with cutoff frequencies of 0.5 and 70 Hz is used to extract useful EEG signal information. According to Fig. 8, the number of data related to the neutral class is less than the data of the positive and negative classes, which causes an imbalance between the data and may cause the problem of overfitting. In addition, an imbalance between the data of each class leads to bias in the classification results and a decrease in accuracy. To solve this problem, using overlapping methods, all the corresponding epochs of each emotion are connected to form a long signal. Rectangular windows are then executed with a specific duration and overlap so that the number of epochs collected is equal to each of the emotion classes. In the proposed method for each channel, 5 min of recorded signal (as shown in Fig. 3) is selected for each emotion. In this case we have 2 data classes (negative and positive) with 75,000 sample points for each channel. The data is then split into 8s intervals per channel, using the overlap technique to prevent overfitting.
Simulation results
The Brainstorm toolbox in MATLAB R2019a was used to calculate the active brain regions by sLORETA method. The results of this method are used as the initial value for the Bayesian model based on BernoulliLaplace prior. A server with an NVIDIA 1080TI GPU and an Intel Core i7 CPU is intended to implement the proposed algorithm in Tensorflow 2.0.0 in Python programming language. The results of the proposed algorithm for automatic detection of emotions are presented in the continuation of this section. In this study, unlike many studies, the evaluation results of the proposed method for inducing emotion with both music and image are presented, so to fairly compare the proposed and the existing stateoftheart methods; we implement both categories of approaches on our recorded data, SEED and DEAP datasets. The sources with less than 50% of a subject's maximum power are eliminated to reduce the computational cost of algorithm.
The proposed method is evaluated in two subjectdependent and subjectindependent scenarios. In the subjectdependent scenario, 4 out of 10 trails are randomly considered as a training set and the remaining 6 experiments are considered as a testing set. In addition, in a subjectindependent scenario, the data of 40% subjects are used for training and 40% subjects for testing and 20% subjects for validation of proposed method. Finally, the average accuracy performance of the proposed method is reported to all subjects.
The accuracy of the subjectdependent scenario of the proposed method and the existing methods^{11,12,17,21,23,25,26} are compared in Fig. 9. The lowest accuracy in this comparison is related to the method in^{11} with 67.7%. However, for the method^{26}, the average accuracy is 96.87%. It can be seen that in all subjects, the highest accuracy is related to the proposed method with 98.95%. The proposed method and the methods^{11,12,17,21,23,25,26} in Fig. 10 were compared in the form of subjectindependent scenario. As can be seen, the best accuracy of the subjectindependent scenario is related to the method in^{26} with 95.83%. However, our proposed algorithm in this scenario gives 97.91% accuracy.
These results indicate the robustness of the proposed algorithm against crosssubject variations. According to the results, the accuracy of subjectindependent scenario is less than the accuracy of subjectdependent scenario, the reason for this issue is the use of unseen data to test the algorithm in a subjectindependent scenario. It is clear from the results that the accuracy of the proposed algorithm in both subjectdependent and subjectindependent scenarios is better than the methods available in^{11,12,17,21,23,25,26}. The evaluation results of the proposed method and the existing methods are presented for the SEED and DEAP dataset in Tables 3 and 4, respectively. The accuracy of the subjectindependent scenario is 98.51% and 98.32% and the subjectdependent scenario is 99.25% and 98.96% for our proposed method on the SEED and DEAP dataset, respectively. The highest accuracy for subjectdependent scenario and the subjectindependent scenario for the proposed method^{26} among the available methods have been calculated as 98.51% and 97.77%, respectively. The accuracy obtained for our proposed method is greater than other methods. As shown in Table 5, when BernoulliLaplacebased Bayesian model is used for source localization, the accuracy of the proposed algorithm is higher than when sLORETA is used. According to Table 5, if CNN classifier is used instead of DGCNN, the accuracy of the proposed algorithm be lower.
Discussion and conclusion
In this study, we propose an algorithm based on DGCNN and EEG sources to recognize emotions. A mapping of scalp sensors to brain sources is performed to extract the pattern of each emotion using Bayesian model based on BernoulliLaplace prior. The results of sLORETA method is used for initialization of this model. In the proposed method, a DGCNN is used to classify emotionbased EEG in which the sources of the Bayesian model based on Bayesian model based on Bernoulli–Laplace prior method are considered as underlined graph signals. Finally, emotional EEG signals are divided into negative and positive emotional classes using this approach. The proposed method is compared with existing standard methods in subjectindependent and subjectdependent experiments on our emotional EEG dataset, DEAP and the SEED dataset.
Feature extraction from EEG data in all previous methods is a major challenge. In this study, to solve this problem, the spatiotemporal information of emotional EEG sources is encoded in a graph. The DGCNN algorithm is then used to classify these graphs. Using purposed approach, acceptable accuracy for the data is obtained without the need to design the feature extraction process. According to the results, the proposed technique has made the brain areas involved in emotions processing more focused. Significant differences can be seen in the areas involved during the induction of positive and negative emotions. This issue significantly increases the accuracy of the emotion classification. Another point in the proposed method is the updating of the adjacency matrix in DGCNN algorithm, which in itself improves the emotion classification accuracy.
Increasing the number of electrodes used to record the signal based on the results of previous studies in the field of EEG signal processing^{46}, improves classification accuracy. However, the problem is that it is costly and timeconsuming to use highdensity EEG sensor arrays in a clinical or field environment. In this study, we use the source localization technique to increase spatial information in EEG recordings. The spatial resolution of EEG recordings can be expanded by increasing the number of sources. These sources contain good spatiotemporal information. According to the concepts mentioned in the results section, the accuracy of the subjectdependent scenario and the subjectindependent scenario for our proposed method are 99.25% and 98.51%, respectively. These accuracies are greater than the values obtained in existing stateoftheart methods.
The use of video or music video to induce emotions, in addition to the areas related to emotion processing, also involves the visual and memory areas^{11,12,17,21,23,25,26}. Considering the results of emotion induction using music in this study and this issue, it is clear that auditory induction can be an easier and more appropriate way to induce emotions. In this study, the weight of each graph edge is determined by calculating the correlation between the graph signal sources. In future studies, another feature of graph signals can be used as a criterion to calculate the weight of edges.
References
Marg, E. DESCARTES’ERROR: Emotion, reason, and the human brain. Optom. Vis. Sci. 72, 847–848 (1995).
MarreroFernández, P., MontoyaPadrón, A., JaumeiCapó, A. & Buades Rubio, J. M. Evaluating the research in automatic emotion recognition. IETE Tech. Rev. 31, 220–232 (2014).
Darwin, C. & Prodger, P. The Expression of the Emotions in Man and Animals (Oxford University Press, 1998).
Tian, Y.I., Kanade, T. & Cohn, J. F. Recognizing action units for facial expression analysis. IEEE Trans. Pattern Anal. Mach. Intell. 23, 97–115 (2001).
Liu, Y., Sourina, O. & Nguyen, M. K. Transactions on Computational Science XII 256–277 (Springer, 2011).
Michel, C. M. & Murray, M. M. Towards the utilization of EEG as a brain imaging tool. Neuroimage 61, 371–385 (2012).
da Silva, F. L. EEG and MEG: Relevance to neuroscience. Neuron 80, 1112–1128 (2013).
Williams, D. & WilliamsMorris, R. Racism and mental health: The African American experience. Ethn. Health 5, 243–268 (2000).
Rotton, J. & Frey, J. Air pollution, weather, and violent crimes: Concomitant timeseries analysis of archival data. J. Pers. Soc. Psychol. 49, 1207 (1985).
Jäncke, L. & Alahmadi, N. Detection of independent functional networks during music listening using electroencephalogram and sLORETAICA. NeuroReport 27, 455–461 (2016).
PadillaBuritica, J. I., MartinezVargas, J. D. & CastellanosDominguez, G. Emotion discrimination using spatially compact regions of interest extracted from imaging EEG activity. Front. Comput. Neurosci. 10, 55 (2016).
Chen, G., Zhang, X., Sun, Y. & Zhang, J. Emotion feature analysis and recognition based on reconstructed eeg sources. IEEE Access 8, 11907–11916 (2020).
Ekman, P. Are There Basic Emotions? (Springer, 1992).
Tsolaki, A. C. et al. Ageinduced differences in brain neural activation elicited by visual emotional stimuli: A highdensity EEG study. Neuroscience 340, 268–278 (2017).
Batabyal, T., Muthukrishnan, S., Sharma, R., Tayade, P. & Kaur, S. Neural substrates of emotional interference: A quantitative EEG study. Neurosci. Lett. 685, 1–6 (2018).
Song, T., Zheng, W., Song, P. & Cui, Z. EEG emotion recognition using dynamical graph convolutional neural networks. IEEE Trans. Affect. Comput. 11, 532–541 (2018).
Goshvarpour, A. & Goshvarpour, A. EEG spectral powers and source localization in depressing, sad, and fun music videos focusing on gender differences. Cogn. Neurodyn. 13, 161–173 (2019).
Wang, F. et al. Emotion recognition with convolutional neural network and EEGbased EFDMs. Neuropsychologia 146, 107506 (2020).
Khare, S. K. & Bajaj, V. Timefrequency representation and convolutional neural networkbased emotion recognition. IEEE Trans. Neural Netw. Learn. Syst. 32, 2901–2909 (2020).
Song, T., Liu, S., Zheng, W., Zong, Y. & Cui, Z. Proceedings of the AAAI Conference on Artificial Intelligence, 2701–2708.
Zhong, P., Wang, D. & Miao, C. EEGbased emotion recognition using regularized graph neural networks. IEEE Trans. Affect. Comput. 1, 1 (2020).
Jin, L. & Kim, E. Y. Interpretable crosssubject EEGbased emotion recognition using channelwise features. Sensors 20, 6719 (2020).
Plummer, C., Harvey, A. S. & Cook, M. EEG source localization in focal epilepsy: Where are we now?. Epilepsia 49, 201–218 (2008).
PascualMarqui, R. D., Michel, C. M. & Lehmann, D. Low resolution electromagnetic tomography: A new method for localizing electrical activity in the brain. Int. J. Psychophysiol. 18, 49–65 (1994).
PascualMarqui, R. D. Standardized lowresolution brain electromagnetic tomography (sLORETA): Technical details. Methods Find Exp. Clin. Pharmacol. 24, 5–12 (2002).
Grech, R. et al. Review on solving the inverse problem in EEG source analysis. J. Neuroeng. Rehabil. 5, 1–33 (2008).
Hallez, H. et al. Review on solving the forward problem in EEG source analysis. J. Neuroeng. Rehabil. 4, 1–29 (2007).
Kiebel, S. J., Daunizeau, J., Phillips, C. & Friston, K. J. Variational Bayesian inversion of the equivalent current dipole model in EEG/MEG. Neuroimage 39, 728–741 (2008).
Mosher, J. C., Leahy, R. M. & Lewis, P. S. EEG and MEG: Forward solutions for inverse methods. IEEE Trans. Biomed. Eng. 46, 245–259 (1999).
Costa, F., Batatia, H., Chaari, L. & Tourneret, J.Y. Sparse EEG source localization using bernoulli laplacian priors. IEEE Trans. Biomed. Eng. 62, 2888–2898 (2015).
Casella, G. & Robert, C. P. Monte Carlo Statistical Methods (Springer, 1999).
Shuman, D. I., Narang, S. K., Frossard, P., Ortega, A. & Vandergheynst, P. The emerging field of signal processing on graphs: Extending highdimensional data analysis to networks and other irregular domains. IEEE Signal Process. Mag. 30, 83–98 (2013).
Chung, F. R. & Graham, F. C. Spectral Graph Theory (American Mathematical Society, 1997).
Defferrard, M., Bresson, X. & Vandergheynst, P. Convolutional neural networks on graphs with fast localized spectral filtering. Adv. Neural. Inf. Process. Syst. 29, 3844–3852 (2016).
Talairach, J. Coplanar stereotaxic atlas of the human brain3dimensional proportional system. An Approach to Cerebral Imaging (1988).
Brodmann, K. Vergleichende Lokalisationslehre der Grosshirnrinde in ihren Prinzipien dargestellt auf Grund des Zellenbaues (Barth, 1909).
Sheykhivand, S., Mousavi, Z., Rezaii, T. Y. & Farzamnia, A. Recognizing emotions evoked by music using CNNLSTM networks on EEG signals. IEEE Access 8, 139332–139345 (2020).
Bradley, M. M. & Lang, P. J. Measuring emotion: The selfassessment manikin and the semantic differential. J. Behav. Ther. Exp. Psychiatry 25, 49–59 (1994).
Beck, A. T., Steer, R. A. & Brown, G. K. Beck Depression Inventory (BDIII) Vol. 10 (Pearson, 1996).
Romanowicz, K., Kozłowska, K. & Wichniak, A. Psychomotor retardation in recurrent depression and the related factors. Adv. Psychiatr. Neurol. 28, 208–219 (2019).
Author information
Authors and Affiliations
Contributions
S.A. and T.Y.R. conceived of the presented idea. S.A. developed the theory and performed the computations. T.Y.R., S.B. and S.M. supervised the findings of this work. All authors discussed the results and contributed to the final manuscript.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing interests.
Additional information
Publisher's note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Asadzadeh, S., Yousefi Rezaii, T., Beheshti, S. et al. Accurate emotion recognition using Bayesian model based EEG sources as dynamic graph convolutional neural network nodes. Sci Rep 12, 10282 (2022). https://doi.org/10.1038/s41598022142177
Received:
Accepted:
Published:
DOI: https://doi.org/10.1038/s41598022142177
Comments
By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.