Detecting emotions through EEG signals based on modified convolutional fuzzy neural network

Ahmadzadeh Nobari Azar, Nasim; Cavus, Nadire; Esmaili, Parvaneh; Sekeroglu, Boran; Aşır, Süleyman

doi:10.1038/s41598-024-60977-9

Download PDF

Article
Open access
Published: 06 May 2024

Detecting emotions through EEG signals based on modified convolutional fuzzy neural network

Nasim Ahmadzadeh Nobari Azar^1,3,
Nadire Cavus^2,3,
Parvaneh Esmaili⁴,
Boran Sekeroglu⁵ &
…
Süleyman Aşır^1,6

Scientific Reports volume 14, Article number: 10371 (2024) Cite this article

465 Accesses
Metrics details

Subjects

Abstract

Emotion is a human sense that can influence an individual’s life quality in both positive and negative ways. The ability to distinguish different types of emotion can lead researchers to estimate the current situation of patients or the probability of future disease. Recognizing emotions from images have problems concealing their feeling by modifying their facial expressions. This led researchers to consider Electroencephalography (EEG) signals for more accurate emotion detection. However, the complexity of EEG recordings and data analysis using conventional machine learning algorithms caused inconsistent emotion recognition. Therefore, utilizing hybrid deep learning models and other techniques has become common due to their ability to analyze complicated data and achieve higher performance by integrating diverse features of the models. However, researchers prioritize models with fewer parameters to achieve the highest average accuracy. This study improves the Convolutional Fuzzy Neural Network (CFNN) for emotion recognition using EEG signals to achieve a reliable detection system. Initially, the pre-processing and feature extraction phases are implemented to obtain noiseless and informative data. Then, the CFNN with modified architecture is trained to classify emotions. Several parametric and comparative experiments are performed. The proposed model achieved reliable performance for emotion recognition with an average accuracy of 98.21% and 98.08% for valence (pleasantness) and arousal (intensity), respectively, and outperformed state-of-the-art methods.

CNN-XGBoost fusion-based affective state recognition using EEG spectrogram image analysis

Article Open access 19 August 2022

Emotion recognition based on group phase locking value using convolutional neural network

Article Open access 07 March 2023

Four-layer ConvNet to facial emotion recognition with minimal epochs and the significance of data diversity

Article Open access 28 April 2022

Introduction

In recent years, the study of emotion recognition has become increasingly popular among researchers from diverse backgrounds. This is likely due to the ability of emotion recognition to reveal important aspects of individual behavior and mental states. Affective computing is a relatively new research field that aims to provide computer systems to Affective computing being a relatively new research field is aimed at using computer systems to detect, analyses, and interpret emotional information provided by people effectively¹. This form of computing enables researchers understand how people feel, what triggers their feelings, and how to design a more similar, responsive, and better systems to meet people's needs. However, one of the most challenging aspects of this technology is in development of computer methods and approaches which aids the natural interaction of computers with humans. This phenomenon is known as human–computer interaction (HCI) as it analyses and evaluate emotional exchanges and emotional state existing between a person and machine interaction².

For a better understanding of emotional states and exchanges, two categorized methods have been proposed. The first method utilizes effective conduct characteristics, such as speech intonation, facial gestures, and body language for the detection of these category of emotion However, the second group considers the signals of physio-logical activities recorded by non-invasive sensors to detect emotions as electrical responses³. Emotions have been primarily represented in two ways in the related emotion recognition literature. The primary approach categorizes emotions as distinct states, including the six fundamental emotions suggested by Ekman and Friesen⁴. According to the second approach, emotion is expressed as a continuous 4-D space of valence, arousal, dominance, and liking⁵. However, most investigations reduce this space to 2-D, applying valence and arousal dimensions⁶.

Many researchers have found that the generation and activity of emotions are very closely related to the activity of the brain's cortex⁷, and the EEG (Electroencephalogram) has been recently utilized for monitoring brain activity due to its high detection sensitivity in comparison with other methods⁸. However, the number of channels and various frequency bands in recordings complicate the analysis and require advanced tools. Since machine learning (ML) and deep learning (DL) are effective approaches due to their ability to relate between features and make decisions, they are frequently used for analyzing EEG signals to solve the problem mentioned above⁹. However, ML comprises data preparation, feature selection, and classification steps, which generally require manual procedure and cause loss of relevant data by increasing the computation cost in data preparation¹⁰. Numerous models, including K-Nearest Neighbor (KNN), Support Vector Machine (SVM), Decision Tree (DT), and Random Forest (RF) encompass a wide spectrum of machine learning techniques. Since the K-Nearest-Neighbors (KNN) technique is nonparametric, it doesn’t make any assumptions about the elementary dataset. Its simplicity and efficacy are well known. It's an algorithm for supervised learning. In order to predict the class of the unlabeled data, a labeled training dataset with data points categorized into many classes is supplied¹¹.

Support Vector Machines (SVM) are machine learning techniques that use binary linear classification to divide classes based on how much the instances of their boundary line differences differ from one another. It is called the optimal margin classifier for this reason¹².

On the contrary, recent advances in deep learning technology have made it very successful at recognizing things such as pictures, speeches, and text. This is because deep learning technology can learn to recognize complex, high-level features on its own, and it requires less time to extract the features of a particular object. The reasonable and superior results achieved by the 1D convolutional neural networks (CNN) made it one of the benchmark algorithms in EEG analysis¹⁰. Extracting the features in the pre-defined window (periods) of the recordings in the convolutional layer decreased the computational cost spent for the data preparation and reduced the noise dependency of the analysis. However, the similarity of the signals and the channel variety require an approximation of the signals to improve the accuracy of emotion recognition. The deep-bidirectional LSTMs (Bi-LSTM) represent an advancement in conventional LSTM models by incorporating two LSTMs into input data processing. Initially, an LSTM processes the input sequence, followed by another round where the LSTM model processes the input sequence in reverse order. Utilizing the LSTM twice enhances the ability to learn long-term dependencies, thereby enhancing the model's accuracy¹³.

On the other hand, neural networks and fuzzy systems are adequate for the universal approximation for modelling nonlinear functions. Therefore, the fuzzy neural network (FNN) is a hybrid model that merges the capabilities of both neural networks and fuzzy logic into a single, cohesive system¹⁴. The FNN offers the benefit of improving the efficacy of function estimation.

This paper proposes a hybrid Convolution Fuzzy Neural Network (CFNN) model for emotion classification on the DEAP database⁵. The proposed model aims to provide a more comprehensive and accurate understanding of emotions within the dataset by combining multiple approaches and techniques. In the proposed method, signals are used as inputs of CFNN for the first time, and the method comprises a layer for converting flattened features obtained in the convolutional layer into fuzzy quantities (fuzzification) with a following layer for converting fuzzy sets into crisp values (defuzzification). The suggested approach enhances classification accuracy by utilizing fuzzy neural networks, which have the capability to produce not just precise values but also fuzzy values. This implies that fuzzy sets potentially contain additional information, leading to improved accuracy. Moreover, the model is adept at managing the noise disruption in the data. Therefore, it demonstrated improved capabilities for recognition and classification. Additionally, the study highlighted the significant impact of utilizing the Fast Fourier Transform (FFT) as a feature extraction technique. Applying FFT to the input signals extracted relevant frequency-domain information, enhancing the classification model's discriminative power.

The rest of this paper's material is organized as follows: section “Related works” summarizes the related works in the literature in a timeline by highlighting the problems. Section “Methods” presents the materials and methods considered in this paper. Section “Proposed method” goes into detail about the proposed method. The “Results” section demonstrates the model's results analysis and discussions. Finally, the last section is the “Conclusion” of the proposed work.

Related works

Several studies have been conducted on emotion recognition, and most of them have been focused on the ML or DL approach in the last decade.

Machine learning-related works

Numerous machine learning techniques are used to classify EEG signals, including K-Nearest Neighbor (KNN)¹⁵, Support Vector Machine (SVM)¹⁶, Decision Tree (DT)¹⁷, and Random Forest (RF)¹⁸. Traditional EEG-based emotion detection algorithms primarily concentrate on extracting individual EEG features from different domains.

Ismael et al.¹⁹ proposed a technique for categorizing EEG data based on a two-stage majority vote. First, bandpass filters were used to reduce noise from the raw EEG data, and afterward, low-pass filters were used to extract rhythm. The rhythms were analyzed based on their fractal dimensions and wavelet-based entropy features, which were evaluated using KNN and performed on the DEAP dataset. In different studies²⁰,²¹ researchers applied SVM using multichannel feature fusion and a dimensional model to detect diverse emotions.

Amiri et al.²² proposed applying the DWT approach to extract the EEG signal’s properties. They used the DEAP dataset to categorize real-time affective responses using the arousal-valence dimensions model. The two distinct classifiers, SVM and KNN, were used in this study and achieved reasonable accuracy. They concluded that the gamma, as a high-frequency classifier, provided higher accuracy than the other frequencies.

To investigate the effects of the various frequency bands and number of channels on accuracy, Li et al.²³ divided the DEAP dataset into four frequency bands. Then, as an input characteristic for a KNN classifier, the entropy and energy of each band were computed. The authors concluded that the gamma frequency band exhibited the highest classification accuracy regardless of the valence or arousal dimension. Also, it was shown that the gamma frequency band, as opposed to the low-frequency band, was significant for the emotional state in the valence and arousal dimensions. Additionally, they demonstrated how adding more EEG channels could enhance the categorization precision of emotional states.

Furthermore, scientists applied three distinct methods²⁴ to combine data from various channels and the Fusion after deep feature reduction (FaDFR) method, which combines reduced deep time–frequency features from EEG channels with Inception-V3²⁵CNN for deep feature extraction and SVM for classification, produced superior results. The results demonstrated 88.6% accuracy on the DEAP dataset and 94.58% accuracy on the SEED dataset²⁶.

Moreover, a novel approach was introduced in²⁷ for emotion detection using multichannel EEG data. The framework utilized a linear EEG model and an emotion timing model to improve accuracy in emotion classification. Signal framing, Hamming window, and power spectral density were used to extract the features from the signals. They achieved 81.10% and 74.38% accuracy in valence and arousal on the DEAP dataset, respectively.

Deep learning-related works

As DL techniques progress rapidly, DL modules may eventually replace all or part of the abovementioned systems' components. Several networks have been proposed based on CNN²⁸ and Long-Short Term Memory (LSTM) neural networks²⁹.

Xiao et al.³⁰ proposed an emotion recognition algorithm that relied on a CNN. EEG signals were mapped into 4D spaces, and Differential Entropy features were extracted from them. The next step involved gaining spatial and spectral information by utilizing the CNN from each temporal piece. LSTM was used to investigate the important aspects of various pieces and identify their emotions by assigning different weights to diverse brain regions and frequency bands. The algorithm obtained excellent classification results. The study was implemented on both DEAP and SEED datasets.

Cimtay et al.³¹ attempted end-to-end methods for the classification of emotions by using CNN. In this research, they developed the model by adding more layers to improve the classification performance. A median filter was utilized to eliminate false identifications along an emotional prediction interval, improving classification accuracy. In the³², the model comprised a 1D convolution layer that collects electrode correlations throughout the spatial dimension and receives weighted combinations of contextual data on DEAP and SEED datasets to overcome the limitations of nonlinear estimation and effectively extract features from frequency bands. The technique that was developed can effectively extract features from noisy EEG data while managing electrode correlations and temporal dependencies.

In another research carried out³³, to gain high accuracy and performance in emotion detection, stacked Bi-LSTM was applied to the DEAP database. The model’s efficiency was enhanced by extracting the statistical, wavelet, and Hurst exponent features. The Binary Grey Wolf Optimizer (BGWO) algorithm managed the problem of the complexity and high dimensionality of the dataset, and it caused a diminishment in classifying time and an improvement in the model's effectiveness.

A newly developed deep learning framework³⁴, based on subject-independent, comprises two main components. First, an unsupervised LSTM with a channel-attention autoencoder was suggested for obtaining a subject-invariant latent vector subspace for each subject. Second, CNN with an attention framework was de-scribed for carrying out the task of subject-independent emotion recognition derived from the last step. The method was evaluated on the DEAP, SEED, and CHB-MIT³⁵ datasets.

Fusion of the models in deep learning was the challenge faced by the researchers³⁶. The fused model employed multiple graph convolutional neural networks (GCNNs) to extract features from the graph domain. Additionally, LSTM cells were utilized to capture the evolving relationships between two EEG channels over time and perform temporal feature extraction. Finally, a dense layer was employed to classify emotions based on the extracted features. The results were superior to the state-of-the-art methods on the DEAP dataset, and the authors concluded that an attention weight, which provides weights to the emotional states that arise at particular moments, was computed to identify the input component that has the most impact on the output; the greater the importance of the relevant input component, the higher the magnitude in the network training.

In another study³⁷, a hybrid model consisting of a combination of CNN and LSTM was presented to construct a deep-learning system for emotion identification applied to the DEAP and DREAMER³⁸ databases. This method applied CNN and channel-wise attention mechanisms to investigate spatial information. Additionally, attention-based convolutional recurrent neural networks (ACRNN) combined extended self-attention with RNNF to investigate temporal information in EEG signals. Iyer et al.³⁹ employed differential entropy (DE) to extract different frequencies based on EEG signals, and a hybrid model was produced by combining CNN and LSTM model sub-blocks. They achieved 65% accuracy on the DEAP dataset and 97.16% on the SEED dataset.

A novel fuzzy rule-based categorization system that uses EEG to measure emotional characteristics was presented in⁴⁰. This study’s purpose was to derive rules from EEG data for a fuzzy categorization automatically. The proposed method extracted a set of rules from the EEG data using FCM. Applying the suggested fuzzy emotion classifier and the fuzzy extraction approach yielded a faster calculation time. Because it learns from data and builds rules, FCFCM can be used for every axis of three-dimensional models of human emotion. According to the results, the algorithm's accuracy was higher than that of SVM and fuzzy classification using pre-defined rules, with ratings of 55.77%, 49.62%, and 54%, respectively.

Methods

This section presents the considered dataset and evaluation metrics. Additionally, CNN and FNN are briefly described before the proposed model.

Dataset

In this study, we considered a single but state-of-the-art DEAP dataset⁵, which is one of the gold-standard and common datasets for emotion recognition. The DEAP dataset makes it feasible to assess the quality of the extracted features and the proposed models’ performance.

The dataset consists of 32 subjects (16 male and 16 female) who watched 40 videos to stimulate distinct emotions. The Biosemi ActiveTwo device was used to record EEGs from the individuals while they were watching the clip. Following each film, they completed a Likert scale questionnaire ranging from 1 (low) to 9 (high) to record their degree of arousal, valence, dominance, liking, and familiarity. To achieve accurate and reliable results, since there were adequate observations for each subject in the emotion detection approach, one dataset confirmed the validity of the research.

The DEAP dataset is already pre-processed and is accessible to all researchers for use. Music videos are utilized as stimulation for triggering emotions in this dataset. The signals were recorded at 512 Hz with 128 Hz resampling. In this investigation, the employed method involves a 2D emotional model that includes valence and arousal. Valence measures how positive or negative an emotion is, whereas arousal refers to the intensity of related feelings and demonstrates the amount of excitement or apathy. Figure 1 illustrates the overall presentation of the DEAP dataset.

Evaluation metrics

Several evaluation metrics can evaluate the performance of a model. In our study, we used four common metrics, namely accuracy, precision, recall, and F1-Score, to evaluate and analyze the actual results of the proposed model and to enable comparison with the existing recent studies.

Accuracy is used to measure the general recognition ability of the models. The formula for accuracy is given in Eq. (1).

$${\text{Accuracy }} = \frac{{\left( {TP + TN} \right)}}{{\left( {TP + FP + TN + FN} \right)}}$$

(1)

where TP, TN, FP, and FN represent True Positive, True Negative, False Positive, and False Negative samples of predicted data.

However, the model's actual performance can be misleading, particularly for imbalanced data and multinomial classification. Therefore, precision and recall help us to analyze how effectively the model recognizes positive or negative samples for each class separately. The formulae of precision and recall are given in Eqs. (2, 3) PRE and REC.

$${\text{Precision }} = \frac{{{\text{TP}}}}{{\left( {{\text{TP}} + {\text{FP }}} \right)}}$$

(2)

$${\text{Recall }} = \frac{TP}{{\left( {TP + FN} \right)}}.$$

(3)

The F1-score is an effective evaluation metric, particularly for imbalanced datasets, in order to determine actual performance and the recognition ability of the models. It reduces the effect of the class majority on the results and provides a more balanced score than the accuracy. The formula of F1-Score is given in Eq. (4).

$${\text{F1 Score }} = \frac{{\left( {2 \times Precision \times Recall} \right) }}{{\left( {Recall + Precision} \right)}}$$

(4)

Convolutional neural network

CNN plays a significant role in the field of deep learning. It is one of the most effective methods for image analysis since it extracts features in its convolutional layers and classifies the features in the fully connected layer. One-dimensional implementation of CNN (1D-CNN) has gained significant importance in extracting features in time series data and making robust predictions. This makes 1D-CNN a popular tool for analyzing sequential data, such as EEG signals⁴¹. We employed 1D-CNN to extract the characteristics of EEG signals and to predict emotions prior to the fuzzification.

Fuzzy neural network (FNN)

The Fuzzy Neural Network, which merges the beneficial properties of fuzzy logic and neural networks, is an essential approach for intelligent information processing⁴². As a result, the fuzzy neural network technique has a powerful potential for both direct data processing through self-learning and efficient representation of structural knowledge. A multi-input fuzzy neural network system evaluates each input unit according to the degree to which it belongs to each fuzzy set.

The fuzzy rules used in the design of the FNN are represented using the “If–then” format, and they are as follows (Eq. 5):

$$P_{j} = IF\,u_{1}\,is\,P_{1j } \ldots \wedge u_{n}\,is\,P_{nj} , THEN\,y_{j} = w_{j}$$

(5)

where $P_{j}$ represents a fuzzy rule, $P_{{ij{ }}}$ represents fuzzy sets and w_j is a zero-order Takagi–Sugeno-Kang weight. The definition of the fuzzy set $P_{{ij{ }}}$, which uses a Gaussian membership function, is (Eq. 6):

$$P_{ij} = \exp \left\{ {\frac{{ - \left( {u_{1 - } m_{ij} } \right)}}{{2\sigma_{ij}^{2} }}} \right\}$$

(6)

where exp (.) is the exponential function and $m_{ij}$ and $\sigma_{ij}$ are the mean and standard deviation of a fuzzy set $P_{{ij{ }}}$, respectively.

Proposed method

The proposed method consisted of pre-processing to minimize noise and artifacts, feature extraction to obtain the most informative training data, and improving CFNN to classify emotions. Figure 2 demonstrates a brief description of the proposed methodology.

Pre-processing and feature extraction

Even though the DEAP dataset was also provided as a pre-processed edition of its raw EEG data, it includes noise and several artifacts that can disrupt the analysis. In this study, EEG signals on 32 channels from 32 contributors viewing 40 videos in the DEAP dataset were used in the experiments, where EEG signals were first down-sampled to 128 Hz to collect accurate data content between 0 and 48 Hz⁵. EEG data were purged from the down-sampled data. The other elements in the signal, such as incremental waves, were also isolated from the analysis process after utilizing the bandpass filter.

One of the basic and challenging tasks in human emotion detection, which changes according to emotion fluctuation, is determining the appropriate features and attributes⁸. The performance of the emotion recognition model is strongly influenced by the quality of the features, which explains the importance of extraction features that are both strongly associated with emotion and have a good, accurate representation as the main component of emotion recognition⁴³. By identifying the most valuable features for classification recognition from an enormous amount of feature data or generating a collection of "few but precise" characteristics with an extremely low probability of classification error, feature extraction is an approach that can decrease the dimensionality of feature space. Each feature derived from a signal provides specific details regarding the data and defines how signals behave. Feature extraction methods aim to propose a model with fewer features but more precision⁴⁴. There are several techniques for feature extraction by EEG signals. There are three primary kinds of features: TDF (Time domain features), FDF (frequency domain features), and TFDF (time–frequency domain features)⁴⁵.

A common signal processing technique used to convert time-domain signals to frequency-domain signals is Fourier analysis⁴⁵. Fourier transformations are used in this study to break the EEG signal down into its frequency components. The FFT (Fast Fourier Transform) technique, which calculates a sequence's DFT (Discrete Fourier Transform), is a widely used method for computing the Fourier transform⁴⁶. It yields the same results as evaluating the DFT definition explicitly, except that it is significantly faster. The formula of DFT is given in Eq. (7).

$$X_{k} = \mathop \sum \limits_{i = 0}^{N - 1} x_{i} \left( n \right)e^{{\frac{ - j2\pi ik}{N}}} \;\;\;for\; k = 0,1,2 \ldots N - 1$$

(7)

X_k denotes the discrete Fourier coefficient, N is the length of the accessible data, and x_i (n) is the input signal in the time domain⁴⁷. The fraction of a signal's frequency bands that could not be confirmed in the time domain can be confirmed if the signal function is transformed into the frequency domain by Equation (FFT)⁴⁸.

In this study, the FFT integrates information from the raw EEG database, considering the size of the window. All raw data from the DEAP dataset, specifically 40 EEG channels, is initially loaded for a single subject. Subsequently, the selected 14 channels of EEG data are presented collectively. Finally, the plotted representation of each of the 14 channels of EEG data is displayed as part of the third step. In step four, the FFT process demonstrates the conversion of each channel signal into a frequency domain using five power bands. In total, 14 channels were selected for this investigation, as shown in Table 1. The last stage demonstrates the combined frequency domain of the 14 channels. After the feature extraction process, the features were fed to the CFNN for classification.

Table 1 FFT description for parameters.

Full size table

The proposed convolutional fuzzy neural network model

The CFNN model is a neural network structure that integrates fuzzy logic and convolutional neural networks. It is primarily developed to deal with ambiguous or fuzzy data. Our proposed architecture combines the features extracted by the CNN with the fuzzy engine of the fuzzy neural network (FNN). The benefits of both network designs are combined in this approach.

The utilization of 1D-CNN has become increasingly significant in extracting features from time series data and generating reliable predictions. The filters in the convolution layers extract the features of 1D input sequences, and the most informative features are activated using the activation functions. The following layers provide more significant and distinguishable features corresponding to their labels in order to make proper predictions in the fully connected layers. However, it is challenging for 1D-CNN to optimize the extracted features due to the similarities or differences of time-series data. Even though 1D-CNN could achieve reasonable results, the improvement or the modifications of the extracted features with the fuzzification lead to an increased recognition ability of the models.

Fuzzification refers to the process of adding a fuzzy layer to the model. By using this layer, which transforms the input matrix (extracted features) into the fuzzy domain, high-dimension feature extraction can be accomplished using a convoluted representation of the result with the ability to handle noise in data. The fuzzy set’s estimate is carried out following Eqs. (8) and (9), which show the probabilities of the components existing in the domain of fuzzy numbers⁴⁹.

$$\hat{X}\, = \,Fuzzification\left( {X_{i,j } | cx_{i,j} } \right)$$

(8)

$$x_{i,j} = possibility {(}x_{i,j} {|}MF_{i,j} ) = \max MF_{i,j} \delta \left( {x - x_{i,j} } \right)$$

(9)

Here i and j represent the element x’s index in the input matrix X, and cx is the center of the input fuzzy membership function. $\delta \left( {x - x_{i,j} } \right)$ represents the Kronecker delta function.

In the proposed model, the CNN architecture includes two 1D convolution layers. In the first convolution layer, we aimed to increase the dimension of features using 64 filters with a 5 × 5 kernel, and in the second convolutional layer, we compressed the feature map by 32 filters with 3 × 3. In order to minimize the computational cost and the dimension of the feature map, max-pooling was applied, ensuring the choice of pertinent features. The extracted features are flattened and then fed to the fuzzification layer to eliminate noise in the data.

After the fuzzification of the features, batch normalization is applied to avoid over-fitting, and a dense layer with two nodes is used prior to the defuzzification to improve the classification performance.

Defuzzification is the process of reversing fuzzified values into the crisp values where the noise-free and more informative features are transmitted to the final output layer. The defuzzification procedure calculates the crisp value, ν_i, using Eq. (10).

$$\nu_{i} = defuzzy \left( {x_{i} } \right) = \frac{{\sum C_{y } x_{i} }}{{\sum x_{i} }}$$

(10)

where $C_{{y{ }}}$ represents the defuzzification membership function's core. The weight assigned to a fully connected layer is denoted as $W_{{fc{ }}}$.

Figure 3 demonstrates the general structure of the proposed CFNN model, and Table 2 presents detailed information about the parameters and layers of the proposed model.

Table 2 Detailed information about the parameters and layers of the proposed model.

Full size table

Results

Experiments

Several experiments, which are varied from the training and testing split ratio to feature extraction window size, have been performed to analyze the proposed model's recognition ability and conduct comparative studies with state-of-the-art methods. Multiple experiments have been conducted, ranging from adjusting the training and testing split ratio to varying the size of the feature extraction window, as well as testing with different percentages of training and testing data. These experiments are conducted to analyze the recognition capability of the proposed model and compare its performance with state-of-the-art methods. The experiments are performed on an Intel (R) Core (TM) i5-5200U CPU @ 2.20GHZ, 64 GB of RAM, and NVIDIA GeForce 840 M PC. The proposed model was implemented in Google Collaboratory, a hosted Jupyter Notebook service. The Python version is 3.10.12, and the Tensorflow version is 2.12.0.

In addition to the proposed improved CFNN method, we trained SVM, KNN, 1D-CNN, and Bi-LSTM for the binary classification of emotions (arousal and valence) to provide a comparative study. All models were trained using 75% of data obtained from the feature extraction process as training data and 25% as test data. The proposed model's learning rate and batch size were set to 0.01 and 256, respectively, and the Adam optimizer was used. The training was stopped after 100 epochs.

The window size of FFT was varied between 4 and 128, doubled at each experiment, and the training and testing ratio was set to 90–10%, 80–20%, and 75–25%. In addition to the hold-out experiments, a five-fold cross-validation experiment was performed on the proposed model to obtain consistent results.

Results

When the proposed model was trained using different training and testing ratios, the highest accuracy, F1-score, and precision were obtained in five-fold cross-validation experiments (K = 5), where all data were considered in separate training and testing folds. The highest recall and second-best scores of other metrics were achieved in a 75:25 training and testing ratio. The other hold-out ratios could not outperform the scores obtained in the K = 5. Based on the obtained results, to minimize training time and speed up the analyses, the rest of the experiments are performed using a 75:25 hold-out ratio. Table 3 presents the results obtained in the hold-out and K = 5 experiments.

Table 3 Performance of the proposed CFNN using varying train/test splits.

Full size table

Fluctuated recognition rates were obtained when the proposed model was trained with different window sizes (4, 8, 16, 32, 64, 128). It made the task challenging to conclude which window size was more effective and informative; however, it should be noted that 4 and 8 window sizes might not be suitable for classifying emotions. Table 4 presents the results obtained by the proposed method with different window sizes.

Table 4 Performance of the proposed CFNN using different window sizes.

Full size table

Based on the hold-out and window size experiments, the proposed model with a 75:25 training and testing ratio and 32 window size achieved 98.39 and 97.93 F1 scores for valence and arousal, respectively. The proposed CFNN model resulted in enhanced classification capabilities, yielding an average accuracy of 98% when employing fusion techniques. This improved performance can be attributed to fuzzy logic’s capacity to emulate human reasoning. As the weights were modified periodically, it caused a reduction in overfitting and gave the best outcomes. Table 5 shows the accuracy, precision, recall, and F1-score results of the proposed model in valence and arousal on the DEAP dataset.

Table 5 Results of the proposed CFNN method on the DEAP dataset (in 75–25%).

Full size table

When other comparative methods were considered, it was clear that the SVM and KNN failed to produce correct emotional classifications using the extracted features. The SVM obtained 55% and 54% accuracy for valence and arousal, while the KNN achieved 61.00% and 60.00%, respectively. Even though the Bidirectional-LSTM improved the recognition rates by achieving 72.01% for valence and 70.42 for arousal, it could not outperform 1D-CNN. Among those methods, 1D-CNN achieved superior results (88.87% for valence and 83.35% for arousal); however, the optimal results were obtained by the proposed method by 98.21% and 98.08%, respectively. Table 6 presents the results obtained in the comparative study with the state-of-the-art methods, and Fig. 4 visualizes the results.

Table 6 Comparison results of the proposed CFNN method with state-of-the-art methods.

Full size table

Limitations

Emotional recognition studies have different challenges, such as data acquisition procedures, tools, aims, etc. These differences create varied and different datasets, which prevent the use of external datasets to validate the proposed methods. The proposed model has not been implemented for different datasets that have different characteristics, and the generalization of the results in real-time applications requires further investigation.

Discussion

Recent and the most accurate studies are focused on the deep learning approach; however, the pre-processing or relevant approaches were also effective on the recognition ability of the models. In the study by Al-Nafjan et al.⁵⁰, PSD was used for feature extraction and DNN for classification. They achieved 82% in both valence and arousal accuracy. Alhagry et al.⁵¹ utilized LSTM to learn features from the signals. They achieved 85.45% and 85.65% for valence and arousal. However, Xing et al.²⁴ employed SAE (Stack AutoEncoder) with LSTM-RNN to fix the linear EEG signals problem. The accuracies obtained were 81.10% and 74.38% in valence and arousal, respectively. Furthermore, Iyer et al.³⁹ implemented a CNN and LSTM-based hybrid model in which DE was used as feature extraction, and they obtained 65% accuracy. In addition, Sharma et al.⁵² considered valence and arousal emotion for classification, and PSD was used for feature extraction. They applied CNN and LSTM models, which, as the results obtained, were 85.23%, 86.50 for CNN, and 87.68%, 87.98% for LSTM on the valence and arousal. In one of the most influencing studies, Singh et al.⁵³ developed a CNN 1D and Bi-LSTM model as classification and achieved 92.29% and 90.33% in two classes. Another influencing accurate was performed by Yang et al.⁵⁴ by suggesting a multi-column structure to enhance the accuracy of the CNN-based model. They achieved 90.01% in valance and 90.65% in arousal.

However, the complexity of signals and the variety of the pre-processing approaches and classifiers produce fluctuated results in emotion recognition. In our study, converting the extracted features of the flattened layer into fuzzy values with learnable parameters and applying defuzzification provided the representation of the features more informative. This improved the recognition ability of the model, and the proposed method produced superior results to the recent studies by 6–17%. Table 7 compares the proposed method with the recent studies for the same dataset.

Table 7 Comparison of the proposed method with the recent studies for the DEAP dataset.

Full size table

Conclusion

The rate of false positives and the possibility of image manipulation in image processing can affect the accuracy of results. However, such manipulation cannot occur in EEG brain wave signals as it cannot be tampered with. For this purpose, recording brain signals makes recognizing and investigating emotions easier. This study aimed to improve the accuracy of emotions obtained from EEG signals using deep learning and fuzzy logic. EEG signals were pre-processed to eliminate noise, features were extracted in the frequency domain using FFT, and the classification was performed using the improved CFNN model.

A comparative study was performed using four trained state-of-the-art methods, and the results suggested that the proposed method outperformed other methods in all metrics. Additionally, the proposed model was compared with the recent studies, and the proposed method achieved superior results for the same dataset. The performance of the proposed method was recorded as 98.21% and 98.08% for valence and arousal. The performance of the proposed method suggests that the conversion and representation of the extracted features as fuzzy posterior to the convolutional layers provide more informative features in the recognition phase. Our future work will include the multiclass implementation of the proposed method.

Data availability

The dataset used in this research can be accessed at: https://www.eecs.qmul.ac.uk/mmv/datasets/deap/ (accessed on 13 February2022.

References

Hughes, T. & Harding, K. Descartes’ error. Pract. Neurol. 14, 201 (2014).
Article Google Scholar
Zheng, W. L., Zhu, J. Y. & Lu, B. L. Identifying stable patterns over time for emotion recognition from eeg. IEEE Trans. Affect. Comput. 10, 417–429 (2019).
Article Google Scholar
Zheng, W., Member, S., Lu, B. & Member, S. Investigating critical frequency bands and channels for EEG-based emotion recognition with deep neural networks. IEEE Trans. Autonomous Mental Dev. 604, 1–14 (2015).
Google Scholar
Ekman, P. et al. Universals and cultural differences in the judgments of facial expressions personality processes and individual universals and cultural differences in the judgments of facial expressions of emotion. J. Personal. Soc. Psychol. https://doi.org/10.1037/0022-3514.53.4.712 (1987).
Article Google Scholar
Koelstra, S. et al. DEAP: A database for emotion analysis using physiological signals. IEEE Trans. Affect. Comput 3, 1–15 (2011).
Google Scholar
Guzel, S., Turgay, A. & Hasan, K. Wavelet-based study of valence—arousal model of emotions on EEG signals with LabVIEW. Brain Inf. 3, 109–117 (2016).
Article Google Scholar
Islam, R., Moni, M. A. L. I., Islam, M., Azad, A. K. M. & Alyami, S. A. Emotion recognition from EEG signal focusing on deep learning and shallow learning techniques. IEEE Access 9, 94601–94624 (2021).
Article Google Scholar
Wagh, K. P. & Vasanth, K. Electroencephalograph (EEG) Based Emotion Recognition System: A Review. Lecture Notes in Networks and Systems (Springer, 2019).
Book Google Scholar
Chowdary, M. K., Anitha, J. & Hemanth, D. J. Emotion recognition from EEG signals using recurrent neural networks. Electronics 11, 2387 (2022).
Article Google Scholar
Mattioli, F., Porcaro, C. & Baldassarre, G. A 1D CNN for high accuracy classification and transfer learning in motor imagery EEG-based brain-computer interface. J. Neural Eng. 18, 6 (2021).
Article Google Scholar
Taunk, K., De, S., Verma, S. & Swetapadma, A. A brief review of nearest neighbor algorithm for learning and classification. In 2019 Int. Conf. Intell. Comput. Control Syst. ICCS 2019 1255–1260 (2019). https://doi.org/10.1109/ICCS45141.2019.9065747.
Schölkopf, B. An introduction to support vector machines. Recent Adv. Trends Nonparametr. Stat. 2003, 3–17. https://doi.org/10.1016/B978-044451378-6/50001-6 (2003).
Article MathSciNet Google Scholar
Schuster, M. & Paliwal, K. K. Bidirectional recurrent neural networks. IEEE Trans. Signal Process. 45, 2673–2681 (1997).
Article ADS Google Scholar
Hsu, M. J., Chien, Y. H., Wang, W. Y. & Hsu, C. C. A convolutional fuzzy neural network architecture for object classification with small training database. Int. J. Fuzzy Syst. 22, 1–10 (2020).
Article Google Scholar
Duan, R. N., Zhu, J. Y. & Lu, B. L. Differential entropy feature for EEG-based emotion classification. In Int. IEEE/EMBS Conf. Neural Eng. NER 81–84 (2013). https://doi.org/10.1109/NER.2013.6695876.
George, F. P., Shaikat, I. M., Ferdawoos Hossain, P. S., Parvez, M. Z. & Uddin, J. Recognition of emotional states using EEG signals based on time-frequency analysis and SVM classifier. Int. J. Electr. Comput. Eng. 9, 1012 (2019).
Google Scholar
Soundarya, S. An Eeg based emotion recognition and classification using machine learning techniques. Int. J. Emerg. Technol. Innov. Eng. 5, 226 (2019).
Google Scholar
Vaid, S., Singh, P. & Kaur, C. Classification of human emotions using multiwavelet transform based features and random forest technique. Indian J. Sci. Technol. 8, 28 (2015).
Article Google Scholar
Ismael, A. M., Alçin, Ö. F., Abdalla, K. H. & Şengür, A. Two-stepped majority voting for efficient EEG-based emotion classification. Brain Inf. 7, 1 (2020).
Article Google Scholar
Liu, Y. & Fu, G. Emotion recognition by deeply learned multi-channel textual and EEG features. Futur. Gener. Comput. Syst. 119, 1–6 (2021).
Article Google Scholar
Thejaswini, S., Ravikumar, K. M., Jhenkar, L., Natraj, A. & Abhay, K. K. Analysis of EEG based emotion detection of DEAP and SEED-IV databases using SVM. Int. J. Recent Technol. Eng. 8, 207–211 (2019).
Google Scholar
Mohammadi, Z., Frounchi, J. & Amiri, M. Wavelet-based emotion recognition system using EEG signal. Neural Comput. Appl. 28, 1985–1990 (2017).
Article Google Scholar
Li, M., Xu, H., Liu, X. & Lu, S. Emotion recognition from multichannel EEG signals using K-nearest neighbor classification. Technol. Heal. Care 26, S509–S519 (2018).
Article ADS Google Scholar
Zali-Vargahan, B., Charmin, A., Kalbkhani, H. & Barghandan, S. Deep time-frequency features and semi-supervised dimension reduction for subject-independent emotion recognition from multi-channel EEG signals. Biomed. Signal Process. Control 85, 104806 (2023).
Article Google Scholar
Szegedy, C., Vanhoucke, V., Ioffe, S., Shlens, J. & Wojna, Z. Rethinking the inception architecture for computer vision. In Proc. IEEE Comput. Soc. Conf. Comput. Vis. Pattern Recognit. 2016-Decem 2818–2826 (2016).
Zheng, W. L. & Lu, B. L. Investigating critical frequency bands and channels for EEG-based emotion recognition with deep neural networks. IEEE Trans. Auton. Ment. Dev. 7, 162–175 (2015).
Article Google Scholar
Xing, X. et al. SAE+LSTM: A new framework for emotion recognition from multi-channel EEG. Front. Neurorobot. 13, 1–14 (2019).
Article ADS Google Scholar
Gliner, J. A., Morgan, G. A., Leech, N. L., Gliner, J. A. & Morgan, G. A. Measurement reliability and validity. Res. Methods Appl. Settings 2021, 319–338. https://doi.org/10.4324/9781410605337-29 (2021).
Article Google Scholar
Hochreiter, S. & Schmidhuber, J. Long short-term memory. Neural Comput. 9, 1735–1780 (1997).
Article CAS PubMed Google Scholar
Xiao, G. et al. 4D attention-based neural network for EEG emotion recognition. Cogn. Neurodyn. 16, 805–818 (2022).
Article PubMed PubMed Central Google Scholar
Cimtay, Y. & Ekmekcioglu, E. Investigating the use of pretrained convolutional neural network on cross-subject and cross-dataset eeg emotion recognition. Sens. (Switzerl.) 20, 1–20 (2020).
Google Scholar
Gao, Z. et al. A channel-fused dense convolutional network for EEG-based emotion recognition. IEEE Trans. Cogn. Dev. Syst. 8920, 1–1 (2020).
Google Scholar
Algarni, M., Saeed, F., Al-Hadhrami, T., Ghabban, F. & Al-Sarem, M. Deep learning-based approach for emotion recognition using electroencephalography (EEG) signals using bi-directional long short-term memory (Bi-LSTM). Sensors 22, 2976 (2022).
Article ADS PubMed PubMed Central Google Scholar
Arjun, R. A. S. & Panicker, M. R. Subject independent emotion recognition using EEG signals employing attention driven neural networks. Biomed. Signal Process. Control 75, 103547 (2022).
Article Google Scholar
Ali-Hossam, S. Application of machine learning to epileptic seizure onset detection and treatment. Massachusetts Inst. Technol. 2009, 157–162 (2009).
Google Scholar
Garg, A., Kapoor, A., Bedi, A. K. & Sunkaria, R. K. Merged LSTM Model for emotion classification using EEG signals. In 2019 Int. Conf. Data Sci. Eng. ICDSE 2019 139–143 (2019). https://doi.org/10.1109/ICDSE47409.2019.8971484.
Tao, W. et al. EEG-based emotion recognition via channel-wise attention and self attention. IEEE Trans. Affect. Comput. 14, 382–393 (2023).
Article Google Scholar
Katsigiannis, S. & Ramzan, N. DREAMER: A database for emotion recognition through EEG and ECG signals from wireless low-cost off-the-shelf devices. IEEE J. Biomed. Heal. Inf. 22, 98–107 (2018).
Article Google Scholar
Iyer, A., Das, S. S., Teotia, R., Maheshwari, S. & Sharma, R. R. CNN and LSTM based ensemble learning for human emotion recognition using EEG recordings. Multimed. Tools Appl. 82, 4883–4896 (2023).
Article Google Scholar
Chatchinarat, A., Wong, K. W., & Fung, C. C. Fuzzy classification of human emotions using fuzzy C-mean (FCFCM). In 2016 international conference on fuzzy theory and its applications (iFuzzy). 1–5 (IEEE, 2016).
Zamani, F. & Wulansari, R. Emotion Classification using 1D-CNN and RNN based On DEAP Dataset 363–378 (Springer, 2021). https://doi.org/10.5121/csit.2021.112328.
Gai, J. & Hu, Y. Research on fault diagnosis based on singular value decomposition and fuzzy neural network. Shock Vib. 2018, 7 (2018).
Google Scholar
Zhang, J., Yin, Z., Chen, P. & Nichele, S. Emotion recognition using multi-modal data and machine learning techniques : A tutorial and review. Inf. Fusion 59, 103–126 (2020).
Article Google Scholar
Liu, H., Zhang, Y., Li, Y. & Kong, X. Review on emotion recognition based on electroencephalography. Front. Comput. Neurosci. 15, 1–15 (2021).
Article CAS Google Scholar
Murugappan, M. & Subbulakshmi, M. Human emotion recognition through short time electroencephalogram (eeg) signals using fast fourier transform (FFT). In IEEE 9th international colloquium on signal processing and its applications. 289–294 (IEEE, 2013).
Nandini, D., Yadav, J., Rani, A. & Singh, V. Biomedical signal processing and control design of subject independent 3D VAD emotion detection system using EEG signals and machine learning algorithms. Biomed. Signal Process. Control 85, 104894 (2023).
Article Google Scholar
Singh, R. K. & Singh, P. A. K. Frequency analysis of healthy & epileptic seizure in EEG using fast fourier transform. Int. J. Eng. Res. Gen. Sci. 2, 683–691 (2014).
Google Scholar
Akter, S., Prodhan, R. A., Pias, T. S., Eisenberg, D. & Fresneda Fernandez, J. M1M2: deep-learning-based real-time emotion recognition from neural activity. Sensors 22(21), 8467 (2022).
Article ADS PubMed PubMed Central Google Scholar
Nguyen, T. L., Kavuri, S. & Lee, M. A fuzzy convolutional neural network for text sentiment analysis. J. Intell. Fuzzy Syst. 35, 6025–6034 (2018).
Article Google Scholar
Al-Nafjan, A., Hosny, M., Al-Wabil, A. & Al-Ohali, Y. Classification of Human Emotions from Electroencephalogram (EEG) signal using deep neural network. Int. J. Adv. Comput. Sci. Appl. 8, 419–425 (2017).
Google Scholar
Alhagry, S., Aly, A. & Aly, R. Emotion recognition based on EEG using LSTM recurrent neural network. Int. J. Adv. Comput. Sci. Appl. 8, 8–11 (2017).
Google Scholar
Sharma, R. & Meena, H. K. EmHM: A novel hybrid model for the emotion recognition based on EEG signals. In 2023 19th IEEE Int. Colloq. Signal Process. Its Appl. CSPA 2023 - Conf. Proc. 75–80 (2023). https://doi.org/10.1109/CSPA57446.2023.10087500.
Singh, K., Ahirwal, M. K. & Pandey, M. Quaternary classification of emotions based on electroencephalogram signals using hybrid deep learning model. J. Ambient Intell. Humaniz. Comput. 14, 2429–2441 (2023).
Article Google Scholar
Yang, H., Han, J. & Min, K. A multi-column CNN model for emotion recognition from EEG signals. Sens. (Switzerl.) 19, 1–12 (2019).
Google Scholar
Kim, Y. & Choi, A. Eeg-based emotion classification using long short-term memory network with attention mechanism. Sens. (Switzerl.) 20, 1–22 (2020).
Google Scholar

Download references

Acknowledgements

The first author would like to express her gratitude to the supportive supervisor of her dissertation which this paper is part of that. All authors would like to thank and acknowledge anonymous re-viewers for their efforts to improve the quality of this paper.

Author information

Authors and Affiliations

Department of Biomedical Engineering, Near East University, 99138, Nicosia, Cyprus
Nasim Ahmadzadeh Nobari Azar & Süleyman Aşır
Department of Computer Information Systems, Near East University, 99138, Nicosia, Cyprus
Nadire Cavus
Computer Information Systems Research and Technology Center, Near East University, Nicosia, 99138, Turkey
Nasim Ahmadzadeh Nobari Azar & Nadire Cavus
Department of Computer Engineering, Cyprus International University, 99258, Nicosia, Cyprus
Parvaneh Esmaili
Software Engineering Department, World Peace University, Nicosia, Turkey
Boran Sekeroglu
Center for Science and Technology and Engineering, Near East University, Nicosia, 99138, Turkey
Süleyman Aşır

Authors

Nasim Ahmadzadeh Nobari Azar
View author publications
You can also search for this author in PubMed Google Scholar
Nadire Cavus
View author publications
You can also search for this author in PubMed Google Scholar
Parvaneh Esmaili
View author publications
You can also search for this author in PubMed Google Scholar
Boran Sekeroglu
View author publications
You can also search for this author in PubMed Google Scholar
Süleyman Aşır
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

Conceptualization, N.A.N.A.; methodology, N.A.N.A.; software, N.A.N.A.; validation, N.A.N.A.; formal analysis, N.A.N.A.; investigation, N.A.N.A.; writing—original draft preparation, N.A.N.A., B.S., and N.C.; writing—review and editing, N.A.N.A., N.C., P.E., B.S., and, S.A.; visualization, N.A.N.A.; All authors have read and agreed to the published version of the manuscript.

Corresponding author

Correspondence to Nasim Ahmadzadeh Nobari Azar.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher's note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Ahmadzadeh Nobari Azar, N., Cavus, N., Esmaili, P. et al. Detecting emotions through EEG signals based on modified convolutional fuzzy neural network. Sci Rep 14, 10371 (2024). https://doi.org/10.1038/s41598-024-60977-9

Download citation

Received: 01 November 2023
Accepted: 29 April 2024
Published: 06 May 2024
DOI: https://doi.org/10.1038/s41598-024-60977-9

Comments

By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.

Subjects

Abstract

Similar content being viewed by others

CNN-XGBoost fusion-based affective state recognition using EEG spectrogram image analysis

Emotion recognition based on group phase locking value using convolutional neural network

Four-layer ConvNet to facial emotion recognition with minimal epochs and the significance of data diversity

Introduction

Related works

Machine learning-related works

Deep learning-related works

Methods

Dataset

Evaluation metrics

Convolutional neural network

Fuzzy neural network (FNN)

Proposed method

Pre-processing and feature extraction

The proposed convolutional fuzzy neural network model

Results

Experiments

Results

Limitations

Discussion

Conclusion

Data availability

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Competing interests

Additional information

Publisher's note

Rights and permissions

About this article

Cite this article

Share this article

Comments

Search

Quick links