Deep learning-based automated detection and multiclass classification of focal interictal epileptiform discharges in scalp electroencephalograms

Detection and spatial distribution analyses of interictal epileptiform discharges (IEDs) are important for diagnosing, classifying, and treating focal epilepsy. This study proposes deep learning-based models to detect focal IEDs in electroencephalography (EEG) recordings of the frontal, temporal, and occipital scalp regions. This study included 38 patients with frontal (n = 15), temporal (n = 13), and occipital (n = 10) IEDs and 232 controls without IEDs from a single tertiary center. All the EEG recordings were segmented into 1.5-s epochs and fed into 1- or 2-dimensional convolutional neural networks to construct binary classification models to detect IEDs in each focal region and multiclass classification models to categorize IEDs into frontal, temporal, and occipital regions. The binary classification models exhibited accuracies of 79.3–86.4%, 93.3–94.2%, and 95.5–97.2% for frontal, temporal, and occipital IEDs, respectively. The three- and four-class models exhibited accuracies of 87.0–88.7% and 74.6–74.9%, respectively, with temporal, occipital, and non-IEDs F1-scores of 89.9–92.3%, 84.9–90.6%, and 84.3–86.0%; and 86.6–86.7%, 86.8–87.2%, and 67.8–69.2% for the three- and four-class (frontal, 50.3–58.2%) models, respectively. The deep learning-based models could help enhance EEG interpretation. Although they performed well, the resolution of region-specific focal IED misinterpretations and further model improvement are needed.

Interictal epileptiform discharges (IEDs) are electroencephalography (EEG) biomarkers of epilepsy important for diagnosing, classifying, and monitoring the disease and for selecting anti-seizure medication [1][2][3][4] . In current practice, IEDs are manually detected during EEG interpretation. This process is highly labor-intensive and time-consuming because it depends on visual interpretation by neurology specialists [5][6][7] . Ongoing investigations have endeavored to develop an automated technique that could efficiently detect IEDs in EEG recordings at an acceptable accuracy 8 . Recently, deep learning techniques have been widely accepted as the main strategy for building automated IED detectors for scalp [9][10][11][12][13][14][15][16][17] and intracranial [18][19][20][21] Epilepsy, a chronic disorder of the brain that causes recurrent spontaneous seizures, is categorized as focal or generalized. Recurrent seizures originating within a neuronal network limited to one hemisphere, unifocal or multifocal, are core features of focal epilepsy. Analyzing the spatial distribution of IEDs in focal epilepsy is of fundamental importance to properly classify it and determine the cortical generators of the epileptic activity. Nevertheless, the irritative zone might not exactly match the epileptogenic zone 22,23 . Most deep learning-based investigations performed binary classification of scalp EEG recordings into IED and non-IED, regardless of their location. Studies have reported automated detectors for centrotemporal IEDs, a characteristic EEG marker for self-limited epilepsy with centrotemporal spikes 13,16,24 . One study reported an automated detector trained by frontal, temporal, parietal, and occipital IEDs in patients with focal epilepsy, although they did not attempt to IED annotation. A total of 4557 IEDs (2112 frontal, 1176 temporal, and 1269 occipital) were annotated from the 38 patients with focal epilepsy. The number of IEDs per EEG recording was 141 ± 129, 90 ± ± 99, and 127 ± 68 in the frontal, temporal, and occipital regions, respectively. The mean lengths of frontal, temporal, and occipital IEDs were 0.45, 0.52, and 0.48 s, respectively. Detailed information on the patients with focal IEDs is presented in Table 1.
Classification models. Binary classification models were constructed individually for the frontal, temporal, and occipital IEDs. Multiclass classification models were constructed to distinguish focal IEDs from other IEDs in various regions. One-dimensional (1D) and two-dimensional (2D) convolutional neural networks (CNNs) were adopted for the classification models, with multichannel EEG time series as input data. EEG recordings with IEDs were segmented into 1.5 s epochs that contained the IEDs at their center and spanned − 0.75 s to + 0.75 s from the IED center (hereafter referred to as focal IED epochs). The EEG recordings of the controls were segmented into 1.5 s epochs at random time points (hereafter referred to as non-IED epochs). At here, an epoch denoted a 1.5 s EEG segment.
The focal IED epochs were split randomly into training, validation, and test sets at a ratio of 6:2:2. Frontal, temporal, and occipital IED epochs were handled separately in the individual binary classification to classify IED and non-IED epochs. A set of focal IED epochs was handled collectively in the multiclass classification models to classify frontal, temporal, occipital IED, and non-IED epochs. The focal IED epochs were augmented by random jittering between − 50 ms and + 50 ms from the center of each epoch to handle imbalanced data distributions caused by much smaller number of focal IED epochs compared to non-IED epochs. The non-IED epochs were randomly under-sampled to match the number of augmented focal IED epochs at a 1:1 ratio. We made the focal IED epochs have fully shaped IEDs to ensure a sufficient number of clean-labeled training data for the robustness of our deep learning-based classification 25 . Representative focal IED and non-IED epoch images are shown in Fig. 1.
The binary classification models' performance was assessed using sensitivity, specificity, accuracy, and the area under the receiver operating characteristic curve (AUC). The multiclass classification models' performance was assessed using precision and recall, and the F1-score was calculated. We used Python 3.8 with Tensorflow 2.2, compute unified device architecture 10.1, four NVIDIA TITAN V graphic cards with 12 GB memory to implement classification models; and sklearn.metrics module for the performance evaluation. CNN architecture. We adopted a CNN architecture consisting of three convolution layers, batch normalization, and max pooling layers. Multi-channel EEG time series were fed into the first convolution layer with an input size of 19 × 300 (number of channels × number of data points). Fully connected layers had output sizes of two (IED and non-IED epochs), three (temporal and occipital IED and non-IED epochs), and four (frontal, temporal, and occipital IED and non-IED epochs) in accordance with classification types. For our 1D CNN-based www.nature.com/scientificreports/ classification, the three convolution layers had 64, 128, and 64 filters, with a kernel size of 19 × 6 and a stride of 1. For our 2D CNN-based classification, the three convolution layers had 32, 64, and 32 filters with a kernel size of 1 × 6 and a stride of 1. We set the kernel size considering the minimum size of spikes and waves each described in a previous study on EEG characteristics of IEDs 26 . 1D or 2D max pooling layers were applied to the three convolution layers with a pooling size of 3, 2, and 1 in order and a stride of 2. Batch normalization was applied www.nature.com/scientificreports/ to each convolution layer to provide regularization and enhance training speed. Categorical cross-entropy and Softmax were used as the loss and activation functions, respectively, for the binary and multiclass classifications. The root mean square propagation (RMSprop) 27 was used as an optimizer with a learning rate of 1 × 10 −5 . The model with the highest validation accuracy during training was selected for testing by the model checkpoint in Keras to avoid overfitting. The total number of training epochs was 300 with a batch size of 64.

Feature visualization.
We applied t-distributed stochastic neighbor embedding (t-SNE) for 2D visualizations of frontal, temporal, occipital, and non-IED epoch features. t-SNE is a statistical tool for dimensionality reduction that minimizes the mismatch between the joint probabilities of high-and low-dimensional data points 28 . We extracted the focal IED and non-IED epoch features from the flattened layers in the 1D and 2D CNN-based classification models. We used sklearn.manifold.TSNE module in the scikit-learn library with default parameters that included a component number of 2, perplexity of 30, and early exaggeration of 12. No additional statistical analyses were performed using the t-SNE results.
Patient-level evaluation. We additionally performed leave-one-patient-out cross-validation for patientlevel evaluation of the multiclass classification of focal IEDs. We excluded frontal IEDs to avoid potential effects of eye-related artifacts on the classification. Therefore, we performed the leave-one-patient-out cross-validation with respect to the three-class classification of temporal, occipital, and non-IEDs using the EEG recordings of 13 and 10 patients with temporal and occipital IEDs, respectively. We trained 2D CNN-based three-class classification models using the focal IED epochs of N-1 patients, where N was the total number of patients with temporal or occipital IEDs. Then, we evaluated performance of the classification models using the focal IED epochs of the remaining one patient. The performance of the patient-level evaluation was assessed using a detection rate defined as the number of correctly identified model-classified focal IED epochs divided by the total number of focal IED epochs for individual patients. All the focal IED epochs in this procedure were augmented by random jittering as described above.
To explore the effect of the source of non-IED epochs on the binary classification, we examined additional performance using the non-IED epochs from the patients with focal epilepsy. In this case, the non-IED epochs were selected at random time points outside the focal IED epochs in the EEG recordings of the patients. The respective AUCs for the frontal, temporal, and occipital IEDs were 60.3%, 76.2%, and 90.1% for the 1D CNNbased classification and 63.7%, 72.9%, and 78.1% for the 2D CNN-based classification (Abnormal in Fig. 2), which were noticeably lower than the above.
The 2D visualization of the frontal, temporal, occipital, and non-IED features is shown in Fig. 4. We qualitatively examined that those features were apparently separated from each other, in particular for the 2D CNNbased three-class classification. We could observe large overlaps between frontal and non-IED features for the four-class classification.   Table 4.

Discussion
This study implemented deep learning-based automated binary focal IED detectors with accuracies of 79.3-86.4%, 93.3-94.2%, and 95.5-97.2% for frontal, temporal, and occipital IEDs, respectively, and multiclass focal IED detectors with accuracies of 87.0-88.7% and 74.6-74.9% for three and four (frontal IED included) IED classes, respectively, on scalp EEG recordings. Frontal IEDs were associated with a low detection performance, probably due to eye-related artifacts in the frontal region. The inclusion of spatial information by applying 2D CNN to multi-channel scalp EEG recordings provided mixed effects on the focal IED detection performance. The major finding of the current study was that the implemented individual binary IED detectors showed a significant discrepancy in their focal IED detection performance among the three brain regions. Detection of Feature visualization for the three-class (upper panels) and four-class (lower panels) classification. Green, blue, yellow, and red dots represent frontal, temporal, occipital, and non-IEDs, respectively. Owing to the large number of epochs (32,709 in the three-class classification and 43,269 in the four-class classification), we randomly selected 1000 of each class for visualization (3000 in the three-class classification and 4000 in the fourclass classification). IED: interictal discharge and t-SNE: t-distributed stochastic neighbor embedding.  [10][11][12]14,15,17 , regardless of the IED location. Assessment of centrotemporal IEDs on scalp EEG recordings resulted in a sensitivity of 92.0%, precision of 85.8%, and F1-score of 88.5% in one study 13 , and AUCs of 0.768-0.942 in another 16 . Although it is difficult to directly compare the performance among the models, our good performance for temporal and occipital IED detection might be due to the exclusion of frontal IEDs, which have a low detection performance 12 . In addition, in terms of the source of non-IED epochs, using the non-IED epochs from patients' EEG recordings resulted in a poorer performance, possibly due to the presence of the abnormal EEG signals in or near the source of the focal IEDs, such as slow activity, voltage attenuation, or alteration of the background synchrony of EEG signals 29,30 .
Another major strength of this study is that it is the first to implement multiclass IED detectors that enable location-specific IED classification. Recent deep learning-based studies have reported multiclass classification for various morphological characteristics of IEDs, such as spikes, sharp waves, broadly distributed sharp waves, and spike-and-wave complexes in scalp EEG recordings 18 ; and repetitive high-amplitude complexes, high-amplitude isolated spikes, and atypical epileptiform activities in intracranial EEG recordings 20 . However, multiclass locationspecific IED classification could have a clinical advantage over morphology-specific classification as its relevance can be more easily determined and provide clues for epilepsy classification. Considering clinical application of our multiclass IED detectors, we carried out patient-level evaluation for the 2D CNN-based three-class classification which showed the best performance among our multiclass classification approaches in Table 3. It was based on leave-one-patient-out cross-validation which was known to be suitable for the confirmation of model's generalizability 31 . Our three-class IED detectors provided considerably high detection rates (> 90%) in 57% of the patients with temporal or occipital IEDs, while low (< 66%) in only 13% of the patients, as shown in Table 4. We suggest that our deep learning-based automated multiclass IED detection approaches have a potential for clinical application, provided that we enhance their detection rates for a larger number of patients with a deep understanding of inter-patient variability on electroencephalographic focal IED characteristics. As we concerned Table 4. Performance of patient-level evaluation using leave-one-patient-out cross-validation for 2D CNNbased three-class classification. CNN convolutional neural network, IED interictal epileptiform discharge, SD standard deviation. www.nature.com/scientificreports/ that eye-related artifacts could induce unclear interpretations, we did not include frontal IEDs in the patientlevel evaluation. The main reason for the discrepancy in the focal IED detection performance among brain regions in this study might be the different spatial distribution of EEG artifacts and normal EEG variations that might be misinterpreted as IEDs. A review of the major sources of artifacts and their potential implications might help improve the detector performance by annotating those artifacts and normal EEG variations and discriminating them from IEDs in future studies.

Subject No Age (yr) Sex Number of IEDs
Eye-related artifacts are well-known contaminants that can be erroneously interpreted as IEDs in scalp EEG recordings. Eye closure and eye blink result in sharp positive waveforms in the frontal channels (Fp1, Fp2, F3, and F4), while lateral eye movements generate spiky waveforms with high amplitudes in the frontal (Fp1, Fp2, F3, and F4) and anterior temporal (F7 and F8) channels mimicking epileptiform activity if combined with lateral rectus spikes 5,6 . Additionally, eye flutter with myogenic artifacts during photic stimulation can mimic spike-andwave complexes 32 . Given that eye-related artifacts are predominantly distributed in the frontal region, these may explain the relatively low detection performance of frontal IEDs.
Normal EEG variations are another major cause of erroneous IED interpretation [32][33][34] . First, fragmented or sharply contoured background alpha rhythm variations could mimic spike-like waveforms in the occipital region, while alpha rhythms in the temporal and occipital regions might have an apiculate morphology [32][33][34][35] . The 6-11 Hz wicket waves in the mid-temporal regions are well-known normal variants commonly mistaken as IEDs owing to their sharply contoured morphology 32,36 . A previous study reported that incorrect identification of such wicket waves as epileptiform activity was observed in more than 50% of the patients 36 . These normal EEG variations may have influenced the accuracy of telling focal IEDs from non-IEDs in the temporal and occipital regions, more in the binary classification than in differentiating between temporal and occipital IEDs in multiclass classification.
Misclassification of frontal IEDs in the binary classification tended to be primarily of the false positive type, whereas similar frequencies of the false positive and false negative types were found in the multiclass classification. Misclassification of temporal and occipital IEDs in multiclass classification tended to be of the false negative type. Although understanding the discrepancy between the false positive and false negative type frequencies for frontal IEDs was outside the scope of this study, different binary and multiclass classification schemes might have been its source 18 .
We adopted a CNN architecture with multichannel EEG time series as input data considering the clinical environment that neurologists usually reviewed multichannel EEG time series appeared in the form of channel × time in their monitors to manually detect IEDs. Our EEG channel arrangement was the same as that used by our neurologists when they monitored EEG recordings. In addition, we adopted 1D and 2D CNN architectures to compare their effects on the performance of classification models.
The main difference between our 1D and 2D CNN architectures is that the 1D CNN used kernels for all channels simultaneously, while the 2D CNN used kernels for each channel separately to extract features from the 19-channel EEG time series. Therefore, spatial information associated with EEG features from multiple scalp regions has been better exploited in 2D CNN-based classification. A previous study reported that 2D CNN outperformed 1D CNN in differentiating IED from non-IED binary as it combines temporal and spatial information 10 . From this perspective, we hypothesized that our 2D CNN-based classification models outperformed 1D CNN-based ones unless EEG recordings were severely contaminated by artifacts because kernels of the 2D CNN could extract epileptic electroencephalographic features from each channel more precisely than those of the 1D CNN. However, the performance of the 2D CNN-based classification noticeably declined for frontal IED detection in both binary and four-class classifications, possibly resulting from adverse effects of eyerelated artifacts. To explore the difference of the performance between the 1D and 2D CNN-based classification models in more detail, we visualized their corresponding feature maps using t-SNE. We qualitatively examined that the 2D CNN-based classification separated focal IED features more distinctly than the 1D CNN-based one, particularly for three IED classes.
In this study, the frequency of false-positive frontal IEDs in the 2D CNN-based binary classification was higher than that in the 1D CNN-based classification, and the number of false-negative occipital IEDs in the 2D CNN-based multiclass classification was higher than that in the 1D CNN-based classification. In terms of the region-specific misinterpretation of focal IEDs, we suggest that 2D CNN captures the morphological characteristics of eye-related artifacts in frontal regions and normal variations in occipital regions more sensitively than 1D CNN. The number of false-positive frontal IEDs in the 2D CNN-based four-class classification was lower than that in the 1D CNN-based classification, while the number of falsely classified frontal IEDs as temporal and occipital IEDs was higher, indicating a decreased sensitivity in the 2D CNN-based multiclass classifications in detecting frontal IEDs. Although 2D CNN extracts spatial information better than 1D CNN, the additional spatial information probably provoked unintended and mixed region-specific effects on the performance of the classification procedures.
This study had several limitations. First, we limited the focal IEDs to three categories: frontal, temporal, and occipital. Second, the number of annotated IEDs may have been insufficient for generalizing the study results. Third, the patients with temporal IEDs were significantly older than the other subgroups. Fourth, there was no EEG-level clinical validation of focal IED detectors. To address these limitations, we plan studies that will include centrotemporal or generalized IEDs to expand the region-specific detectability of focal IED detectors; and utilize CNNs with more optimal hyperparameters or other deep learning techniques such as LSTM that effectively analyze time-series data or combined CNNs and LSTM 13 . We also plan semi-supervised approaches using clinician-initiated automated detectors to rapidly annotate IEDs and abundantly acquire training datasets; EEG-level evaluation of our IED detectors to improve their clinical applicability; and explainable artificial intelligence-based studies to understand spatiotemporal model interpretability such as occlusion maps 37 and gradient-weighted class activation mapping 20 . Additionally, studies that will include a larger number of patients, www.nature.com/scientificreports/ perform group analysis of specific epilepsy syndromes, include complete clinical data, and handle different EEG channel arrangements could also help.

Conclusions
This study implemented deep learning-based automated focal IED detectors for detecting and localizing frontal, temporal, and occipital focal IEDs based on scalp EEG recordings of patients with epilepsy. Although we believe our detectors performed reasonably well, we still need to resolve the unintended EEG features that lead to regionspecific misinterpretations of focal IEDs.

Data availability
The datasets generated and analyzed during the current study are not publicly available due retrospective design of the study (waiver of informed consent was approved by IRB) but are available from the corresponding author on reasonable request.