Person identification with arrhythmic ECG signals using deep convolution neural network

Over the past decade, the use of biometrics in security systems and other applications has grown in popularity. ECG signals in particular are attracting increased attention due to their characteristics, which are required for a trustworthy identification system. The majority of ECG-based person identification systems are evaluated without considering the health-state of the individuals. Few person identification systems consider person-by-person health-state annotation. This paper proposes a person identification system considering the health-state annotated ECG signals where each person’s beats overlap among variant arrhythmia classes. This overlapping between the normal class and other arrhythmia classes grants the ability to isolate normal beats in the train set from the Arrhythmic beats in the test set. Therefore, this paper investigates the effect of arrhythmic heartbeats on biometric recognition. An effective lightweight CNN based on depth-wise separable convolution (DWSC) is proposed to enhance the performance of person identification for several common arrhythmia types using the MITBIH dataset. The proposed methodology has been tested on nine arrhythmia types and presents how different types of arrhythmia affect ECG-based biometric systems differently. The experimental results show excellent recognition performance (99.28%) on normal heartbeats and (93.81%) on arrhythmic heartbeats, outperforming other models in terms of mean accuracy.

To the best of our knowledge, no previous study has tested the arrhythmic versus normal heartbeats of the same patient separated into train and test sets to proposed a complete analysis of the impact of different types of arrhythmia on the ECG-biometric system.As a result, our research suggests a thorough examination of how various types of arrhythmia affect an ECG-based biometric system.A deep-learning, beat-labeled ECG-based biometric system is also proposed in this study.The research's contribution includes the following: • A deep CNN model is designed based on the depth-wise separable convolution (DWSC) operation with a good trade-off between the low parameter complexity and the high representation capacity for achieving good person identification performance on ECG signals.
• Using the designed CNN model, a robust ECG-based identification system is developed to reduce the adverse impacts of four different types of arrhythmia.• We isolate the arrhythmic heartbeat set and the normal heartbeat set of the same individuals in train and test sets in order to analyze the effects of arrhythmia on the person identification system.To the best of our knowledge, no person identification system isolates the heartbeats of the same individual based on the heartbeat health-state annotation in the training and test sets.• We found a relationship between the heartbeat's health state and the performance of the ECG-based person identification system: if the person does not suffer from cardiac disease, the performance is excellent.• Experimentally, we show a negative impact of arrhythmia on the performance of the person identification system.The severity of the adverse effects varies depending on the type of arrhythmia.
The rest of the paper is structured as follows: Next section presents the related work.Then the proposed model is described.The following section presents results, the evaluation criteria, and the implementation details.In the conclusion, the results are highlighted and future research is underlined.

Related work
ECG identification systems are built based on either hand engineering/machine learning methodologies or deep learning methods.Many ECG-based recognition systems have been proposed since 2001 when ECG-signal was first introduced as a biometric modality 10 .Most of the state-of-art studies proposed solutions based on Deep Learning models for ECG Biometric systems.CNN and BLSTM are the most applied models in ECG-based biometric systems that perform the best.Yazhao Li et al. 11 suggest a CNN architecture for identification and feature learning process.Eko Ihsanto et al. 12 suggested Residual Depth-wise Separable Convolutional Neural Network (RDSCNN) deep learning approach for ECG-based recognition.Zhidong Zhao et al. 13 test their suggested system using various dataset scenarios, regular dataset and the Atrial Fibrillation dataset were both used for system evaluation.The results of the suggested method are good, coming in at 99% and 98%, respectively.
There are three algorithms where biometric performance is evaluated separately between healthy and arrhythmic subjects.Authors in 14 suggested a methodology and tested it and compare it versus other two methodologies.They test the three techniques on 112 PTB subjects (98 subjects have arrhythmia beats and 14 healthy subjects.).The first methodology is proposed in 14 , the Pulse Active transform (PAT).PAT is a transformation technique; it decomposes a signal into a series of related cyclic triangular waveforms (pulse active feature sets).Matching scores are generated by the Euclidean Distance as the distance measure.The second technique is proposed in 15 , where authors proposed a set of extensive ECG features proven to be invariant to anxiety state.These descriptors are the fiducial points; they contain physiological information of the heart.They develop filtering technique that merges heuristic and quantitative information.The third technique is proposed in 10 Authors use about twenty-one fiducial features from single lead.These features include amplitudes, waves and segments.Soft independent modeling of class analogy (SIMSA model) is applied for classification.The first step in SIMCA is applying PCA for each class to express data variance.Next step is building models of different classes using measurement sets.The proposed methods in 10 and in 15 are applied in 14 on same data and the results are shown in Table 1.It can be concluded in the results that in all the three different proposed methodologies, Healthy subjects score better identification performance versus Arrhythmic subjects.It also shows that PAT outperforms other two other methodologies.
Above review shows that Deep learning methods scores higher performance than hand engineering methods.It is noted that the same system performs differently based on the data health state (normal/arrhythmic). • The distance between each box's top and bottom is called the interquartile range.
• The center red line denotes the sample median for each box.
• Box whiskers are lines that extend above and below the box.The whisker length is measured from the end of the interquartile range to the furthest observation.• The outliers are those that fall outside of the whisker length.An outlier is a value that is more than 1.5 times the interquartile range from either the top or bottom of the box.An outlier is indicated by a red plus sign.

Data preprocessing approach
The ECG signal is a recording of the electrical activity of the heart that represents the regular pattern of heartbeats made up of P-QRS-T waveforms.Some preprocessing is required to get the ECG signal ready for the following stages, including signal segmentation to have the most relevant segments.In this study, beats are segmented based on the locations of the R-peaks recorded the index-annotation file in the MIT-BIH dataset.The R-peak is simpler to detect because of its lengthy duration and high amplitude 21 .We apply the segmentation technique on 181 ms-wide window (0.5 s) following the research in [22][23][24] where this signal length gives the best result on the PTB arrhythmic dataset, with R-peak at its central point.The adopted preprocessing approach in this research is inspired by the information given in [25][26][27] .In this research, R-peak position is basically centered in the beats segmentation process.However, R-peak is later re-positioned according to ECG intervals.The authors in 27 state that the PQ interval consumes 80 ms, which is roughly 22.22% of the ECG beat.QR-interval consumes 120 ms, roughly 33.33% of the whole beat.T-wave consumes 160 ms, which is roughly 44.44%.Therefore, T-wave consumes about double of the part consumed by P-wave.In 25 , authors present similar calculations, suggesting that QT-interval consumes about three times what PQ-interval consumes.it is also shown that the PQ interval ranges from 0.6 to 0.79, which is roughly 25.6% of the whole ECG beat.QT-interval consumes from 0.81 to 1.25 (about 0.44) which is 31.43% of the whole ECG beat in 26 .These calculations suggest that R-peak is not supposed to be in the center of the ECG segment.The www.nature.com/scientificreports/adopted approach is to place the R-peak before the center of the ECG segment.This approach has increased the recognition performance for four types of arrhythmic heartbeats.After beat segmentation, ECG signal is transformed via a time-frequency transformation technique into a two-dimensional image.Scalogram refers to this representation, which is made up of the absolute value of the signal's CWT coefficients.Figure 3 demonstrates the CWT transformation of ECG normal beats (right) and ECG specific-arrhythmia-type beats (left).

Developing deep learning-based ECG biometric model
Deep learning has become the most widely used model in ECG biometric systems.Different deep learning architectures adopted in ECG biometric systems have been investigated in 29 .Deep learning-based systems typically outperform traditional hand-engineered and machine-learning-based systems.Deep learning-based system adaptation can be accomplished in two ways.The first involves starting from scratch and training a new network.The second method uses Transfer Learning, which enables us to train an already-built and previously trained CNN before on our dataset.High model efficiency necessitates a straightforward architecture with fewer parameters.By using a sufficient number of training samples and suitable numbers of parameters, over-fitting and under-fitting issues are easy to be avoided.The following subsection present an explanation of the proposed CNN and its components.

Depth-wise separable convolution (DWSC)
The proposed CNN include variant types of layers: convolution layer, batch normalization layer, and activation layer.Convolutional layers are the basic layer where the convolutional process could be standard or depthwise convolution.In depth-wise convolution, channel-wise and spatial-wise computations are fulfilled in one step.The approach proposed in 30 served as the main inspiration for the suggested model.The authors in 30 propose a depth-wise separable convolution architecture.Figure 4 illustrates the basic building block of the proposed model.The basic block of the proposed model contains depth-wise separable convolutions (DWSC) that were proposed in 30,31 .DWSC consists of two processes: depth-wise and point-wise convolution, Depth-wise convolution convolves each input feature map with a single convolutional filter.Then, point-wise convolution is used to make a linear combination of depth-wise convolution outputs.The first step in the proposed model is applying a standard convolutional filter K to the input feature map F to produce the output feature map O: Then, depth-wise separable convolution is applied by: (1) applying a 3 × 3 depth-wise convolution K to Each input channel, (2) Combining the depth-wise output by applying 1 × 1 pointwise convolution K, where: m and n is number of the input and output feature maps respectively.Depth-wise Separable Convolution is presented in Fig. 5.The architecture of the proposed CNN model is shown in Table 2.

Activation function
The proposed model is incorporated with nonlinearity through two Activation Functions: (1) Scaled Exponentially Linear Units(SeLU), proposed in 32 , Where =1.0507 and α =1.6733 SeLU non-linearity is one of ELU's variants.SeLU's important property is introducing self-normalizing 33 , it is applied in the first two blocks in the proposed model.(2) Gaussian error linear units (GeLU) activation is applied in the next following blocks to introduce regularizers, such as dropout with the activation function.GeLU procedure is performed by the following formula, where x is the input:  Instead of using gate inputs by their sign, like in ReLU, we adopt the GeLU activation function to introduce the nonlinearity weighted inputs computed by the inputs value.The experiments in 34 show that GeLU matches or exceeds models with ReLUs or ELUs within variant fields such as NLP fields, computer vision, and automatic speech recognition.The Gaussian distribution function computes the cumulative distribution function (CDF) 34 .

Experiments and Results
Arrhythmic datasets are provided by PhysioNet 35 for a patient population.MIT-BIH 28 includes 47 participants total-25 men, ages 32 to 89, and 22 women, ages 23 to 89.A detailed statistic is presented in Table 3.This study focuses on this dataset because it is the only one that offers beat-labeling (A or N) for every subject.Every other dataset with an arrhythmia offers a subject-labeling.Therefore, the MIT-BIH dataset is the only one of these ( 5)

Evaluation
The number of samples TP, FP, TN, and FN is computed by the confusion matrix, where T is true, F is false, P is positive, and N is negative.Evaluation criteria are computed using these sample numbers as follows 33 : The evaluation of different pre-trained CNNs identification systems for regular beats was the first step in this study.The top results in the research and the complexity of the networks are taken into consideration when selecting the five pretrained networks.Table 4 demonstrates that the proposed model achieves recognition rates better than or competitive to the state-of-the-art networks, including Xception 36 , Shufflenet 37 and MobilenetV2 38 , which provided the best outcomes of the MITBIH dataset on regular beats.The table also shows whether the network is a Directed Acyclic Graph (DAG) or a linear architecture network.
Exploring and evaluating the various ECG biometric system architectures for both normal and arrhythmic datasets will help us achieve our first goal.Therefore, as a second step, we compare different arrhythmia types' negative impact on ECG biometric systems in Table 5, where Mobilenet 41 is tested on each arrhythmia type separately, isolating the same subject's arrhythmic heartbeats from his/her Normal heartbeats in the test set or train set.ROC curves are shown in Fig. 6 while Fig. 7 shows the cumulative distribution plot (CDF).Figure 8 shows the confusion charts for the four Arrhythmia types.Table 5 also shows the case where both the train and test sets contain only arrhythmic beats, the results are excellent (97.67%).This is because the network trains and learns the characteristics and morphologies of the same type of beats in the test set.

Results
At first, we tested the effect of different components of the proposed model on the performance of recognition using arrhythmic heartbeats. it is observed from Table 4 that the four arrhythmia types (J, A, R, & j) has less impact on the recognition performance.In general, the performance is excellent if the person does not have an arrhythmia as presented in Table 3.However, the performance is worse if the person has arrhythmia.Depending on the type of arrhythmia, the performance varies.For instance, of all arrhythmias, atrial premature (A-arrhythmia) scores very swell.This is because, as seen in Figs.2a and 3b, A-arrhythmia does not significantly alter the form of the heartbeat.In contrast with the occurrence of premature ventricular contraction arrhythmia (V-arrythmia) which severely changes the morphology of heartbeats, as shown in Fig. 3a.
Depth-wise separable convolution (DWSC) has been applied in several studies related to image processing tasks 30,[42][43][44] .Authors in 30  and show that the depth-wise approach scored the highest score, requiring much lower parameters compared with the other state-of-the-art approaches.
We evaluated the proposed model and other six pretrained networks on Normal betas (as a Train set) vs. Arrhythmic Beats (J, A, R, & J) as a test set and results are presented in Table 6.In addition, we tested these networks and the proposed model on Normal beats (as a Train set) vs. Arrhythmic Beats (All arrhythmia types) as a test set, and results are presented in Table 6.In addition, to explore DWSC effects, we tested the proposed model with DWSC versus the proposed model with standard convolution (SC) instead.Results in Table 6 shows that the proposed model with DWSC score competitive accuracy with about 5 × lower number of learnable parameters than the proposed model without DWSC operation.
The main contribution in this study is the results shown in Table 7 where the proposed model improves performance of person identification system with the four types of arrhythmias.Table 7 shows that the proposed model outperforms other models in term of the mean accuracy.Considering the proposed model's simplicity (664 k learnable Parameters) and other models' complexity (2 to 20 million learnable parameters, this is another achievement of this study in addition to highest mean accuracy.The proposed model is evaluated on normal heartbeats and achieves a good recognition performance (99.6%) presented in Table 8.

Discussion
The discriminative feature of the proposed model is the shape it takes.It starts from low number of channels, in the first block, then the number of channel increases while we go deeper.The feature map spatial resolution starts from the input image's size then it became smaller while we go deeper in the network for simpler computations and better features learning.The smaller spatial resolution (down-sampling) is achieved through stride = 2 in    The proposed architecture consists of five repeated building blocks based on 3 × 3 DWSC as in Fig. 4. It is a lightweight model and has roughly 664.6 Kilos learnable parameters.We compare the complexity of same model but based on Standard Convolution (SC) which has 3.8 million and found it is 5.7 × fewer learnable parameters than SC-Model.Assuming number of channels N, DWSC use number of learnable parameters = filterSize 2 × 1 × 1 x N, against filterSize 2 x N 2 learnable parameters in case the SC-model.This occurs at the same or small increase in accuracy compared with the same SC -model, as seen in Table 6.Therefore, in the proposed model (DWSCmodel) provides competitive accuracy with much lower number of parameters compared with the SC-model and compared with other networks with millions learnable parameters.This is due to the main benefits of DWSC operation which is providing more efficient use of network parameters and an excellent trade-off between the model complexity and its representation capability 36,37 .The pointwise 1 × 1 convolution is simply a linear projection of exact same spatial size and the same number of channels; however, it allows the model to include extra non-linearity through the activation function without affecting the receptive field.

Conclusion
The proposed lightweight CNN based on depth-wise separable convolution (DWSC) was tested on normal and different arrhythmic heartbeats and it was found effective to enhance the performance of person identification for several common arrhythmia types.The experimental results show excellent recognition performance on normal heartbeats (99.28%) as well as on arrhythmic heartbeats (93.81%).It outperformed other commonly used models in term of average accuracy as shown in Table 7.We generated ROC curves (Fig. 6) using the FAR and FRR for each class compared to the rest classes (one-versus-rest mode of classification).Although it is possible to balance the error rates for an authentication system by fine tuning the network parameters which we like to leave as a future work.As other possible future works, we also aim to investigate challenges such as automatic scalability due to increase of users, recognition with noisy signals, and effect of ECG signal variability on recognition performances due to different physical conditions such stress or exercise activities.
The model's scalability (i.e., adding more users/classes) is an important requirement of an identification system in real-world situations.Instead of requiring additional training of the entire model from scratch, the system should be able to automatically adopt itself.However, the automatic adaptation of an existing model presents a challenge and we would like to look into a few solutions for this as a future work.First, there have been some recent studies on Automated Machine Learning (AutoML) 48 , which could be incorporated into biometrics.Another alternative strategy is to use transfer-learning to retrain the previously trained model on the new pattern, after fine-tuning the old model by adjusting the final fully connected layer and the classification layer to account for the additional class.This technique could involve optimizing the model's hyperparameters for training on the new class using optimization algorithms such as Bayesian optimization for automated hyperparameter tuning

Figure 1 .
Figure 1.Block diagram of the proposed method.

Figure 2 .
Figure 2. Heartbeat morphology of ECG signal around R-peak for arrhythmic (up) vs normal (down) beats for subjects of MIT-BIH dataset: (a) Atrial premature beats (A-arrythmia) for subjects# 103; (b) Aberrated atrial premature beat (small a-arrythmia) for subjects# 203; (c) Paced beat (/-Arrhythmia)) for all common Subjects; (d) Nodal (junctional) premature (J-Arrhythmia) for all common subjects.(a) and (b) subfigure shows the intra-individual similarities; inter-individual differences among the individuals can be observed from these two subfigures.

Figure 3 .
Figure 3.The CWT transformation images of ECG heartbeats from the MITBIH Dataset 28 of ECG Beats for: (a) ECG Normal Beats; (b) Premature ventricular Contraction (V-Arrhythmia); (c) Atrial premature beat (A-Arrhythmia); (d) Paced beat (/-Arrhythmia); Heartbeats morphology differences among the three types of Arrhythmias can be observed from the three subfigures.

Figure 6 .
Figure 6.ROC Curves of the validation curve on person identification system evaluated on the MIT-dataset for: (a) Normal beats (training set) versus one arrhythmia type beats (test set); (b) one arrhythmia type (training set) versus normal beats (test set).

Table 1 .
ECG based biometric systems on normal and arrhythmic subject.

Table 2 .
Architecture of the proposed CNN model.

Table 3 .
The statistical Information of the MITBIH Dataset.

Table 4 .
evaluate their model, which is based on DWSC on ImageNet and CIFAR100 datasets Deep network models parameters, time consumed and recognition rate on regular heartbeat.M: Millions, K: Kilos.Significant values are in bold.

Table 5 .
ECG-based identification performance of variant types of arrhythmia.N: normal beats.A: arrhythmic beats.*These values are affected by the zero values in confusion matrix.

Table 6 .
Recognition accuracy of normal (training set) versus arrhythmic beats (test set) with and w/o DWSC.Significant values are in bold.someconvolution operations but not everyone to avoid aggressive feature map spatial size reduction.This model shape is inspired from the networks such as VGG16, VGG19, Mobilenet, Shifflent and Xception networks.These networks are DAG networks, which have more complex architectures.The proposed model is light and simple, its architecture is linear so that each layer in the proposed model has a single input and each output goes to a single layer.Unlike DAG architecture, the proposed model is free of Addition/Concatenation operation and residual connections.Residual connections are not needed in the proposed model due to its simplicity and few number of layers.

Table 7 .
Identification results of the proposed methodology compared with other models (N vs. A).Significant values are in bold.

Table 8 .
Performance comparison of the proposed model and other models in the MITBIH Dataset.