A Personalized Arrhythmia Monitoring Platform

Arrhythmia detection is the core of cardiovascular disease diagnosis. Though, there is no such generic solution for detecting the arrhythmias at the moment they occur which is due to the non-stationary nature and inter-patient variations of ECG signals. The feature extraction and classification techniques are significant tools widely used in the automated classification of arrhythmias. This study aims to develop a personalized arrhythmia monitoring platform allowing real-time detection of arrhythmias from the subject’s electrocardiogram (ECG) signal for point-of-care usage. A novel method, i.e. discrete orthogonal stockwell transform (DOST) technique for feature extraction is employed to capture the significant time-frequency coefficients to constitute the feature set representing each of the ECG signals. These coefficients or features are classified using artificial bee colony (ABC) optimized twin least-square support vector machine (LSTSVM) for classifying the different categories of ECG signals. The ABC optimizes the dimension of the feature set and the learning parameters of the classifier. The proposed method is prototyped on the commercially available ARM-based embedded platform and validated on the benchmark MIT-BIH arrhythmia database. Further, the prototype is evaluated under two schemes, i.e. class and personalized schemes which reported a higher overall accuracy of 96.29% and 96.08% in the respective schemes than the existing works to the state-of-art CVDs diagnosis.

The world health organization places the cardiovascular diseases (CVDs) as the leading cause of deaths globally. These CVDs occur due to the long-term effect of cardiac arrhythmias. Generally, the cardiac arrhythmias are not often life-threatening but can lead to cardiac death or heart failure in long run and required to be detected on time. Computerized electrocardiography (ECG) is a widely used diagnostic measure to interpret the function of subject's heart. The automatic monitoring of heartbeats is divided into four stages (i) pre-processing (ii) R-peak detection and ECG signal segmentation (iii) feature extraction and classification. The initial two stages i.e. preprocessing and R-peak detection and heartbeat segmentation are widely explored in the literature. However, the improvements can be done in the feature extraction and classification stages. Several methods have been throughly studied in the domain of efficient and automatic monitoring of heartbeats. The detection and recognition of ECG signals mainly consist of the combination of an efficient feature extraction technique followed by the machine learning algorithms [1][2][3][4][5][6][7][8][9][10][11][12][13] . In the feature extraction stage, time domain 7,14 , frequency domain 8 , time-frequency domain 9,10 , and statistical techniques [15][16][17] are commonly used to capture the significant features of the heartbeats. However, each of these techniques exhibit certain limitations in their domain of analysis. For a time-domain input signal, the classical Fourier transform (FT) fails to provide any information regarding the time of occurrence of the frequency components. The ambiguity of FT is overcome by short-time Fourier transform (STFT), however it is limited to stationary signals only due to the constant window length. The shortcoming of STFT is overcome by Wavelet transform (WT), however the choice of mother wavelet and the levels of decomposition of an input signal remains a challenge. The S-transform overcome the challenge in WT, however it provides a redundant representation of the time-frequency plane and hence, is computationally expensive. Therefore, an efficient version of ST i.e. DOST is employed in this study to extract significant characteristics from the ECG signals. These feature extraction methods are combined with the classification tools which include the artificial neural network (ANN) 11,18 , support vector machines (SVMs) 10 , k-nearest neighbor and many more. However, these techniques fail to provide a generic solution and suffer from the drawbacks such as (1) a general purpose modeling is infeasible due to high inter-subject variation of heartbeats (2) due to the non-stationary nature of ECG, the QRS complex, P waves, and RR-intervals vary from one signal to another depending upon the lifestyle of subject 3 and (3) the discriminative capability of the extracted features vary significantly for different heartbeat patterns (4) the training and testing dataset consists of heartbeats from the same patient which is not feasible for practical applications. This paper addresses the aforementioned issues by presenting a novel analysis method i.e. discrete orthogonal stockwell transform (DOST) based artificial bee colony (ABC) optimized least-square twin support vector machines (LSTSVMs) to detect and classify different categories of ECG signals. The DOST localizes the spectrum and retains the phase properties by extracting the time-frequency features from an input ECG signal. The artificial bee colony (ABC) technique gradually optimizes the learning parameters for the LSTSVM classifier to develop an efficient model for the extracted features. The proposed method is implemented on the commercially available ARM-based embedded platform and validated on the benchmark MIT-BIH arrhythmia database 19 . The developed platform is evaluated under two assessment schemes i.e. class and personalized schemes. The personalized scheme is suitable for practical applications, since the training and testing datasets comprises of records of different patients (i.e. considering the inter-individual variability in the data). The prototype is trained in off-line mode while the results are reported on the testing dataset. More particularly, this study proposes the validation of developed platform on the MIT-BIH data for real-time verification of the proposed method allowing monitoring of arrhythmias for point-of-care applications.

Results
Database Setup. The proposed method is validated over the benchmark MIT-BIH arrhythmia database 19 and evaluated under two analysis schemes i.e. setup I (class-oriented) and setup II (personalized) schemes respectively.
Setup I (Class-oriented scheme): In this scheme, all the 48 records of the database comprising beat 110109 labels are utilized for analysis. The experiments are carried out on the training and testing datasets constituted by randomly selecting a fraction of ECG signals from each class and further divided into sixteen clusters where each class of ECG signal represents each cluster. Here, each class of heartbeat is represented using each cluster. The training dataset comprises of 23996 signals, (i.e. 21.26%) while rest of heartbeats are utilized for the testing purpose. Table 1 presents the training and testing datasets utilized for the experimental purpose. Further, the optimal parameters for the classifier are determined by performing 14 fold cross-validation on the training dataset, while the testing is performed on the testing dataset 7 . The final accuracy is computed by averaging the accuracy of all the folds.
Setup II (Personalized scheme): In this scheme, the AAMI standard 14,20-22 is followed where four records i.e. 102, 104, 107, and 217 are excluded from the datasets, i.e. the analysis is carried out on the remaining 44 records. The 16 classes of ECG signals from the MIT-BIH arrhythmia database are mapped into five bigger classes namely N (beats originating in the sinus mode), S (supraventricular ectopic beats (SVEBs)), V (ventricular ectopic beats (VEBs)), F (fusion beats), and Q (unclassifiable beats). Under this scheme, two experiments are performed following Ince et al. 21 (Setup IIA) and Chazal et al. 7 (Setup IIB). In the setup IIA, the first 20 records (within a range of 100-124), which include representative samples of routine clinical recordings, are used to select representative beats to be included in the common training data. The rest 24 recordings (within a range of 200-234) contain ventricular, junctional, and supraventricular arrhythmias. A total of 83648 beats from all 44 records are used as test patterns for performance evaluation. a stopping criterion is decided which is the minimum train classification error level that is set to 1% to prevent over-fitting. In setup IIB, the training and testing datasets are constituted by equally splitting the records i.e. each dataset comprises of 22 records. Further, the optimal parameters for the classifier are determined by performing 22 fold cross-validation on the training dataset, while the testing is performed on the testing dataset 7 . The final accuracy is computed by averaging the accuracy of all the folds. Both these experiments are performed to have a fair comparison among the existing methods in the literature. Proposed Method Flow. The block diagram of the proposed method flow is depicted in Fig. 1.
• Data Acquisition Unit: The real-time input ECG signals are generated using MIT-BIH arrhythmia data and provided as input to the embedded platform for processing and analysis. Since, the monitoring platform is trained in off-line mode, therefore ECG signal from the testing datasets only for both the schemes i.e. setup I and II are generated. • ARM based embedded platform: The proposed method is implemented on the ARM embedded platform to provide real-time monitoring of electrocardiogram signals. The real-time input ECG signals are input of the ARM based embedded platform. The median filters and low-pass filter to remove the baseline wander and high-frequency noise from the corresponding ECG signals. Thenafter, the R-peak in the ECG signal is detected within the input ECG signals. The discrete orthogonal stockwell transform features (i.e. 1 × 256) are computed to represent the input ECG signals. The selection of features and tuning of the least square twin SVM classifier is done using the artificial bee colony (ABC) method. Once the feature reaches the classification stage, the computation gets started and the input signal is assigned to a particular class. The class of heartbeats detected by the developed platform for each tested signal is compared with the annotations file to formulate the results and reported in the confusion matrix as in Tables 2, 3 and 4 for setup I, IIA and IIB i.e. class-oriented and personalized assessment schemes respectively. This matrix maps the correctly classified and misclassified signals into their subsequent classes. In the confusion matrix, the column represents the predicted signals detected by the platform while the row represents the ground truth (i.e. labels provided in the database). In Table 2, among the 86113 testing signals, 82918 signals are correctly detected achieving a high accuracy of 96.14% with an error rate of 3.86%. In the personalized scheme (Setup IIA), an accuracy of 96.08% is achieved i.e. out of 83648 signals 80379 signals are correctly detected by the prototype summarized in Table 3. While in setup IIB as in Table 4, the platform reported an accuracy and error rate of 86.89% and 13.11% respectively i.e. out of 49711 signals, 43194 signals are correctly detected. The accuracy reported for the classes 'e' and 'Q' as in Tables 2, 3 and 4 is very low i.e. which is due to the less amount of data considered for training purpose.
Further, other metrics such as true positives (TP), false positives (FP) and false negatives (FN) parameters are computed for each of the schemes and are summarized in Tables 2, 4 and 3. In setup I scheme, the overall S E , P P , F s metrics for the 16 classes of ECG signals is computed and presented in Table 2 which is 96.29%, 96.29% and 76.06% respectively. In the personalized scheme i.e. setup IIA following the Ince et al. 21 , the overall S E , P P , F s are 96.08%, 96.08% and 96.08 respectively summarized in Table 3. In setup IIB following the Chazal et al. 7 , the overall S E , P P , F s is 86.89%, 86.89% and 86.89% respectively presented in Table 4.
These results reported under both the setup I, IIA and IIB (i.e. class and personalized schemes) are directly compared with the existing methods reported in the literature and presented in Tables 5 and 6. It can be concluded from Tables 5 and 6, that the proposed method reported higher classification accuracy under both the analysis schemes. In addition, the study classifies more number of ECG signals in the class-oriented scheme. It is evident that the features extracted are significant enough to efficiently represent the input ECG signal in time-frequency space for the developed classifier model, leading to higher accuracy and improved performance achieved by the prototype.

Discussion
As expected, the performance of developed personalized arrhythmia monitoring platform evaluated under setup II (i.e. personalized scheme) reported worse results than the setup I (i.e class-oriented) scheme, due to inter-individual variability in physiological characteristics between the data of different subjects. The personalized-analysis scheme is suitable for practical applications due to the fact that the training and testing dataset comprises of records of different patients. Further, due to a huge variation in the number of ECG signals among various classes used for training and testing purpose, the performance metrics can be considered as highly

Predicted Labels
Total     Cardiac Summary Report and Storage. The classes of arrhythmias detected by the platform under both the schemes (i.e. setup I and setup II) are stored in a text ('.txt') file to generate a daily cardiac summary report regarding the status of heart. The report provides the total number of ECG signals detected by the prototype, along with the number of ECG signals detected in each class. A 1 Gb memory SD card is provided to store the cardiac report summary report, thus allowing off-line analysis by an cardiology expert. In the off-line analysis, the expert can analyze the ECG data (i.e. in digital form) and their classes detected by the platform, thus allowing the reduction in time consumption required for providing preventive measures to the patients. It is to note that a 1 GB card can store the ECG data for 40 days 1 while the length of recording depend on the memory size of SD card used. Further, during the detection of an arrhythmia by the platform, an alarm is triggered which is a pop-up message warning notification to alert the user. This allows the user to do not continuously monitor the cardiac summary reports and provides a no emergency condition by detecting an arrhythmia at a earlier stage to prevent the users from serious ailments for cardiovascular diseases. The developed prototype can be fabricated to develop a wearable or handheld device for point-of-care use providing real-time feedback regarding the condition of heart to the end users.

Methods
In this study, a novel ECG signal recognition scheme, i.e. discrete orthogonal stockwell transform (DOST) based artificial bee colony optimized twin least-square support vector machine (ABC-LSTSVM) is explored for automated recognition of cardiac arrhythmias. The block diagram representation of the proposed method is depicted in Fig. 2. Further, the proposed method is implemented on the microcontroller platform to monitor the different categories of ECG signals validated on the benchmark MIT-BIH arrhythmia database.

RR-interval NN 90
Cvikl et al. 30  MIT-BIH Database. The benchmark MIT-BIH database is chosen to provide the real clinical situation. The database comprises of 48 different subjects. The data are band-pass filtered at 0.1 H-100 Hz and sampled with a rate of 360 Hz which is acquired with 11-bit resolution over 10 mV range. Further, the data acquired is bandpass filtered at 0.1-100 Hz while the database comprises a total of 110109 beat labels. The database also provides the class annotations. In this study, modified limb lead II (MLII) signals are used for the validation while their class annotations are used as ground truth to formulate the results. It is to note that the all the 48 records of the database are utilized to evaluate the proposed method.
Pre-processing. There are several kinds of noise associated with the raw ECG signal. Among them include the artifacts due to muscle contraction, power-line interference, baseline wander. Hence, pre-processing is essential for an efficient fiducial point detection and recognition of ECG signals. The pre-processing is applied to the noisy ECG signals to improve the signal-to-noise ratio (SNR) and remove the noise. Initially, to remove the baseline wander, the ECG signals are passed through two median filters. The first median filter of 200 ms removes the QRS complex and P wave while the second median filter of 600 ms removes the T wave within the ECG signal. The output signal of the second filter contains the baseline associated with the ECG signal and hence, it is subtracted from the original signal to generate baseline corrected signal. A 12-tap FIR low-pass filter with a cut-off frequency of 35 Hz with equal ripple in the pass and stop bands is used here to remove the high-frequency noise and power-line interference. Finally, the filtered ECG signal is used for processing and analysis.

R-peak detection and ECG Segmentation.
In the practical scenario, the automatic detection of R-peak is essential to evaluate the proposed method entirely for ECG signal analysis. For this purpose, a well established Pan-Tompkins (PT) method 23 is adopted for the subsequent detection of R-peak in the corresponding ECG signals. The PT method has reported good performance in noisy conditions with less complexity achieving a sensitivity of 99.87% than 24,25 . Therefore, the method is used as a ready-made solution for the R-peak detection in the consecutive ECG signals. A window of length 0.512 ms is taken across each R-peak to determine the size of each ECG signal. Here, each ECG signal consists of 256 samples i.e. 110 samples before and 145 samples after detected R-peak of the heartbeat. The R-peak detection stage is followed by the feature extraction and classification stages for an efficient recognition of heartbeats.  (NlogN + NlogN + N). The data flow of the DOST algorithm is depicted in Fig. 3. The time-frequency coefficient vectors are extracted from the corresponding heartbeats that are used as final feature set to recognize the heartbeats into different classes using the ABC-LSTSVM classifier model. Figure 4 shows the reconstruction of the normal and the abnormal i.e. LBBB and PVC signals. Here, the error in reconstruction for all the three classes is of the order of 10 −15 . The DOST features of vector of size N×1 are extracted for different categories of heartbeats. Therefore, it can be concluded from Fig. 4 that DOST coefficients exhibits better performance while reconstructing the ECG signals. Therefore, the DOST can be considered as a potential tool for extracting significant features from the corresponding ECG signals.
Feature classification using ABC Optimized least-square twin SVM. An optimal artificial bee colony optimized least-square twin SVM (ABC-LSTSVM) classifier model is developed to classify the DOST features into various categories under the two schemes. In this study, the radial basis function (RBF) kernel is adopted to perform non-linear analysis of the input data by mapping them into the high-dimensional feature spaces. In the recognition phase, directed acyclic graph (DAG) technique 27 is adopted to provide the multi-class solution to the classifier model. The DAG approach is suitable for practical applications due to the fact that its computational complexity is . The cost function of both the two hyperplanes are considered as equal i.e. c 1 = c 2 = C for significant reduction in computational load of selecting the parameters. The testing time involved is less than those of one-against-one and one-versus-all approaches 27 . In the training phase of DAG-SVM, N(N − 1) non-parallel hyper-planes and binary classifiers are constructed for a total number of N classes. While in testing phase, a binary rooted DAG is used consisting N(N − 1)/2 internal decision nodes (binary classifiers) and N leaves. The searching range of penalty C and kernel argument γ parameters are in the range of [2 −5 , 2 10 ] and [2 −10 , 2 5 ] respectively. The artificial bee colony (ABC) 28 technique is employed to search the optimal learning parameters for the least-square twin support vector machine classifier.
The ABC aims to gradually optimizes the least-square twin SVM classifier by a) employing the best features to distinguish between different categories of heartbeats automatically and b) selecting best learning parameters for the classification model. The flowchart of the ABC technique is presented in Fig. 5 and its pseudo code is summarized here.
In the ABC algorithm, the parameters considered for performing the experiments are summarized in Table 7. ABC evaluates the fitness of each food source at each iteration and determines the optimal network. In the training phase, the classifier parameters i.e C and γ are gradually tuned using the ABC algorithm following the m-fold cross-validation strategy 29 where the simple SV count is employed as a fitness criteria. The ten-fold cross-validation is performed on the training as well as testing dataset i.e. entire dataset using the optimal parameter values of C and γ to achieve better accuracy. It is to note that the whole procedure is performed for the datasets under both the analysis schemes presented in section 0 i.e. for both the schemes different values of C and γ are obtained. During testing phase, the testing dataset is utilized for estimating the classification accuracy for ECG signal class which is reported in terms of confusion matrix.   Hardware prototyping and implementation of proposed method. This study aims to develop microcontroller-based platform technology having the capability to perform real-time ECG reception, extraction of features, and heartbeat recognition. To demonstrate the feasibility of the idea, a proof-of-concept prototype is developed using commercially available ARM9 platform in the laboratory experimental setup.
Laboratory prototype. The proposed method is implemented on the ARM based hardware test platform in real scenario. The experiments are performed to evaluate the proposed method and monitor the different classes of ECG signals. The #C++ programming language is used to develop the proposed method while the debugging environment is linux to program the ARM processor. The gcc compiler compiles the program to generate the output file. While the output file of the program is executed, the processor platform starts predicting the ECG signals. The real-time ECG signals are generated using the arbitrary function generator (AFG 3252) while their class detected by the platform is observed on the liquid crystal display (i.e. 16 × 2) interfaced with the processor platform. The AFG 3252 provides two-channel analog output system from which these real-time signals using channel I are taken through BNC crocodile connection cable (i.e. with an impedance of 50 Ω) to the platform as input for further processing and analysis. In addition, the morphology of the signals can be seen on mixed signal oscilloscope (i.e. MSO 2024B). Figure 6  No. of Food sources 25 Table 7. Parameters in the ABC Technique.  schemes i.e. case I and case II will yield different values of C and γ. The platform is evaluated on the testing dataset signals to estimate the performance. The ECG signals from the testing dataset are randomized, re-annotated and generated in a text file which is transferred to the arbitrary function generator (i.e. AFG 3252) for real-time generation of input signals. In the MIT-BIH arrhythmia database, the ECG data available have a maximum dynamic range of ±5 mV. The data is converted in the range of 0-5 V using the AFG for its processing by a typical 10-bit analog-to-digital (ADC) converter. The sampling rate of the ADC is set to 500 Hz. Further, a pause is added; so that the packages should arrive at the same sampling rate as the ADC. The data is encoded with a fixed point math by using 32-bit representation. Thenafter, the proposed method is applied for processing the acquired input signal. The raw ECG signals are preprocessed to remove noise and improve the SNR of the ECG signals. In Fig. 7, shows the two channel mixed signal oscilloscope (MSO) in which the noisy along with the filtered ECG signals are observed in channels I and II respectively. Further, the R-peak of the filtered ECG is detected using the PT algorithm 23 which is depicted in Fig. 8(a). Figure 8(a), shows the R-peak detection in the consecutive ECG signals on the digital signal oscilloscope. The R-peak detection is followed by a window of length 256 samples, i.e. 0.712 ms chosen across each R-peak to determine the length of ECG segments and depicted in Fig. 8(b).
Further, the ECG segments are passed through the feature extraction and classification stages. In the feature extraction stage, the time-frequency features are extracted. The twin SVM model is developed and exploited in off-line mode for training purpose. The performance of SVM classifier is presented at convergence during the optimization procedure using ABC. The real-time testing is performed on the testing dataset where once the features reaches this stage, the computation gets started and the input ECG signals are recognized by the test platform. During testing, the category of ECG signals detected is displayed on the liquid crystal display (i.e. 16 × 2 LCD) module while their morphology is displayed on mixed signal oscilloscope (i.e. MSO 2024B) in the experimental laboratory setup depicted in Fig. 6. In Fig. 6, a right bundle branch block (RBBB) signal is detected at that instant of time and consequently, the other classes of the ECG signals are detected by the developed platform.

Conclusions
This study presents a novel method i.e. discrete orthogonal stockwell transform for feature extraction and artificial bee colony (ABC) optimized least-square support vector machines for the classification of the features extracted for each of the ECG signals into their subsequent categories. The proposed method is implemented on ARM processor platform, validated on the MIT-BIH arrhythmia database and evaluated under two schemes i.e. class and personalized schemes to monitor the different classes of arrhythmias. The evaluation under personalized scheme is suitable for practical scenario. A higher accuracy of 96.29% and 86.89% is reported under class and personalized schemes respectively than the existing works in the literature. The platform can be utilized by end users to maintain a daily record of the cardiac health status to enhance the healthcare for cardiovascular diseases and lead a healthy life style. Further, the platform can also be utilized in hospitals to analyze the long-term ECG recordings, thus reducing the time of cardiologist in providing the necessary preventive measures to the patients.