Reliable P wave detection in pathological ECG signals

Accurate automated detection of P waves in ECG allows to provide fast correct diagnosis of various cardiac arrhythmias and select suitable strategy for patients’ treatment. However, P waves detection is a still challenging task, especially in long-term ECGs with manifested cardiac pathologies. Software tools used in medical practice usually fail to detect P waves under pathological conditions. Most of recently published approaches have not been tested on such the signals at all. Here we introduce a novel method for accurate and reliable P wave detection, which is success in both normal and pathological cases. Our method uses phasor transform of ECG and innovative decision rules in order to improve P waves detection in pathological signals. The rules are based on a deep knowledge of heart manifestation during various arrhythmias, such as atrial fibrillation, premature ventricular contraction, etc. By involving the rules into the decision process, we are able to find the P wave in the correct location or, alternatively, not to search for it at all. In contrast to another studies, we use three, highly variable annotated ECG databases, which contain both normal and pathological records, to objectively validate our algorithm. The results for physiological records are Se = 98.56% and PP = 99.82% for MIT-BIH Arrhythmia Database (MITDP, with MITDB P-Wave Annotations) and Se = 99.23% and PP = 99.12% for QT database. These results are comparable with other published methods. For pathological signals, the proposed method reaches Se = 96.40% and PP = 91.56% for MITDB and Se = 93.07% and PP = 88.60% for Brno University of Technology ECG Signal Database with Annotations of P wave (BUT PDB). In these signals, the proposed detector greatly outperforms other methods and, thus, represents a huge step towards effective use of fully automated ECG analysis in a real medical practice.


Methods
The entire algorithm for P wave detection consists of eight parts: (a) QRS complex detection, (b) T wave detection, (c) PVC detection, (d) AFIB detection, (e) pathology check, (f) normal P wave detection, (g) dissociated P wave detection, and (h) P wave verification. The complete architecture of P wave detection algorithm is demonstrated by the block diagram in Fig. 1. Each block is described in detail below. The integration of methods for AFIB and PVC detection and several novel decision rules to P wave detection algorithm is an important innovation of the proposed method.
Values and constants used during P wave detection were determined according to the knowledge of cardiac activity (definition of the area for searching for P wave, T wave, PVC, check of PVC and AFIB detection, etc.) and the properties of PT (value of Rv).

(a) QRS complex detection
Firstly, QRS complex detection is provided. The raw signal is preprocessed 34,35 and filtered by a bandpass FIR filter with Hamming window and passband of 12 to 19 Hz 36 in order to enhance the QRS complexes and suppress P and T waves. After filtration, Phasor transform (PT) is applied on the signal. PT enhances variations of the signal's components (such as P waves, T waves and QRS complexes) and makes the detection of these components easier 23 . PT transforms each sample of the signal into a complex value preserving the signal information. The constant value R V is considered as a real part of the phasor signal, while the value x(n) of the original ECG sample is considered as an imaginary component: y(n) = R V + jx(n) . R V is set on the value within the interval 0-1, which indicates the 'degree' of the waves enhancement in ECG. For QRS detection, R V = 0.001 is used. The phase (phasor) signal PT(n) is then computed as PT(n) = tan −1 x(n) R V .  www.nature.com/scientificreports/ In signal PT(n) , maxima are detected in sliding window (300 ms long) and compared to an adaptive threshold established as a double of standard deviation calculated in 2 s moving window. The positions of maxima, which are higher than the threshold, are considered as the positions of QRSs (R waves). If the current RR interval RR(i) is 1.75 times longer than the previous one RR(i − 1), backward searching with a new threshold established as 30% of the last detected QRS amplitude is additionally applied to add possible missing detections. More detailed information about QRS detection via PT can be found in our previous work 31 . The output of this step-QRS positions-is then used for demarcation of areas for P and T waves searching and for detection of PVC and AFIB.

(b) T wave detection
The searching area for T wave detection is determined using the position of the corresponding QRS complex R(i) as R(i) + 0.12 × RR(i) to R(i) + 0.57 × RR(i) + 60 ms. In this area, the ECG is transformed using the PT in the same way as in the case of QRS detection, but with R V = 0.1. The maximum of the phase signal PT(n) within the demarcated segment is considered as the position of the T wave. We already used a similar method 31 , but here we introduce different way for demarcating of the searching area. The T waves positions are further used to identify the area for P wave searching.

(c) PVC detection
Recognition of PVC is a very important step for demarcating the area where P wave may occur. If the current beat R(i) is marked as PVC, the P wave may not be searched before the QRS complex, because the P wave is not present at all. The proposed PVC detection method is effective, with low computational cost. It is based on only one simple feature extracted from the QRS, namely the area under the QRS (AUC (i)) calculated from the segment demarcated using current R wave position as R(i) − 150 ms to R(i) + 150 ms. Before AUC calculation, the signal is filtered using a high-pass Lynn's filter with a cut-off frequency of 0.67 Hz to eliminate the baseline wandering. The current beat is then considered as PVC, if its AUC (i) is 1.3 times larger than a median AUC calculated from all previous beats. An example of PVC detection procedure is shown in Fig. 2. In previous studies, we used multi-feature approach for PVC detection 24,37,38 , where we combined AUC with other features. Here, we achieved promising results by using AUC only (see below).
In the next step, the number of beats labelled by algorithm as PVC is calculated. If more than 75% of all beats in ECG are labelled as PVCs, the PVC detection results are considered as a mistake. Instead, morphological changes of the beats are considered to be a sign of right or left bundle branch blocks, which are known to have similar ECG manifestations as PVC. As a result, each beat is assigned as normal or PVC, which is used in further steps.

(d) AFIB detection
During AFIB, P waves are not present in ECG 39 and, thus, common P wave detection algorithms produce many false positive detections. To eliminate this problem, we supplemented our algorithm by checking of the AFIB presence in current beat R(i). If the beat is marked as AFIB, the algorithm does not search for the P wave at all.
The pilot version of AFIB detection method was published in 40 . Here, we introduce the modified approach. It is based on the representation of heart rate dynamics via so called symbolic dynamics (symbols and words) and Shannon entropy (SH). First, the heart rate sequence (hr(i)) is calculated from the RR www.nature.com/scientificreports/ intervals (RR(i)) and transformed into the symbol sequence (Sy(i)) 41 . The 3-symbol template is then used to examine the entropic properties of Sy(i) and another 3-symbol template is used to obtain the transformed sequence of words (wv(i)). The template length was set on only 3 samples to ensure low computational demand of the sequence analysis. Second, SH(i) is computed from the segment of 59 consecutive word elements (beats) selected as wv(i-29) to wv(i + 29). Finally, the beats with SH(i) higher than 0.737 (selected empirically) are marked as AFIB, since during AFIB, RR intervals are highly variable resulting in a large SH 41,42 . In Fig. 3, the process of AFIB detection is illustrated. Upper graph shows the lengths of RR intervals and lower graph shows the corresponding SH values and the decision threshold for AFIB detection. It is obvious from the figure, that the increased SH values (above the threshold) correlate with the presence of AFIB in ECG (according to the ground truth annotations available in a database).
In ECG with detected AFIB, the PVCs number is then calculated within the segments of the 59 consecutive beats. If more than 30 PVCs (50% of total beats) is present in the segment, than increased SH values seem to be due to the PVCs, which are 'surrounded' by RR intervals of specific lengths (shortened and extended for RR before and after the PVC, respectively) different from the lengths of RR intervals surrounding the normal beats. In this case, the current beat R(i) is not considered as AFIB. As a result, each beat is labeled as normal or AFIB and this information is involved into further analysis.

(e) Pathology check
In this stage, the algorithm checks, whether the pathologies from the steps (c) and (d) were detected in the current beat, and decides, whether P wave detection process continues or not. Particularly, if the beat is marked as AFIB, the P wave detection in this beat is terminated (see above). In the beats with no AFIB, the presence of detected PVC is checked. If the beat is marked as PVC, then the detection process is terminated (see above). In the opposite case, the algorithm continues to the step f).

(f) Normal P wave detection
If the currently analyzed beat is not labeled as PVC nor AFIB, the segment for P wave searching is selected from ECG as R(i − 1) + 0.71 × RR(i) to R(i)-0.07 × RR(i)-60 ms and transformed by the PT with R V = 0.05. The maximum peak from the calculated phase signal is then considered as a P wave candidate (cP). In Fig. 4, an example of P wave searching is shown. For the first beat of signal, the segment for P wave searching is set as R(i)-300 ms to R(i)-80 ms. (g) Dissociated P wave detection Dissociated P waves can be usually found in ECG of patients with AVB II. To detect these waves carefully, we proposed simple criteria. First, it is checked, whether there is a dissociated P wave in the previous RR interval (RR(i − 1)). If not, then three further criteria are checked: If the dissociated P wave was found in the previous interval, then one criterion is checked: . In both cases, if the criteria are met, the dissociated P wave may www.nature.com/scientificreports/ be present in the current beat and, thus, the position of this wave is further detected. If the above criteria are not met, the dissociated P wave is not present in the beat and the detection procedure is terminated. The dissociated P wave is localized in the segment demarcated as T(i − 1) + 200 ms to P(i)-400 ms. The segment is transformed by the PT in the same way as in step f) and the position of P wave candidate cP is found by detecting the maximum peak within the segment. In Fig. 5, the detection of dissociated P waves is illustrated.

(h) P wave verification
In the last step, the P waves candidates are validated. First, the voltage level of the candidate cP(i) is assessed by the criterion: U P (i) > 0.05 × U QRS (i). If the criterion is not met, the P wave is probably not present in the current beat, which may be in case of nodal origin of the beat/rhythm or idioventricular rhythm. Second, the position of the candidate cP(i) is verified to be ensure, that the candidate is not a part of the previous T wave, but the true P wave of the current beat. Corresponding criterion is cP(i) > T(i − 1), where  www.nature.com/scientificreports/ cP(i) is the position of current P wave candidate and T(i − 1) is the position of the previous T wave. If this criterion is not met, this P wave candidate is excluded from the analysis, as it likely represents the T wave from previous beat instead of P wave of current beat. Consequently, the true P wave is most probably absent in the current beat or is hidden in the previous QRS complex or previous T wave, such it happens in case of supraventricular tachyarrhythmia, sinus tachyarrhythmia or atrial premature beat. If the above criterion is met, the candidate position cP(i) is considered as the position of P wave.
Testing databases. For testing the proposed algorithm, physiological as well as pathological ECG records with P waves annotated by the experts were needed. There are only three publicly available databases, which contain correct manual annotations of P waves. All the databases can be found on Physionet 32 . In all databases, the first lead was used for algorithm testing. The first database is a part of MIT-BIH Arrhythmia Database (MITDB) 32,43 with the P wave annotations published by our team for selected ECGs under the name MIT-BIH Arrhythmia Database P-Wave Annotations (MIT PDB) 24 . For this database, the P wave annotations were also published by Elgendi et al. 44 . However, these annotations contain many mistakes and, thus, they are not suitable for reliable testing of detection algorithms. The MITDB dataset is widely used database for evaluation of QRS detectors and is the most cited ECG database at all 45 . It contains both physiological and pathological ECG records sampled with frequency 360 Hz. For our study, we selected 12 physiological and pathological signals with P wave annotations available. Particularly, selected records no. 106, 119, 214, and 223 include PVCs (various types of ventricular arrhythmias-ventricular bigeminy (B), ventricular trigeminy (T), and idioventricular rhythm (IVR). Records no. 207 and 222 include nodal rhythm (NOD) and record no. 231 includes AVB II. Records no. 100, 101, 103, 117, and 122 do not include any significant pathology. Therefore, these records represent normal signals, which will be used to verify the performance of the algorithm under physiological conditions. Altogether, the database contains 2281 P waves.
The second database is the QT database (QTDB) 32,46 . It consists of 105 15-min-long two-channel ECG records sampled at 250 Hz. In this study, the first channel was used. For all records and beats, the automatically found reference positions of QRS complexes are available. For some beats, the QTDB includes manual annotations of P wave peak, P wave onset, P wave offset, QRS complex onset, QRS complex offset, T wave peak and the T wave offset. All annotations are available for at least 30 beats per record in 79 out of the 105 recordings 46 . The performance of the proposed algorithms for P waves and T waves detection was tested against the manually annotated part of the QTDB (altogether 3622 beats), which mainly represents the physiological signals.
The third database is Brno University of Technology ECG Signal Database with Annotations of P Wave (BUT PDB) recently published by our team 33 . It consists of 50 2-min long, two-channel ECG records with 23 different types of pathologies and manually annotated P waves. The ECGs were selected from 3 existing databases of ECG signals-MITDB, MIT-BIH Supraventricular Arrhythmia Database (MITSVA) and Long Term AF Database (LTAF) 46 . The sampling frequency is 360 Hz for signals from MITDB and MITSVA and 128 Hz for signals from LTAF. Each record from BUT PDB contains annotation of dominant diagnosis (pathology) and types of QRS complexes (taken over from the original databases). Available information about pathologies was manually checked. Since the original annotations were found correct, the labels were taken over from the original databases. The missing annotations (all signals from MITSVA) were further supplemented by ECG experts. The BUT PDB consists of 7638 QRS complexes. For 2209 QRSs, there are not P wave presented (the case of atrial fibrillation, ventricular beats or nodal rhythm). On the contrary, 141 P waves are not corresponded with QRS complexes (mainly the case of the 2 nd or 3 nd degree atrioventricular block and paced rhythm). Altogether, the BUT PDB includes 5429 P waves. Types of pathologies, their abbreviations and the number of signals in particular pathological groups are listed in Table 1. It should be noted, that the BUT PDB contains all known pathologies that affect P waves presence and/or positions.

Results and discussion
Proposed detection algorithm was tested on physiological as well as pathological signals. Physiological signals are represented by the whole manually annotated part of QTDB 46  Besides the results obtained by using of the proposed improved P wave detector, we also present the results of our previous algorithms. All the methods were tested on the same dataset. The first previously published algorithm is the basic P wave detector based on using the phasor transformation with no extra decision rules for pathological cases 26 . The second algorithm was specially designed for using under normal conditions (physiological cardiac rhythm) and during PVC or AVB II 31 . Here, we will compare the results of all three detectors in order to objectively evaluate the impact of the procedures we proposed for improvement of the previous outputs. As was mentioned above, these procedures are preliminary focused on eliminating of false positive P wave detections, which are common in case of many cardiac arrhythmias. From our results (see below), the number of false positives could be effectively reduced by involving the decision rules for accurate demarcation of the search area and information about presence of arrhythmia, such as AFIB and PVC.
Detection of P waves in physiological conditions. First of all, we validated the efficiency of the proposed P wave detector under normal conditions by testing it on the ECGs with no pathology (see above). The results obtained by the proposed method as well as our two previously published detectors are summarized in Tables 2 and 3. For both test databases, the results of other teams are available. We included this data in the tables for comparison. In these studies, various methods were used to detect the P waves, such as PP rhythm tracking 3 , phasor transform 23 Tables 4 and 5, the results of detectors testing on signals with pathologies (see above) are shown. As can be seen in the tables, Se and PP were calculated for each signal separately and then averaged over the database. In the second columns of the tables, the pathologies prevailing in the particular record are noted. According to the detection results, the newly proposed method performs notably better than both previous approaches. In case of MITDB, however, this predominance is not as prominent as for BUT PDB (compare averaged Se and PP from Tables 4, 5). It is due to the fact that MITDB contains only a few types of pathologies, whereas BUT PDB includes highly variable data and, thus, allows to reveal the limitations of the previous algorithm 31 on one side and to highlight the benefits of the novel method on the other side. Particularly, the proposed improved algorithm achieved higher performance in most signals with all types of PVC (i.e. single PVC, bigeminy, trigeminy, ventricular pair, ventricular flutter, and fusion of normal and ventricular beat) as compared to the previous versions. The examples of P waves detection in ECG with a single PVC (record no. 35), ventricular flutter (record no. 33) and ventricular trigemini (record no. 14) are shown in Fig. 6. In all the signals, the proposed method was able to deal with a given pathology and to detect all P waves correctly. On the contrary, the basic detector failed in all cases (see false positive/negative detections in the figures).

Detection of P waves in pathological conditions. In
The next significant improvement was indicated when detecting P waves by new detector in ECGs with AFIB, which is due to special unique criteria added to the algorithm (see above). On the contrary, two previous versions are not "equipped" by the mechanisms for AFIB identification and, thus, are not able to adjust the detection process to this pathological condition. As a result, many false positive detections can be seen in output of these algorithms, as shown in Fig. 7. The ECG from the figure is entirely burdened by AFIB, which manifests in absent P waves (as was correctly recognized by the proposed detector). In a few cases, however, Se of the proposed detector was lower than that of the previous approaches (see Table 5). It can be explained by false positive detections of AFIB at the beginning or the end of the segments due to delay caused by computing SH from 59 consecutive beats. P waves were successfully detected in ECGs with right bundle branch block (RBBB) as well. This pathology is manifested in ECG by changed QRS complexes (wide, of higher amplitude and aberrant morphology as compared to the normal, narrow QRS). The correct detection under this condition is possible due to improved criteria for searching area. Particularly, the area was shortened by shifting its right boundary to the left on 60 ms (see section Normal P wave detection). Use of this narrow search area instead of the previously proposed wide area 31 allows us to avoid the situations, where the QRS complexes were detected instead of the P waves (see Fig. 8).
In general, the novel algorithm reached more promising results than the previously published detectors in all pathological cases addressed in the study, including AVB II, nodal rhythm, all types of atrial and ventricular arrhythmias, bundle branch blocks, and pre-excitation.
Relatively poor results were obtained in ECGs with multiple concurrent pathologies (such as in records no. 9, 19, 33, 37 and 38 from BUT PDB). It is caused by false positive AFIB detections (and, consequently, false negative detections of P waves in corresponding ECG segments) or missed PVC detections due to highly irregular rhythm originated in overlapped manifestations of multiple arrhythmias in the same segment. Our detector was not success when testing on ECG with AVB III (record no. 3 from BUT PDB), where P waves and QRS complexes appear in ECG independently from each other.
There are only few published studies reporting the performance of P wave detectors on the available databases. Therefore, the comparison of our results with the results of other teams is rather limited and can be provided only on the manually annotated signals from MITDB. On this database, Laguna et al. 30 achieved averaged (over all the signals with PVC, NOD and AFIB) Se = 71.13% and PP = 59.08% using the multilead detector. The PP rhythm tracking method proposed by Portet et al. 3 reached averaged Se = 61.89% and PP = 59.00% on the same ECGs.
Vitek et al. 47 applied wavelet transformation and decision rules and detected P waves with averaged Se = 90.79% Table 4. The performance of the P wave detection algorithms on pathological signals from the MITDB with annotations MIT PDB (Se-sensitivity; PP-positive predictivity).

Sig. no
Type of pathology www.nature.com/scientificreports/ and PP = 84.56%. It is obvious, that our detector with averaged Se = 96.4% and PP = 91.56% significantly outperforms above approaches. Taking into account the results on all three databases, the proposed detector is a promising tool for analysis of ECG recorded in patients with many different arrhythmias.  -the whole signal is AFIB, no P waves are present, A-atrial  premature beat, AFIB-atrial fibrillation, AFL-atrial flutter, B-ventricular bigeminy, BI-atrioventricular  block 1st degree, BII-atrioventricular block 2nd degree, BIII-atrioventricular block 3rd degree, Eventricular escape beat, F-fusion of ventricular and normal beat, IVR-idioventricular rhythm, J-nodal beat,  L-left bundle branch block beat, NA-sinus arrhythmia, NOD-nodal rhythm, P-paced rhythm, PREXpre-excitation, R-right bundle branch block beat, SVTA-supraventricular tachyarrhythmia, T-  www.nature.com/scientificreports/

Limitations of the study
The main limitation of this study is that the proposed algorithm was not tested on ECGs with extensive noise and artefacts. In these situations, therefore, successful P wave detection cannot be guaranteed. The algorithm seems to be inaccurate when detecting P waves in ECGs with junctional rhythm and AVB III. To provide more comprehensive evaluation of detector performance, it should be tested on more ECGs. However, to the best of our knowledge, there are no other databases suitable for reliable testing of P waves detectors.

Conclusion
This work introduces a new advanced method for P wave detection in ECGs based on a combination of simple phasor transform of the signal and innovative set of decision rules. Involving of unique criteria into the algorithm significantly improved P wave detection during pathological events, which is still a challenging task. The criteria are based on deep knowledge of heart manifestations during both normal and pathological conditions, such as AFIB, PVC, RBBB, etc. The main benefit of the criteria is in accurate definition of searching areas based on information about pathologies present in the current segment. As a result, the algorithm adjusts its parameters in order to eliminate false positive and false negative P waves detections. Under normal conditions, the algorithm achieves similar results as previously published methods with Se = 98.56% and PP = 99.82% for ECGs from MIT PDB, and Se = 99.23% and PP = 99.12% for ECGs from QTDB. In ECGs with pathological manifestations our algorithm prominently outperforms other approaches, as follows from the comprehensive testing on highly variable datasets from MIT PDB (Se = 96.40%, PP = 91.56%) and BUT PDB (Se = 93.07%, PP = 88.60%). It should be noted, that the latter contains all the known pathologies affecting P waves presence and positions in ECG.
By accurate automatic detection of P waves in ECGs, our method has a potential to improve the diagnostic yield of routine ECG examination and to simplify the daily work of the cardiologists. The method may also improve accuracy of cardiac pathology detection by wearable devices 48 . The proposed P wave detector represents a huge step towards fully automated systems for ECG analysis and diagnosis of cardiac arrhythmias.