Reconfigurable Architecture for Multi-lead ECG Signal Compression with High-frequency Noise Reduction

Electrocardiogram (ECG) is a record of the heart’s electrical activity over a specified period, and it is the most popular noninvasive diagnostic test to identify several cardiac diseases. It is an integral part of a typical eHealth system, where the ECG signals are often needed to be compressed for long term data recording and remote transmission. Reconfigurable architecture offers high-speed parallel computation unit, particularly the Field Programmable Gate Array (FPGA) along with adaptable software features. Hence, this type of design is suitable for multi-channel signal processing units like ECGs, which usually require precise real-time computation. This paper presents a reconfigurable signal processing unit which is implemented in ZedBoard- a development board for Xilinx Zynq −7000 SoC. The compression algorithm is based on Fast Fourier Transformation. The implemented system can work in real-time and achieve a maximum 90% compression rate without any significant signal distortion (i.e., less than 9% normalized percentage of root-mean-square deviation). This compression rate is 5% higher than the state-of-the-art hardware implementation. Additionally, this algorithm has an inherent capability of high-frequency noise reduction, which makes it unique in this field. The confirmatory analysis is done using six databases from the PhysioNet databank to compare and validate the effectiveness of the proposed system.

helps to reduce the required bandwidth for data transmission. It also saves space for storing the ECG data. This paper presents the design and implementation of an FPGA based reconfigurable system for ECG compression.
Various noises affect ECG signal during the data accusation 10 and transmission process 11 . The high-frequency noise is one of the primary reasons for ECG signal distortion 12 , and it mostly contains the powerline interference (50)(51)(52)(53)(54)(55)(56)(57)(58)(59)(60) and the electromyographic (EMG) noise (100-500 Hz) 13 . One of the added benefits of the proposed system is its built-in capability of high-frequency noise cancellation. Consequently, additional high-frequency noise filtering mechanism in the eHealth system will be redundant and the overall system complexity will be reduced.
The FPGA is essentially an integrated circuit (IC) which can be reconfigured for any assigned task, hence the name 'reconfigurable computing' 14 . It is widely used for ASIC (Application Specific Integrated Circuit) prototyping and hardware verification 15 . Because of the unique capability of parallel computation 16 , FPGAs are significantly faster in the applications where simultaneous computation is needed for multiple processes 17 . In a typical electrophysiological monitoring unit, data accusation is often made by collecting signals from multiple channels. As mentioned earlier, a clinical ECG has a 12 lead standard. This means while processing the data it needs 12 concurrent signal processing. Therefore, FPGA is used as the computation unit in the proposed architecture.
The fundamental focus of an effective ECG compression is efficient transmission and data storage without losing critical diagnostic information. Many methods are found in the literature for the compression of the ECG signal. Based on the techniques, they can be classified as follows: (i) time domain compression, (ii) transform domain compression, (iii) compression by parameter extraction, and (iv) hybrid method. The original time-domain ECG signal features are scrutinized, and redundant data points are discarded in the time domain compression method 18 . As no significant data conversion or transformation is needed in this method, this scheme offers relatively fast processing 19 . The second compression method is the transform domain compression, and it operates by converting the original signal to frequency 20 or spatiotemporal 21 signal. In this method, the signal is processed and compressed after the transformation. Consequently, it takes longer processing time than the first method, though this method is capable of more efficient compression. Compression by parameter extraction is the third ECG compression method. This scheme requires a complex feature extraction from the given signal. Various learning methods are applied to determine the parameter characteristics, which are then used to compress the original signal and conserved thereafter for decompression 8,22 . Some contemporary feature extraction techniques include supervised dictionary learning 23 , object detection 24,25 , background information retrieval 26,27 , and deep learning 28 . Hybrid compression method combines the particulars from time-domain, transform-domain, and parameter extraction methods to create an efficient compression scheme 29,30 . The last two methods involve more computation than the first two methods. Hence, they require more processing time and resources.
Nonetheless, most of the above researches are mainly concentrated on the software level 31,32 . This paper is more focused on the system level implementation of a real-time ECG compression algorithm. Therefore, a fast signal processing method with low hardware resource requirement is perceived while conceiving the appropriate ECG compression algorithm. The proposed system employs a transform domain compression method based on Fast Fourier Transformation (FFT) technique. This system is hardware implemented and demonstrates a significant improvement in compression efficiency without any vital signal deformation. Moreover, it has a built-in high-frequency noise cancellation ability, which makes it unique from the other implemented systems. A comparative study of the proposed system with the other methods is presented in the "Result Analysis" section.

Methodology
The Fast Fourier Transform is a highly optimized form of the Discrete Fourier Transform (DFT), which takes a sequence of sampled data in the time domain and computes the frequency component of that data sequence. DFT is defined by the folllowing equation 33 : Here, each of the single frequency components, X k ( ) is calculated by considering each of the time domain samples, x n ( ). Hence, N 2 addition and multiplication is needed to compute N number of samples. To recover the time domain components from frequency samples IDFT (Inverse DFT) is used, and it is defined 33 by: Again, Eq. (2) requires N 2 addition and multiplication to process N number of samples in spatio-temporal domain. This large number of mathematical operation often creates a computational burden. Hence, FFT and IFFT (Inverse FFT) algorithms are developed 34 . FFT reduces the calculation complexity of DFT by factorizing its matrix into sparse arrays in which the majority of the elements are zero 35 . For N number of points the required number of complex multiplication in the DFT is N 2 , whereas in the similar case, only N log N ( /2) 2 complex multiplication is needed in the FFT algorithm 33 . As an example, if 128 number of point is considered, DFT and FFT involve 16,384 and 448 complex multiplication respectively. Thus, FFT improves the computational speed by a factor of 36.6 in this case 34 .
Using the FPGA as the signal processing hardware gives the unique capability to decompose the complete multichannel architecture into a series of structurally identical single channel core units.
Consequently, a composition of these elementary segments becomes the complete multichannel system as shown in Figs. 1 and 2. The concept of parallel computation in the FPGA can also be understood from this figure.
Twelve signals from the ECG signal accusation unit are fed into the FPGA based signal processing core, and these signals are simultaneously processed to produce the desired real-time outputs.
The proposed compression algorithm works in three steps. At first, the original time domain ECG signal is subjected to FFT for transforming it into the frequency domain. Next, the first level compression is done utilizing the 'symmetric property' of this frequency response. Finally, additional compression is performed by discarding the high-frequency noise components. Figure 3 demonstrates a simple block diagram to clarify the compression steps.
A bit accurate MATLAB program is developed, and a standard ECG signal from the MIT-BIH Normal Sinus Rhythm Database (collected from the PhysioBank 36 record: 16265, sampling frequency 128 Hz) is used to demonstrate the proposed method. As mentioned earlier, for simplicity, this primary model is designed for a single channel ECG compression and decompression. The complete multichannel scheme is implemented in hardware and described in the section 'FPGA Implementation' . Figure 4 shows an epoch of a single channel signal from the aforementioned ECG, displaying a time window of 1 second.
As mentioned earlier, the sampling frequency is 128 Hz. Therefore, this epoch of 1 second corresponds to 128 data samples in the time domain. As the first step of compression, FFT is used to transform this time domain signal to the frequency domain. Here, 128-point FFT is applied to the signal of Fig. 4, and the outcome is 128 complex samples in the frequency domain. This frequency response in absolute value is shown in Fig. 5.
It is mathematically established that applying FFT to a real signal results in a symmetric complex signal (with the real and imaginary parts) 37 . Since the input ECG is a real signal, the FFT output is a complex one, and as expected, it is symmetric in nature. The data in frequency domain sample number 65 through 128 are redundant to store in the memory because they can be regenerated by simply mirroring the rest of the data. Therefore, the first level of compression is performed by discarding half of the FFT data.
At this point, it should be noted that the ECG frequency components are dominant in the lower frequency range (<30 Hz) 38 . Here, the sampling frequency is 128 Hz. Nyquist theorem dictates that the input signal can represent up to half of its sampling frequency 33 . Hence, the input signal can have frequency components up to 64 Hz. In this case, 128-point FFT is applied; hence it has a 0-64 Hz band, which is conveniently represented by the respective sample numbers. Therefore, for the second level of compression, data of the sample number 33 through 64 are dropped, as they are essentially the high-frequency (33-64 Hz) noise components. Thus the essential lower frequency components (0-32 Hz) prevail while the high-frequency (HF) noises (>32 Hz) are discarded in the compressed signal. These two-level compression are shown in Figs. 6 and 7 respectively.   www.nature.com/scientificreports www.nature.com/scientificreports/ To explain the compression ratio, let us take the above example in consideration. Here, the input time domain signal has 128 real number. At first, consider this numbers in an input bin A. The FFT output is 128 complex numbers. Putting the real and imaginary number in the separate bins will result in 2 bins B and C with 128   www.nature.com/scientificreports www.nature.com/scientificreports/ number in each of them, totaling 256 numbers in the memory. This means the resultant memory size is, in fact, taking double space compared to the input. However, after the first level of compression, the size of the bin B and C will be half, as these bins will now contain 64 numbers each. At this point the total output memory contains 128 numbers, which is the same as the input bin. Finally, the second level of compression is executed by dropping the high-frequency noise signals. This process will further size down bin B and C. Now, these bins hold 32 numbers each. Therefore, the final output bins will contain 64 numbers in total. The compression ratio (CR) is defined by 39 size of the input stream size of the output stream size of the input stream (%) 100 (3) For this particular example, the size of the input and output stream is 128 and 64 respectively. In this case, the compression ratio is 50%. However, CR can be easily adjusted by modifying how much high-frequency components is to be discarded.
The decompression algorithm is somewhat a reverse technique of the compression method. Figure 8 demonstrates a simplified block diagram of the decompression process. Taking the compressed ECG signal from the preceding example, the main objective of decompression is to recover the data of the frequency sample number 33 through 128. As mentioned earlier, the discarded data of the sample number 33 through 64 are actually high-frequency noise components. Therefore, as the first step of decompression, the data corresponding the sample number 33 through 64 are substituted by zeros to ensure the HF noise reduction. The second step is mirroring the frequency components of sample number 1 through 64 to regenerate the data of the sample number 65 through 128. Finally, the frequency domain decompressed signal is transformed into the time domain using the IFFT. Figures 9 and 10 show the decompression process and the restored time-domain ECG signal.

FPGA Implementation
ZedBoard is an optimum cost-effective development board manufactured by Digilent. This board employs Xilinx Artix-7 FPGAs coupled with a dual-core ARM Cortex-A9 processor, which is specially optimized for digital signal processing (DSP) applications 40 . ZedBoard is used as the processing unit for implementing the proposed scheme in hardware level.
System Generator is one of the design tools for the implementation of DSP algorithms in Xilinx devices, and it is used to program the ZedBorad in this research. System Generator works as a toolbox for Simulink, a graphical programming environment of MATLAB. A simple diagram of the practical hardware design is shown in Fig. 11 and the screenshot of the practical system generator design is illustrated in Fig. 12.
It should be noted that this core processing unit consists of both compression (blue shaded portion of the Figs. 11 and 12) and decompression (green shaded portion of the Figs. 11 and 12) unit. If necessary, the practical arrangement in the eHealth system can be designed either as a compression unit (for transmission system) or as a decompression unit (for reception system).
The compression unit is composed of three subunits-one FFT module and two compression subunits. The input ECG signal is fed into the FFT unit, and it returns two FFT outputs as real and imaginary numbers. These real and imaginary parts are compressed in the following units. The output of these compression units are the desired compressed signal which can be stored or transmitted for farther use. In the design, these compressed  www.nature.com/scientificreports www.nature.com/scientificreports/ signals are used as the input of the decompression unit. Here, the decompression unit is also comprised of three subunits. Two of them are for decompressing the compressed frequency-domain ECG signal. The final subunit is the IFFT module, which converts the decompressed frequency-domain ECG signal to the desired time-domain ECG as the output. Figure 13 shows signals of different processing stages in the Xilinx Waveform Viewer. Table 1 shows the hardware resources required in the implementation of a single channel ECG processing unit (i.e. the core unit) on Zedboard. This ECG processing unit comprises both compression and decompression subsystems. Comparing the number of used resources with the available resources on Zedboard, it is possible to identify the plausible maximum number of core unit implementation on a single Zedboard. Here, the core unit utilizes 34% of the available DSPs, featuring the largest percentage of available resource consumption. Hence, two ECG processing units (including both compression and decompression subsystems) can be implemented on one Zedboard. Consequently, six Zedboards will be required for the implementation of a twelve channel system with compression-decompression capability. However, for an eHealth transmission or recording system, only   www.nature.com/scientificreports www.nature.com/scientificreports/ the capability of compression is adequate. In this case, a twelve channel ECG compression unit can be built using only one Zedboard. To built an eHealth reception or restoring system, only decompression ability is required. Therefore, four Zedboards will be needed to design a twelve channel ECG reconstruction unit.
In this research, the prototype uses Zedboard for hardware implementation. This development board is a low-cost option with a relatively low-end FPGA as the processing unit. Though employing this board offers a budget development prospect, this is also one of the limitations of the implemented system. The high-end boards provide advanced FPGAs with more hardware resources. Therefore, the processing unit can accommodate more core units than the proposed prototype. Furthermore, complex algorithms can be adopted if adequate hardware resources become available. This constitutes a scope for future development.

Result Analysis
As mentioned earlier, the proposed system compresses the ECG signal in the frequency domain. This compression is done in two vital steps: discarding the symmetric data and removing the high-frequency components. The reconstruction of the discarded symmetric data does not involve any data loss as it can be done by simple data mirroring. However, as the high-frequency components are considered as noises and permanently discarded during the compression process, the restored ECG signal does not inherit the high-frequency components. Though   www.nature.com/scientificreports www.nature.com/scientificreports/ this process attributes as a low pass noise filter, the removal of high-frequency components makes the decompressed ECG somewhat different from the original signal.
For a preliminary graphical comparison, the raw input and the recovered output ECG are superimposed in the Fig. 14. It can be readily comprehended from a thorough visual inspection that, these two ECG traces are practically indistinguishable. As the ECG compression should aim to maximize compression efficiency without the deterioration of the signal quality, both the compression efficiency and the signal quality are evaluated while analyzing the performance of the proposed system.
Compression efficiency indicates how much the processed data is compressed compared to the raw data. The most common parameter to evaluate the compression efficiency is to measure the "Compression Ratio (CR)" 39 . CR is defined by Eq. (3) in the "Methedology" section which can be rewritten in the form of Eq. (4) for further discussion.
size of the output stream size of the input stream (%) 1 100 (4) It should be noted that in some literature, CR is also mentioned as the Data-volume Saving (DS) as it essentially indicates the percentage of data being saved by the compression 41 .
It goes without saying that the quality of the reconstructed ECG signal must be satisfactory to avoid misdiagnosis. There are several parameters to assess the quality of the decompressed ECG. Among them, the "Percentage Root mean square Difference (PRD)" is the most common and the "Normalized version of Percentage Root mean square Difference (PRDN)" is the most accurate 39 . PRD and PRDN are defined by the Eqs (5) and (6)  Here, x n ( ) is the original signal,  x n ( ) is the reconstructed signal and x n ( ) is the mean of the original signal. According to 39,42 and 43 , if the value of PRDN is less than 9% then the quality of the reconstructed signals are considered as "very good".
The acceptable ECG sampling frequency range for legitimate clinical use is 100-1000 Hz 44 . Therefore, to cover this frequency range, six databases are used to assess the performance of the proposed system. These datasets are downloaded from the PhysioBank 36 and presented in the Table 2 with relevant references.
The compression ratio of the processed ECG can be controlled by simply adjusting how much high-frequency components to be discarded. However, dropping too much data will, of course, distort the resultant signal from the original one. Figure 15 shows the relationship between the Compression Ratio and the Normalized Percentage Root mean square Difference for various sampling frequency.
As expected, a higher percentage of CR gives a higher rate of PRDN. This case is more severe for the lower sampling frequency but much less prevailing for the higher one. This phenomenon can be explained by using the Nyquist theorem as mentioned in the "Methodology" section. The ECG signals with higher sampling frequency inherently possess greater high-frequency noise components in it. Hence, it is possible to discard much of these noisy data during the compression stage without declining the signal quality. Actually, this property is suitable for practical applications; as for similar signals, a lower sampling frequency already consumes fewer data. As the signals with higher sampling frequency occupy more data, it makes sense to compress it with greater compression efficiency. Figure 15 also indicates the 9% PRDN line, which is the boundary condition for being a "very good" www.nature.com/scientificreports www.nature.com/scientificreports/ compressed signal. Considering this 9% PRDN as the higher limit of signal distortion we can find out the maximum allowable compression ratio for different sampling frequency as shown in Fig. 16.
At this point, it should be noted that, though discarding high-frequency components distorts the compressed signal from the original one, it acts as a high-frequency noise reduction process. This is an added benefit of the proposed compression system. Using FPGAs for hardware implementation plays another vital role in this research. As a reconfigurable device, the same FPGA units can be easily configured for different compression ratio for different sampling frequency when required. This indeed makes the eHealth system more flexible for global implementation.
To conclude the result analysis a comparative study between the proposed system and the previous works is presented in Table 3. In spite of being the most critical quality measurement parameter, PRDN is not specified in some of the papers. However, PRD is measured for all cases. Therefore, PRD is also included as a quality measurement parameter for this study. For this table the performance of the proposed system is measured for 720 Hz sampling frequency using ANSI/AAMI EC13 Test database.
A few of the previous works (and 20,29,30,45 ) can compress the ECG at a higher rate than the proposed one. However, their signal quality is less than the new system, and these works are not implemented at the hardware level. Among the hardware implemented works, the proposed method has the highest compression ratio (at  Table 2. Used databases for the performance evaluation.  www.nature.com/scientificreports www.nature.com/scientificreports/ least 5% improvement) while maintaining the accepted PRDN limit. It is the only implemented system which can reduce noises from the ECG while compressing it. The overall signal quality is actually improved during the compression process as the high-frequency noise is reduced. This makes the new system unique. The higher data volume savings with better signal quality ensures economical telemedicine and electronic health record system. Therefore, the proposed system is more suitable for upper-level applications like eHealth systems.

Conclusion
This paper presents a novel reconfigurable system architecture for electrocardiographic signal compression. To the best of our knowledge, this is the first hardware implemented ECG compression system which can reduce the embedded high-frequency noise from the original signal while compressing it. This unique feature enhances the overall ECG signal quality for further diagnosis. Because of its high compression performance and superior signal quality, this system is ideal for the incorporation with a contemporary eHealth care system. Furthermore, similar methods and system designs can be utilized for compressing other signals, especially the signals with distinct frequency dependence. The FPGA design can be used for prototyping a dedicated application specific integrated circuit in the future. This will make the system suitable for wearable wireless ECG monitoring systems.

Data availability
The datasets analyzed during the current study are available in the PhysioBank repository, www.physionet.org/ physiobank/.