Abstract
An effective method for compression of ECG signals, which falls within the transform lossy compression category, is proposed. The transformation is realized by a fast wavelet transform. The effectiveness of the approach, in relation to the simplicity and speed of its implementation, is a consequence of the efficient storage of the outputs of the algorithm which is realized in compressed Hierarchical Data Format. The compression performance is tested on the MITBIH Arrhythmia database producing compression results which largely improve upon recently reported benchmarks on the same database. For a distortion corresponding to a percentage rootmeansquare difference (PRD) of 0.53, in mean value, the achieved average compression ratio is 23.17 with quality score of 43.93. For a mean value of PRD up to 1.71 the compression ratio increases up to 62.5. The compression of a 30 min record is realized in an average time of 0.14 s. The insignificant delay for the compression process, together with the high compression ratio achieved at low level distortion and the negligible time for the signal recovery, uphold the suitability of the technique for supporting distant clinical health care.
Introduction
The electrocardiogram, frequently called ECG, is a routine diagnostic test to assess the electrical and muscular functions of the heart. A trained person looking at an ECG record can for instance interpret the rate and rhythm of heartbeats; estimate the size of the heart, the health of its muscles and its electrical systems; check for effects or side effects of medications on the heart, or check heart abnormalities caused by other health conditions. At the present time, ambulatory ECG monitoring serves to detect and characterize abnormal cardiac functions during long hours of ordinary daily activities. Thereby the validated diagnostic role of ECG recording has been extended beyond the bedside^{1,2,3}.
The broad use of ECG records, in particular as a mean of supporting clinical health care from a distance, enhances the significance of dedicated techniques for compressing this type of data. Compression of ECG signals may be realized without any loss in the signal reconstruction, what is referred to as lossless compression, or allowing some distortion which does not change the clinical information of the data. The latter is called lossy compression. This procedure can enclose an ECG signal within a file significantly smaller than that containing the uncompressed record.
The literature concerning both lossless^{4,5,6,7,8} and lossy compression^{9,10,11,12,13,14,15} of ECG records is vast. It includes emerging methodologies based on compressed sensing^{16,17,18,19}. This work focusses on lossy compression with good performance at low distortion recovery. Even if the approach falls within the standard transform compression category, it achieves stunning results. Fresh benchmarks on the MITBIH Arrhythmia database are produced for values of PRD as in recent publications^{11,12,14,15}.
The transformation step applies a Discrete Wavelet Transform (DWT). It is recommended to use the fast CohenDaubechiesFeauveau 9/7 (CDF 9/7) DWT^{20}, but other possibilities could also be applied. Techniques for ECG signal compression using a wavelet transform have been reported in numerous publications. For a review paper with extensive references see^{21}. The main difference introduced by our proposal lies in the compression method. In particular in what we refer to as the Organization and Storage stage. One of the findings of this work is the appreciation that remarkable compression results are achievable even prescinding from the typical entropy coding step for saving the outputs of the algorithm. High compression is attained in straightforward manner by saving in the Hierarchical Data Format (HDF)^{22}. More precisely, in the compressed HDF5 version which is supported by a number of commercial and noncommercial software platforms including MATLAB, Octave, Mathematica, and Python. HDF5 also implements a highlevel Application Programming Interface (API) with C, C++, Fortran 90, and Java interfaces. As will be illustrated here, if implemented in software, adding to the algorithm an entropy coding process may improve compression further, but at expense of processing time. Either way, the compression results for distortion corresponding to mean PRD in the range [0.48, 1.71] are shown to significantly improve recently reported benchmarks^{11,12,14,15} on the MITBIH Arrhythmia database. For PRD < 0.4 the technique becomes less effective.
Method
Before describing the method let’s introduce the notational convention. \({\mathbb{R}}\) is the set of real numbers. Bold face lower cases are used to represent one dimension arrays and standard mathematical fonts to indicate their components, e.g. \({\bf{c}}\in {{\mathbb{R}}}^{N}\) is an array of N real components c(i),i = 1, …, N, or equivalently c = (c(1), …, c(N)). Within the algorithms, operations on components will be indicated with a dot, e.g. c.^{2} = (c(1)^{2}, …, c(N)^{2}) and c. = (c(1), …, c(N)). Moreover t = cumsum (c.^{2}) is a vector of components \(t(n)={\sum }_{i\mathrm{=1}}^{n}\,{c(i)}^{2},\,\,n=\mathrm{1,}\,\ldots ,\,N\).
The proposed compression algorithm consists of three distinctive steps.

(1)
Approximation Step. Applies a DWT to the signal keeping the largest coefficients to produce an approximation of the signal up to the target quality.

(2)
Quantization Step. Uses a scalar quantizer to convert the wavelet coefficients in multiples of integer numbers.

(3)
Organization and Storage Step. Organizes the outputs of steps (1) and (2) for economizing storage space.
At the Approximation Step a DWT is applied to convert the signal \({\bf{f}}\in {{\mathbb{R}}}^{N}\) into the vector \({\bf{w}}\in {{\mathbb{R}}}^{N}\) whose components are the wavelet coefficients (w(1), …, w(N)). For deciding on the number of nonzero coefficients to be involved in the approximation we consider two possibilities:

(a)
The wavelet coefficients (w(1), …, w(N)) are sorted in ascending order of their absolute value (w(γ_{1}), …, w(γ_{N})), with w(γ_{1})≤ ⋯ ≤w(γ_{N}). The cumulative sums \(t(n)={\sum }_{i=1}^{k}\,{w({\gamma }_{i})}^{2},\,n=\mathrm{1,}\,\ldots ,\,N\) are calculated to find all the values n such that t(n) ≥ tol^{2}. Let k + 1 be the smallest of these values. Then the indices γ_{i},i = k + 1,…N give the coefficients w(γ_{i}), i = k + 1, …, N of largest absolute value. Algorithm 1 summarizes the procedure.

(b)
After the quantization step the nonzero coefficients and their corresponding indices are gathered together.
At the Quantization Step the selected wavelet coefficients c = (c(1), …, c(K)), with K = N − k and c(i − k) = w(γ_{i}), i = k + 1, …, N, are transformed into integers by a midtread uniform quantizer as follows:
where \(\lfloor x\rfloor \) indicates the largest integer number smaller or equal to x and Δ is the quantization parameter. After quantization, the coefficients and indices are further reduced by the elimination of those coefficients which are mapped to zero by the quantizer. The above mentioned option (b) follows from this process. It comes into effect by skipping Algorithm 1. The signs of the coefficients are encoded separately using a binary alphabet (1 for + and 0 for −) in an array (s(1), …, s(K)).
Since the indices \({\ell }_{i},\,i=\mathrm{1,}\,\ldots ,\,K\) are large numbers, in order to store them in an effective manner at the Organization and Storage Step we proceed as follows. These indices are reordered in ascending order \({\ell }_{i}\to {\tilde{\ell }}_{i},\,i=\mathrm{1,}\,\ldots ,\,K\), which guarantees that \({\tilde{\ell }}_{i} < {\tilde{\ell }}_{i+1},\,i=\mathrm{1,}\,\ldots ,\,K\). This induces a reorder in the coefficients, \({{\bf{c}}}^{{\rm{\Delta }}}\to {\tilde{{\bf{c}}}}^{{\rm{\Delta }}}\) and in the corresponding signs \({\bf{s}}\to \tilde{{\bf{s}}}\). The reordered indices are stored as smaller positive numbers by taking differences between two consecutive values. Defining \(\delta (i)={\tilde{\ell }}_{i}{\tilde{\ell }}_{i1},\,i=\mathrm{2,}\,\ldots ,\,K\) the array \(\tilde{{\boldsymbol{\delta }}}=({\tilde{\ell }}_{1},\,\delta \mathrm{(2),}\,\ldots ,\,\delta (K))\) stores the indices \({\tilde{\ell }}_{1},\,\ldots ,\,{\tilde{\ell }}_{K}\) with unique recovery. The size of the signal, N, the quantization parameter Δ, and the arrays \({\tilde{{\bf{c}}}}^{{\rm{\Delta }}}\), \(\tilde{{\bf{s}}}\), and \(\tilde{{\boldsymbol{\delta }}}\) are saved in HDF5 format. The HDF5 library operates using a chunked storage mechanism. The data array is split into equally sized chunks each of which is stored separately in the file. Compression is applied to each individual chunk using gzip. The gzip method is based of on the DEFLATE algorithm, which is a combination of LZ77^{23} and Huffman coding^{24}. Within MATLAB all this is implemented simply by using the function save to store the data.
Algorithm 2 outlines a pseudo code of the above described compression procedure.
The fast wavelet transform has computational complexity O(N). Thus, if the approach (a) is applied, the computational complexity of Algorithm 2 is dominated by the sort operation in Algorithm 1 with average computational complexity O(NlogN). Otherwise the complexity is just O(N), because the number K of indices of nonzero coefficients to be sorted is in general much less than N. Nevertheless, as will be shown in the Numerical Example III, in either case the compression of a 30 min record is achieved on a MATLAB platform in an average time less then 0.2 s. While compression performance can be improved further by adding an entropy coding step before saving the arrays, if implemented in software such a step slows the process.
When selecting the number of wavelet coefficients for the approximation by method a) the parameter tol is fixed as follows: Assuming that the target PRD before quantization is PRD_{0} we set \({\rm{tol}}={{\rm{PRD}}}_{{\rm{0}}}\Vert f\Vert /100\). The value of PRD_{0} is fixed as 70–80% of the required PRD. The quantization parameter is tuned to achieve the required PRD.
Signal recovery
At the Decoding Stage the signal is recovered by the following steps.

Read the number N, the quantization parameter Δ, and the arrays \({\tilde{{\bf{c}}}}^{{\rm{\Delta }}}\), \(\tilde{{\boldsymbol{\delta }}}\), and \(\tilde{{\bf{s}}}\) from the compressed file.

Recover the magnitude of the coefficients from their quantized version as
$${\tilde{{\bf{c}}}}^{{\rm{r}}}={\rm{\Delta }}{\tilde{{\bf{c}}}}^{{\rm{\Delta }}}.$$(2) 
Recover the indices \(\tilde{\ell }\) from the array \(\tilde{{\boldsymbol{\delta }}}\) as: \({\tilde{\ell }}_{1}=\tilde{\delta }\mathrm{(1)}\) and \({\tilde{\ell }}_{i}=\tilde{\delta }(i)+\tilde{\delta }(i\mathrm{1),}\,i=\mathrm{2,}\,\ldots ,\,K\mathrm{.}\)

Recover the signs of the the wavelet coefficients as \({\tilde{{\bf{s}}}}^{{\rm{r}}}=2\tilde{{\bf{s}}}1\)

Complete the full array of wavelet coefficients as w^{r}(i) = 0, i = 1, …, N and \({{\bf{w}}}^{{\rm{r}}}(\tilde{\ell })={\tilde{{\bf{s}}}}^{{\rm{r}}}\mathrm{.}{\tilde{{\bf{c}}}}^{{\rm{r}}}\)

Invert the wavelet transform to recover the approximated signal f^{r}.
As shown in Tables 5–7, and the recovery process runs about 3 times faster than the compression procedure, which is already very fast.
Results
We present here four numerical tests with different purposes. Except for the comparison in Test II, all the other tests use the full MITBIH Arrhythmia database^{25} which contains 48 ECG records. Each of these records consists of N = 650000 11bit samples at a frequency of 360 Hz. The algorithms are implemented using MATLAB in a notebook Core i7 3520 M, 4GB RAM.
Since the compression performance of lossy compression has to be considered in relation to the quality of the recovered signals, we introduce at this point the measures to evaluate the results of the proposed procedure.
The quality of a recovered signal is assessed with respect to the PRD calculated as follows,
where, f is the original signal, f^{r} is the signal reconstructed from the compressed file and \(\Vert \cdot \Vert \) indicates the 2norm. Since the PRD strongly depends on the baseline of the signal, the PRDN, as defined below, is also reported.
where, \(\overline{{\bf{f}}}\) indicates the mean value of f.
When fixing a value of PRD, the compression performance is assessed by the Compression Ratio (CR) as given by
The quality score (QS), reflecting the tradeoff between compression performance and reconstruction quality, is the ratio:
Since the PRD is a global quantity, in order to detect possible local changes in the visual quality of the recovered signal, we define the local PRD as follows. Each signal is partitioned in Q segments f_{q}, q = 1 …, Q of L samples. The local PRD with respect to every segment in the partition, which we indicate as prd(q), q = 1, … Q, is calculated as
where \({{\bf{f}}}_{q}^{{\rm{r}}}\) is the recovered portion of the signal corresponding to the segment q. For each record the mean value prd (\(\overline{{\rm{prd}}}\)) and corresponding standard deviation (std) are calculated as
and
The mean value prd with respect to all the records in the database is a double average \(\overline{\overline{{\rm{prd}}}}\).
When comparing two approaches on a database we reproduce the same mean value PRD. The quantification of the relative gain in CR of one particular approach, say approach 1, in relation to another, say approach 2, is given by the quantity:
The gain in QS has the equivalent definition.
Numerical test I
We start the tests by implementing the proposed approach using wavelet transforms corresponding to different wavelet families at different levels of decomposition. The comparison between different wavelet transforms is realized using approach (b), because within this option each value of PRD is uniquely determined by the quantization parameter Δ. Thus, the difference in CR is only due to the particular wavelet basis and the concomitant decomposition level. Table 1 shows the average CR (indicated as CR_{b}) and corresponding standard deviation (std) with respect to the whole data set and for three different values of PRD. For each PRDvalue CR_{b} is obtained by means of the following wavelet basis: db5 (Daubechies) coif4 (Coiflets) sym4 (Symlets) and cdf97 (CohenDaubechiesFeauveau). Each basis is decomposed in three different levels (lv).
As observed in Table 1, on the whole the best CR is achieved with the biorthogonal basis cdb97 for lv = 4. In what follows all the results are given using this basis for decomposition level lv = 4.
Next we produce the CR for every record in the database for a mean value PRD of 0.53.
Table 2 shows the results obtained by approach (a) where the CR and QS produced by this method are indicated as CR_{a} and QS_{a}, respectively. The PRD values for each of the records listed in the first column of Table 2 are given in the forth columns of those tables. The second and third columns show the values of \(\overline{{\rm{prd}}}\) and the corresponding std for each record. The CR is given in the fifth column and the corresponding QS in sixth column of the table. The mean value CR obtained by method (b) for the same mean value PRD = 0.53 is CR_{b} = 22.16.
Table 3 shows the variations of the CR_{a} with different values of the parameter PRD_{0} in method (a).
Numerical test II
Here comparisons are carried out with respect to results produced by the set partitioning in hierarchical threes algorithm (SPHIT) approach proposed in^{26}. Thus for this test we use the data set described in that publication. It consists of 10min long segments from records 100, 101, 102, 103, 107, 108, 109, 111, 115, 117, 118, and 111. As indicated in the footnote of^{26} at pp 853, the given values of PRD correspond to the subtraction of a baseline equal to 1024. This has generated confusion in the literature, as often the values of PRD in Tables III of^{26} are unfairly reproduced for comparison with values of PRD obtained without subtraction of the 1024 base line. The values of PRD with and without subtraction of that baseline, which are indicated as PRD_{B} and PRD respectively, are given in Table 4. As seen in this table, for the same approximation there is an enormous difference between the two metrics. A fair comparison with respect to the results in^{26} should either involve the figures in the second row of Table 4 or, as done in^{26}, the fact that a 1024 base line has been subtracted should be specified.
The figures in the 3rd row of Table 4 correspond to the CRs in^{26}. The 4th row shows the CRs resulting from method (b) of the proposed approach without entropy coding and the 5th row the results of adding a Huffman coding step before saving the compressed data in HDF5 format. The last two rows show the quantization parameters Δ which produce the required values of PRD_{B} and PRD.
Numerical test III
This numerical test aims at comparing our results with recently reported benchmarks on the full MITBIH Arrhythmia database for mean value PRD in the rage [0.23, 1.71]. To the best of our knowledge the highest CRs reported so far for mean value PRD in the range [0.8, 1.30) are those in^{12}, and in the range (1.30,1.71] those in^{14}. For PRD < 0.8 the comparison is realized with the results in^{11}, as shown in Table 7. Table 5 compares our results against the results in Table III of^{12} and Table 6 against Table 1 of^{14}. In both cases we reproduce the identical mean value of PRD. The differences are in the values of CR and QS. All the Gains given in Table 5 are relative to the results in^{12} while those given in Tables 6 and 7 are relative to the results in^{14} and^{11}, respectively.
As already remarked, and fully discussed in^{27}, when comparing results from different publications care should be taken to make sure that the comparison is actually on the identical database, without any difference in baseline. From the information given in the papers producing the results we are comparing with (the relation between the values of PRD and PRDN) we can be certain that we are working on the same database^{25}, which is described in^{28}.
The parameters for reproducing the required PRD with methods (a) and (b) are given in the last 3 rows of Tables 5–7. The previous 3 rows in each table give, in seconds, the average time to compress (t^{c}) and recover (t^{r}) a record. As can be observed, the compression times of approaches (a) and (b) are very similar. The given times were obtained as the average of 10 independent runs. Notice that the CR in these tables do not include the additional entropy coding step.
Figure 1 gives the plot of CR vs PRD for the approaches being compared in this section.
Numerical test IV
Finally we would like to highlight the following two features of the proposed compression algorithm.

(1)
One of the distinctive features stems from the significance of saving the outputs of the algorithm directly in compressed HDF5 format. In order to highlight this, we compare the size of the file saved in this way against the size of the file obtained by applying a commonly used entropy coding process, Huffman coding, before saving the data in HDF5 format. The implementation of Huffman coding is realized, as in Table 4, by the off the shelf MATLAB functions Huff06 available on^{29}. In Table 8 CR_{a} and CR_{b} indicate, as before, the CR obtained when the outputs of methods (a) and (b) are directly saved in HDF5 format. \({{\rm{CR}}}_{{\rm{a}}}^{{\rm{Huff}}}\) and \({{\rm{CR}}}_{{\rm{b}}}^{{\rm{Huff}}}\) indicate the CR when Huffman coding is applied on the outputs (a) and (b) before saving the data in HDF5 format. The rows right below the CRs give the corresponding compression times.

(2)
The other distinctive feature of the method is the significance of the proposed Organization and Storage step. In order to illustrate this, we compare the results obtained by method (b) with those obtained using the conventional RunLength (RL) algorithm^{30} instead of storing the indices of nonzero coefficients as proposed in this work. The CR corresponding to RL in HDF5 format is indicated in Table 8 as CR_{RL}. When Huffman coding is applied on RL before saving the outputs in compressed HDF5 format, the CR is indicated as \({{\rm{CR}}}_{{\rm{RL}}}^{{\rm{Huff}}}\).
Discussion
We notice that, while the results in Table 1 show some differences in CR when different wavelets are used for the DWT, it is clear from the table that the selection of the wavelet family is not the crucial factor for the success of the technique. The same is true for the decomposition level. That said, since the best results correspond to the cdf97 family at decomposition level 4, we have realized the other numerical tests with that wavelet basis.
We chose to produce full results for a mean value PRD of 0.53 (c.f. Table 2) as this value represents a good compromise between compression performance and high visual similitude of the recovered signal and the raw data. Indeed, in^{15} the quality of the recovered signals giving rise to a mean value PRD of 0.53 is illustrated in relation to the high performance of automatic QRS complex detection. However, the compression ratio of their method is low. For the same mean value of PRD our CR is 5 times larger: 4.5^{15} vs 23.17 (Table 2). As observed in Table 2 the mean value of the local quantity prd is equivalent to the global value (PRD). Nevertheless the prd may differ for some of the segments in a record. Figure 2 plots the prd for record 101 partitioned into Q = 325 segments of length L = 2000 sample points. Notice that there are a few segments corresponding to significantly larger values of prd than the others. Accordingly, with the aim of demonstrating the visual quality of the recovered signals, for each signal in the database we detect the segment \({q}^{\ast }\) of maximum distortion with respect to the prd as
The left graphs of Fig. 3 correspond to the segments of maximum prd with respect to all the records in the database and segments of length L = 2000. These are: the segment 25 of records 101, when applying the approximation approach (a) (top graph), and segment 175 of record 213 for approach (b) (bottom graph). The upper waveforms in all the graphs are the raw ECG data. The lower waveforms are the corresponding approximations which have been shifted down for visual convenience. The bottom lines in all the graphs represent the absolute value of the difference between the raw data and their corresponding approximation. The right graphs of Fig. 3 have the same description as the left ones but the segments correspond to values of prd close to the mean value prd for the corresponding record.
It is worth commenting that the difference in the results between approaches (a) and (b) is consequence of the fact that the concomitant parameters are set to approximate the whole database at a fixed mean value PRD. In that sense, approach (a) provides us with some flexibility (there are two parameters to be fixed to match the required PRD) whereas for approach (b) the only parameter (Δ) is completely determined by the required PRD. As observed in Table 3, when setting the parameter PRD_{0} much smaller than the target PRD the approximation is only influenced by the quantization parameter Δ and methods (a) and (b) coincide. Contrarily, when setting the PRD_{0} too close to the target PRD the quantization parameter needs to be significantly reduced, which affects the compression results. For a target PRD≥0.4 we recommend to set PRD_{0} as 70–80% of the required PRD.
For values of PRD < 0.4 the storage approach is not as effective as for larger values of PRD. This is noticeable in both Tables 4 and 8. Another feature that appears for PRD < 0.4 is that applying the entropy coding step, before saving the data in compressed HDF5 format, improves the CR much more than for larger values of PRD. This is because for PRD < 0.4 the approximation fits noise and small details, for which components in higher wavelet bands are required. Contrarily, for larger values of PRD the adopted uniform quantization keeps wavelet coefficients in the first bands. As a result, through the proposed technique the location of the nonzero wavelet coefficients is encoded in an array which contains mainly a long stream of ones. For small values of PRD the array’s length increases to include different numbers. This is why the addition of an entropy coding step, such as Huffman coding which assigns smaller bits to the most frequent symbols, becomes more important. In any case, if the outputs are saved in HDF5 format, adding the Huffman coding step is beneficial. Nonetheless, since when implemented in software the improvement comes at expense of computational time, for PRD > 0.4 this step can be avoided and the CR is still very high.
Comparisons with the conventional RL algorithm, in Table 8, enhances the suitability of the proposal for storing the location of nonzero coefficients. A similar storage strategy has been successfully used with other approximation techniques for compression of melodic music^{31} and XRay medical images^{32}. In this case the strategy is even more efficient, because the approximation is realized using a basis and on the whole signal, which intensifies the efficiency of the storage approach.
Conclusions
An effective and efficient method for compressing ECG signals has been proposed. The proposal was tested on the MITBIH Arrhythmia database, which gave rise to benchmarks improving upon recently reported results. The main feature of the method is its simplicity and the fact that for values of PRD > 0.4 a dedicated entropy coding to save the outputs can be avoided by saving the outputs of the algorithm in compressed HDF5. This solution involves a time delay which is practically negligible in relation to the signal length: 0.14 s for compressing a 30 min record. Two approaches for reducing wavelet coefficients have been considered. Approach (b) arises from switching off in approach (a) the selection of the largest wavelet coefficients before quantization. It was shown that, when approximating a whole database to obtain a fixed mean value of PRD, approach (a) may render a higher mean vale of CR when the target PRD is greater the 0.4.
The role of the proposed Organization and Store strategy was highlighted by comparison with the conventional Run Length algorithm. Whilst the latter produces smaller CRs, the results are still good in comparison with previously reported benchmarks. This outcome leads to conclude that, using the a wavelet transform on the whole signal, uniform quantization for all the wavelet bands works well in the design of a codec for lossy compression of ECG signals.
Note: The MATLAB codes for implementing the whole approach have been made available on a dedicated website^{29,33}.
Data Availability
The data used in this paper are available on https://physionet.org/physiobank/database/mitdb/ We have also placed the data, together with the software for implementing the proposed approach, on http://www.nonlinearapprox.info/examples/node012.htm.
References
 1.
Gibson, C. M. et al. Diagnostic and prognostic value of ambulatory ECG (Holter) monitoring in patients with coronary heart disease: a review. J Thromb Thrombolysis 23, 135–145 (2007).
 2.
Mittal, S., Movsowitz, C. & Steinberg, J. S. Ambulatory external electrocardiographic monitoring: focus on atrial fibrillation. J Am Coll Cardiol 58, 1741–1749 (2011).
 3.
Steinberg, J. S. et al. ISHNEHRS expert consensus statement on ambulatory ECG and external cardiac monitoring/telemetry. Heart Rhythm 14, e55–e96 (2017).
 4.
Jalaleddine, S. M. S., Hutchens, C. G., Strattan, R. D. & Coberly, W. A. ECG data compression techniques – a unified approach. IEEE Transactions on Biomedical Engineering 37, 329–343 (1990).
 5.
Sriraam, N. & Eswaran, C. An Adaptive Error Modeling Scheme for the Lossless Compression of EEG Signals. IEEE Transactions on Information Technology in Biomedicine 12, 587–594 (2008).
 6.
Srinivasan, K., Dauwels, J. & Reddy, M. R. A twodimensional approach for lossless EEG compression. Biomedical Signal Processing and Control 4, 387–394 (2011).
 7.
Mukhopadhyay, S. K., Mitra, S. & Mitra, M. A lossless ECG data compression technique using ASCII character encoding. Computers & Electrical Engineering 37, 486–497 (2011).
 8.
Hejrati, B., Fathi, A. & AbdaliMohammadi, F. Efficient lossless multichannel EEG compression based on channel clustering. Biomedical Signal Processing and Control 31, 295–300 (2017).
 9.
Miaou, S.G., Yen, H.L. & Lin, C.L. Waveletbased ECG compression using dynamic vector quantization with tree codevectors in single codebook,. IEEE Transactions on Biomedical Engineering 19, 671–680 (2002).
 10.
Ku, C., Hung, K. & Wu, T. Waveletbased ECG data compression system with linear quality control schem. IEEE Transactions on Biomedical Engineering 57, 1399–1409 (2010).
 11.
Lee, S. J., Kim, J. & Lee, M. A RealTime ECG Data Compression and Transmission Algorithm for an eHealth Device. IEEE Transactions on Biomedical Engineering 58, 2448–2455 (2011).
 12.
Ma, J. L., Zhang, T. T. & Dong, M. C. A Novel ECG Data Compression Method Using Adaptive Fourier Decomposition With Security Guarantee in eHealth Applications. IEEE Journal of Biomedical and Health Informatics 19, 986–994 (2015).
 13.
Fathi, A. & Farajikheirabadi, F. ECG compression method based on adaptive quantization of main wavelet packet subbands. Signal, Image and Video Processing 10, 1433–1440 (2016).
 14.
Tan, C., Zhang, L. & Wu, H. A Novel Blaschke Unwinding Adaptive Fourier Decomposition based Signal Compression Algorithm with Application on ECG Signals. IEEE Journal of Biomedical and Health Informatics, https://doi.org/10.1109/JBHI.2018.2817192, (22 March 2018).
 15.
Elgendi, M., Mohamed, A. & Ward, R. Efficient ECG Compression and QRS Detection for EHealth Applications. Scientific Reports 7, https://doi.org/10.1038/s4159801700540x (2017).
 16.
Mamaghanian, H., Khaled, N., Atienza, D. & Vandergheynst, P. Compressed Sensing for RealTime EnergyEfficient ECG Compression on Wireless Body Sensor Nodes. IEEE Transactions on Biomedical Engineering 58, 2456–2466 (2011).
 17.
Zhang, Z., Jung, T.P., Makeig, S. & Rao, B. D. Compressed Sensing for EnergyEfficient Wireless Telemonitoring of Noninvasive Fetal ECG via Block Sparse Bayesian Learning. IEEE Transactions on Biomedical Engineering 60, 300–309 (2013).
 18.
Polanía, L. F., Carrillo, R. E., BlancoVelasco, Manuel & Barner, K. E. Exploiting Prior Knowledge in Compressed Sensing Wireless ECG Systems. IEEE Journal of Biomedical and Health Informatics 19, 508–519 (2015).
 19.
Polanía, L. F. & Plaza, R. I. Compressed Sensing ECG using Restricted Boltzmann Machines. Biomedical Signal Processing and Control 45, 237–45 (2018).
 20.
Cohen, A., Daubechies, I. & Feauveau, J. C. Biorthogonal bases of compactly supported wavelets. Communications on Pure and Applied Mathematics 45, 485–560 (1992).
 21.
Manikandan, M. S. & Dandapat, S. Waveletbased electrocardiogram signal compression methods and their performances: A prospective review. Biomedical Signal Processing and Control 14, 73–107 (2014).
 22.
https://www.hdfgroup.org/ (Accessed Jan 2, 2019).
 23.
Ziv, J. & Lenpel, A. A Universal Algorithm for Sequential Data Compression. IEEE Transactions on Information Theory 23, 337–343 (1977).
 24.
Huffman, D. A Method for the Construction of MinimumRedundancy Codes. Proceedings of the IRE 9, 1098–1101 (1952).
 25.
https://physionet.org/physiobank/database/mitdb/ (Accessed Jan 2, 2019).
 26.
Lu, Z., Kim, D. Y. & Pearlman, W. A. Wavelet compression of ECG signals by the set partitioning in hierarchical trees algorithm. IEEE Transactions on Biomedical Engineering 47, 849–856 (2000).
 27.
BlancoVelasco, M., CruzRoldán, F. & GodinoLlorente, J. I. On the use of PRD and CR parameters for ECG compression. Medical Engineering and Physics 27, 798–802 (2005).
 28.
Moody, G. B. & Mark, R. G. RG, The impact of the MITBIH Arrhythmia Database. IEEE Eng in Med and Biol 20, 45–50 (2001).
 29.
http://www.ux.uis.no/~karlsk/proj99 (Accessed Jan 2, 2019).
 30.
Salomon, D. Data Compression. (SpringerVerlag London, 2007).
 31.
RebolloNeira, L. & Sanches, I. Simple scheme for compressing sparse representation of melodic music. Electronics Letters, https://doi.org/10.1049/el.2017.3908 (2017)
 32.
RebolloNeira, L. A competitive scheme for storing sparse representation of XRay medical images. Plos One, https://doi.org/10.1371/journal.pone.0201455 (2018).
 33.
http://www.nonlinearapprox.info/examples/node012.html (Accessed Jan 2, 2019).
Acknowledgements
Thanks are due to K. Skretting, for making available the Huff06 MATLAB function, which has been used for entropy coding, and to P. Getreuer for the waveletcdf97 MATLAB function which has being used for implementation of the CDF 9/7 wavelet transform.
Author information
Affiliations
Contributions
The author is the only contributor to the paper.
Corresponding author
Ethics declarations
Competing Interests
The author declares no competing interests.
Additional information
Publisher’s note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
RebolloNeira, L. Effective high compression of ECG signals at low level distortion. Sci Rep 9, 4564 (2019). https://doi.org/10.1038/s4159801940350x
Received:
Accepted:
Published:
Further reading

Complex study on compression of ECG signals using novel singlecycle fractalbased algorithm and SPIHT
Scientific Reports (2020)
Comments
By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.