Main

Telescreening for diabetic retinopathy (DR) is relevant to compensate for the lack of ophthalmologists assessing DR, especially in remote areas. Digital fundus photography is commonly used for DR screening.1 High-quality digital images can reach a size of 1.5 MB or greater. Compression techniques are required to fasten transmission of such images, yet information may be lost during compression. Standards have been defined for radiology and pathology; yet to date, resolution levels required for reliable diagnosis in ophthalmology have not been determined.

Basu et al2 evaluated the effect of classic Joint Photographic Experts Group (JPEG) compression (cJPEG) on the grading of DR lesions and found that compression ratios of 1 : 20 to 1 : 12 were acceptable. JPEG2000, a new image compression algorithm purportedly rendering higher quality images than that of cJPEG at higher compression ratios, has been evaluated for the compression of medical images in radiology.3, 4 The aim of our study was to compare cJPEG and JPEG2000 compression on the detection of DR lesions.

Methods

In total, 45 good quality digital images including different pathologic lesions of DR were selected from our digital image bank. Good quality was defined as a well-centred image without dark or bright peripheral halos, in focus that allowed unambiguous analysis of fine details. They were obtained with Topcon's nonmydriatic retinal fundus camera (45° opening, TRC-NW6, Topcon Europe, Rotterdam, The Netherlands) connected to a tri-CCD colour video camera (Sony, DXC-950 P, Tokyo, Japan). Images were captured without pupil dilation, in true colour (24 bits) at a resolution of 800 × 600 pixels resulting in an uncompressed image size of 1.26 MB.

Photographs were selected to ensure adequate distribution of different DR lesions; eight photographs showed normal fundus, 37 various numbers of haemorrhages/microaneurysms (HMA), 18 soft exudates (SE), 22 hard exudates (HE), seven new vessels on the disk (NVD) or elsewhere (NVE), and four showed intraretinal microvascular abnormalities (IRMA). Images were stored as uncompressed TIFF (tagged interchanged files format) files. This study of retrospective design adhered to the tenets of the declaration of Helsinki.

Compression

The 45 TIFF images were first compressed to five different levels, using the cJPEG algorithm with PhotoShop 5.0 (Adobe, San Jose, CA, USA), to 118, 58, 41, 27, and 24 KB in size (representing compression ratios of 1 : 11, 1 : 22, 1 : 31, 1 : 47, and 1 : 53, respectively). The TIFF images were then compressed by the JPEG2000 algorithm using the ImagePress JP2 plug-in (Pegasus, Tampa, FL, USA) for Photoshop computer software at levels 1 : 11, 1 : 22, 1 : 31, and 1 : 47, resulting in image sizes of 118, 58, 41 and 27 KB, respectively. These weights were chosen as it became empirically clear that the low compression ratio (118 KB) and high compression ratio (27 KB) were respectively of excellent and poor quality with both compression algorithms. For classic JPEG images, only 24 KB was applied as it appeared much worse than the others, and the gain in storage/transmission size between 27 and 24 KB is negligible. All 405 compressed images were placed in a random order.

Image grading

Images were displayed on a 21-inch monitor (resolution: 1280 × 1024 × 24 bits, CRT Sony Trinitron Multiscan G500, Tokyo, Japan). To avoid intergrader variability, all images were graded jointly by two retinal specialists. No image processing was used. Each sign of DR was graded on TIFF photographs as SE, HE, ‘macular oedema’ (ME, which used as a surrogate the presence of HE within one disc diameter of the foveola), NVD, NVE, and IRMA were assessed as absent (0), questionable (1) or present (2). HMA were graded as (0), absent; (1), questionable or less than 5; (2), 5 to 10, and (3), more than 10. TIFF images were graded twice at an interval of 1 month in order to assess intragrader variability. Just after the second TIFF grading, compressed images (n=405) were then presented in random order and graded jointly by the two graders. Image quality was graded as being good (image degradation not apparent), acceptable (image degradation apparent, but still allowing subjectively reliable assessment), or poor (quality not sufficient from which to make reliable assessment). Then each lesion of DR was graded on each image according to the above scale.

Statistics

TIFF photographs were the reference standard. However, some variability was observed between the two successive gradings due to intragrader variability. A consensus grade for each image was reached after discussion between the two graders, to establish the gold standard.

Lesion grades for each image at each compression level were then compared with the gold standard, and sensitivity, specificity, positive and negative predictive values were calculated. For sensitivity and specificity, the score for each lesion was dichotomized as present or absent. Questionable lesions were conventionally considered positive. 95% confidence intervals were calculated. A weighted kappa statistic of each compression grade was calculated to evaluate level of agreement between gold standard and grading scores for each level of compression. Kappa statistic agreement was termed slight (0–0.20), fair (0.21–0.40), moderate (0.41–0.60), substantial (0.61–0.80), and almost perfect (0.81 and more).5 Agreement between gold standard and grading scores for HMA was also calculated, when HMA were absent (grade 0) vs present (grades 1, 2, and 3), to study the effect of compression on distinction of no lesions to few or many. Kappa statistics were also used to assess intragrader agreement between the two gradings at a 1-month interval. The SPSS v11.1 software statistical package (SPSS, Chicago, IL, USA) for Windows was used.

Results

Table 1 summarizes image quality assessment for different compression levels. With cJPEG image quality started to decrease at 41 KB and blocking effects were obvious in almost all 24 KB images. With JPEG2000, at sizes of 27 and 41 KB, slightly more ‘poor’ images were seen than that for cJPEG, with ‘rice grain’ artefacts present.

Table 1 Values of subjective image quality assessment for 45 compressed images at different levels of cJPEG and JPEG 2000 compression

No intragrader variability was observed when grading for SE, HE, ME, NVD/NVE, and IRMAs on TIFF images. Kappa statistic for intragrader agreement for the grading of HMA on TIFF images was 0.87 (95% confidence interval, 0.74–0.99).

Sensitivity, specificity, positive and negative predictive values are given in Table 2. Table 3 summarizes agreement of lesion grade between TIFF and compressed images.

Table 2 Sensitivity, specificity, positive predictive value, and negative predictive value for 45 compressed images at different levels of cJPEG and JPEG 2000 compression
Table 3 Kappa values (±95% confidence intervals) for 45 compressed images at different levels of cJPEG and JPEG 2000 compression

Compression had no effect on the detection of NVD. Among NVE, two were subtle and smaller than 1/2 disc area in size; one of them was not detected on both 24 KB cJEPG and 27 KB JPEG2000 images. 27, 41 and 58 KB JPEG2000 image sizes led to identification of one false-positive NVE.

Good-to-excellent agreement was observed for detection of exudates between TIFF and compressed images. Only subtle isolated exudates were missed in 24–41 KB cJPEG and 27–58 KB JPEG2000 images. For SE at 41 and 58 KB, JPEG2000 showed lower sensitivity.

The lowest level of agreement was associated with IRMA, present on four images; there were many of them on two, and few small, isolated ones on the two others. In cJPEG, small IRMA were missed at 27 and 41 KB. At these same image sizes, JPEG2000 performed slightly better for IRMA.

Kappa statistics analysis for HMA grades showed fair-to-good agreement (greater than 0.5) between TIFF and either JPEG images, at any compression level. However, variability of HMA grades was both due to intragrader variability and compression. For 118 KB size in both cJPEG and JPEG2000, variability was mostly due to intragrader variability (kappa values being greater than that for intragrader agreement). For other cases, variability was mostly due to compression, as kappa values were smaller than for intragrader agreement.

Kappa values were higher when considering absence vs presence of HMA than that of global detection for all images, except for the 27 and 41 KB JPEG2000 images.

Discussion

Compression reduces image file size, allowing quicker transmission and using less storage space. The Joint Photographic Experts Group (JPEG) compression is the most common image format; it is largely used for medical imaging. It is a ‘lossy’ compression technique, meaning that some information and image quality are lost during compression. The amount of information that is discarded determines the amount of compression. The cJPEG algorithm breaks the image into 8 × 8 pixel blocks and performs a discrete cosine transform on each block. An 8 × 8 block of spectral coefficients with most of the information concentrated in relatively few coefficients is obtained. Quantization is then performed, closely preserving low-frequency components, approximating high-frequency components. The amount of discarded information determines compression level. A coding process compresses the remaining frequency coefficients.

The JPEG2000 wavelet algorithm uses a different approach, dividing the image into a series of nonoverlapping rectangular blocks called tiles.6 Each tile component is decomposed using the wavelet transforms into decomposition levels, each of which contains a number of subbands. These subbands contain information describing the horizontal and vertical characteristics of the original tile. They are computed using a one-dimensional filter applied in both directions. This gives four smaller image blocks; one with low resolution, one with high vertical and low horizontal resolution, one with low vertical and high horizontal resolution, one with all high resolution. This application of one-dimensional filters in both directions is then repeated a number of times on the low-resolution image block (dyadic decomposition). After transformation, all coefficients are quantized. This is the process by which the coefficients are reduced in precision. Following quantization, each subband is subjected to a packet partition or code-blocks creation,7 which are the fundamental entities used for the final step of entropy coding.

The effect of compression on retinal images has been investigated in a number of studies. Eikelboom et al8 studied the effect of various levels of both cJPEG and wavelet compression on the quality of digitized retinal images and using different methods concluded that a digital image 1.5 MB in size could be compressed to 29 KB without serious degradation in quality. Newsom et al9 demonstrated significant loss of sensitivity to the features of DR with cJPEG compression of 35 mm slides that had been digitized concluding that this was due to the TFT screen they used. Only retinopathy level was considered, with no mention of individual lesion counts. Basu et al2 explored the effect of four different levels of cJPEG compression on 58 digitally acquired fundus images, finding that up to 1 : 20 compression ratios were acceptable. Stellingwerf et al10 compared uncompressed TIFF and compressed cJPEG digitally acquired fundus photographs with 35 mm retinal slides, finding 1 : 30 compression decreased sensitivity from 0.86–0.92 to 0.72–0.74. Using large 2008 × 3040 pixel images, Baker found 1 : 55 and 1 : 113 compression ratios acceptable for DR screening.11

The aim of our study was to compare cJPEG and JPEG2000 compression algorithms in screening for DR lesions. We therefore chose individual images to determine effect of compression on specific lesions and not composite fundus images as ETDRS grading was not our aim.

In cJPEG compression, blocking artefacts started to be visible on 41 KB-images (Figure 1) and became obvious on 27 KB images. Image degradation was first noticed at an earlier level of compression than that in Eikelboom's study.8 In JPEG2000 images, certain images were subjectively found to be blurred at all three levels of compression, without typical ‘blocking’ artefacts, yet artefacts specific to JPEG2000 (‘wavelet or rice-shaped’,12 Figure 1) were observed. Another feature visible on JPEG2000 images at high compression ratios was the ‘smoothing’ effect; as Eikelboom et al8 noted wavelet images may be pleasant to look at, even if appearing somewhat fuzzy or ‘out of focus’.8

Figure 1
figure 1

Examples of the effect on a retinal image (detail) of the different JPEG and JEPG2000 compression ratios. Top Middle: Original 1.26 MB TIFF image. Left column, from top to bottom: classic JPEG compression to 118, 58, 41, 27 and 24 KB. Right column, from top to bottom: JPEG2000 wavelet compression to 118, 58, 41 and 27 KB. A small red dot (HMA, arrow) is seen in all images except the 24 KB JPEG image (bottom left), where it becomes a vertical line, part of a blocking artefact (large arrowhead). The vessels surrounding it, seen well on TIFF and both 118 KB images progressively fade away with both compression algorithms. Rice-grain artefacts become more prominent in the JPEG2000 images as image size decreases (small arrowheads in 41 and 27 KB images).

With the levels of compression that we used, we observed relatively few effects on the detection of gross anomalies, which were detected with good sensitivity. Eikelboom et al8 found that large anomalies could be detected on retinal images at a compression ratio over 1 : 300, using JPEG. In our study, NVD or NVE were detected at any level of JPEG compression when they were greater than 1/2 disc area in size. However, smaller new vessels were missed in one case each on 24 KB cJPEG and 27 KB JPEG2000 images.

HE were well detected at any level of either compression. Indeed, HE are small lesions, but are often grouped together in clusters or large circinate rings. Only small isolated HE were missed at higher levels of compression.

As expected, the effect of compression was more pronounced on small, subtle, low-contrasted anomalies. Vanishing SE were missed at the lowest compression levels. The lowest level of agreement was associated with IRMA; small, isolated IRMA present on two images were missed from 41 KB images. This was the case in one JPEG2000 image.

An important point in this study was the effect of image compression on the detection of microaneurysms; they are the first ophthalmoscopic sign of early DR, and their detection is particularly critical when screening for DR; it is important to be able to count them at early stages of DR to follow the progression of the disease.13 The possibility to detect and count them properly on compressed images will thus determine the highest compression level clinically acceptable to screen for DR.

The sensitivity to detect microaneurysms is not as good on digital photography as on conventional 35-mm photography, although several authors have found good agreement between both techniques for DR grading,14, 15, 16 the poorest agreement between DR grades was recorded at level 21, characterized by the presence of few microaneurysms.17, 18, 19 This is due to the lower resolution of digital photography, which makes the images granular and increases the number of questionable lesions when grading for microaneurysms. Image compression is expected to increase this phenomenon, as well as the difficulty in detecting and counting microaneurysms. For both classic and JPEG2000, global level of agreement between 118 KB compressed and TIFF images grading was almost perfect. Image compression did not lead to increased variability, compared to variability observed when grading TIFF images twice. At intermediate compression ratios (41 and 58 KB), global performance assessed by the kappa statistic was quite similar for both algorithms. At 27 KB, JPEG2000 performed slightly better than cJPEG for distinction of different HMA levels. At these levels of compression, cJPEG compression resulted in a significant decrease of visibility of critical details due to blocking artefacts that affected microaneurysm grading. Thus, such levels of compression with cJPEG do not appear suitable for DR screening.

Blocking artefacts are an intrinsic limitation of the cJPEG algorithm that splits the image in blocks of 8 × 8 pixels. Low-quality compressed images are inefficiently compressed because for each block, only the lowest frequency (constant) component remains, and these various components are encoded inefficiently. As the JPEG2000 standard works on the image as a whole, it does not present blocking effects, as it is based on wavelet decompositions. However, implementation issues have also imposed that this newer JPEG standard splits the image into ‘tiles’,6 so that the problem of artefacts (‘wavelet or rice-shaped’) still remain. On the other hand, at high compression ratios, JPEG2000 might be more efficient in retaining small details like microaneurysms than the current cJPEG standard, especially when these details are much smaller than the 8 × 8 block size. This effect is supported by our finding a higher kappa for 27 KB JPEG2000 vs cJPEG images for global HMA evaluation.

In spite of higher performance computers, more sophisticated software and high-speed internet-based data transmission, compression will still be a matter of concern for telemedicine in the future, as digital retinal screening cameras are also getting higher definitions (above the 5 MegaPixel range to date). We found that cJPEG as well as JPEG2000 compression of a 1.26 MB fundus image to 118 KB does not affect accuracy when compared to uncompressed TIFF images. At higher compression ratios, JPEG2000 has proven superior to cJPEG in radiology;4 we also found it slightly better, but also with results insufficient for clinical use.4 This remains to be tested in a clinical setting using larger image sizes.