Introduction

Digital imaging systems are increasingly being used in place of conventional photography in the screening and grading of retinopathy. Advances in technology mean that the resolution of digital colour fundus cameras approaches that of 35 mm film.1 In imaging diabetic retinopathy, such systems have been shown to have sensitivities and specificities approaching that of conventional film or polaroid,2,3,4,5 and long-term running costs are less.2 Image acquisition is faster with digital systems and other advantages include instant image feedback and lower flash intensities.5 Detailed retinal images require large amounts of memory and are thus inefficient to transfer and store. Compression of such images into smaller file sizes can alleviate this problem.

There are two main video display types in current usage, thin-film transistor (TFT) and cathode-ray tube (CRT). TFT displays are smaller, they have a lower power consumption, and have a brighter display than their CRT counterparts. Such displays are growing in popularity in the hospital setting. These advantages, together with their portability, mean that they would also be the obvious choice in the mobile retinal screening environment.

There is currently little data on the effect of computer display type on the quality of retinal images. Given these facts, we felt that it was important to compare these two screen types in the grading of diabetic retinopathy. Accurate screening for retinopathy according to Diabetes UK guidelines depends on the ability to detect ‘any retinopathy’ and ‘sight-threatening’ retinopathy.6 In order to investigate the effect of display type on image quality more closely in this study, both compressed and uncompressed digital images of diabetic retinopathy were graded as to the specific level of retinopathy using both TFT and CRT displays.

Methods

A total of 49 high-quality 35 mm transparencies (17 with no retinopathy, eight with background, five with preproliferative, and 19 with proliferative retinopathy) were selected in order to give a range of stages of retinopathy. Images were chosen mainly to determine the effect of screen type and image compression on the threshold for the detection of sight-threatening retinopathy. Sight-threatening retinopathy in this study was defined as the presence of neovascularisation, or features consistent with ‘preproliferative’ retinopathy.7 The images were digitised to tagged information file format (TIFF) files using the Umax ‘powerlook’ 3000 scanner at a resolution of 3000 dots per inch (dpi) as described in a previous study.8

Comparable grading performance for both 35 mm transparencies of diabetic retinopathy and their scanned counterparts has been reported previously.9 The images were converted into JPEG files at 0, 70, 80 and 90% compression using ‘Paintshop Pro’. These levels of compression were chosen so as to demonstrate the change in image quality across a range of image file sizes.8

The original 35 mm transparency was projected and graded by three retinal specialists to produce a ‘gold standard’ for each image. The retinal specialists were not involved in the study grading.

Image grading

All the images at their various compression levels (n=196) were randomised. Each image was presented independently to two masked specialist registrar observers under uniform lighting conditions, firstly on a 14 in, 32 bit true colour, superextended graphics array (SVGA) TFT screen (1024 × 768 pixels, 0.28 mm dot pitch). In an attempt to prevent any memory effect in the grading, the same images were presented using a 17.5 in, 32 bit true colour CRT display (1024 × 768 pixels, 0.26 dot pitch) after a 2-month ‘washout period’. To investigate the effect of image magnification on grading performance, randomised uncompressed images (n=49) were also graded on each screen type at separate grading sessions.

Both uncompressed and compressed images were graded as ‘no retinopathy’, ‘background’, ‘pre-proliferative’ and ‘proliferative’ retinopathy, based on European Grading Protocols.7 Where there was disagreement between the graders, the images were regraded by a third grader.

The grade for each image at each compression level was compared to the gold standard for both the TFT and CRT images. The sensitivity, specificity, positive predictive value (PPV), and negative predictive value (NPV) were calculated both in relation to the specific grade of retinopathy and for the detection of ‘any retinopathy’ and ‘sight-threatening retinopathy’.

Statistical methods

Receiver operator characteristic curves (ROC) were calculated to demonstrate the association between sensitivity and specificity for the detection of both ‘sight-threatening’ and ‘any retinopathy’ on both TFT and CRT displays. The ROC curves give a measure of the accuracy of the grading system. The accuracy is represented by the area under the curve (AUC). Perfect agreement with the gold standard would be indicated by AUC=1. The closer the curve converges towards the top left-hand corner, the more accurate the grading.

In considering the sensitivity and specificity for detection of the actual grade of retinopathy on both TFT and CRT displays, the Stuart–Maxwell test was used. This is a modification of the McNemar's test based on a 2 × 2 contingency table. The Stuart–Maxwell test is suitable for nonparametric, paired, ordered nominal data where there are more than two ordered categories10,11 (ie grade of retinopathy).

The number of true and test positives and negatives was analysed in the form of a 4x4 table for each of the screen types. The null hypothesis of ‘no difference between the proportions within each group’ was tested for. With a P value of <0.05, the null hypothesis was rejected assuming a difference between the proportions greater than would be expected by chance. Such a difference between the proportions would imply inaccurate grading, with more misclassifications, for that screen type.

Both the AUC data and the P-statistic derived from the Stuart–Maxwell test yield information only for the screen type in question and do not enable statistical inference to differences between the two screen types. Apparent differences may suggest trends in performance and are discussed as such. Statistical analysis was performed using SAS for Windows™ (v8) and SPSS for Windows™ (v11).

Results

The retinopathy grading results using both TFT and CRT displays are summarised in Table 1. In relation to the detection of sight-threatening retinopathy, sensitivity and specificity were slightly greater for CRT when compared to TFT with images from 0 to 80% compression. The ROC curves for ‘sight-threatening retinopathy’ (Figure 1) demonstrate that the AUC is greater for the CRT at all compression levels, implying greater grading accuracy.

Table 1 Sensitivity and specificity data for both TFT and CRT displays
Figure 1
figure 1

ROC curves for ‘sight-threatening retinopathy’. The broken lines represent ROC curves for each level of image compression. ‘Comp’=level of image compression.

For the detection of any retinopathy, greater specificity was obtained using the TFT with 0.94 at 80% compression, compared to 0.81 at 80% for the CRT. This is reflected in the ROC curves for ‘any retinopathy’ (Figure 2). The AUC for the TFT at 80% compression is 0.935 compared to 0.869 for the same images viewed on CRT. However, with uncompressed images and images compressed to 90%, sensitivity was slightly greater on the CRT. At other compression levels, the sensitivities were the same.

Figure 2
figure 2

ROC curves for ‘any retinopathy’.

In grading the specific level of retinopathy (‘grade of retinopathy’), the sensitivity using both displays is reduced, but remains slightly greater on the CRT even with highly compressed images (Table 1). For the images at 80% compression, sensitivity was 0.73 on the CRT and 0.63 on the TFT. Grading of images compressed to 70% or more of their original size yielded greater specificity on TFT with a specificity of 0.88 at 80% compression on the TFT and 0.76 for the same level of compression on the CRT.

The results of the Stuart–Maxwell test to investigate any association between the proportions are displayed in Table 1. It can be seen that for the CRT, the P-value remains >0.05 for all but the 90% images.

For the TFT, P-values for all images are less than their CRT counterparts suggesting greater grading inconsistency with the TFT.

When uncompressed images were magnified to × 4 their original size, greater sensitivity and specificity were achieved using the TFT rather than the CRT (Table 1). This is also reflected in the P-values for the × 4 data.

Discussion

In this study, we aimed to investigate the effect of video display properties on the grading of diabetic retinopathy. For screening purposes, both display types met the sensitivity (0.80) and specificity (0.95) requirements of the Diabetes UK guidelines for the detection of retinopathy.6 For the 90% compressed images viewed on both TFT and CRT, the Diabetes UK requirements for detection of sight-threatening retinopathy would not be met. For such highly compressed images, the image quality was judged to be very poor and unsuitable for any grading purposes.

Sensitivity in detecting both ‘any retinopathy’ and ‘sight-threatening’ retinopathy was comparable for both TFT and CRT displays. There was a trend towards slightly greater sensitivity using the CRT. When the specific grade of retinopathy was assessed, sensitivity was greater using the CRT for all but the magnified images. Applying the Stuart–Maxwell test, lower P-values for the TFT grading data would suggest that the grading using CRT is more consistent when considering the actual grade of retinopathy in question. Magnification of the images to × 4 their original size resulted in a trend towards better grading performance on the TFT rather than the CRT.

When considering these apparent differences in performance, there are a number of technical attributes related to screen design, which may help to explain these trends.

The CRT display typically has a higher ‘dot pitch’ (0.26 vs 0.28 for the TFT) hence greater image resolution. This may enable greater accuracy of grading, by being able to resolve fine image details.

TFT displays are prone to artefacts such as ghosting and latent images,12 which can affect image quality. CRT displays however are prone to ‘veiling glare’ from the glass front, which can also reduce image quality.13 TFT displays tend to have a higher luminance (typically about 178 vs 118 cd/m2 for CRT),12 which may result in lower contrast-detection thresholds.14 Indeed, this may explain why when images were magnified to × 4 their original size, the TFT display performed slightly better. This may also explain why the TFT display tended to have a higher specificity for all except the images of sight-threatening retinopathy, specificity being a measure of the ability to reliably identify truly negative images, in other words images with either no retinopathy or retinopathy below the threshold in question.

Constable et al,15 looking at remote transfer of both compressed and uncompressed diabetic fundal images obtained with a portable fundus camera and viewed on a TFT screen, found that although transmission times were reduced by compression, even uncompressed images were inadequate for the diagnosis of diabetic retinopathy. Another group also reported that real-time fundal images obtained using a direct ophthalmoscope attached to a digital camera and transmitted to a remote site also proved to be inadequate for screening for retinopathy.16 In the latter study, it is not clear what type of monitor was used however. Owing to of the remote nature of the screening in both these studies, portable imaging systems were used. Better results may have been obtained using more sophisticated retinal cameras.

Digital images can be enhanced, and manipulation of digital images with dedicated software has been used to increase grading performance.9 We did not investigate such manipulation of our images and this may have improved grading performance even further. There is as yet little data on the clinical effect of software enhancement of compressed images.

In our study, we used preselected, high-quality images of diabetic retinopathy which were scanned from 35 mm transparencies; and although agreement between high-quality scanned images and originals is high,9 clearly, in the screening environment most digital diabetic images would be obtained directly by a digital camera. This could have an additional effect on image quality and ultimate grading performance. In considering the grade of retinopathy, we did not investigate maculopathy in our study and further studies into the effect of screen type on detection of maculopathy would be needed in order to apply these findings more precisely to the clinical setting. In addition, the selection of specific features, such as new vessels, in the study images means that they would not be truly representative of the diabetic population. Therefore, larger studies ‘in the field’ are needed, whereby images from the screening environment are compressed using a variety of different compression techniques and graded remotely on different display devices.

In conclusion, grading of diabetic retinopathy yielded comparable sensitivities for both display types with nonmagnified images. Specificity tended to be slightly higher for the TFT than for the CRT. Images seemed to be less affected by compression when viewed on the CRT, and there was a trend towards more accurate grading as to the specific level of retinopathy using this display.

When images were magnified, performance was slightly greater for the TFT in all categories, and this factor together with the greater specificity when using TFT may be a feature of the greater display luminance characteristic of this type of monitor.

Both display types were adequate for screening for retinopathy at compression levels equal to or less than 80% of the original image file size. Any differences in performance between the two displays would appear to be small. There may be some circumstances where one display type slightly outperforms the other, but from our study this would appear not to be of any clinical significance. Both display types would appear to offer similar image quality. When acquiring a digital imaging system for diabetic retinopathy screening, factors such as the cost, size and portability of the viewing system are probably the main determinants in choosing the type of monitor.