Introduction

Photographic screening in patients with diabetes has reduced the incidence of blindness in England.1, 2 The rising prevalence of diabetes and increased options for the treatment of diabetic retinopathy (DR) is causing significant problems with providing capacity for managing patients referred from diabetic eye screening.3, 4 The quality of clinical assessment after entering the hospital system may also be variable and is difficult to audit. Screening for DR moved from clinical assessment to photography, so could the hospital service do the same?

The prevalence of referable grade DR in the screened population is between 6 and 20%,5, 6, 7 although only about 10% of referred patients are treated. Treatment options have changed to include intravitreal injections, which require more frequent visits and monitoring than that required for laser. The total number of people with diabetes globally is projected to rise from 171 million in 2000 to 366 million in 2030.8 In 2011–2012, 2.59 million people in England aged 12 years and over were identified with diabetes, and 2.36 million were offered screening, of whom 1.91 million received screening (http://diabeticeye.screening.nhs.uk/statistics (last accessed 30 May 2014)). This calls for efficient ways of managing diabetic eye clinics and effectively triaging the referred patients into those who actually need treatment, from those requiring closer review. Clinicians currently decide the DR grade by slit lamp examination and it is assumed that they are right. One study found that there was only moderate agreement between the retinal findings described by clinicians using biomicroscopy and the results from graders analysing screening photographs. Most of the errors were found to be by the clinician when the DR screening photographs were reanalysed.9

Seven-field colour stereo photography has been set as a gold standard for the detection of DR but this is technically difficult, time consuming, and taxing for both patients and operators.10 In one study, which compares two-field to seven-field images, 31.6% of the seven-field stereo photosets were ungradeable by strict quality criteria and 15.3% by less strict criteria.10 Slit lamp biomicroscopy by an experienced ophthalmologist compares favourably with seven-field stereophotography. Two other imaging techniques may allow an equivalent or improved assessment of the retina: optical coherence tomography (OCT) and wide-field colour imaging. The Optos (Optos PLC, Dunfermline, Scotland, UK) system uses an ellipsoid mirror to produce images (called Optomap) with ∼200 internal degrees of view, providing an image of over 80% of the retina in a single image. OCT allows objective evaluation of diabetic maculopathy. OCT performs well compared with fundus stereo photography and outperforms biomicroscopy.11 It is now being used in conjunction with some screening programmes to improve diabetic maculopathy assessment and follow-up.12

We therefore decided to determine the method of assessment of DR that detects the maximum amount of diabetic pathology and to compare agreement between different methods of assessing DR grades using clinical examination, clinical examination with additional access to Optomap wide-field images and OCT, or assessment of the retinal images alone.

Methods

This was a multisite, prospective, clinic-based study conducted to evaluate the agreement in assessing severity of DR at the grade level between clinical retinal examination through dilated pupils, the 200-degree Optomap wide-field images with OCT scans, and a combination of clinical examination and imaging. Ethical approval was obtained from the Northern Regional Ethics Committee and the NHS Trust Research departments at three sites: Royal Victoria Infirmary, Newcastle upon Tyne, Sunderland Eye Infirmary, and Frimley Park Hospital. The patients were recruited from those referred by the DR Screening Service.

After informed consent, visual acuity recording and mydriasis, OCT scans and wide-field Optomap images were taken and an ophthalmologist performed a clinical examination at the slit lamp using 90 and 60 D lenses. Individual retinal lesions were recorded on a proforma leading to a retinopathy grade, which was then filed away by the research nurse. The same ophthalmologist then viewed the imaging in the clinic, filled another copy of the proforma, and gave a grade, combining imaging with clinical findings. The examinations were performed by 10 trained ophthalmologists employed in the diabetic eye clinics in three hospitals at Senior Registrar, Fellow and Consultant level, and so should reflect clinical practice rather than the expertise of an individual. The images were independently graded at a reading centre whose quality assurance processes would ensure good reliability. Special attention was given to confirm definite appearance of new vessels rather than haemorrhage or intraretinal microvascular abnormalities. In all the cases where there was a disagreement between the reading centre findings and the clinician findings, we looked at the images to double check whether we could see new vessels or not (SJT, VM). In doubtful cases, the lesion was not classified as new vessels.

Imaging was performed by certified medical photographers using:

  1. 1)

    Optomap P2000: Scanning laser ophthalmoscope, with field up to 200°. Eye steering was used to obtain three images of each eye: centre, looking up, and looking down. Images were reviewed using proprietary image review software (Optos V2 Vantage Dx Review version 2.5.0.135; Optos PLC). Grading for each wide-field image involved viewing the colour composite, green-wavelength, and red-wavelength images using all the available adjustments.

  2. 2)

    OCT: Spectral domain OCT was carried out. In this study, macular oedema was defined as OCT retinal thickness of 300 μm or greater, with associated focal changes such as cysts, within one disc diameter of the centre of the fovea (ie in any of the five central ETDRS map regions). The 3D acquisition mode was used with the camera centred on the macular fixation point. A 6 × 6 cm2 scan area with 128 lines of 512 A-scans per line was acquired.

Statistical analysis

The clinical NSC (National Screening Committee) levels of DR severity were compared, the agreement between clinical grading of DR severity and wide-field images and the combined grade were cross-tabulated, and κ-values were calculated. Eyes classified as ungradable were excluded from the analysis. Guidelines for interpretation were based on Landis and Koch (ETDRS report 10: 0.0–0.2=slight agreement; 0.21–0.40=fair agreement; 0.41–0.60=moderate agreement; 0.61–0.80=substantial agreement; and 0.81–1.00=almost perfect agreement). Additional sensitivity, specificity, positive, and negative likelihood ratios were calculated after treating DR severity level as a binary variable with clinically important thresholds:

  1. 1)

    R1 vs R2+R3, that is, ‘dischargeable’ patients vs those with sight-threatening DR.

  2. 2)

    R1+R2 vs R3, that is, non-proliferative vs proliferative DR.

  3. 3)

    Clinically significant macular oedema (CSMO) vs no diabetic maculopathy+non-CSMO maculopathy.

  4. 4)

    No diabetic maculopathy vs any maculopathy.

R1 indicates the background DR; R2 the preproliferative DR; and R3 the proliferative retinopathy.

Such comparisons are useful if one technique is similar to another; however, if not, the issue is to determine the more sensitive technique. Therefore, we measured which technique detected the maximum number of eyes with proliferative DR (R3) and clinically significant macular oedema (CSMO). We also assessed the frequency with which the clinician changed their grade when they examined the images. Analyses were performed using SPSS version 19 (SPSS, Chicago, IL, USA). All analyses were carried out per eye and not per patient.

Results

Overall data

One thousand and twenty-three consecutive patients were recruited from three centres (2046 eyes). Of these, 910 patients (1820 eyes) were referrals from screening and 113 eyes were from follow-up clinics. Previous laser treatment (18 PRP, 67 macular) was present in 85 eyes. The screening grade was available for 1580 eyes. There was technical failure in obtaining wide-field images in three patients. Wherever data was missing, these eyes were removed from that particular comparison. The images were deemed ungradable either due to inadequate quality or media opacity in 23 eyes.

The distribution of retinopathy severity is shown in Table 1a. All patients had DR in at least one eye requiring referral, and hence there are some eyes with no DR in the cohort.

Table 1a Frequency of DR grades (eyes)

Table 1b shows the cross-tabulation of screening to the clinical, combined, and imaging grades. κ-Statistic for agreement did not include ungradable eyes or missing data.

Table 1b Cross-tabulation of grades between screening, clinical examination, and imaging (1580 eyes)

Clinician vs imaging-only comparisons

Comparison between clinical examination and imaging (Table 2a) shows exact agreement of levels in 59.4% of eyes. The clinical examination was taken as the reference standard. There was fair agreement (k=0.386) across all four levels of DR. Imaging alone detected 73 more eyes with new vessels not seen by the clinician (3+22+48), and the clinician alone detected new vessels in 39 eyes not detected on imaging (2+8+29). Overall imaging diagnosed more PDR and severe NPDR than did clinical examination. There were two eyes given a clinical grade as proliferative DR, but an imaging grade of no retinopathy. These were rechecked and found to have collateral vessels. Therefore, 20 eyes were said to have new vessels on clinical examination, which were not seen on imaging. This either means that imaging missed it or demonstrates that clinical decisions can be difficult when deciding between intraretinal microvascular abnormalities (IRMA), haemorrhage, and new vessels, as suggested by the 17 cases in which the clinician changed their mind on reviewing the images. Also, on two-dimensional images, small flat new vessels can be difficult to distinguish from IRMA and the graders could have therefore undergraded these images.

Table 2a Clinician grade × imaging-only grade cross-tabulation (κ=0.386)

The DR severity was changed to a binary variable to compare treatable (ie proliferative) vs that not needing treatment. The grades were also transformed to differentiate eyes with sight-threatening DR (STDR) from those with non-STDR that could be discharged to annual screening. Table 2b shows the sensitivity, specificity, agreement, and likelihood ratios comparing clinical examination to wide-field imaging at these cutoff levels. Wide-field imaging had a sensitivity of 73% and specificity of 96% for detecting proliferative DR. The sensitivity was 84% and specificity was 69% in differentiating STDR from non-STDR.

Table 2b Comparison at different thresholds between clinical and imaging grades

For maculopathy, the comparisons were performed at two cutoff levels, that is, the presence or absence of any maculopathy separate from CSMO (the ETDRS definitions were used).

The clinical grading was taken as standard, although we found that including OCT makes imaging sensitive to the presence of macular oedema. The overall agreement over the three different grades of maculopathy (ie absent, exudates only, and CSMO) was 0.380, but imaging was 64% sensitive and 90% specific in detecting CSMO (Table 2b).

Clinical vs combined grade comparisons

The clinical examination was compared with the combined grade assigned by the clinician in conjunction with imaging (Table 3a). Exact agreement was seen in 80.7%. There was substantial agreement (k=0.695) across all four levels of retinopathy. The combination grade detected 22 more eyes with new vessels. Interestingly, the clinician seemed to have altered the grade in 17 eyes, which they initially graded as R3, after viewing the wide-field images. The variation of clinical judgment is also highlighted where 29 mild non-proliferative cases was changed to no retinopathy after including imaging.

Table 3a Clinician grade × combined grade cross-tabulation (κ=0.695)

When the combined grade was taken as the reference standard (Table 3b), we found the clinical examination to be 85% sensitive and 99% specific for proliferative retinopathy. Clinical examination was 75% sensitive and 97% specific for deciding on STDR vs patients with non-sight-threatening retinopathy that could be discharged to screening.

Table 3b Comparison at different thresholds between clinical and combined grades

Table 3b give the comparisons between clinical examination alone and combined grades. Adding OCT to the assessment of maculopathy improves the sensitivity and specificity of detecting CSMO as well as any maculopathy (ie exudates only)

Combination vs imaging-alone grade comparisons

When compared with the combination grade, imaging showed exact agreement in 64.6% of eyes (Table 4a). There was moderate overall agreement on the four different levels of retinopathy (k=0.459). The imaging grade picked up 60 more eyes with new vessels. This could be because of the difficulty in deciding between treated (and fibrosed) new vessels and active vessels. Eighteen patients had previously had PRP laser, and after removing these from analysis, there were 35 more eyes found to have new vessels with imaging only and 14 more with the combined compared with clinical examination alone.

Table 4a Imaging-only grade × combined grade cross-tabulation (κ=0.459, fair agreement)

The combined grade was used as the reference standard to compare with imaging only in Table 4b. Imaging was 79% sensitive and 97% specific in detecting proliferative DR. The decision on STDR vs non-sight-threatening DR was 81% sensitive and 75% specific. The specificity remains very high with imaging-alone grades regardless of whether clinical examination is included in the assessment.

Table 4b Comparison at different thresholds between imaging and combined grades

A comparison between screening and imaging was not carried out for maculopathy as the screening grades do not use OCT for referral. From Tables 2b, 3b, and 4b we see that imaging-alone detects the maximum numbers of eyes with CSMO and the likelihood ratios are most favourable with a combination of wide-field imaging with OCT.

Discussion

The success of retinal photography in screening for treatable DR1, 2, 5 led to the establishment of a UK National Screening Programme. Subsequently, a decrease in the rate of blindness due to diabetes in the working age population has been demonstrated.2, 13 However, the workload generated for Ophthalmology departments has been considerable and this continues to increase in step with the increasing prevalence of diabetes. A more efficient means of triaging those requiring active treatment is required.

This study was designed to evaluate whether wide-field imaging and OCT could be used to improve the management of patients in diabetic eye clinics in the Hospital Eye Service (HES). The study population therefore comprised patients who were already known to have some form of retinopathy and had been referred as such. Previously wide-field imaging has been studied as a screening modality with a sensitivity of 94% and a specificity of 100% for detection of retinopathy.14 That study was performed in a setting similar to ours, that is, in diabetic eye clinics where higher levels of DR were expected. In another study, the patient population was a mixture from the general population in a screening service as well as from DR clinics. They reported a sensitivity of 83% and specificity of 89% for detecting referable DR.15 Our study had a sensitivity of 84% and a specificity of 69% for STDR, with the clinical findings as the reference standard. For detecting proliferative DR, the sensitivity was 73% and specificity rose to 96%. Our study has a negative bias as the study population are patients known to have DR and already within the HES.

Previous studies have shown that Optos colour imaging compares well with clinical examination and ETDRS colours.15, 16 However, these may have involved especially expert examiners and the number of patients involved was relatively small so the number of patients with new vessels was small. Our study was powered with enough patients to show a difference in the ability of the techniques at detecting pathology. By having 10 different ophthalmologists from three centres, this was less likely to bias the study than only having one or two who were either unusually good or unusually bad. We standardised the examinations in that the examiners had a specific proforma to record the findings with the different ETDRS findings and associated grades specified. The patients were also seen in research clinics so the examiners had longer time to examine the patient than may be possible in standard clinical practice.

The ophthalmologists involved were not trained as graders but were used to looking at Optomap images in their clinical practice. Compared with the ophthalmologists involved in the study, more pathology was found by a reading centre examining the images than by the ophthalmologists. This suggests that imaging alone could be used as a way of assessing patients with DR and could be used to quality control examinations being performed. Maybe better training of the ophthalmologists involved in examining the patients and examining the images in this study would have meant less pathology was missed.

The British Association of Retinal Screeners recommends a sensitivity of 80% and specificity of at least 95% for referable retinopathy.7 In our study, the cutoff points used were the presence of treatable retinopathy and the ability to distinguish between those who need some follow-up (ie sight-threatening) and those who could be discharged from the HES. More severe retinopathy, proliferative retinopathy, and macular oedema were found with imaging alone.

The English National Screening Committee recommends that any camera should have about 20 pixels degree in both axes.17 A division of the Optomap sensors’ megapixels by the external angles achieves a resolution of about 14 pixels/degree in the horizontal axis and 20 pixel/degree in the vertical axis. However, this does not consider the differential distribution of pixels, which are densest in the centre for the Optomap. In this study, we used mydriasis and steering up and down, thus three pictures per eye, to increase the resolution and enable better grading.

With regards to diabetic maculopathy and clinically significant macular oedema, OCT is in routine use in several centres. However, the decision to treat with laser continues to be based on the ETDRS criteria.18 The ETDRS used contact lens biomicroscopy and this was deemed to agree very closely with stereoscopic photography.19 Currently, most units use non-contact biomicroscopy to evaluate macular oedema in the clinic, but OCT allows both objective and quantitative assessment. Subclinical foveal oedema is being recognised as an entity, especially if a lower threshold of retinal thickness is used. This study used the central thickness of 300 μm as the threshold. This number gave the highest specificity in previous studies but varies with different OCT machines.20 There were eyes with cysts and thickening on OCT, albeit the thickness remained below 300 μm. These were graded as CSMO for imaging only, but additional visual acuity data was used to guide grading and treatment decisions in the clinical grades. A review of the use of OCT in diabetic macular oedema reported sensitivities ranging from 0.67 to 1.00 and specificities from 0.77 to 0.96.11 The higher values were from studies using healthy controls,21 which would always increase performance indices for a diagnostic test. Our study gave a sensitivity of 64% and specificity of 90%, which increased to 94% when the combination of examination and imaging was used. The pooled likelihood ratios were 6.5 (positive) and 0.24 (negative) in the meta-analysis by Virgili et al.11

For CSMO, our study showed a positive likelihood ratio of 6.5 (positive) and 0.39 (negative) for imaging only. A likelihood ration of >5 (positive) and <0.2 (negative) is believed to be convincing evidence.11 For any maculopathy, the grades were less convincing (Tables 2b, 3b, and 4b). This is because there was a reliance on the presence of exudates, haemorrhages, and visual acuity, all of which made the grades very variable. The sensitivities and specificities were around 80%, which serves to underline the utility of the two imaging modalities.

In the paper by Sallam et al,9 agreement between screening grades and clinical assessment gave a κ-value of 0.4, with most of the disagreement being due to the clinician failing to detect small amounts of exudates, which was apparent on the images. However, small amounts of exudate may not relate to thickening, which is objectively measured on OCT. There was also a tendency for screening to overgrade and doctors to undergrade. It is also possible that in our study there was overcall by image grading particularly for R2. The differentiation of small IRMA from haemorrhage and from small NVE can be difficult with both images and on clinical examination. To further enhance diabetic retinal assessment, fundus fluorescein angiography may be required. Compared with many previous papers, our study was large enough to assess the detection of new vessels, Sallam et al9 reported only nine cases.

In summary, more sight threatening DR was found on assessing images alone compared with clinical examination, or by combining clinical examination with assessing images in the clinical setting. As with clinical assessments, appropriate training and quality control is required to ensure standards of assessment. We believe that assessing wide-field imaging along with OCT is the best way to diagnose treatable DR. It could be used to assess referrals from DR screening to determine further management, to enhance the quality of assessment of DR in clinics when used in combination with examination, to audit the quality of clinical assessments being made in a clinic and to follow-up patients who have too much retinopathy to return to screening but do not actually need treatment.