Introduction

Spectral domain optical coherence tomography (SDOCT) and scanning laser polarimetry (SLP) are the two commonly used imaging techniques for retinal nerve fiber layer (RNFL) evaluation in glaucoma currently. SDOCT is a recent technique that enables imaging the ocular structures with higher resolution and faster scan rate compared with the previous version of this technology (Stratus OCT; Carl Zeiss Meditec, Inc., Dublin, CA, USA).1, 2 GDx (Carl Zeiss Meditec, Inc.), a commonly used SLP device that measures the RNFL birefringence in vivo, is based on the principle that polarized light passing through the birefringent RNFL undergoes a measurable phase shift, known as retardation, which is linearly related to the RNFL tissue thickness.3 The recently introduced SLP protocol, called the enhanced corneal compensation (ECC), optimizes imaging by improving the signal-to-noise ratio compared with the previous version (GDx Variable Corneal Compensation). The ECC protocol introduces a predetermined birefringence bias to shift the measurement of the total retardation to a higher value region to remove noise and reduce atypical patterns. The amount of birefringence bias is then mathematically removed point by point from the total birefringence pattern of the image to improve the signal and obtain a retardation pattern of the RNFL with the least noise.4, 5, 6

Though numerous studies have reported good diagnostic ability of both SDOCT7, 8, 9, 10, 11 and GDx ECC12, 13 in glaucoma, there is limited literature on the head-to-head comparison of these imaging techniques in the same population.14, 15 Also, most of these studies have employed a case–control design including glaucoma patients (cases), defined based on the presence of repeatable characteristic glaucomatous visual field (VF) defects; and normal subjects (controls), usually recruited from the general population and having normal intra-ocular pressures (IOPs), healthy appearance of the optic nerve and normal VFs. However, in clinical practice, a diagnostic test is used to rule-in or rule-out disease in subjects suspected of having disease. We have previously demonstrated how the selection of a control group without any suspicious findings of disease affected the diagnostic accuracy of SDOCT in glaucoma.16

The purpose of the current study was to compare the abilities of RNFL parameters of SDOCT and GDx ECC in detecting glaucoma. The control group against which the glaucoma cohort was evaluated consisted of subjects referred to our institute from general ophthalmologists as glaucoma suspects based on the optic disc appearance. These subjects, however, were later judged as normals with physiological optic disc variations by glaucoma experts on masked evaluation of optic disc photographs (detailed below).

Materials and methods

This was an observational, cross-sectional study of consecutive subjects referred by general ophthalmologists to a tertiary eye care facility between September 2010 and November 2012 for a glaucoma evaluation. Informed consent was obtained from all participants and the Ethics Committee of L V Prasad Eye Institute approved the methodology. All methods adhered to the tenets of the Declaration of Helsinki for research involving human subjects.

Inclusion criteria were age ≥18 years, best corrected visual acuity of 20/40 or better, and refractive error within ±5.0 D sphere and ±3 D cylinder. Exclusion criteria were the presence of any media opacities that prevented good imaging and any retinal (including macular) or neurological diseases other than glaucoma which could confound the results of VF examination and RNFL measurements with SDOCT or SLP. All participants underwent a comprehensive ocular examination that included a detailed medical history, best corrected visual acuity measurement, slit-lamp biomicroscopy, Goldmann applanation tonometry, gonioscopy, dilated fundus examination, digital optic disc photography, standard automated perimetry (SAP), and RNFL imaging with SDOCT and SLP.

SAP was performed using a Humphrey Field analyzer, model 750 (Zeiss Humphrey Systems, Dublin, CA, USA), with the Swedish interactive threshold algorithm (SITA) standard 24-2 program. VFs with fixation losses, false positive, and false negative response rates of <20% were considered as reliable. VFs were considered as glaucomatous if the pattern standard deviation (SD) had a P-value of <5% and the glaucoma hemifield test result was outside normal limits.17

Digital optic disc photographs were obtained by trained technicians (Visupac 4.2.2; Carl Zeiss Meditec Systems GmbH, Pirmasens, Germany). Photographs consisted of a 50 degree image centered on the optic disc, a similar image centered on the macula, a 30 degree image centered on the optic disc, and a 20 degree image centered on the disc. All these images also consisted of one colored and one red-free image each. The photographs were evaluated by two experts independently who were masked to the clinical examination results of the subjects and also the results of VF and imaging examinations. Experts classified the optic disc photographs into glaucomatous and non-glaucomatous based on the presence of focal or diffuse neuroretinal rim thinning, localized notching or nerve fiber layer defects. Discrepancies between the two experts were resolved by consensus.

SDOCT examination was performed with the RTVue (software version 5.1.0.90; Optovue Inc, Fremont, CA, USA). RTVue uses a scanning laser diode with a wavelength of 840±10 nm to provide images of ocular microstructures. The protocol used for RNFL imaging with RTVue in this study was ONH (optic nerve head) scan. This protocol has been explained earlier.16, 18 The RNFL parameters generated by the ONH protocol and used in the study were temporal average (temporal 90 degree), superior average (superior 90 degree), nasal average (nasal 90 degree), inferior average (inferior 90 degree), and global average (over 360 degree). The RNFL parameters are also compared with the internal normative database of 1081 subjects of different ethnicities within the software and one of the three diagnostic categorizations is provided. ‘Outside normal result’ categorization indicates that the value is lesser than the lower 99% confidence interval (CI) of the healthy, age-matched population. ‘Borderline’ result indicates that the value is between the 95% and 99% CI, and a ‘within normal limit’ indicates that the value is within the 95% CI. Only well-centered images with a signal strength index (SSI) of ≥30 were used for analysis. Eyes in which the segmentation algorithm failed were excluded.

SLP examination was done using GDxPRO (version 1.1.1; Carl Zeiss Meditec, Inc.). GDxPRO uses a scanning laser diode of wavelength 785 nm to obtain birefringence measurements. The general principles of SLP and the algorithm used for ECC have been described in detail previously.3, 5, 12 The parameters generated on the printout and used in this study were temporal average (temporal 50 degree), superior average (superior 120 degree), nasal average (nasal 70 degree), inferior average (inferior 120 degree), TSNIT (temporal superior nasal inferior and temporal) average (over 360 degree), TSNIT SD, and nerve fiber indicator (NFI). TSNIT SD represents the SD of the values contained in the calculation circle. The higher the value is, the greater the modulation of the double-hump pattern is. The NFI is a support vector machine score based on several RNFL measures and assigns a number from 0 to 100 to each eye. The higher the NFI is, the greater the likelihood of having glaucoma is. NFI value of <30 is considered as ‘within normal limits’, 30–50 is considered as ‘borderline’ and >50 is considered as ‘outside normal limits’. The superior, inferior, and TSNIT average and TSNIT SD are also compared with the internal normative database of 251 subjects within the software and one of the three diagnostic categorizations as with SDOCT is provided. Only well-focused, centered and illuminated images with a quality score of ≥7, a typical scan score of (TSS) >80, and a residual anterior segment retardation of ≤4 were included for analysis.

Participants had all examinations as well as RNFL imaging with SDOCT and SLP performed on the same day.

Statistical analysis

Descriptive statistics included mean and SD for normally distributed variables and median and inter-quartile range (IQR) for non-normally distributed variables. Receiver operating characteristic (ROC) curves were used to describe the ability of RNFL parameters of both SDOCT and GDx ECC to discriminate glaucomatous eyes from control eyes. Sensitivities at fixed specificities of 80 and 95% were determined for all the parameters. To obtain CIs for area under the ROC curves (AUCs) and sensitivities, a bootstrap re-sampling procedure was used (n=1000 re-samples). As measurements from both eyes of the same subject are likely to be correlated, the standard statistical methods for parameter estimation lead to underestimation of standard errors and to CIs that are too narrow.19 Therefore, the cluster of data for the study subject was considered as the units of re-sampling and bias corrected standard errors were calculated during all estimations. This procedure has been used in the literature to adjust for the presence of multiple correlated measurements from the same unit.20, 21 ROC regression modeling technique was used to evaluate the influence of glaucoma severity on the sensitivities of the RNFL parameters of SDOCT and GDx ECC in diagnosing glaucoma.22, 23 Likelihood ratios (LRs) were reported for diagnostic categorization (outside normal limits, borderline, or within normal limits) after comparison with the instrument’s internal normative database. LR is the probability of a given test result in those with disease divided by the probability of the same test result in those without the disease.24, 25, 26 The LR for a given test result indicates how much that result will raise or lower the probability of disease. An LR of 1 or close to 1 would mean that the test provides no additional information about the post test probability of the disease. LRs higher than 10 or lower than 0.1 would be associated with large effects on post test probability, LRs from 5 to 10 or from 0.1 to 0.2 would be associated with moderate effects, and LRs from 2 to 5 or from 0.2 to 0.5 would be associated with small effects.24 The 95% CIs for LRs were calculated according to the method proposed by Simel et al.27

Statistical analyses were performed using the commercial software (Stata ver. 12.1; StataCorp, College Station, TX, USA). A P-value of <0.05 was considered as statistically significant.

Results

Three hundred and forty-three eyes of two hundred and fifty-five consecutive subjects referred for glaucoma evaluation to our center were analyzed. Figure 1 shows a flow diagram describing the eyes not meeting the inclusion criteria. For the final analysis, 106 eyes of 79 subjects with the optic disc and VF classification as ‘glaucoma’ formed the glaucoma group and 109 eyes of 86 subjects with optic disc and VF classification as ‘non-glaucoma’ formed the control group. The initial agreement between the two experts for optic disc classification on photographs into glaucoma and control groups was 89% (kappa=0.80, 95% CI: 0.72–0.89). The remaining optic disc photographs were classified by consensus. Table 1 shows the demographic and VF parameters of the two groups. Glaucoma patients had significantly smaller optic discs than the control subjects.

Figure 1
figure 1

Flow chart showing the number of eyes evaluated, number of eyes excluded from the study, and the reasons for exclusion.

Table 1 Demographic and visual field characteristics of the participants

Table 2 shows the RNFL parameters of SDOCT in the two groups of participants. All RNFL parameters of SDOCT were significantly thinner in the glaucoma group compared with the control group. SSI in the control group was significantly better than the glaucoma group. AUC and sensitivities at fixed specificities, after adjusting for the differences in the disc area and SSI between the two groups using covariate adjustment as proposed by Pepe,28 are also shown in Table 2. Average and inferior quadrant RNFL thickness parameters showed the best AUCs and sensitivities at fixed specificities of 95 and 80%.

Table 2 Mean values of SDOCT retinal nerve fiber layer thickness parameters in glaucoma and control eyes with areas under the receiver operating characteristic curves and sensitivities at fixed specificities

Table 3 shows the RNFL parameters of GDx ECC in the two groups of participants. Quality score, TSS, and residual anterior segment retardance were comparable between the groups. All RNFL parameters except the temporal quadrant RNFL retardation were significantly lower in the glaucoma group compared with the control group. NFI value was significantly higher in the glaucoma group. AUC and sensitivities at fixed specificities after adjusting for the differences in the disc area between the two groups are also shown in Table 3. Average and inferior quadrant RNFL parameters as well as NFI showed the best AUCs and sensitivities at fixed specificities.

Table 3 Mean values of GDx ECC retinal nerve fiber layer parameters in glaucoma and control eyes with areas under the receiver operating characteristic curves and sensitivities at fixed specificities

There were no statistically significant differences between the AUCs of corresponding RNFL parameters of SDOCT and GDx ECC (P>0.10 for all comparisons) except for the temporal quadrant thickness, the AUC of which was significantly better (P<0.05) with SDOCT compared with GDx ECC. Figure 2 shows the ROC curves of the superior, inferior, and average RNFL parameters of SDOCT and GDx ECC. Figure 3 shows the effect of glaucoma severity, based on the mean deviation of VFs, on the sensitivities at fixed specificities of 95 and 80% of the average RNFL parameters of SDOCT and GDx ECC. In early stages of glaucoma, sensitivity of SDOCT RNFL parameter appeared to be better than that of GDx ECC RNFL parameter. Sensitivities of the RNFL parameters of both devices increased significantly as the severity of glaucoma increased.

Figure 2
figure 2

Receiver operating characteristic curves of the superior (a), inferior (b) and average (c) RNFL parameters of SDOCT and GDx ECC.

Figure 3
figure 3

Effect of glaucoma severity (based on the mean deviation of VFs) on the sensitivities of the average RNFL parameters of SDOCT and GDx ECC at specificities of 95% (a) and 80% (b).

Table 4 shows the LRs associated with the diagnostic categorization of the SDOCT and GDx ECC parameters after comparison with the respective instrument’s internal normative database. Outside normal limit categories of SDOCT parameters were associated with moderate effects on the post test probability of disease while that of GDx ECC parameters were associated with small effects. Within normal limit categories of the RNFL parameters of both SDOCT and GDx ECC were associated with moderate effects while borderline categories were associated with small or no effects on the post test probability of glaucoma. Analyzing the effect of optic disc size on the probability of false positive and false negative results with SDOCT and GDx ECC separately, we found that the disc size did not affect the misinterpretation rates with either of the technologies (P>0.40 for all associations).

Table 4 Likelihood ratios (with 95% confidence interval)a of the normative database classification of SDOCT and GDx ECC parameters to discriminate glaucoma from control eyes

Discussion

In this study to compare the abilities of RNFL parameters of SDOCT and GDx ECC in diagnosing perimetric glaucoma, we found that the AUCs of the best RNFL parameters of both devices were comparable. This is similar to the results of the study by Benítez-del-Castillo et al14 who also found that the AUCs of the best RNFL parameters of SDOCT and GDx ECC were comparable. However, the AUCs found in our study were lower than their study. This is probably because of the difference in the control group of the two studies. The control group in the study by Benítez-del-Castillo et al consisted of healthy volunteers without any suspicious findings of glaucoma while the control group in our study consisted of subjects referred by general ophthalmologists as glaucoma suspects based on their optic disc appearance. Though found suspicious by the general ophthalmologists, their optic discs were judged by glaucoma experts to be non-glaucomatous and physiological variations of normal. The VFs of these subjects were within normal limits as well. We have recently demonstrated how a control cohort with no suspicious findings of glaucoma can affect the diagnostic ability of SDOCT parameters in glaucoma compared with a control cohort which is likely to be misinterpreted as glaucoma.16 Due to the same reason, AUCs of RNFL parameters reported in our study are also lower compared with the studies that have reported the same with SDOCT7, 10, 29 and GDx ECC5, 12, 13, 30 separately.

We also found that the sensitivities of SDOCT RNFL parameters at fixed specificities were comparable to that of GDx ECC RNFL parameters to diagnose glaucoma. Similar results have been reported by Benítez-del-Castillo et al14 and Garas et al.15 We also found that the sensitivities of both the devices increased significantly as the severity of glaucomatous VF damage increased. This is similar to the relationship reported between sensitivity to diagnose glaucoma and the severity of glaucomatous VF damage, using various imaging techniques.31, 32, 33, 34, 35 However, as noted in Figure 2, the sensitivities with SDOCT appeared to be better than those with GDx ECC in the early stages of glaucoma. Future studies are required to evaluate whether SDOCT is better than SLP in early (and preperimetric) stages of glaucoma.

In addition to sensitivity, specificity, and AUC, diagnostic tests can also be summarized in terms of LR. LR is higher than the other measures in hierarchy, as it expresses the magnitude by which the probability of a diagnosis in a given patient is modified by the results of the test.32, 36, 37 In other words, the LR indicates how much a given diagnostic test result will change (increase or decrease) the pretest probability of the disease. We therefore evaluated the LRs associated with the diagnostic categorization of the RNFL parameter value, after comparison with the instrument’s normative database. The magnitudes of the LRs associated with the outside normal limits category of SDOCT RNFL parameters were associated with moderate effects on the post test probability of disease while that of GDx ECC parameters were associated with small effects. Though a formal comparison of LRs was not done, it appeared that the ‘outside normal limits’ category of SDOCT was significantly better in ‘ruling in’ glaucoma compared with that of GDX ECC. Within normal limit categories of the RNFL parameters of both SDOCT and GDx ECC were associated with moderate effects on the post test probability of glaucoma. This would mean that the ‘within normal limits’ category of both SDOCT and GDx ECC was similarly effective in ‘ruling out’ glaucoma. ‘Borderline’ categories of both the devices were associated with small or no effects on the post test probability of glaucoma, meaning that they were not useful in ‘ruling in’ or ‘ruling out’ glaucoma. It should, however, be noted that in clinical practice even small effects on post test probability may be relevant and useful, depending on the overall clinical picture and the pretest probability of disease.

In this study, we also found that the RNFL measurements with SDOCT were significantly higher than that with GDx ECC both in normal subjects and in glaucoma patients. This was also shown earlier using the previous versions of OCT and SLP.38 This disparity is due to the differences in the underlying principles of tissue thickness measurement between OCT and SLP. While OCT measures tissue thickness depending on the changes in the reflectivity of the diode laser at the anterior and posterior boundaries of RNFL, SLP measures the retardance of light when it passes through birefringent RNFL and converts this retardance to thickness using a conversion factor. These results also demonstrate that the measurements of different imaging instruments in glaucoma cannot be used interchangeably.

A limitation of the current study is the possible inclusion of a few preperimetric glaucoma cases in the control cohort, as the control group was selected from the group of subjects referred as glaucoma suspects based on their optic disc appearance by general ophthalmologists. These subjects were however diagnosed as normals based on the masked optic disc evaluation by glaucoma experts and the normal VFs. Therefore in true sense, optic discs included in the control group though referred as suspects for glaucoma, were not true suspects but were the ones that caused a diagnostic uncertainty among general ophthalmologists. Significantly larger disc size in the control group suggests that these were the eyes with large discs and large physiologic cups, mislabeled as ‘glaucoma suspects’ by general ophthalmologists. There was however no ambiguity in their classification by the glaucoma experts. This limitation is common to all diagnostic accuracy studies using a cross-sectional design because of the lack of a reference standard for diagnosing glaucoma. It is not possible to rule-out the diagnosis of glaucoma completely in suspect eyes unless followed up for a reasonable period of time. Medeiros et al39 have therefore advocated the use of progressive optic disc change over follow-up examinations as the reference standard for glaucoma. This possibility of including a few preperimetric glaucoma cases in the control cohort, affecting the comparison of diagnostic abilities between SDOCT and GDx ECC should therefore be kept in mind.

In conclusion, though the AUCs and sensitivities at fixed specificities were comparable between the RNFL parameters of SDOCT and GDx ECC in diagnosing glaucoma, LRs indicated that the RNFL parameters of SDOCT were better in ‘ruling in’ glaucoma.