Introduction

Primary open-angle glaucoma is a multi-factorial optic neuropathy in which there is a characteristic acquired loss of optic nerve fibres.1 The detection of defects in the retinal nerve fibre layer (RNFL) is key for diagnosing glaucoma.2, 3 In recent years, different devices have been introduced to quantitatively measure peripapillary RNFL thickness. One of these techniques is scanning laser polarimetry (SLP) with variable corneal compensation (GDx-VCC; Carl Zeiss Meditec Inc., Dublin, CA, USA), which measures retardation of light that has double-passed the birefringent fibres of the RNFL. Light retardation correlates well with RNFL thickness.4, 5, 6, 7 The GDx-VCC allows for eye-specific compensation of the birefringent effect of the anterior segment (cornea and lens), thereby improving the diagnostic accuracy for glaucoma as compared to previous polarimetre versions.8, 9, 10

The aim of this study was to optimize the sensitivity–specificity balance of SLP parameters by calculating a binary logistic regression analysis. Binomial (or binary) logistic regression analysis is useful for predicting the presence or absence of a characteristic or outcome based on values of a set of predictor variables, and is applicable to a broader range of research situations than discriminant analysis. This method can be used to find a linear combination of variables whose value is as similar as possible within groups and as different as possible between groups. The linear combination is called a linear discriminant function (LDF). In our study, we used this procedure to determine which RNFL parameters of the SLP were more useful for differentiating between normal eyes and glaucoma subjects with glaucomatous visual field defects.

Methods

The design of the study adhered to the tenets of the Declaration of Helsinki for biomedical research, and followed all 25 items of the Standards for Reporting of Diagnostic Accuracy (STARD) guidelines.11 The study protocol was approved by the ethics committee of Miguel Servet University Hospital, and informed written consent was obtained from all participants.

Subjects and measurement protocol

Participants had to meet the following inclusion criteria: best-corrected visual acuity of 20 of 40 or better, refractive error within ±5.00 dioptres equivalent sphere and ±2.00 dioptres astigmatism, and transparent ocular media (nuclear colour/opalescence, cortical or posterior subcapsular lens opacity <1) according to the Lens Opacities Classification System III system,12 and open anterior chamber angle. Subjects with previous intraocular surgery, diabetes or other diseases affecting the visual field, and current use of a medication that could affect visual field sensitivity were excluded.

A total of 321 eyes of 321 consecutive subjects were prospectively enrolled from January 2007 to June 2007. A total of 8 subjects did not complete all the required tests, and 21 subjects were unable to perform at least one of the tests included in the study protocol (12 of them did not perform a reliable standard automated perimetry (SAP) and the other 9 had poor quality SLP, after three attempts in both cases) and were excluded from further analysis. Therefore, 292 eyes of Caucasian origin were included in the statistical analysis. One eye from each subject was randomly chosen for the study, unless only one eye met the inclusion criteria.

Normal eyes were consecutively recruited from patients referred for refraction that underwent routine examination without abnormal ocular findings, hospital staff, and relatives of patients in our hospital. Patients with glaucoma were recruited consecutively from two outpatient clinics under the area of influence of our hospital.

All participants underwent a full ophthalmologic examination: clinical history, visual acuity, biomicroscopy of the anterior segment using a slit lamp, gonioscopy, Goldmann's applanation tonometry, central corneal ultrasonic pachymetry (model DGH 500, DGH Technology, Exton, PA, USA), and ophthalmoscopy of the posterior segment. At least two reliable SAP tests per eye were carried out using a Humphrey Field analyzer, model 750 (Zeiss Humphrey Systems, Dublin, CA, USA), with the SITA Standard 24-2 strategy. If fixation losses and false positive or false negative rates were >20%, the test was repeated. The second reliable perimetry test obtained was used in this study to minimize the learning effect.13, 14 Abnormal SAP results were considered as a reproducible glaucomatous visual field loss in the absence of any other abnormalities to explain the defect. A visual field loss was defined as the presence of a cluster of three points lower than P<5% or a cluster of two points lower than P<1% on a pattern deviation plot,15 and/or a pattern standard deviation significantly elevated beyond the 5% level and/or a Glaucoma Hemifield Test outside normal limits. The subjects completed the perimetry tests prior to any clinical examination or structural test. Each perimetry test was carried out on different days to avoid the fatigue effect.

The SLPs were carried out with the GDx-VCC (software version 5.4.1.35). After macular scanning, the axis and magnitude of anterior segment birefringence were estimated from the measured macular retardation profile.8, 9, 10 Retinal polarization images were then obtained and automatically compensated using the GDx-VCC software. At least one scan of acceptable quality (well focused and centred scans with a quality report 7) was obtained for each eye, and the placement of the optic disc margin was confirmed by a trained ophthalmologist. The GDx-VCC parameters investigated in this study were temporal-superior-nasal-inferior-temporal (TSNIT) average, superior average, inferior average, TSNIT standard deviation, nerve fibre indicator (NFI), superior ratio, inferior ratio, superior/nasal, maximum modulation, superior maximum, inferior maximum, image average, ellipse modulation, normalized superior area, normalized inferior area, temporal/nasal, inferior/nasal, total integral, superior integral, inferior integral, temporal average, and nasal average.

All the ophthalmic examinations were carried out within 1 month of the subject's date of enrolment into the study.

Classification into groups

Healthy eyes had an intraocular pressure (IOP) of less than 21 mm Hg, no history of increased IOP, and a normal SAP. Glaucomatous eyes had an IOP higher than 21 mm Hg (on at least three readings on different days) and abnormal SAP results, regardless of the appearance of the optic disc. The eyes were classified by two glaucoma specialists masked to patient identity and clinical history. Any disagreement was resolved by consensus. The total population was randomly divided into two samples: one population for obtaining the LDF (teaching set) and a second independent population for testing the LDF (validating set).

Statistical analysis

All statistical analyses were calculated using SPSS (version 15.0; SPSS Inc., Chicago, IL, USA) and MedCalc (version 9.3.2.0 MedCalc Software, Belgium) statistical software. The teaching set was used for binary logistic regression analysis, a regression analysis that is used when the dependent variable is dichotomous (healthy or diseased) and the independent variables are of any type. The dependent variable was glaucoma (yes or no), and the predictive variables were all the studied SLP parameters, except NFI, which is calculated using a support vector machine algorithm based on several RNFL measures. The relative importance of each independent variable was assessed by stepwise binary logistic regression analysis using the forward Wald's method. The stepwise probability test determined the criteria by which variables were entered into and removed from the model. The LDF was a score calculated by taking the weighted sum of the predictor variables.

The significant SLP parameters were combined to generate a new variable (the LDF) in such a way that the measurable differences between the groups were maximized. The validating set was used to test and compare the diagnostic accuracy of our LDF with other parameters of the GDx-VCC. The receiver operating characteristic (ROC) curves were plotted for all of them and compared with the proposed LDF. Differences between the ROC curves were tested to compare the area under the ROC curves (AUCs) using the Hanley–McNeil method.16 The cutoff points were calculated by the MedCalc software as the points with the best sensitivity–specificity balance. Sensitivities at 85 and 95% (5% false positive rate) fixed specificities, and positive and negative likelihood ratios (LRs) were also calculated.

Results

The teaching set comprised 144 subjects divided into 71 normal eyes and 73 glaucomatous eyes (58 with primary open-angle glaucoma, 12 with pseudoexfoliative glaucoma, and 3 with pigmentary glaucoma). The mean age was 59.6±9.1 years for the normal group and 61.9±6.9 years for the glaucoma group (Table 1). The validating set comprised 72 normal eyes and 76 glaucomatous patients (64 with primary open-angle glaucoma, 11 with pseudoexfoliative glaucoma, and 1 with pigmentary glaucoma). The mean age of the normal group was 58.6±9.6 years and the mean age of the glaucomatous group was 60.7±7.1 years. Age and central corneal thickness did not differ significantly (P>0.05) between the groups in either sample.

Table 1 Clinical characteristics of both populations included in the study

Table 2 shows the mean values and standard deviation of all parameters evaluated in the teaching and validating sets. The values of all studied SLP parameters, except for the temporal average, were different between the normal and glaucoma groups in both samples.

Table 2 Mean and standard deviation values of scanning laser polarimetry parameters in the teaching and validating sets

The stepwise procedure identified the SLP parameters that accounted for the greatest amount of error, then included the next best variable, and so on. At the first iteration, the normalized inferior area was selected. At the second iteration, the nasal average was added to the model, and at the third iteration, the maximum modulation was added to the model. Finally, our LDF was defined as follows: LDF=−12.20+(0.15 × nasal average)−(23.85 × normalized inferior area)+(1.18 × maximum modulation).

In the teaching set, the highest sensitivity–specificity balance was observed for our LDF (86.3−83.1%; cutoff point, −0.33) and NFI (84.9−84.5%; cutoff point, >20). The AUC was 0.901 (standard error (SE), 0.030) for our LDF. The largest AUCs for the SLP-provided parameters were 0.893 (SE, 0.027) for NFI, 0.847 (SE, 0.035) for normalized inferior area, and 0.830 (SE, 0.036) for the TSNIT average. The AUCs of our LDF and NFI were higher than those of the TSNIT average (P=0.034 and 0.005, respectively).

In the validating set, our LDF, NFI, and normalized inferior area had the best pairs of sensitivity–specificity (Table 3): LDF, 86.8−79.1% (cutoff point, −0.32); NFI, 82.8−83.3% (cutoff point, >21); and normalized inferior area, 68.4−87.5% (cutoff point, −0.11). The image (15.16) and inferior averages (8.69) had the highest positive LRs, while our LDF (0.17) had the lowest negative LRs.

Table 3 In the validating set, areas under the receiver operating characteristic curves, best sensitivity–specificity balance and likelihood ratios of scanning laser polarimetry parameters to discriminate between normal and glaucoma subjects

The greatest AUCs (Table 3 and Figure 1) were 0.893 (SE, 0.026) for our LDF, followed by NFI (0.877; SE, 0.028), and the normalized inferior area (0.845; SE, 0.031). There were no significant differences between them (between our LDF and NFI, P=0.458; between our LDF and normalized inferior area, P=0.082; and between NFI and normalized inferior area, P=0.212), but the AUC of our LDF was different from the rest of GDx-VCC studied parameters. Our LDF, NFI, and normalized inferior area yielded sensitivities of 73.6, 73.6, and 71.0% respectively at a fixed specificity of 85%, whereas the sensitivities were 57.8, 48.6, and 40.7% respectively at a fixed specificity of 95%.

Figure 1
figure 1

Receiver operating characteristic curves of the linear discriminant function (LDF) and nerve fibre indicator (NFI) between healthy eyes and glaucomatous patients in the validating set. These parameters showed the largest areas under the receiver operating characteristic curve (AUC): 0.893 (95% confidence interval, 0.843–0.943) and 0.877 (95% confidence interval, 0.823–0.931) for our LDF.

When the abnormality criteria of both learning classifiers included in this study (LDF and NFI) were combined (Figure 2), SLP had 78.9% sensitivity and 98.6% specificity.

Figure 2
figure 2

Diagnostic ability of the linear discriminant function (LDF) and the nerve fibre indicator (NFI) in the validating set. The GDx-VCC yielded 78.9% sensitivity and 98.6% specificity when both LDF and NFI abnormality criteria were applied.

Discussion

Numerous studies17, 18, 19, 20, 21, 22, 23, 24 have reported the sensitivity and specificity of different versions of SLP for discriminating between healthy and glaucomatous eyes. Very few studies25, 26, 27, 28 however have tried to combine SLP variables to obtain an LDF to increase the diagnostic accuracy of isolated GDx-VCC-provided parameters. To our knowledge, the present study is the only study aimed at calculating an LDF based on GDx-VCC parameters. Other authors25, 26, 27 have carried out similar statistical analyses using previous versions of SLP (Nerve Fiber Analyzer or NFA); the only study based on GDx-VCC28 investigated the ability of quantitative analysis of the shape of the RNFL thickness surface and did not include GDx-VCC parameters.

Several factors threaten the internal and external validity of a study of diagnostic accuracy, a fact that inspired the launch of the STARD initiative.11 The objective of the STARD initiative is to improve the quality of the reporting of studies of diagnostic accuracy. The design of the present study followed all 25 items of the STARD guidelines. A major strength of this study is that we validated our LDF using an independent sample.29

Our LDF yielded higher sensitivities at high specificities as compared to the GDx-VCC parameters included in the printout and the extended GDx-VCC parameters. Only NFI, which is a support vector machine learning classifier, was as effective as our LDF for diagnosing glaucoma. These results are consistent with those of previous studies25, 26, 27, 28 in which there was improved detection of glaucomatous visual field defects or RNFL defects using LDFs based on SLP variables. Moreover, we suggest that a combination of learning classifiers should be used to increase the diagnostic ability of SLP. In our study, when the abnormality criterion for SLP was based on both LDF and NFI, specificity increased from 80−83 to 99% with a low reduction of sensitivity, as compared with the diagnostic ability of each variable individually.

The AUC of the normalized inferior area did not differ from the AUC of our LDF, but had a worse sensitivity–specificity balance. The remaining isolated GDx-VCC parameters presented worse AUCs and diagnostic accuracy than did the LDF. Normalized inferior area examines the modulation in the inferior portion of the data ellipse. A low value represents low modulation and is associated with RNFL loss. Unfortunately, the normalized inferior area is not included in the standard printout, even though several studies20, 23 have reported that this parameter can provide a good ability to detect changes of the RNFL.

Different designs and severity of visual field losses make it difficult to compare results among different studies. Obviously, the severity of visual field loss has an important effect on imaging instrument sensitivity.30 More severe disease is associated with increased sensitivity; therefore, in populations of patients with moderate and severe visual field losses, a higher sensitivity–specificity balance for the LDF and SLP parameters might be expected.

Depending on the pre-test probability, positive or negative LRs indicate the extent to which a factor will increase or decrease respectively the probability of disease. An LR value31 close to 1 indicates insignificant effects, whereas LR values higher than 10 or lower than 0.1 often indicate large changes in post-test odds of the disease. Our LDF had the lowest negative LR, thus normal results are associated with a big change in the post-test probability of disease for these variables, and a better ability to exclude the presence of glaucoma.

All participants were classified into groups regardless of the appearance of the optic disc because we chose visual field loss as the reference standard32 to avoid bias in the evaluation of the studied test. As we were measuring RNFL thickness, we selected the best indicator of glaucoma that is not dependent on RNFL or optic disc morphology.33 Therefore, we might have included pre-perimetric glaucoma subjects in the normal group, resulting in an underestimation of the diagnostic accuracy of the imaging test. On the other hand, in some cases SAP may detect glaucomatous visual field defects when the disease is clinically advanced.23, 34, 35, 36, 37 Thus, the glaucoma group might have had a more advanced stage of the disease38 than that represented by the perimetric indices, leading to a better diagnostic ability of the GDx-VCC.

The ethnic characteristics of the validation sample were similar to those of the teaching set, and this fact might have biased the findings towards our LDF when compared to other SLP parameters in the second population. Only glaucoma patients with high IOP were included, and therefore our results may not be applicable to subjects having low-tension glaucoma. The quality of the data obtained by the imaging devices is influenced by the media opacity, retinal pigment epithelium status, instrument variability, and positioning and centring of the images. These limitations must be taken into account in clinical practice. Also, other statistical analyses25, 26, 27, 28, 39 could provide alternative formulas that will increase the diagnostic performance of SLP.

RNFL thickness can vary widely among healthy subjects, limiting the usefulness of single parameters to differentiate between glaucoma and normal subjects. Our LDF combined the most useful SLP variables and increased the diagnostic accuracy of GDx-VCC for glaucoma. The results in the validating set confirmed those obtained in the teaching set.