Introduction

Early glaucoma detection is crucial for the prognosis of the disease and treatment by preventing the progression of irreversible restriction of the visual field. Therefore, different setups to uncover glaucomatous damages at an early stage have been suggested. Diagnosis of glaucoma is very closely associated with morphological changes in the optic nerve head and thus, one diagnostic parameter might consider the morphology of the disc and retinal nerve fibre layer. Measurements of the functional integrity of the visual system should deliver further information. To increase diagnostic value, the validity of morphological measurements and sensory testing has been compared1,2,3,4,5 and relationship between procedures has been studied.6 It has been found that the combinations of structural and functional test might be helpful for the identification of glaucomatous damage.1,7,8,9 In recent years, new sensory tests, as well as morphological techniques, have been developed to detect glaucomatous defects at an early stage. It has been shown, that quantitative measurements of the retinal nerve fibre layer10 and perimetry using frequency-doubling technique (FDT)11,12 may be helpful in glaucoma diagnosis.

The aim of this study is to assess the diagnostic value of FDT-perimetry and high-resolution spectral-domain optical coherence tomography (SDOCT) in glaucoma detection in patients of the Erlangen glaucoma registry and to increase the diagnostic value by a combination of both methods. To judge validity on very early glaucomas, we examined populations of ocular hypertensive (OHT) patients, preperimetric, and early and moderate perimetric glaucoma patients in comparison to normals.

Materials and methods

Procedures

The study included 97 experienced healthy subjects and 233 patients of the Erlangen Glaucoma Registry (www.clinicaltrials.gov, NCT00494923). Normals and patients had annual examinations over a period of 2–18 years by slit lamp biomicroscopy, tonometry, funduscopy, gonioscopy, pachymetry, perimetry, and papillometry. Papillometric evaluations of patients were based on 15° colour photographs. The individual morphological and sensory data, presented here were obtained in a 48-h period of time. For all patients conventional stereo-diapositives under mydriatic conditions were available. In the healthy subjects group 67% had conventional stereophotographs, 33% normal subjects had non stereoscopic digital photographs. Criteria for the diagnosis in all glaucomas were an open anterior chamber angle and glaucomatous changes of the optic nerve head, including an unusually small neuroretinal rim area, in relation to the optic disc size and cup-to-disc ratios being higher vertically compared with horizontal.13 For diagnosis and optic disc classification according to the stages given by Jonas et al14 all available optic disc photographs were examined and classified by two glaucoma specialists. If agreement in classification of photographs between two graders (CM and DB) could not be achieved a third glaucoma specialist (author AGJ) was included, leading to a majority vote. Clinical diagnosis included the patients’ total observation period (10.0±4.3 years).

All subjects underwent visual field testings with standard white-on-white perimetry using a computerized static projection perimeter (Octopus-G1, Interzeag, Switzerland). All patients had a three-phase protocol (full threshold) in order to calculate the corrected loss variance. Thirty six per cent of the healthy subjects had the G1 TOP strategy. Those subjects with a rate of false-positive or false-negative responses higher than 12% were not included in this study. Similar to what has been suggested earlier15 a white-on-white perimetry was classified as a ‘non-normal’ visual field when one of the following was present: (a) at least three adjacent test points in the superior or inferior hemifield having a probability of ≤5% and with one test point with a defect of ≤1% or (b) at least two adjacent test points having a probability of ≤1%. These criteria had to be confirmed in at least the two most recent Octopus measurements at the same test locations.

All individuals included in the study (Table 1a) had clear optic media and a visual acuity of 20/40 or better. The mean age of the subjects did not differ significantly between the groups. Exclusion criteria were all eye diseases other than glaucoma, diabetes mellitus, and a myopic refractive error exceeding ±6.75 dioptres. In glaucoma patients one eye of each patient was selected for the assessment of validity; this concerned always the eye with the more advanced defects. In healthy subjects and OHT patients, one randomly selected eye entered the study. The study followed the tenets of the declaration of Helsinki for research involving human subjects and informed consent, including agreement for data collection, was obtained from all participants of the study. An institutional review board approved the study.

Table 1a Demographic characteristics (mean±SD) and evaluation of optic disc photographs according to Jonas et al14

Healthy subjects

The study included healthy subjects from the Erlangen glaucoma registry. Findings in slit lamp inspection, tonometry without medication, and funduscopy were in the normal range. White-on-white perimetry was classified normal following the criteria as described above. Optic discs were inspected and classified as normal.

OHT group

Patients of this group had intraocular pressures above 22 mm Hg upon repeated applanation tonometry measurements. All OHT patients had normal white-on-white perimetry and normal appearing optic discs.

‘Preperimetric’ glaucoma patients

In the ‘preperimetric’ glaucoma group, patients showed glaucomatous abnormalities of the optic discs (diffuse or localized loss of neuroretinal rim). A total of 61 patients of this group showed loss of normal configuration of the neuroretinal rim in relation to disc size, but no apparent notches (optic disc stage 1).14 A total of 16 patients showed additionally notching of the neuroretinal rim (optic disc stage 2). Computerized visual field examinations with white-on-white perimetry were normal.

‘Early’ glaucoma patients

The ‘early’ glaucoma patient group included 42 patients with OAG characterized by elevated intraocular pressure measurements higher than 21 mm Hg and 10 patients with normal-pressure glaucoma. All patients of this group had glaucomatous optic disc damage (Table 1a) and ‘non-normal’ white-on-white perimetry. The early stage of a glaucomatous visual field defect was defined as a mean deviation not worse than 5.0 dB and a corrected loss variance not worse than 24 dB2. This classification of a glaucomatous visual field in this group is based on stage 1 of the Glaucoma Staging System (Version 2, Brusini and Filacorda17).

‘Moderate’ glaucoma patients

The ‘moderate’ glaucoma patient group included 31 patients with OAG characterized by elevated intraocular pressure measurements higher than 21 mm Hg and 19 patients with normal-pressure glaucoma. All patients of this group had glaucomatous optic disc damage stage 1–3 (Tables 1a and 1b) and visual field losses in Octopus not classified as ‘early glaucoma’. The mean value of perimetric defects in this group was 5.8±2.1 dB and corrected loss variance was 60.2±37.9 dB2. The classification followed the Glaucoma Staging System with 34 patients at stage 2 and 16 patients in the ‘localized defects’ section of stage 3. Patients with a perimetric mean defect exceeding 9.7 dB were not included.

Table 1b Results (mean, SD, and percentiles) from FDT-perimetry and SDOCT for healthy subjects and patient groups

SDOCT

A SDOCT (Spectralis, Heidelberg Engineering, Heidelberg, Germany) was used to examine the thickness of the retinal nerve fibre layer (RNFL) at a circle of 3.4 mm diameter around the optic disc. An online tracking system compensates for eye movements during 16 consecutive circular B-scans. Averaged B-scans were analyzed automatically to determine the RNFL thickness at 768 positions (Spectralis Software). In this Spectralis software all retinal vessels within the RNFL were considered to be part of the RNFL. The present statistical analysis is based on 32 equal-sized sectors. To take into account the dependency of the RNFL thickness on the age of the subjects, all thickness measurements in the 32 sectors were corrected according to the individual regression in each sector. For these sector-wise age normalizations, we used data from 184 eyes of 184 healthy subjects (reference group) available from the Erlangen glaucoma registry. This reference group was also used to establish confidence limits and was completely different from the normal cohort of the present analysis. No subject of the reference group was used in the present statistical analysis. A detailed description of Spectralis-SDOCT results in normals has been published recently.18 For statistical evaluation of SDOCT results we used the mean RNFL thickness and an individual SDOCT-score considering the confidence limits for 97.5 and 99.5% of normal subjects in all 32 sectors. RNFL thickness within 97.5% level: score =0, between 97.5 and 99.5% levels: score =1, and outside 99.5% level: score =2 (see Figure 1). Considering all 32 sectors, the score ranges from 0 (all sectors within 97.5%) to 64 (all sectors outside 99.5%). If the normative database of the Spectralis was compared with our reference group, we found that our cut-points are identical to those that are achieved with the build-in normative database of Spectralis software version 4.

Figure 1
figure 1

Determination of RNFL thickness for a normal (left) and glaucoma eye (right). Top row: confidence intervals and average thickness of healthy subjects and individual thickness results in 32 sectors. Bottom row: individual SDOCT B-scans. Grey lines in B-scans indicate segmentation of peripapillar RNFL (arrows). For statistical evaluation we used an individual SDOCT-score considering the lower limits of the confidence interval of the healthy subjects in all 32 sectors (dotted area: 97.5% level, hatched: 99.5% level, thick line: mean thickness). Thus, the range of this SDOCT-score is 0–64. In the normal eye (left) all individual thickness values (light grey dots) are in the normal range. In the glaucoma patient (right) the score is 37 (7 segments within 97.5%, 13 segments between 97.5 and 99.5% and 12 segments outside 99.5%).

Evaluation of FDT-perimetry

The FDT-perimeter (Zeiss Humphrey Systems, Dublin CA, USA), is a device that tests contrast sensitivity with pattern reversal stimuli in the central visual field. A low-spatial-frequency sine-wave grating pattern that is alternated with a temporal high-frequency (25 Hz) counter-phase flicker is presented in one of the target locations on a random basis. The technology and the paradigm were described in detail earlier.19 The screening procedure (C-20-5 or N-30-5) presents stimuli with a contrast that 95% of the normal population of the corresponding age group is able to detect. If the stimulus was detected, it was assumed that contrast sensitivity is within normal limits and no further testing was performed at that location. If the initial stimulus was missed, the same stimulus was presented at that location a second time. If it was missed again, the instrument presented a stimulus with a contrast detectable by 98% of the normal population, and if this was missed, a stimulus with a contrast detectable by 99% of the normative subjects was presented. This strategy allows generating a score ranging from zero (ie first presentation seen) to four (ie 99% level not seen) for each test locations.20 Considering all 17 central fields, the score ranges from 0 to 68. In the Software version N-30-5 of the present FDT perimeter two more test locations are studied in a separate step of the test procedure. These additional tests are not considered in the present study. Two types of catch trials were generated to attract the subject's attention and to obtain an impression of the goodness of the fixation. Tests with more than one positive catch trial were not included. To minimize possible learning effects21 only subjects with experience in FDT testing in one or more earlier examinations were included.

Statistical methods

The description of the results in healthy subjects, OHT and three glaucoma patient groups as given above included means, SDs and percentiles (5 and 95%). Receiver-operating-characteristic (ROC) curves were used to describe the validity of the different glaucoma classifiers (SPSS 15, SPSS Inc., Chicago, IL, USA). ROC analysis was made additionally for those of the early and moderate OAG patients with Jonas-stage 1 or 2. The areas under ROC-curves were compared statistically considering the total area.22 To judge the diagnostic performance of the procedures at a high specificity we additionally calculated the significance of the ROC-difference and the sensitivities at the cutpoints for 95.9% (93 of 97 normals). Differences between pairs of curves at a selected false-positive fraction were evaluated with the ROCKIT software.23 For comparison of results between groups confidence intervals are given. To take into acount the error from multiple testing, all significance values were corrected using the method given by Benjamini and Hochberg.24 In addition to the validity of independent parameters of the individual instruments (FDT-score, SDOCT-score and RNFL thickness), the combination of the individual parameters FDT-score and SOCT-score have been evaluated. The combined score was the sum of both individual scores (combined score = FDT-score + SDOCT-score).

Results

The different patient groups of the study were significantly distinguished by both diagnostic procedures (Table 2). Areas under ROC increase according to the severity of the disease from OHT through preperimetric OAG and early OAG to moderate OAG as defined by w/w perimetry (Table 2). When the significance of the difference between paired ROC curves was assessed for total areas, we found in the preperimetric patient group that the area of RNFL thickness was significantly larger than the areas of FDT- and SOCT-scores. In other patient groups the total areas of ROCs did not differ significantly.

Table 2 Area under ROC and confidence limitsa

For diagnostic purposes dealing with glaucoma the sensitivities at high values of specificities are more meaningful measures than the total area under the ROC curve. Therefore, we calculated sensitivities and the difference of the ROC curves at a high specificity. At a pre-given specificity of 95.9% (93 of 97 normals) the combined score (sum of FDT-score and SDOCT-score) revealed best or at least equal performance compared with single parameters in all subgroups. Significant differences between combined score and single tests are indicated in Figure 2. The combined score was able to detect 22.2% OHT, 44.2% preperimetric OAG, 88.5% early OAG, and 98.0% moderate OAG (numbers given in Table 3). In a direct comparison of instrument-specific parameters (SDOCT-score and FDT-score), the SDOCT-score detected more glaucoma patients than the FDT-score in the preperimetric group (44.2% compared with 24.7%, Table 3). In contrast, in the present perimetric glaucoma groups, the FDT-score and the SDOCT-score performed similar (Figures 2, 3 and 4, Table 3). If the ROC-curves of the two SDOCT parameters (mean RNFL thickness and SDOCT-score) were compared at high specificity, a possible diagnostic advantage of the present SDOCT-score became visible in patients with focal optic disc damages and in the moderate glaucoma group (P=0.05). This difference is not significant if a correction for multiple testing is performed.

Figure 2
figure 2

ROC-curves generated in patients groups for FDT-score, SDOCT-score, combined scores (sum of FDT-score and SDOCT-score), and mean RNFL thickness. (a) Patients with OHT and OAG in preperimetric and perimetric groups. (b) Sub-groups of patients with concentric or focal disc damage according the stages by Jonas. The sensitivity is plotted as a function of the specificity. Statistical comparisons of ROC curves have been performed at a specificity of 96%. The combined score is significantly superior or at least comparable to individual scores in the high specificity range. In preperimetric patients, the ROC-curve of the RNFL thickness is significant above the FDT-score. The present SDOCT-score shows higher sensitivities in focal disc-damage OAG than in patients with concentric disc damages. Asterisks indicate significant differences of ROC curves at the selected specificity after correction of multiple testing (six comparisons in six groups).

Table 3 Numeric results for data of Figure 3 and mean RNFL thickness: sensitivity in all sub-groups using results from FDT-score and SDOCT-score and combined score (in percent and number of eyes)

To judge individual results, all SDOCT-scores were plotted as a function of the FDT-scores (Figure 3). The scatterplot shows the distribution of the data in subgroups. Spearman rank correlation was significant (r=0.65, P<0.001), if all patients of this study were included and weakly significant (r=0.41, P<0.05; corrected for multiple testing) for the moderate OAG group. In Figure 3, horizontal, vertical, and diagonal lines indicate the borders of the normal range for SDOCT-score, FDT-score, and combined score. Numbers of cases that were classified glaucomatous according to these lines are given in Table 1b. Using the normal range for single determinations as shown in Table 1b and Figure 3, 100 of all 233 patients had non-normal results for both testings indicated by falling in the white area above and right of the straight lines. No single member of the normal group was found in this area. Using the present combined score of FDT-score and SDOCT-score as indicated by the diagonal line in Figure 3, 141 of the patients were classified glaucomatous.

Figure 3
figure 3

Graphical presentation of results. All patients are shown in an area spanned by the FDT-score and SDOCT-score. The horizontal (chain), vertical (dotted) and diagonal lines indicate borders of normal range for FDT-score, SDOCT-score, and combined score (95.9%). Symbols: OHT (+), preperimetric OAG ( × ), early OAG (□), and moderate OAG (•). Note that each symbol of a patient can represent more than one individual from this group. All patients falling in the white area above and right of the straight lines are correctly classified by both methods: the FDT-score and as well as by the SDOCT-score (numeric results are given in Table 1b). The patients falling in the grey area are additionally classified having glaucoma by the combined score.

Discussion

The study compares ability of a functional and morphometric test, and its combination to distinguish normal eyes from glaucoma eyes. The FDT findings in analysis of the present groups are in agreement with results of earlier studies indicating high diagnostic value of this functional test in patients with losses in conventional white-on-white perimetry.20,25,26,27 Our results in the preperimetric glaucoma group are in line with the observation, that a part of this group reveals losses in FDT-perimetry earlier than in conventional white-on-white perimetry.20,26 Generally, in comparison to other studies one has to keep in mind that only trained subjects and patients are included in our evaluation. Furthermore, a new SDOCT with automatic segmentation of the RNFL has been used for thickness measurements of the peripapillary RNFL. In contrast to the conventional time-domain OCTs with slow mechanical mirror systems the spectral domain OCT technique reveals some fundamental advantages: it allows much faster data acquisition28,29 and has an improved signal to noise ratio.30,31,32 In addition, the Spectralis OCT used in this study is equipped with an eye-tracking system33 to avoid movement artefacts and B-scans can be average online to further improve the signal to noise ratio of B-scan images. Furthermore, the tracking system can be used to re-analyze patient eyes at exactly the same position. This is a helpful tool to reduce intra- and inter-individual variability in follow-up examinations. Diagnostic value was found to be higher34 or comparable35 in comparison of SDOCT- and time-domain OCT. Differences between studies comparing time domain OCT (Stratus, Carl Zeiss Meditec, Jena, Germany) and spectral domain OCTs (available from different manufacturers) might be due to differences of normative data bases and the algorithms used for RNFL segmentation in the different instruments. It was shown that the key factor determining whether results from different machines will be comparable is the algorithm used to segment RNFL thickness, not the type of OCT.36 A direct comparison of the validity of time domain OCT technique and the present Spectralis OCT in the same glaucoma patients is underway.37

In our study, not only mean RNFL thickness has been found as good classifier38,39 but also the SDOCT-score which is based on the results of our reference group. For this SDOCT-score a sector size of 11.25°C was used. Future studies might increase the diagnostic value by optimizing the sector size. Here, at high specificity in the control group, the sensitivity in the moderate OAG group was 94% for the present SDOCT-score in comparison to 86% for mean RNFL thickness (Table 3). Possibly, the present SDOCT-score is especially able to detect those patients, who show localized RNFL defects with little effect on the general depression of the RNFL thickness. This observation can be explained by the high prevalence of optic discs with notching of the neuroretinal rim in our perimetric patients group leading to a high frequency of focal RNFL defects (Table 1a: Jonas-stage 2). Figure 2b shows results of subgroups containing only patients with concentric (Figure 2-5) or focal damages (Figure 2-6). In the group with focal-disc damages a possible diagnostic advantage of the SDOCT-score in comparison to the RNFL thickness can be seen for high specificities in Figure 2-6 (not significant after correction of multiple tests). The difference in discrimination performance between SDOCT parameters in patients with concentric and focal disc damage is furthermore expressed by the observation that the difference between SDOCT-score and combined score is significant in the concentric damage and the difference between RNFL thickness and combined score is significant in focal damage subgroup.

The present study additionally includes a group of OHT patients in order to study whether the tests might be able to uncover so far undetected structural or functional defects in those subjects. The patients of this OHT group show normal intraocular pressure under treatment and have regular visits in our glaucoma service. Thus, the chance is high that the majority of these patients might never develop glaucoma. Nevertheless, a re-examination with the present methods might be recommended especially in those patients with elevated combined score.

Judging the present results one has to keep in mind that in practice, not all losses indicated in FDT-perimetry are caused by glaucoma and that present SDOCT measurements are performed in a parapapillary ring without any information of the specific pathology of the optic disc (eg simple vs glaucomatous optic atrophy). Therefore, the present study has been restricted on selected groups of experienced subjects without anatomically altered optic discs except glaucoma. Analysis of validity used artificial staging of the different glaucoma groups by degree of disc atrophy, visual field damage and IOP. The information gained has the practical consequence, to know to what extent these two methods would miss or identify glaucoma in various stages, when the patient presents himself for the first time. Taking the function–structure relationship of glaucoma into account, it is important to notice that the combination of FDT and SDOCT performs equal or even better than each single method alone. As a consequence an early glaucomatous case would be missed by FDT, but not by SDOCT on one hand and on the other hand in a progressed case, when function is more informative than a structural measurement the advantage of FDT contributes to a higher sensitivity of the score.

Numerous studies have been performed to investigate the ability of OCT and FDT to unmask glaucomatous disc atrophy in comparison to other methods, which claim to detect glaucomatous damage at an early stage of the disease. Both devices competed successfully with modern perimetric techniques10,19,20,40,42 and other methods to measure the RNFL.8,12,41 In comparison of the present structural and functional method it is not surprising that RNFL thickness analysis outperforms FDT-perimetry in our preperimetric group as this group is artificially defined by structural damage alone. In our early perimetric patients the diagnostic value of the FDT-score is comparable to the SDOCT-score, whereas FDT-score is superior to RNFL-thickness results (Figure 2-4) in the moderate glaucoma group. One reason for this has to be sought in the fact, that the diagnostic groups were defined by another sensory test, the white-white automated visual field testing. As it can be seen from Table 1b and Figure 3, the majority (90%, 45/50) of these moderate perimetric patients showed losses not only in SDOCT-score but also in FDT-score, whereas none of the healthy subjects are classified as glaucomatous by both tests at the same time (ie none of the normals is above the chain-line and on the right beyond the dashed line in Figure 3). In the preperimetric glaucoma group 14 of 77 patients (18.2%) were classified glaucoma by both tests. The classification rate is considerably higher if both methods were combined by a simple addition of the scores; in this preperimetric group the number of correctly classified patients increased from 14 to 34. Similarly, Shah et al8 could show that combined evaluation of FDT-perimetry and structural data can increase the number of correctly classified perimetric glaucoma patients. In contrast to this earlier study in which different test combinations were compared after dichotomizing the results into positive or negative we used a summed score from the two diagnostic devices, allowing the recalculation of a suitable specificity cut-off. In another study, Kim et al38 identified preperimetric patients with abnormal FDT-perimetry and RNFL thinning and stated that coincident FDT and OCT abnormalities could be a possible prognostic indicator for the development of visual field losses.

In summary, the study shows that combined use of structural and functional testing can be helpful in glaucoma detection. FDT-perimetry and SDOCT are able to unmask a considerable portion of glaucoma patients with visual field losses and quite a number of patients with early forms of the disease. The FDT-score, RNFL thickness, and the fraction of individual RNFL thickness below normal range the newly introduced SDOCT-score can be used as an additional classifier for glaucoma diagnosis. Ongoing studies will show the validity of more sophisticated algorithms to include all information derived from both devices. FDT-perimetry is a rapid and non-expensive method for glaucoma screening. This study used the screening protocol of the first-generation FDT-perimeter. Future studies should include results of the FDT Matrix, which has a higher spatial resolution.43 Our results indicate that additional determination of morphometric integrity in this case RNFL thickness assessed by SDOCT can be justified especially in early diagnosis; in preperimetric patients the sensitivity of SDOCT is considerable higher than sensitivity of the presently used FDT perimeter. In more advanced stages of glaucoma the diagnostic value of both devices seem to be comparable. Here, a non-normal SDOCT is an indicator that FDT-defects may be due to diminished retinal nerve fibre layer thickness.