Introduction

Glaucoma is a chronic and progressive optic neuropathy associated with retinal ganglion cell death or dysfunction and with visual field disorder1. Although glaucoma is an irreversible disease, visual field progression can be delayed or arrested by reducing intra-ocular pressure to an optimal level2. Thus, early assessment using structural and functional ophthalmic measurements is crucial to slow the progression of glaucoma.

Previous studies using conventional standard automated perimetry (SAP) have suggested that structural changes in early glaucoma, including changes to the optic disc3,4,5,6,7,8,9,10, the retinal nerve fibre layer (RNFL)3,4,5,6,7,8,9,10 and the macular retinal ganglion cell11,12,13, generally occur prior to visual field damage. However, structural damage from glaucoma does not always precede functional damage, and both structural and functional measurements are needed to reliably detect early glaucoma14, 15. Although conventional SAP is the clinically accepted method for the diagnosis and assessment of glaucoma, SAP detectability of early glaucomatous visual field damage has been reported to be slightly inferior to the detectability of structural measurements by optical coherence tomography (OCT)16, 17 and to functional measurements from non-conventional perimetry, such as short wave length automaed perimetry18, 19, flicker perimetry (Flicker)20, 21, Moorfields motion displacement test22, 23, flicker defined form perimetry24, 25, and frequency doubling technology (FDT)26, 27.

The Octopus 600 perimeter (Haag-Streit, Koeniz, Switzerland) is based on a thin film transistor LCD and was designed to perform both Pulsar perimetry (Pulsar)28, 29 and SAP30. Pulsar stimulus was invented in 2000 by Gonzalez-Hernandez and coworkers28. Although a prototype device known as the Pulsar T30W test was used in a pilot study, Pulsar was not commercially available until the Octopus 600 perimeter was released in 2013. Pulsar is a flicker stimulus, displaying a ring pattern with different contrast levels in counter phase. Pulsar has been shown to have high detectability for early glaucoma31,32,33,34. Detectability of early glaucoma using Pulsar has been compared with FDT, rarebit perimetry, the Heidelberg Retina Tomograph II (HRT II), and scanning laser polarimetry (GDx VCC)31,32,33,34, but there has been no report comparing it to other non-conventional perimetric methods or to spectral-domain OCT (SD-OCT). Nomoto and coworkers20 reported that Flicker demonstrates a higher AUC compared to FDT or to SWAP. Thus, there is a need to compare the detectability of Pulsar, Flicker, and SD-OCT. The aim of this study was therefore to assess the diagnostic capability of Pulsar for PPG and EG and to compare this capability with those of Flicker and SD-OCT.

Results

Three eyes of 3 glaucoma patients and 2 eyes of 2 normal participants were excluded due to failure to meet the inclusion criteria. Thus, 60 eyes of 60 glaucoma patients and 42 eyes of 42 normal participants were analysed. Of the 60 eyes of the 60 glaucoma patients, 25 eyes were classified as pre-perimetric glaucoma (PPG; 24 eyes with normal tension glaucoma and 1 eye with primary open angle glaucoma), and 35 eyes were classified as early glaucoma (EG; 20 eyes with normal tension glaucoma, 11 eyes with primary open angle glaucoma, and 4 eyes with secondary glaucoma). TableĀ 1 shows demographic data of normal participants, PPG patients, and EG patients. TableĀ 2 shows the results of each parameter from Pulsar, Flicker, and SD-OCT in normal participants, PPG patients, and EG patients. Although there was no difference in Flicker results between the control group and the PPG group, there were significant differences in the results from both Pulsar and SD-OCT between the control group and the PPG group.

Table 1 Demographics of normal participants and glaucoma patients.
Table 2 Comparison of each parameter measured with Pulsar, Flicker, and SD-OCT.

The best parameters for discriminating the control group from the PPG group by Pulsar, Flicker, and SD-OCT (OCT-disc and OCT-macular) were mean defect (AUCā€‰=ā€‰0.733), number of abnormal points with a pā€‰<ā€‰0.01 (AUCā€‰=ā€‰0.663), vertical cup-to-disc ratio (AUCā€‰=ā€‰0.842), and superior macular RNFL (mRNFL) thickness (AUCā€‰=ā€‰0.830), respectively. The AUC from the Flicker results was significantly lower than that from the OCT-disc results (pā€‰=ā€‰0.016), but there was no other significant difference among the three methods. The best parameters for discriminating the control group from the EG group by Pulsar, Flicker, OCT-disc, and OCT-macular were mean defect (AUCā€‰=ā€‰0.851), square loss variance (AUCā€‰=ā€‰0.869), inferior circumpapillary RNFL (cpRNFL) thickness (AUCā€‰=ā€‰0.907), and inferior mRNFL thickness (AUCā€‰=ā€‰0.861), respectively. There was no significant difference in AUC among the four methods. ROC curves are shown in Fig.Ā 1. The AUC, best cut-off value, sensitivity and specificity, and sensitivity at both 80% and 90% specificity between the control group and the PPG group and between the control group and the EG group are shown in TablesĀ 3 and 4, respectively.

Figure 1
figure 1

Receiver operating characteristic curve of each device. Area under the curve (AUC) of the best parameters from Pulsar perimetry, Flicker perimetry, optical coherence tomography (OCT) disc measurement, and OCT macular measurement in pre-perimetric glaucoma (PPG) and early glaucoma (EG). The (*) shows significance at pā€‰=ā€‰0.016 by the DeLong test.

Table 3 Results of receiver operating characteristic analysis between control and pre-perimetric glaucoma eyes.
Table 4 Results of receiver operating characteristic analysis between control and early glaucoma eyes.

Comparison of sensitivity and specificity values at best cut-off and sensitivity values at 80% and 90% specificity values are shown in TableĀ 5. For the PPG group, sensitivity at best cut-off and at 90% specificity of Pulsar was significantly lower than that of OCT-disc (pā€‰<ā€‰0.013) and OCT-macular (pā€‰<ā€‰0.008). However, specificity at best cut-off of Pulsar was significantly higher than that of OCT-disc (pā€‰=ā€‰0.003) and OCT-macular (pā€‰=ā€‰0.001). The sensitivity and specificity of Pulsar were equal to or better than that of Flicker. For EG, sensitivity and specificity at best cut-off and sensitivity at 80% specificity of Pulsar were equal to Flicker, OCT-disc, and OCT-macular. However, sensitivity at 90% specificity of Pulsar was significantly lower than that of OCT-disc (pā€‰=ā€‰0.030).

Table 5 Statistical comparison of sensitivity and specificity values at best cut-off and sensitivity values at fixed specificity among each device.

FigureĀ 2 shows a Venn diagram of Pulsar, Flicker, OCT-disc and OCT-macular parameters, showing the agreement between structural and functional measurements. Abnormality was based on the cut-off value of the best parameter. There were no patients with normal results for all devices. Six of the 25 PPG eyes (24%) and 17 of the 35 EG eyes (48.6%) showed abnormal results from all four methods. However, PPG and EG were detected with 100% sensitivity by functional or structural measurement. The agreement between structural and functional measurements expressed by kappa values ranged from āˆ’0.16 to 0.07 in the PPG group and from 0.01 to 0.25 in the EG group.

Figure 2
figure 2

Venn diagram of the eyes detected as abnormality in each device. Abnormality is based on the cut-off value of the best parameter of Pulsar perimetry, Flicker perimetry, optical coherence tomography (OCT) disc measurement, and OCT macular measurement. The agreement values between structural and functional measurements are expressed by kappa coefficients under the Venn diagram.

TableĀ 6 shows the statistical comparison of the test duration and reliability index for both the false positive (FP) and false negative (FN) response rates between Pulsar and Flicker perimetry. The reliability index of Flicker was significantly worse than that of Pulsar in the control group, the PPG group, and the EG group (pā€‰<ā€‰0.001). Test duration of Pulsar was significantly shorter than that of Flicker in the control group (pā€‰<ā€‰0.001), the PPG group (pā€‰<ā€‰0.001), and the EG group (pā€‰<ā€‰0.001).

Table 6 Statistical comparison of reliability indices and test duration between Pulsar and Flicker perimetry.

Discussion

In the current study, we found that the diagnostic capability of Pulsar was equal to Flicker and SD-OCT for both PPG and EG patients. However, the agreement of detectability of structural and functional measurements was poor, and structural measurements appeared to be more sensitive than functional measurements in PPG patients. In contrast, functional measurements using Pulsar were equal to structural measurement in EG patients.

To date, many studies have reported on the diagnostic capability for PPG using specific functional measurements with methods such as FDT (AUCā€‰=ā€‰0.666 to 0.802)20, 35,36,37, Flicker (AUCā€‰=ā€‰0.800)20, SWAP (AUCā€‰=ā€‰0.660 to 0.704)20, 37, and Pulsar (AUCā€‰=ā€‰0.733)31. Multiple studies have also investigated the diagnostic capability for PPG using structural measurements with methods such as OCT (AUCā€‰=ā€‰0.527 to 0.938)38,39,40, HRT (AUCā€‰=ā€‰0.740 to 0.914)39, 40, and GDx (AUCā€‰=ā€‰0.688 to 0.894)39. Although it is difficult to make a direct statistical comparison between the AUC results of the current study and those of previous studies because of differences in sample size and characteristics, we report an AUC for PPG of 0.733 using Pulsar, similar to the results of previous studies on specific functional measurements20, 31, 35,36,37. Although the AUC from Pulsar did not differ from other devices in the current study, the AUCs from specific functional measurements were slightly lower than those that have been reported by previous studies. When sensitivity values at the best cut-off or at fixed specificity were compared for each device, the structural measurements from OCT appeared more sensitive than the specific functional measurements reported by the current study and by a previous study31. This might be because PPG was determined by clinical structural assessment of the optic disc shape based on the general mechanism of pathogenesis1, 41, 42. The best parameter for discriminating between the control group from the PPG patients was cup shape, not cpRNFL thickness. Indeed, the diagnostic capability of structural measurements of the optic nerve head is better than cpRNFL thickness for classifying PPG43, 44.

For EG, the diagnostic capabilities using AUC of both Pulsar and Flicker were equal to those determined from SD-OCT. This is in agreement with a previous study reporting that the diagnostic capability of Pulsar using AUC was equal to FDT, HRT II, and GDx VCC31. Functional measurements appear to be more sensitive for EG diagnosis than structural measurements in both a previous study31 and the current study; however, we report the opposite for PPG diagnosis. This might be due to the fact that all EG patients had abnormal SAP results corresponding with Anderson-Patella criteria45 in addition to a glaucomatous optic nerve head. Additionally, as glaucoma progresses from PPG, superior or inferior RNFL thinning occurs along with changes to the optic nerve head. Thus, it could be reasonably expected that functional measurements are more sensitive than structural measurements for EG diagnosis.

Although the sensitivity for PPG at best cut-off of Pulsar was lower than the sensitivity of OCT, specificity at best cut-off was higher than OCT. However, both sensitivity and specificity for EG at best cut-off with Pulsar was equal to OCT. The agreement between structural and specific functional measurements in both PPG and EG was poor. However, PPG and EG were detected with either structural or specific functional measurements with 100% sensitivity (Fig.Ā 2). There was no method able to accurately detect glaucoma using only one parameter. Based on the current study, the combination of OCT for structural measurement and Pulsar for selective functional measurement should be recommended to reliably diagnose PPG and EG.

Reliability indices of FP and FN for Flicker were worse than Pulsar, and an especially high FP rate was demonstrated for Flicker in both PPG and EG. Flicker fusion frequency was measured at each test point with a fixed high contrast stimulus of 0ā€‰dB, while contrast sensitivity was measured with a fixed temporal frequency of 10ā€‰Hz for Pulsar. Eyes with PPG or EG, such as those investigated in this study, can still respond to Flicker stimuli, although with decreased sensitivity. Flicker is difficult for PPG or EG patients to accurately respond to, as the flickering target is close to threshold at slightly decreased sensitivity regions. It was reported that threshold variability increases even the SAP measurements at slightly decreased sensitivity region46.

The test duration of Pulsar was shorter than that of Flicker despite the use of the same tendency oriented perimetry (TOP) algorithm. This may be due to the difference in number of test points between Pulsar and Flicker. The 32ā€‰P test point of Pulsar is similar to the original Octopus 32 test point program with a 6-degree interval, but the 4 points at the superior and inferior were each removed. In contrast, Flicker was measured with the original Octopus 32 test point program. Another reason for the difference in stimulus presentation time might be that a presentation time of 500 msec was applied for Pulsar, but only a 1ā€‰sec presentation time was applied for Flicker.

Pulsar and Flicker each have several advantages and disadvantages. Pulsar may have the advantage of ease of use and less fatigue compared with Flicker because it demonstrated good reliability indices and a shorter test duration in the current study, and a previous study reported that Pulsar is not associated with a learning effect47, but that Flicker is48. However, it was reported that Pulsar was affected with intraocular straylight as well as SAP49. In contrast, Flicker didnā€™t affect ocular media opacity50. Thus, Pulsar may have the disadvantage of robustness to media opacities compared with Flicker.

The current studyā€™s main limitation was that the rate of glaucoma type was different between PPG and EG, and PPG patients in particular had almost normal tension glaucoma. Further studies will therefore be required to confirm our results.

In conclusion, the diagnostic capability of Pulsar for PPG and EG was equal to that of Flicker and OCT. However, the agreement between structural and functional measurements for PPG and EG was poor. The structural measurements from OCT were more sensitive than the specific functional measurements from Pulsar for PPG, while specific functional measurements by Pulsar were more sensitive than the structural measurement by OCT for EG. Therefore, a combination of structural and functional measurements is recommended to reliably diagnose early glaucoma.

Methods

This prospective cross-sectional study was reviewed and approved by the Kitasato University Hospital Ethics Committee (no. B14-129). All study conduct adhered to the tenets of the Declaration of Helsinki, and all study subjects provided written informed consent. This study was registered in the UMIN Clinical Trials Registry (http://www.umin.ac.jp/) under the unique trial number UMIN000016055.

Study participants

This study included sixty-three eyes of 63 open angle glaucoma patients who visited the Kitasato University Hospital Glaucoma Service between November 2014 and May 2016 and who had previous SAP results from a Humphrey Field Analyzer (HFA) 24ā€“2 or 30ā€“2 Swedish Interactive Threshold Algorithm (SITA) Standard better than a mean deviation of āˆ’3 dB. Additionally, the control group consisted of 45 eyes of 45 normal volunteers from a population of Kitasato University Hospital medical staff and Kitasato University staff members who were recruited between January and May 2016 and who had previous SAP measurements from a HFA 24ā€“2 or 30ā€“2 SITA Standard at least 2 times within one year. The diagnosis of glaucoma was determined via a fundus examination using slit-lamp indirect ophthalmoscopy and 90-diopter lens by one of three glaucoma specialists (MK, KM, or NS) and based on previous SAP results. All glaucoma patients and normal participants underwent a comprehensive ophthalmic examination, including noncycloplegic refraction testing, visual acuity testing at 5 meters using a Landolt ring chart, intraocular pressure measurement, ocular axial length measurement, and slit-lamp and fundus examination by a glaucoma specialist (MK, KM, or NS). Glaucoma patients were included in this study if they had a corrected visual acuity of 20/20 or better, cylindrical power of āˆ’1.50 diopter or less, and spherical equivalent of āˆ’8.00 to +5.00 diopter. These criteria were also applied to normal participants, with the added criteria of an intraocular pressure of 21ā€‰mmHg or less, a normal optic disc appearance, and no ophthalmic diseases in the absence of refractive error.

After comprehensive ophthalmic examination, all glaucoma patients and normal participants underwent an initial SAP measurement. This SAP measurement was performed with an HFA (Carl Zeiss Meditec, Dublin, CA) 24ā€“2 or 30ā€“2 SITA Standard. SAP results were considered reliable if the fixation loss was <20%, the false positive rate was <15%, and the false negative rate was <33%.

Early glaucoma patients

After SAP measurement, glaucoma patients were classified as EG if they showed structural glaucoma changes such as rim thinning, notching, and nerve fibre layer thinning or defects, and if they showed abnormal SAP results corresponding with Anderson-Patella criteria45.

Pre-perimetric glaucoma patients

After SAP measurement, patients were classified as PPG if they showed the abovementioned structural glaucoma changes in the absence of abnormal SAP results. After SAP measurements, all patients and normal participants underwent Pulsar, Flicker and SD-OCT in random order.

Pulsar perimetry

Pulsar perimetry was performed using the Octopus 600 perimeter 32ā€‰P TOP algorithm. The 32ā€‰P test point is similar to the original Octopus 32 test point program with a 6 degree interval, but the 4 points at the superior and inferior were each removed because of limitations of the angle of field of the monitor. The stimulus consisted of a circular, sinusoidal, 5-degree diameter grating pattern that was presented for 500 msec. The stimulus underwent a counter phase pulse motion at 10ā€‰Hz, in which both spatial resolution (from 0.5 to 6.3 cycle/degree on a 12-step log scale) and contrast (from 3 to 100% on a 32-step log scale) were simultaneously modified. Threshold sensitivity is expressed in spatial resolution contrast units (src). Refractive error was corrected to distance by inserting trial lenses with the spherical equivalent correction into the eye piece. The presentation ratios of FP and FN catch trials were configured to 10% of the total number of stimuli presented for Octopus 600 testing reliability, which corresponds with those of the HFA SAP performed with the SITA protocol.

Flicker perimetry

Flicker perimetry was performed using the Octopus 311 perimeter (Haag-Streit, Koeniz, Switzerland) 32 TOP algorithm. The stimulus consisted of a Goldmann size III (0.43 degree) target with a luminance of 1527ā€‰cd/m2 (4800 apostilbs) that was presented for 1ā€‰sec. The flicker targets were presented under a supra-threshold condition with a background luminance of 10ā€‰cd/m2 (31.5 apostilbs), and critical flicker frequency values were evaluated at each test point. Threshold sensitivity is expressed in critical flicker frequency (Hz). The presentation ratios of FP and FN catch trials were configured to 10% of the total number of stimuli presented for Octopus 311 testing reliability, which corresponds with those of HFA SAP performed with the SITA protocol.

SD-OCT

SD-OCT imaging was performed using 3D OCT-2000 version 8.1.1 (Topcon, Tokyo, Japan) in the 3D optic disc horizontal raster scan mode (OCT-disc) with a 512ā€‰Ć—ā€‰128 scan resolution and 6ā€‰Ć—ā€‰6ā€‰mm scan area and in the 3D macular vertical raster scan mode (OCT-macular) with a 512ā€‰Ć—ā€‰128 scan resolution and 7ā€‰Ć—ā€‰7ā€‰mm scan area. This device operates at a speed of 50,000 A-scans per second and has a depth and lateral resolution of 6 and 20ā€‰Ī¼m or less, respectively. It requires a pupil size of 2.5ā€‰mm or larger for imaging. Ocular magnification was corrected based on Littmannā€™s formula51.

Outcome measures and exclusion criteria

The main outcome measures were the diagnostic capability of each device using the best cut-off parameter for discriminating between healthy and glaucomatous eyes and the agreement of detectability between structural and functional measurements. The secondary outcome measures were the comparison of reliability indices, including FP and FN, and the test duration between Pulsar and Flicker.

All examinations were performed within a three-month period. The results of the first examination were excluded to avoid learning effects. Right eye results were converted to left eye format for analysis. The exclusion criteria were as follows: fixation loss >20% and false positive rate >15% in HFA measurement; reliability factor >15%, which is the average of the FP and FN rates in Flicker and Pulsar; and image quality index <30 in SD-OCT.

Statistical analysis

Normality of the data distribution was evaluated using the Shapiro-Wilk test. Test results were compared using either paired t-tests or Wilcoxon signed-rank tests. A Bonferroni test was used to correct for multiple testing. The best cut-off parameter for each device for discrimination between healthy and glaucomatous eyes was decided by the highest AUC based on receiver operating characteristic analysis. The detectability of each device was assessed using the AUC of the best cut-off parameter by the DeLong method. Kappa statistics were calculated to evaluate agreement of detectability between structural and functional measurements. All data were analysed using commercially available SPSS version 22.0 (IBM Japan Ltd, Tokyo, Japan) and MedCalc version 16.1 (MedCalc Software, Ostend, Belgium).