Introduction

The results of threshold visual field tests are often subjected to a number of analytical methods that can be broadly classified as point-wise or global. Global methods reduce the visual field data to a single number that can be compared with normative databases and monitored over time to aid the clinician in making decisions regarding the management of new and existing patients. Most of the visual field global indices we use today have been specifically designed for the detection and monitoring of glaucoma. They recognize and focus on specific characteristics of glaucomatous loss; changes in overall sensitivity (mean deviation (MD)), localized defects (pattern standard deviation (PSD)) and asymmetry between the superior and inferior hemifields (glaucoma hemifield test (GHT)).1, 2, 3

The frequent asymmetry between the superior and inferior hemifields of glaucomatous eyes was first recognized in the 1980s when several research groups4, 5, 6 described the onset and evolution of glaucomatous visual field defects. In 1985 Duggan et al7 reported that high discriminatory power between normal and POAG eyes could be achieved by comparing the vertical hemifields. The diagnostic potential of vertical asymmetries was further developed by Asman and Heijl8, 9 who derived the GHT that is incorporated in the Humphrey Field Analyzer (HFA; Humphrey STATPAC; Allergan Humphrey, San Leandro, CA, USA). The GHT uses a subset (22 test locations in the superior/inferior hemifields) of the 24-2 test pattern. The 22 test locations in each hemifield are divided into 5 sectors on the basis of the mapping of retinal locations to the optic nerve head. The inferior sectors being vertically mirrored copies of the superior sectors. The GHT assigns a score to each test point based upon values presented in the pattern deviation probability maps and then calculates a sum for each sector. The sector differences are then compared with a database of controls. If any sector pair is beyond the 0.5 or 99.5% normal limits the test is classed as ‘outside normal limits’. If none of the sector differences is beyond these limits but they are beyond the 3 or 97% limits the field is classified as ‘borderline’. If neither of the limits are exceeded the field is classified as ‘within normal limits’. To capture symmetrical losses within each sector pair the GHT also looks at the total score for each of the 10 sectors

GHT has found wide acceptance within the glaucoma faculty. It has been shown to provide earlier discrimination of visual field loss than other global indices (MD; PSD) especially when repeated measures show consistent asymmetry.10 Its high diagnostic performance has led to it being used in several major glaucoma trials including the Collaborative Initial Glaucoma Treatment Study11 and the Early Manifest Glaucoma Treatment Study.12

The GHT assumes a constant mapping of the retinal locations to the optic nerve head, which is now recognized as being incorrect.11 There is thus a danger that early loss could cut across the GHT sector boundaries and fail to show significant abnormality in either sector.

The use of five descriptive outputs from the GHT, rather than a continuous scale as is found with other global indices (MD and PSD), is less than ideal. A continuous output scale could include probability estimates and allow asymmetry to be tracked over time.

The computation of the GHT output is based upon the values presented in the pattern deviation probability map and is thus difficult to reproduce outside of the HFA software as the probability limits are not in the public domain.

The aim of this study is to describe three new superior–inferior asymmetry analyses for detecting visual field loss in POAG and to compare these with the GHT. Our objectives in developing these new indices are to avoid the use of sectors within each hemifield, to provide a continuous output scale with probability values and to use well-defined and easily reproduced computational methods that can be incorporated in third-party analytical software.

Patients and methods

All the data used in this study were obtained retrospectively from patients attending the Manchester Royal Eye Hospital between 2007–2012. Visual field data came from the 24-2 test pattern using the SITA strategy of the HFA. Visual field data were exported from the HFA to Microsoft Excel 2010 (Microsoft, Redmond, WA, USA) for analysis. The stage of visual field loss was quantified with Glaucoma Staging System II (GSS 2).13 Selection took no account of the reliability indices produced by the HFA software.

Patients

A total of, 30 899 SITA 24-2 visual field records were initially extracted from the database. Eyes with GSS 2 stage ≤1, were then randomly selected and, following a record review to ensure they meet the inclusion criteria, assigned to one of the following groups.

Controls (n=412 eyes from 296 patients)

All patients had normal visual fields in both eyes at two consecutive visits (GHT within normal limits and no clusters of three or more neighboring locations with an age-adjusted defect of ≥5 dB). Each included eye had a corrected visual acuity of 6/9 or better in patients <50 years of age or 6/12 or better in patients >50 years of age, intraocular pressures (IOP) of <22 mm Hg (Goldmann applanation), refractive error of between −5 and +5 diopters (spherical equivalent), no history of serious eye disease or trauma, a normal optic nerve head (clinical examination) and an absence of any abnormal ocular findings likely to affect the visual field.

Primary open angle glaucoma (n=247 eyes from 222 patients)

All POAG eyes had an IOP history of >21 mm Hg open angles observed by gonioscopic examination and a clinical diagnosis of POAG. In some cases the selected eye was the better of the two eyes and the diagnosis of POAG might have largely been made on the basis of the other eyes findings. These eyes were included to provide a significant challenge to asymmetry analyses.

New superior/inferior asymmetry analysis

The new asymmetry analyses use the same test locations as the GHT. This is a subset of the 24-2 test pattern, (44/54) excluding locations at or near the blind spot, see Figure 1. Each test location in the superior hemifield was paired with vertically mirrored location in the inferior hemifield resulting in 22 test pairs.

Figure 1
figure 1

A total of 22 pairs of asymmetry points. The 44 test points (black and gray annulus) of the 24-2 test pattern used in the analysis. Each point in the superior hemifield (black annulus) was paired with a vertically reversed point in the inferior hemifield (gray annulus) to provide 22 test pairs.

Figure 2 gives a flowchart of the analytical steps used in the new asymmetry analyses. The analysis is based on HFA defect values rather than a scoring of the values presented in the pattern deviation probability map, as used in the GHT, and does not pool data within any sectors. A bootstrapping procedure was used on the control sample (2000 samples of 412 eyes, randomly selected with replacement) from which the analysis empirically derived the normal limits for the hemifield mean difference (HMD) and the Hemifield Standard Deviation (HSD) of differences.

Figure 2
figure 2

Analytical flowchart. Flowchart of the analytical steps taken in the hemifield analysis. NP is the number of pairs falling outside the 85% confidence limits for each of the 22 test pairs.

The sensitivity and specificity of various cut-off criteria were calculated from the control and POAG populations to derive ROC curves for HMD, HSD, and number of test pairs (NP). The sensitivity and specificity of the GHT analysis (values copied from charts) was also established when borderline values were classified as ‘within normal limits’ and ‘outside normal limits’. The ROC curve areas were compared using a method derived by Hanley and McNeil.14

A proportional Venn diagram was constructed to graphically represent the agreement for the presence of a defect in the POAG eyes. The natures of disagreements were explored by visual inspection of the visual field charts.

Contingency tables (3 × 3) were constructed to further explore the agreement between GHT and HMD, and between GHT and HSD in both control and POAG eyes.

Results

This study included POAG eyes with little or no visual field loss on the basis of GSS 2 (41 in stage 0; 25 in stage borderline; 181 in stage 1) and 412 control eyes (stage 0). Subjects with POAG were slightly older than controls, and the difference was statistically significant (P<0.01) with mean age (years) for control and POAG were 60.4±10.6 and 71.7±13.2, respectively. Mean MD and mean PSD for controls and POAG were also statistically significant. Mean MD (dB) was 0.31±0.85 in controls and −2.21±1.32 (dB) in POAG; mean PSD (dB) in controls and POAG were 1.63±0.32 and 3.96±3.12, respectively. Mean number of pairs (max 22) that fall outside various confidence limits derived from the control population were also calculated. The number of pairs for the POAG eyes that fall outside various confidence limits were, as suspected, significantly larger (P<0.01) than the controls.

Figure 3 gives the ROC curves for the new asymmetry indices along with the results from the GHT. The discriminatory power of NP and HSD as measured by the area under the ROC curve is better than GHT, but not statistically significant (P>0.05; area under the ROC curve (AUC): NP=0.863 (95% CI 0.832–0.893); HSD=0.864 (95% CI 0.833–0.894); GHT=0.792 (95% CI 0.754–0.829), whereas that for HMD was lower (AUC HMD=0.745 (95% CI 0.705–0.786). The AUC of NP used the 85% limit of the control group was selected on the basis of the data. This is likely to have led to an overestimation of diagnostic performance. Alternative probability limits (90% and 95%) led to a slightly lower performance (AUC 0.849 and 0.846, respectively). At 95% specificity, the sensitivity value for HMD, HSD, and NP were 39.2% (95% CI 36.4–42.1%), 55.5% (95% CI 52.5–58.4%), and 51.4% (95% CI 48.0–54.4%), respectively.

Figure 3
figure 3

Area under the receiver operator characteristic (ROC) for new indices. ROC curves for: GHT when borderline values are classified as within and outside normal limits; NP is the number of pairs falling outside 85% confidence limits. The numbers along NP curve represent the number of pairs falling outside the 85% probability limits.

The Venn diagram in Figure 4a illustrates the good agreement for the presence of a defect in the POAG eyes between the indices HSD, NP, and GHT. HMD has not been included owing to limitations of proportional Venn diagrams and its relatively poor discriminatory performance. Complete agreement for presence of defect was found in 91 eyes (79%). Five examples where the GHT and new asymmetry analyses give different outcomes are given in Table 1. The format of the new indices follows that of the MD and PSD indices, giving a measure of the asymmetry followed by a P-value if it falls <10%. Categories for P-values being <10, <5, <2, <1, and <0.5%. Comments are included in the table.

Figure 4
figure 4

The agreement with GHT. (a) Venn diagram showing agreement for presence of a defect in POAG eyes. NP is the number of points falling outside 85% confidence limits. This figure only includes data where eyes fell within the normal or defect (no borderline) classification for all three indices (N=115). Diagram was reproduced using eulerAPE software.12 (b) The 3 × 3 contingency tables showing the percentage level of agreement between the two new indices HSD and NP and GHT, for the POAG and control sample populations. For HSD and NP cutoffs were selected to match, as close as possible, the two GHT values.

Table 1 Clinical examples of comparison between the glaucoma hemifield tests and the new asymmetry indices

The 3 × 3 contingency tables, Figure 4b, give a more complete analysis of agreement, including data on the control eyes and a borderline classification. For the POAG eyes complete agreement was 78% (HSD vs GHT) and 82% (NP vs GHT) in the control sample, and 70% (HSD vs GHT) and 71% (NP vs GHT) in the POAG sample. For these tables cut-off criteria for HSD and NP were chosen to match, as close as possible, the GHT values. The contingency tables again highlight the good agreement between GHT, HSD, and NP. Contingency tables for HMD have again not been included because of its relatively poor discriminatory power as evidenced in the ROC curves.

Discussion

The detection of early glaucomatous changes in the visual field is hampered by the large range of values within the normal population. Thus although the global indices MD and PSD are valuable when looking for change in the visual field over time their discriminatory power has never been very high. Global indices based on comparisons within an eye overcome between-subject variability and have a much greater discriminatory potential.

In this paper we have looked at the discriminatory power of three new hemifield indices and compared these to the well-established GHT analysis incorporated in the HFA. A total of 659 eyes, classified on the basis of clinical diagnosis (control, POAG), were used to derive estimates of discriminatory power. The POAG sample was limited to eyes with little or no visual field loss measured by MD and PSD (GSS 2 ≤1). In some cases the diagnosis of POAG seemed to be based more on the existence of POAG in the fellow eye. Including eyes where there is no apparent visual field/structural loss, but a high risk of loss, provides a diagnostic challenge to the new and established hemifield analyses. However, the inclusion of these eyes will tend to depress the reported sensitivity estimates.

The inclusion criteria of ‘GHT within normal limits’ for the control group may have had the opposite effect, reducing variance in the control group and hence improving the performance of the hemifield analyses. Although the reported sensitivity and specificity values are dependent upon the inclusion criteria this should not affect the relative performance of the indices as they are all based upon the same data sets.

In this study we did not exclude eyes on the basis of their reliability indices. This was for two reasons. First, we wished to establish performance on a representative clinical population where poor reliability is not exceptional and, second, the precision of these indices is poor,15 especially the ‘fixation’ index which is often compromised by an error in locating the blind spot at the onset of the test.

Two of the outputs from the GHT are ‘abnormally high sensitivity’ and ‘general reduction in sensitivity’. In this analysis we have ignored data that falls into these categories. Such classifications were rare comprising ~2% of eyes.

Our work looked at three different methods for quantifying the vertical asymmetry. The output from these indices has been designed to be familiar to perimetrists. HMD being similar to MD, HSD similar to PSD and number outside normal limits similar to probability plots.

The discrimination performance of HMD fell well below that of the other two indices, HSD and NP. This finding was not surprising given the relatively poor performance of MD at discriminating between normal eyes and those with early glaucomatous loss. The poor performance of MD is often ascribed to the local nature of early loss where a few locations can show significant loss but whose significance disappears when averaged with a large number of non-damaged locations. A similar explanation can explain the relatively poor performance of HMD. In contrast, HSD can be significant when only a small number of locations differ by moderate amounts.

The choice of pattern deviation probability values in GHT was based upon the work of Asman and Heijl9 who established from an ROC analysis that giving probability values a score that could be summed within each sector gave an improved performance over an analysis of defect values at sensitivity and specificity values above 90%. Our HSD and NP ROC analysis, which does not involve summing scores within sectors and use a similar bootstrapping procedure to that used by Asman and Heijl to provide empirical cutoffs for each pair of test points, performs well at values above 90% and matches the performance of GHT at their cut-off values.

Recent work on the structure–function relationship16, 17 in glaucoma has highlighted how anatomical variations impinge upon the mapping of the visual field to the optic nerve head. Using fixed sectors for the pooling of visual field data introduces a risk that early defects may straddle sector boundaries and fail to reach significance within a sector. The first three cases in Table 1 are examples of where the GHT failed to detect an asymmetry that was evident to the new indices because the defect straddled the GHT sector boundaries.

The use of continuous outputs and probability levels in a format similar to MD and PSD allow better tracking over time than can be achieved with the five output classes of the GHT. Asman and Heijl9 point out that the GHTs performance, which included a borderline classification, was superior to earlier methods developed by Sommer et al1 that used a simple binary (defect/no defect) classification as many defects cannot be classified as either normal or pathological with any certainty. The new algorithm’s continuous scales and probability estimates improve further on this issue allowing a perimetrist to see movement within the sector boundaries of the GHT. The benefits of using a continuous scale are highlighted in the ROC analysis. The GHT analysis only allows two points on the ROC curve (when borderline cases are included in the ‘outside normal limits’ and ‘within normal limits’ categories). This is likely to result in an underestimate of what could be achieved with a continuous scale.

Importantly, the new indices are all based on simple well-defined methods that can easily be imported into third-party software. The authors are happy to supply data on the probability limits needed for such calculations.

In this study a bootstrapping procedure was used to try and mitigate the problem of over-estimating performance when testing a method on the sample from which it was derived. Replication in an independent data set is required to get an improved measure of the algorithms performance.

In conclusion, our new hemifield analysis techniques have been shown to have good diagnostic performance in the early stages of glaucoma. GHT can fail to detect significant asymmetry, detected by HSD and NP, when an early defect crosses sector boundaries. The new indices use methods and outputs that are familiar to clinicians working with glaucoma patients and can easily be adapted for use in third-party software.