Introduction

Glaucoma is an optic neuropathy characterized by the gradual loss of retinal ganglion cells and their axons, which can lead to vision loss1. Clinical detection and monitoring of glaucoma involves the assessment of functional vision loss using visual field (VF) testing, and also measuring structural loss through optical coherence tomography (OCT)2. VF testing demands active participation from patients and presents several challenges such as lengthy test durations and high variability due to its subjective nature3. TEMPO, formally called IMOvifa, is a novel standard automated perimeter with binocular random testing4,5,6. Recent studies have suggested that binocular VF testing may effectively suppress eye movements and stabilize fixation, thus potentially enhancing the reliability of test results7. Moreover, this device also adjusts the stimulus presentation point by tracking eye movements6,8,9.

We hypothesized that this new technology could reduce testing duration and patient fatigue, minimize variability in test results, and improve the correlation between structural and functional data. The purpose of this study was to compare the TEMPO with the Humphrey field analyzer (HFA), the most widely used automated perimeter.

Results

740 eyes (including 68 healthy, 262 glaucoma suspects, and 410 glaucoma) of 370 participants (mean age, 67.6 years [95% CI 66.6–68.6]; 222 female [60.0%] and 148 male [40.0%]; and 219 White [59.2%], 58 Asian [15.7%], 20 African American [5.4%], 44 other or mixed race [11.9%], and 28 unknown or not reported race [7.6%]) were included. Demographic and baseline clinical characteristics of the participants are presented in Table 1.

Table 1 Demographic and baseline clinical characteristics of the participants.

The comparison for VF parameters and reliability indices between HFA and TEMPO is summarized in Table 2. No significant differences were seen in mean deviation (MD) and visual field index (VFI) between the two perimeters (P > 0.05). While significant differences were seen in pattern standard deviation (PSD) (4.1 [3.9, 4.4] dB for HFA and 4.7 [4.5, 5.0] dB for TEMPO; P < 0.001) and foveal threshold (33.3 [32.9, 33.7] dB for HFA and 30.8 [30.2, 31.3] dB for TEMPO; P < 0.001). Bland–Altman scatterplots showed reasonable agreement between the two perimeters. The mean difference (95% limits of agreement [LoA]) was − 0.2 (− 4.8, 4.3) dB for MD, − 0.6 (− 4.7, 3.4) dB for PSD, and 0.4 (− 12.7, 13.4) for VFI, respectively (Supplemental Fig. 1). Figure 1 illustrates the comparison of reliability indices for TEMPO and HFA; fixation loss (11.2 [10.0, 12.5] % for HFA and 8.9 [7.3, 10.5] % for TEMPO), false positive (4.1 [3.7, 4.6] % for HFA and 1.3 [1.1, 1.5] % for TEMPO), and false negative (4.3 [3.8, 4.8] % for HFA and 0.4 [0.3, 0.4] % for TEMPO). Measurement time was faster for TEMPO Ambient Interactive Zippy Estimated by Sequential Testing (AIZE)-Rapid compared to HFA SIFA-Fast (261 s vs. 429 s; P < 0.001). Point-by-point analysis of sensitivities for total deviation at each location comparing TEMPO and HFA was shown in Supplemental Fig. 3. The absolute mean differences observed were less than 2 dB across all test points.

Table 2 Comparison between TEMPO and Humphrey Field Analyzer categorized by diagnosis.
Figure 1
figure 1

Comparison of reliability indices between Humphrey Field Analyzer and TEMPO. Density plot represents the distribution of (A) fixation losses, (B) false positives, and (C) false negatives. Vertical lines indicate the manufacturer recommended reliable cutoff values.

A stronger non-linear (dB and μm) association between VF MS and circumpapillary retinal nerve fiber layer (RNFL) thickness was found for TEMPO (adjusted R2 = 0.25; Akaike information criteria [AIC] = 5235.5) compared to HFA (adjusted R2 = 0.22; AIC = 5263.9). A similar trend was confirmed for the linear (1/L and μm) relationship (adjusted R2 = 0.29; AIC = 5200.8 for TEMPO, and adjusted R2 = 0.22; AIC = 5262.7 for HFA, respectively). Moreover, TEMPO demonstrated higher structure–function relationships compared to HFA in all quadrants (Table 3). In the non-linear relationship, the inferior quadrant for RNFL had the highest association, followed by superior, temporal, and nasal quadrant for RNFL (adjusted R2 = 0.31, 0.29, 0.15, and 0.06 for TEMPO, respectively). In contrast, the linear relationship generally showed higher R2 values compared to non-linear relationship (adjusted R2 = 0.49, 0.43, 0.15, and 0.03 for TEMPO, respectively). Figure 2 illustrates the structure–function relationship between global VF MS from TEMPO and HFA, expressed in dB scale (A and B) and unlogged 1/L scale (C and D), and cpRNFL thickness.

Table 3 Comparison of topographic structure–functional relationship between TEMPO and Humphrey field analyzer.
Figure 2
figure 2

Scatterplots showing the associations between global visual field mean sensitivity from TEMPO and HFA, expressed in dB scale (A, B) and unlogged 1/L scale (C, D), and circumpapillary retinal nerve fiber layer thickness.

Supplemental Fig. 2 provides a summary of the usability findings. 73% of participants preferred TEMPO, while 17% preferred HFA. 83% of participants reported no difficulties with TEMPO. Furthermore, TEMPO received positive feedback in terms of screen readability, ease of concentration, and shorter test duration, as compared to HFA.

Discussion

In this study, we prospectively performed VF testing with both TEMPO AIZE-Rapid and HFA Swedish Interactive Threshold Algorithm (SITA)-Fast in a randomized order and identified a stronger structure–function relationship and better reliability indices with TEMPO compared to HFA. TEMPO reduced measurement time by approximately 40% without compromising perimetric performance. Even though the participants were inexperienced with TEMPO prior to the study, it was strongly preferred by patients.

Effective glaucoma management necessitates functional and structural exams, and correlating these changes ensures reliable tracking of disease progression10,11. Clinicians should optimize and balance considerations such as medical burden, patient preferences, and efficient detection of disease progression to prevent lifelong visual loss. Previous studies have shown that using a combination of structural and functional analyses enhances the ability to detect glaucoma and its progression12,13,14. The current cross-sectional study shows that TEMPO had a stronger structure–function relationship with Cirrus OCT compared to HFA, both globally and sectorally. Our findings support the study by Bowd et al., investigating structure–function relationships using Stratus OCT. They found stronger associations (R2 = 0.33–0.38) in the inferotemporal disc, followed by modest associations in the superotemporal disc area (R2 = 0.19–0.25), and weak associations in the temporal disc area (R2 = 0.02–0.03)15. The lower structure–function relationship in the temporal quadrants, compared to the superior/inferior quadrants, may be attributed to two factors: the higher variability caused by relatively fewer measurement points for visual field and the position of the optic nerve head in relation to the fovea16. Individual anatomical differences, such as variations in the shape, rotation, and tilt of the ONH, can affect the results of structure–function relationship10. Both VF and OCT measurements are prone to inter-subject and test–retest variability, which are major causes of discrepancy in structure–function relationship10,17. Short-term and long-term reproducibility for TEMPO need to be confirmed in future studies.

The average differences in MD, PSD, and VFI between TEMPO and HFA were all within 1 dB. While the mean difference between the measurements was minimal, we observed a trend where the difference increased with the severity of glaucoma (see Supplemental Fig. 1). This suggests the presence of a proportional bias between the two methods. However, the slope of this trend was not steep, indicating that the increase in difference was moderate relative to the increase in glaucoma severity. In the point-by-point analysis, TEMPO showed a higher sensitivity for total deviation at the points closer to the center. This could be due to the inhibitory responses that occur with the non-occluded eye with HFA5,18 or, possibly, due to differences in the algorithm used with AIZE19. However, the trend was consistent with previously reported data4.

HFA SITA-Fast and TEMPO AIZE-Rapid have several items in common regarding reliability indices, but there are some differences. First, false positives are calculated by both devices using reaction time. HFA SITA-Fast uses the percentage of stimuli responded to within a minimum reaction time with an adjustment for the average reaction time of the individual patient20. In contrast, TEMPO AIZE-Rapid uses the percentage of those that have a reaction time of less than 0.3 s. Second, false negatives are calculated as percentage of not responding by presenting a 9 dB bright target to the determined threshold in HFA SITA-Fast, while TEMPO AIZE-Rapid uses percentage of not responding by presenting ≥ 2 dB bright target in the process of threshold determination for all stimuli. Third, fixation loss is calculated as percentage in response to stimulus to blind spots, which is known as Heijl Krakau method, in HFA SITA-Fast. In contrast, TEMPO AIZE-Rapid uses the percentage of stimuli with Gaze tracking greater than 5 degrees. Although the Heijl-Krakau method was introduced in the 1970s and is considered the gold standard21, it has several disadvantages that include time-consuming catch trials, infrequent fixation check, and inaccuracy of fixation loss ratio when the blind spot location is dislocated. In principle, gaze tracking, which monitors the movement of the pupils, should be expected to measure fixation monitoring more accurately, and also is used in HFA SITA-Faster22. In the current study, TEMPO AIZE-Rapid demonstrated lower values than HFA SITA-Fast for all reliability indices. While it's not appropriate to directly compare these indices due to inherent differences in their approaches to reliability, the reduced testing time with TEMPO may be due to lower variability. We speculate that this could account for the improved structure–function relationships and also the overall higher reliability. In the current study, the manufacturer-recommended limits flagged 17.1% of HFA and 10.6% of TEMPO results as low reliability. Fixation losses were the main cause of low reliability for both perimeters (14.4% for HFA and 8.2% for TEMPO). Previous studies have also identified fixation losses as the primary cause of unreliable VF classifications23,24. Comparing these numbers is challenging due to methodological differences, but incorporating more accurate and reliable techniques into future device, while maintaining efficacy, is crucial.

This study has several limitations. First, it was based on participants from a tertiary care academic practice, which could introduce certain biases in socio-economic status, demographics, and severity of disease, which could differ from those of patients treated in other settings. This could potentially restrict the generalizability of our findings. Second, the current software is derived from a database that only consists of data from the Japanese population. Regardless of this limitation, HFA and TEMPO exhibited excellent agreement. Third, certain situations, such as conditions of binocular vision dysfunction like strabismus, anisometropia, nystagmus among others, can render binocular open-eye examinations unfeasible25. These conditions were not evaluated in our study. Last, all participants were using TEMPO for the first time, while they had previous experience with HFA. Although learning effects tend to increase the false positive rate for inexperienced examinees26, false positives were relatively low for TEMPO in our study.

In conclusion, TEMPO showed a stronger structure–function relationship with Cirrus OCT. Further studies are necessary to evaluate the potential of this binocular VF test for longitudinal monitoring.

Methods

Participants

Participants were recruited from patients at the Shiley Eye Institute, University of California, San Diego. The research protocol followed the tenets of the Declaration of Helsinki and was approved by the University of California, San Diego Institutional Review Board. All study participants provided written informed consent.

The participants' eyes were divided into three diagnostic groups: healthy, glaucoma suspect, and glaucoma. Healthy eyes were characterized by intraocular pressure (IOP) ≤ 21 mmHg, normal-appearing optic discs and neuroretinal rims, and normal VF test results defined as PSD within the 95% CI and Glaucoma Hemifield Test (GHT) results within normal limits using SITA 24–2 FAST. Glaucoma suspects were defined as eyes with IOP of ≥ 22 mmHg or glaucomatous-appearing optic discs (glaucomatous optic neuropathy) without repeatable glaucomatous VF damage. Glaucomatous optic neuropathy was defined as excavation, the presence of focal thinning, notching of the neuroretinal rim, or localized or diffuse atrophy of the RNFL by attending physicians based on ophthalmoscopic examination or fundus photographs. Glaucoma was defined as eyes showing at least two reliable (fixation losses and false negatives ≤ 33% and ≤ 15% false positives) and repeatable abnormal (GHT outside normal limits or PSD outside 95% normal limits) VF results using the 24–2 SITA-FAST with similar glaucomatous defect patterns on consecutive testing as evaluated by study investigators. Glaucoma included all types, such as primary open-angle glaucoma, primary angle closure glaucoma, and secondary glaucoma. Eyes were excluded if they had any other ocular or systemic conditions, apart from glaucoma, that could affect VF test results, such as age-related macular degeneration.

Visual field testing

TEMPO (CREWT Medical Systems, Tokyo, Japan) is the commercial name of the product. It is distinct from the portable head-mounted perimeter device known as imo8. This device has two optical systems and pupil-monitoring systems for each eye, allowing independent target presentations and pupil monitoring8. It enables separate testing of each eye and can randomly present test indicators to either eye, with both eyes open, without the examinee knowing which eye is being tested (binocular random single-eye test). AIZE employs Bayesian inference and maximum likelihood methods to determine the threshold, and reduces test time by around 70% compared to the 4–2 dB bracketing method8. AIZE-Rapid maintains the AIZE test method but enhances the representation of interaction with adjacent measurement points4,19.

For the current study, all patients underwent HFA 24–2 SITA-Fast and TEMPO 24–2 AIZE-Rapid on the same day in a randomized order using Goldmann size III (0.431° visual angle) stimuli. Since the purpose of this study was to compare HFA and TEMPO, no exclusions were made at specific cutoff values for reliability indices (fixation losses, false negatives and false positives) for both devices. The binocular random testing mode was selected for testing with TEMPO. In addition, to evaluate usability for patients, there was a questionnaire, as follows: (1) Which device do you prefer?, (2) Did you have any difficulty with the simultaneous examination of both eyes using the novel device?, (3) Was the screen easy to see?, (4) Was it easy to concentrate?, (5) Was the test time short?. These questions were assessed using a 5-point Likert scale.

Structure–function relationship

RNFL was measured using Cirrus spectrum-domain OCT (Carl Zeiss Meditec, Inc, Dublin, CA) Optic Disc Cube 200 × 200 protocol scans. A 3.46 mm diameter circle was automatically placed around the optic disc, providing RNFL thickness globally and in superior, inferior, temporal, and nasal sectors. Coefficient of determination for VF and OCT parameters was calculated and compared using AIC. Age adjusted R2 values and AIC were used to compare the models for goodness of fit. Higher R2 and smaller AIC mean better fit. Given the logarithmic nature of the dB scale, we explored structure–function relationships using both nonlinear and linear models, with dB and 1/L as variables for VF in the regression, respectively. To compare the strength of structure–function relationship between HFA and TEMPO, the absolute value of the residuals from each model were calculated and compared using mixed effects model15. In this model, participants were treated as random effects to account for intra-individual variations. Structure–function relationships were investigated for global (VF mean sensitivity [MS] and circumpapillary RNFL) and sectoral parameters (sectoral VF MS and quadrant RNFL) based on simplified map proposed by Garway-Heath et al. (Fig. 3)27,28. MS was calculated in dB by converting the threshold sensitivity of each test point to a linear scale and then averaging them to obtain the MS values.

Figure 3
figure 3

A topographic map relating visual field test stimulus locations (left) to OCT-measured retinal nerve fiber layer thickness (right) is shown for a right eye.

Statistical analysis

Patient and eye characteristics were reported as mean (95% CI) for continuous data and count (percentage) for categorical data. MD, PSD, foveal threshold (FT), and VFI were compared using mixed-effects model between the two perimeters. Reliability indices were illustrated in a kernel density estimate plot to compare the values from two perimeters. Kernel density estimate is a non-parametric way to estimate the probability density function of a random variable. In other words, it provides a smoothed version of a histogram, giving a continuous curve. The Bland-Altmann plot assessed the LoA between the two perimeters. Measurement time for performing VF for both eyes was recorded for each device. It only accounted for the actual examination time and excluded the setup time needed for testing the second eye with HFA. All statistical analyses were performed with Stata software (version 15; StataCorp, College Station, Texas). Statistical significance for tests was set at P ≤ 0.05.