Introduction

Intraocular pressure (IOP) measurement is one of the main tests used to make decisions in glaucoma despite IOP no longer forming part of the disease definition. Since reduction of IOP is the only evidence-based therapy for glaucoma, accurate measurement of IOP is important to assess response to treatment and monitor the risk of progression. True IOP is rarely measured directly because intracameral manometry is an invasive procedure. Goldmann applanation tonometry (GAT) has been regarded as the reference standard to measure IOP since the mid 1950s. Ready access to slit lamps, low cost and ease of use for trained professionals may explain its appeal. However, disadvantages of GAT include the expertise required to perform the procedure, the subjective nature of determining the result and the potential risk of infection due to tonometer contact with the cornea. In contrast, non-contact tonometry has the advantages of having a lower risk of infection, being automated and user friendly.

Many studies have focused on the influence of central corneal thickness (CCT) on IOP measurement in normal and diseased eyes and it is now widely accepted that IOP is overestimated in thick corneas and underestimated in thin corneas, using all common types of tonometer. Formulas and nomograms have been developed to correct for the effect of CCT on IOP measured by GAT, however the validity of these equations is contested and their routine use is not recommended [1, 2].

Reichert’s Ocular Response Analyser (ORA; Reichert, Inc., Depew, NY) is an automated non-contact tonometer. ORA generates two measurements of IOP: Goldmann correlated IOP (ORA IOPg) and corneal compensated IOP (ORA IOPcc) and two measurements of corneal biomechanics: corneal hysteresis (CH) and corneal resistance factor (CRF).

A systematic review of the agreement between GAT and ORA IOPcc by Cook et al. reported that the pooled mean difference from 12 included studies was 1.5 mmHg (95% CI 0.9 mmHg, 2.2 mmHg) and the pooled predicted 95% limits of agreement (95% LoA) were −3.9–7.0 mmHg [2,3,4,5,6,7,8,9,10,11,12,13,14]. The mean differences reported from the included studies ranged from −0.07 mmHg (95% CI −0.51, 0.37) to 3.60 mmHg (95% CI 2.75, 4.45). The literature review pointed to considerable heterogeneity in these measures and high risk of bias across studies. It included papers through February 2010 but further studies have since been conducted and the systematic review did not report the agreement between GAT and ORA IOPg. Cook et al. also highlighted the paucity of studies investigating patient preferences for methods of IOP measurement.

The primary aim of this study was to evaluate the agreement between GAT and ORA measurements and to assess which covariates influence agreement between these methods of tonometry. We also wished to investigate participant preferences for IOP measurement using GAT and ORA.

Methods

Guidelines for reporting reliability and agreement studies were followed in this study [15]. Both eyes of participants in the ‘Glaucoma within the Northern Ireland Cohort for the Longitudinal Study of Ageing’ (GwNICOLA) were eligible for inclusion. GwNICOLA was a sub-study nested within the Northern Ireland Cohort for the Longitudinal Study of Ageing (NICOLA). The sample size was not specified a priori. NICOLA is an ongoing longitudinal cohort study of ageing in a sample of the Northern Ireland population aged 50 years or older. The “eligible population” for NICOLA was defined as people aged 50 years or older (born on or before September 30, 1962) and living in private residential accommodation in Northern Ireland. The sampling strategy identified addresses within postcode-based geographic regions (geographic stratification). A fixed interval (systematic) sample was drawn from each postcode-based geographical stratum. NICOLA consisted of a computerised assisted personal interview, a self-completion questionnaire and a health assessment performed at a later date. The health assessment included ophthalmic tests such as optic disc stereophotography and IOP measurement with ORA tonometry. Data collection for NICOLA began in February 2014 and ended in April 2018.

NICOLA participants of 50 years of age or older who attended the NICOLA health assessment through January 7, 2017 and had a vertical cup to disc ratio (VCDR) ≥0.7 and/or VCDR asymmetry ≥0.2 and/or vertical neuroretinal rim ratio ≤0.1 on optic disc stereophotography and/or IOP ≥25 mmHg on ORA tonometry were eligible and invited for GwNICOLA. All GwNICOLA examinations took place in the Northern Ireland Clinical Research Facility in Belfast City Hospital between March and December 2017.

At GwNICOLA, participants underwent GAT by a glaucoma expert ophthalmologist (AAB) masked to ORA results according to standard operating procedures. Sequential GAT readings were not masked to the examiner (i.e., the tonometer dial was not covered). Two diastolic GAT measurements were recorded and if there was a difference greater than 2 mmHg between readings a third reading was taken [16]. The GAT IOP was recorded as the mean GAT if two readings were taken and the median GAT if three readings were taken [16].

Subjects underwent ORA tonometry by a masked trained researcher (PMcC) according to standard operating procedures. Three “good quality” measurements with a waveform score (WFS) >4 were captured and mean results of IOP measurements and corneal biomechanical properties were calculated [17, 18]. Mean of three measurements was used in an attempt to mitigate the added variability introduced by the ocular pulse amplitude (OPA) when capturing the IOP during different phases of the cardiac cycle at each measurement [19]. Once three “good quality” measurements were captured, no further measurements were taken for that eye and no more than five consecutive attempts were taken for each eye. The ORA measurement with the best signal values (BSV) was also recorded. BSV is the ORA measurement with the highest WFS among the series of measurements per eye. Eyes with less than three ORA measurements with a WFS > 4 were excluded.

GAT and ORA were performed during the same session ~10 min apart. The order of GAT and ORA was randomly allocated using a sealed envelope method. Non-contact anterior segment tomography (Pentacam HR; Oculus, Wetzlar, Germany) was used to measure CCT and anterior chamber volume (ACV) and non-contact optical biometry (LENSTAR LS900; Haag-Streit, Koeniz, Switzerland) was used to measure axial length (AL). GAT and ORA were calibrated according to the manufacturer’s instructions.

A questionnaire regarding the preferred method of IOP measurement was given as soon as both procedures had been performed. The preference study was performed on a convenience sample of participants from GwNICOLA.

For NICOLA, ethical approval for the study was obtained from the School of Medicine, Dentistry and Biomedical Sciences Ethics Committee, Queen’s University Belfast. The GwNICOLA sub-study was approved by Northern Ireland’s Health and Social Care Research Ethics Committee A (REC reference: 16/NI/0247). All participants of the NICOLA and GwNICOLA studies gave written, informed consent. The NICOLA and GwNICOLA studies were conducted according to “Good Clinical Practice” guidance and the tenets of the Declaration of Helsinki.

Statistical analysis

Summary statistics of baseline continuous variables and frequencies of categorical characteristics for GwNICOLA participants and for participants included and excluded from the analysis are reported below. Comparisons were made using independent sample t tests and chi square test for continuous variables and categorical variables, respectively. Summary statistics are reported for included and excluded eyes and compared using generalised estimating equations (GEE) to account for intra-individual correlations between eyes. Ocular comorbidities including history of intraocular surgery and trauma, which may influence IOP measurement, are also reported.

Inspection of histograms and Kolmogorov–Smirnov tests were used to check for the normality of the distribution of differences in the measurements from the two methods: GAT–ORA IOPg and GAT–ORA IOPcc. Mean bias and 95% LoA were estimated using GAT and ORA IOP measurements. A sensitivity analysis was performed to measure mean bias and 95% LoA when using ORA IOP BSV rather than mean of three measurements.

The percentage of eyes with an absolute difference >2 mmHg (which is considered to be a clinically significant difference) was calculated [2]. Bland–Altman plot analysis was used to demonstrate proportional bias where methods did not agree equally through the range of measurements. Proportional bias was investigated using univariate linear regression of the values of the difference between each method on the average of the two methods.

GEE were then used to investigate how the two dependent variables, GAT–ORA IOPg and GAT–ORA IOPc, varied according to selected independent variables. The following parameters were considered as independent variables in the univariate analysis using GEE: age, sex, glaucoma diagnosis (per eye), ocular comorbidity, the order of GAT and ORA examination, CCT, ACV, CH, CRF and AL.

Fisher’s exact test was used to test for significant differences in the proportions of participants who underwent GAT before ORA and ORA before GAT who had preferences for GAT, ORA or no preference. Statistical analysis was performed using a software program (IBM SPSS Statistics for Windows, Version 24.0. Armonk, NY: IBM Corp).

Results

Baseline characteristics of participants of GwNICOLA and comparisons between participants included and excluded from the agreement analysis are presented in Table 1. Of both eyes from the 128 participants who attended the GwNICOLA study, 228 eyes from 120 participants were included following exclusion of eyes with less than three ORA measurements with WFS > 4. Baseline characteristics of eyes included in the analysis and differences between included and excluded are presented in Table 2.

Table 1 Baseline characteristics of participants of GwNICOLA and comparison between participants included and excluded from the agreement analysis
Table 2 Comparison of characteristics of included and excluded eyes

The mean age of participants included in the analysis was 68.0 years and 52.5% of included participants were female. There was no significant difference between the age of included and excluded participants. There were 29 right eyes with ≥1 comorbidity and 32 left eyes with ≥1 comorbidity. Comorbidities are described in Table 3.

Table 3 Ocular comorbidities in included right and left eyes

Of the 120 participants included, 59 were randomised to undergo GAT before ORA and 61 were randomised to undergo ORA before GAT. All ORA readings were calculated from the mean of three measurements and all GAT readings were calculated from the mean of two measurements.

The results of the agreement between measurements of IOP using GAT and ORA are shown in Table 4. Bland–Altman plots of the results are displayed in Fig. 1. ORA IOPg significantly underestimated GAT (p < 0.001) and the mean difference between GAT and ORA IOPcc was not statistically significant (p = 0.20). There was no statistically significant proportional bias between mean IOP and GAT–ORA IOPg or between mean IOP and GAT–ORA IOPcc. The percentage of eyes with an absolute difference greater than 2 mmHg between GAT and ORA IOPg and GAT and ORA IOPcc were 44.7% and 40.8%, respectively. Sensitivity analysis showed that GAT–ORA IOPg (BSV) mean bias was 0.95 (95% CI 0.62, 1.29; p < 0.001) and the 95% LoA were from 5.93 to −4.03, and that GAT–ORA IOPcc (BSV) mean bias was −0.16 (95% CI −0.45, 0.23; p = 0.53) and the 95% LoA were from 4.98 to −5.30.

Table 4 Agreement analysis: intraocular pressure (IOP) results from ocular response analyser (ORA) and Goldmann applanation tonometry (GAT)
Fig. 1
figure 1

Bland–Altman plot between GAT and ORA IOPg (left) and GAT and ORA IOPcc (right)

Using GEE to combine the analysis of right and left eyes whilst accounting for intra-individual correlation in univariate analysis, the following parameters were statistically significantly associated with GAT–ORA IOPg: CRF (β −0.363, p < 0.001) and CCT (β −0.020, p < 0.001). Using GEE, there were no statistically significant correlations between GAT–ORA IOPg and age (p = 0.32), sex (p = 0.27), order of measurement (p = 0.12), diagnosis of glaucoma (p = 0.29), ocular comorbidity (p = 0.65), AL (p = 0.07), CH (p = 0.16) or ACV (p = 0.66). There was significant multicollinearity between CCT and CRF therefore they were not included in the same multivariate model.

Using GEE to combine the analysis of right and left eyes whilst accounting for intra-individual correlation in univariate analysis, the following parameters were statistically significantly associated with GAT–ORA IOPcc: CRF (β 0.484, p < 0.001) and CH (β 0.972, p < 0.001). Using GEE, there were no statistically significant correlations between GAT–ORA IOPcc and age (p = 0.14), sex (p = 0.61), order of measurement (p = 0.11), diagnosis of glaucoma (p = 0.89), ocular comorbidity (p = 0.19), CCT (p = 0.09), ACV (p = 0.74) or AL (p = 0.07). There was significant multicollinearity between CH and CRF therefore they were not included in the same multivariate model.

There were 69 participants who completed the questionnaire, which recorded if the participant preferred GAT, ORA or had no preference. Most people (n = 40; 57.9%) reported no preference, but among those who expressed a preference (n = 29) the majority chose ORA (n = 22). There were no statistically significant associations between the order of ORA and GAT measurement and participant preferences (Fisher’s exact test p = 0.19).

Discussion

In the current study we investigated the agreement in IOP measurements with GAT and ORA. We also investigated participant preference for IOP measurement method.

The ORA IOP measurement is taken within milliseconds and has been reported to be significantly correlated with OPA whereby IOP is variable depending on which phase of the cardiac cycle the measurement is taken [14]. Therefore it has been suggested that when the three ORA IOP measurements are averaged, factors other than OPA must be important in ORA IOP variability [14]. Our sensitivity analysis showed that the use of the mean of three ORA measurements provided less mean bias and narrower 95% LoA between ORA IOP and GAT than the use of single ORA IOP BSV measurements.

We report that ORA IOPg systematically underestimated GAT by 0.83 mmHg (p < 0.001) and ORA IOPcc overestimated GAT by 0.23 mmHg (p = 0.20). We demonstrated no bias for ORA IOPcc compared with GAT, which suggests that ORA IOPcc shows greater agreement with GAT than ORA IOPg in this population. There was no evidence of proportional bias between mean IOP and GAT–ORA IOPcc or between mean IOP and GAT–ORA IOPg. Although ORA IOPcc showed no significant mean difference compared with GAT, the 95% LoA were wider for GAT–ORA IOPcc than for GAT–ORA IOPg. However, the percentage of differences between ORA IOPcc and GAT that were >2 mmHg was lower than the percentage of differences between ORA IOPg and GAT that were >2 mmHg. This means that, although GAT–ORA IOPcc has a wider 95% LoA, it has no significant mean bias and it has a smaller proportion of differences that were >2 mmHg, which may be the more clinically relevant outcome because a difference >2 mmHg would be considered clinically significant [2]. It has been recommended that inter-operator measurements for GAT should be within ±4 mmHg of the mean bias in 95% of eyes under ideal circumstances and in clinical practice these figures may be considerably higher, however, analysis of inter-observer variability was not performed for GAT or ORA in this study [20]. Our results show that variability between GAT and ORA IOPg was ±4.53 mmHg of the mean bias in 95% of eyes and that variability between GAT and ORA IOPcc was ±5.05 mmHg of the mean bias in 95% of eyes. These findings suggest that GAT and ORA IOP measurements may not be interchangeable, however the differences are relatively close to ±4 mmHg of the mean bias in 95% of eyes and the inclusion of comorbidities in this study population may be more representative of clinical practice than ideal circumstances. Furthermore, both GAT and ORA require corneal applanation to measure IOP, therefore, without intracameral IOP measurement, we cannot determine whether GAT or ORA IOP measurements are the superior measure of true IOP. It could be that ORA is a superior measure even though GAT is the traditional gold standard.

A diagnosis of glaucoma did not have a statistically significant effect on the agreement between GAT and ORA measurements of IOP. The order of measurement by the two measurement devices did not have a statistically significant effect on the agreement between GAT and ORA IOPg and GAT and ORA IOPcc. This suggests that there are no significant effects caused by the application of drops and the tonographic applanation during GAT.

Our results differ from those of the systematic review by Cook et al. [2] who found a statistically significant mean difference of 1.5 mmHg between GAT and ORA IOPcc. A number of studies that report the agreement between GAT and ORA measurements of IOP were not included in the systematic review by Cook et al. [21,22,23,24,25,26,27,28]. The mean difference between GAT and ORA IOPcc ranged between 0.8 and 8.3 mmHg and the mean difference between GAT and ORA IOPg ranged between 0.86 and 7.2 mmHg in these studies [21, 22]. Direct comparisons between results of these studies and our study are difficult due to differences in study populations and designs.

We showed that CRF and CCT were associated with the GAT–ORA IOPg difference and that CRF and CH were statistically significantly associated with the GAT–ORA IOPcc difference. Renier et al. reported that the difference between GAT and ORA IOPg was significantly associated with CRF and the difference between GAT and ORA IOPcc was significantly associated with CH [28].

Cook et al. reported that 46% of ORA IOPcc measurements were within 2 mmHg of GAT compared with our study in which we reported ~60% of ORA IOPcc measurements within 2 mmHg of GAT. The lower percentage of GAT–ORA IOPcc >2 mmHg in this study may be related to the high level of expertise in the measurement of GAT. We also reported that ~55% of ORA IOPg measurements within 2 mmHg of GAT.

Our preference study indicated that participants tended to have no preference or preferred ORA over GAT irrespective of the order of measurement. This suggests that the application of anaesthetic drops did not influence the preference for ORA over GAT. A study by Vandewalle et al. reported that no patients reported more than moderate discomfort with any of the procedures in an agreement study, which included GAT and ORA [13].

This study had a number of strengths and limitations. Among the strengths we would highlight the robust methods used to prevent risk of bias, including randomisation of the order of testing, and masking of investigators. The study included participants referred from a population-based study who were suspected of having glaucoma. The results could possibly be similar to the UK populations referred from the community to secondary care for definitive diagnosis of glaucoma; however, generalisability is limited beyond this subset of subjects.

Repeatability between ORA and GAT could not be meaningfully compared with this study because sequential GAT readings were not masked (i.e., the tonometer dial was not covered to the examiner) and it is therefore likely that GAT repeatability would be highly biased. Unmasked GAT examination was considered to be more representative of clinical practice. If ORA IOP were to be more repeatable than GAT it may be of greater value, despite the differences in measurements obtained with different devices.

Wang et al. reported that, in healthy eyes, GAT had less variability than ORA IOPg when GAT and ORA IOP were obtained by different observers, but ORA IOPg had less variability than GAT when GAT and ORA IOP were obtained by the same observer [19]. Wang et al. also found that ORA IOPcc was more variable than ORA IOPg. ORA IOPcc is recognised to have greater variability than ORA IOPg because ORA IOPcc is calculated using an algorithm including ORA IOPg and corneal biomechanics, each with their own variability. However, the study by Wang et al. may not be generalisable to the UK populations referred from the community to secondary care due to its age range, IOP range and the comorbidity profile [19].

In conclusion, although systematic bias between GAT and ORA IOP readings was of negligible clinical significance, the wide 95% LoAs and the high percentage of differences greater than 2 mmHg suggest that GAT and ORA IOP measurements may not be interchangeable in clinical practice. There were no differences in patient preferences.

Summary

What was known before

  • Mean difference between GAT and ORA IOPcc was reported to be 1.5 mmHg and statistically significant in previous pooled analysis. Mean difference between GAT and ORA IOPg was not reported in previous pooled analysis. There has been a paucity of research into patient preferences for methods of tonometry.

What this study adds

  • There was no bias for ORA IOPcc compared with GAT but IOPg significantly underestimated GAT by 0.83 mmHg. The relatively high percentage of differences between GAT and ORA IOPcc and between GAT and ORA IOPg that were >2 mmHg suggest that the methods are not interchangeable. Participants showed no clear preference for either method of tonometry.