Introduction

In recent years, emphasis on patient-centered care has led to increased use of patient-reported outcomes (PROs) in health research. A PRO is “any report of the status of a patient’s health condition that comes directly from the patient, without interpretation of the patient’s response by a clinician or anyone else” [1]. PROMs are instruments that measure PROs [2]. Trialists have used vision-related PROMs, such as the National Eye Institute-Visual Function Questionnaire-25 (NEI-VFQ-25), as either primary or secondary endpoints [3,4,5]. Although PROMs provide important information about the efficacy of interventions and the impact of disease on patient health, interpretation of PROM data has inherent challenges. In this editorial, we discuss the NEI-VFQ-25, an example of a commonly used PROM in ophthalmology trials, and the concept of minimal important difference (MID) that is frequently used to enhance the interpretability of PROM data.

The National Eye Institute-Visual Function Questionnaire (NEI-VFQ)

The NEI-VFQ-25 measures the impact of visual impairment on health-related quality of life [6,7,8]. Each item of the PROM measures impairment, activity limitation, or participant restriction, following the World Health Organization’s International Classification of Functioning Disability and Health framework for measuring the health-related impact of disease [6]. Mangione et al. developed the original 51-item NEI-VFQ in a cohort of patients with age-related cataracts, age-related macular degeneration, diabetic retinopathy, primary open-angle glaucoma, cytomegalovirus retinitis, or low vision from any cause [7, 8]. Subsequently, the group shortened the NEI-VFQ to 25-item, 9-item, and 8-item versions of the PROM [9,10,11].

The NEI-VFQ-25 has three domains—general health and vision (4 items), difficulty with activities (12 items), and responses to vision problems (9 items)—that include items related to near vision, distance vision, driving, peripheral vision, ocular pain, role limitations, dependency, social function, and mental health. Scores for the NEI-VFQ-25 range from 0 to 100, where 100 represents the best possible score. The NEI-VFQ-25, a reliable and validated instrument, can be interviewer- or self-administered, and is translated into over 30 languages [12].

The challenge of interpreting PRO data

Consider a clinical trial that compares a new anti-VEGF agent for neovascular age-related macular degeneration with a currently approved anti-VEGF agent. The objective of this trial is to demonstrate non-inferiority of the new agent to the current standard of care in terms of visual acuity in the study eye. Trialists administer the NEI-VFQ-25 at baseline and at the primary endpoint. What if the trial reports non-inferiority in visual acuity at one year and a mean difference of 6.3 points (p < 0.05) in NEI-VFQ-25 scores between trial arms? How should clinicians interpret these results?

Although the mean difference in PROM score is statistically significant, it is unclear whether a difference of 6.3 points on the NEI-VFQ-25 is trivial, or if this difference is meaningful to patients. Given this challenge, an MID can be utilized to facilitate the interpretation of NEI-VFQ-25 scores.

What is an MID and how do we assess its credibility?

In 1989, Jaeschke et al. coined the term MID, defined as the smallest difference in outcome that patients perceive as important [13]. Two approaches are widely used to estimate MIDs: distribution-based and anchor-based methods [14]. To calculate an MID using a distribution-based method (e.g., 0.5 standard deviation method [15]), researchers rely on statistical characteristics of the study sample, the measurement properties of the PROM, or the statistical significance of the observed change in PROM score. As such, MIDs estimated using distribution-based methods do not necessarily represent differences in PROM scores that are meaningful to patients. Rather, distribution-based MIDs identify a difference in PROM score that is unlikely to be due to random measurement error [16]. In contrast, when using the anchor-based approach to calculate MIDs, researchers relate changes in PROM score to another meaningful external criterion (or anchor) such as a transition item (e.g., How is your vision now compared with how it was before your cataract surgery: much worse, a little worse, same, a little better, or much better? [17]). Thus, when using anchor-based methods to estimate an MID, researchers ascribe meaning to observed changes in PROM score.

Although anchor-based MIDs are preferred to distribution-based MIDs, some anchor-based MID estimates may be more (or less) trustworthy than other MID estimates. The trustworthiness of an anchor-based MID is limited by the choice of anchor—its validity, reliability, and relationship with the PROM—and the statistical method used to calculate the MID [16, 18].

There are five key criteria that researchers may use to assess the credibility of an anchor-based MID [18]:

  1. 1.

    Did the patient respond directly to the PROM and the anchor?

  2. 2.

    Is the anchor easily understandable and relevant for patients?

  3. 3.

    Has the anchor shown a good correlation with the PROM?

  4. 4.

    Is the MID precise?

  5. 5.

    Does the threshold or difference between groups on the anchor used to estimate the MID reflect a small but important difference?

It is imperative that researchers, clinicians, and decision makers using PROMs be able to identify MIDs with high credibility; use of an untrustworthy MID for sample size estimation may undermine the results of an otherwise well-designed RCT or lead to the flawed interpretation of PROM data in trials and systematic reviews.

MIDs for the NEI-VFQ-25

We queried PROMID (www.promid.org), a living inventory of anchor-based MIDs, for PROMs evaluated in patients with ophthalmic conditions [19]. As of September 2021, researchers reported 128 anchor-based MID estimates, including MIDs for the Visual Function Index (28%; 36/128) and the NEI-VFQ-25 (45%; 58/128). Fifty-eight MIDs estimated for the NEI-VFQ-25 are reported in four primary publications: three studies in patients with glaucoma [3, 20, 21], and the fourth study in patients with intermediate or posterior uveitis [22].

Overall, available MIDs for the NEI-VFQ-25 are of low credibility. These estimates are greatly limited by the researchers’ choice of anchor and threshold used to estimate an MID, failure to report the correlation between the PROM and the anchor, and imprecision of MID estimates. Except for Burr et al. who used the EQ-5D-3L as an anchor [21], researchers typically estimated MIDs using the best-corrected visual acuity score [3, 20, 22] or the mean deviation from the visual field [3, 20] as an anchor. Thus, patients did not directly respond to the anchor, nor was the anchor judged to be intuitively understandable for patients. In two studies (50%; 29/58 MIDs) researchers failed to report the correlation between the PROM and the anchor, while the third group of researchers reported a correlation of ≥0.3 (43%; 25/58 MIDs). Such insufficient reporting limits assessors’ ability to make judgments about the relationship between the PROM and anchor, which is critical for establishing credibility. In most cases, authors reported measures of precision associated with the MID, albeit with marked variability in point estimates. Finally, despite the goal to determine an MID, the thresholds on the anchor used to estimate the MID did not actually reflect a difference that is small in magnitude yet important to patients.

To calculate high credibility MIDs for the NEI-VFQ-25, we recommend that researchers use an anchor that is easily understandable and relevant to patients, and ensure the PROM and anchor measure closely related concepts. To demonstrate the validity of the anchor, the correlation between the PROM and anchor should be at least 0.5 [23]. Researchers should also select a threshold on the anchor that reflects an MID, and ideally, provide empirical evidence to justify their choice of threshold.

Conclusion

Researchers considering the use of PROs in evidence-based decision making should ensure that PROMs have adequate measurement properties and are appropriate for the clinical context in which they are used. MIDs are valuable in enhancing the interpretation of PRO-based evidence from individual trials and meta-analyses, and sample size estimation. Despite the widespread use and reporting of vision-related PROMs such as the NEI-VFQ-25, interpretation of these data should be performed with caution as high credibility MIDs are lacking. Researchers estimating MIDs for the NEI-VFQ-25 should select easily interpretable, patient-important anchors with a strong correlation with the PROM, and ensure that the threshold selected to estimate an MID is appropriate.

Copyright © 2018 McMaster University, Hamilton, Ontario, Canada.

The Minimal Important Difference Credibility Assessment Tool and Minimal Important Difference Inventory, authored by Dr Tahira Devji et al., is the copyright of McMaster University (Copyright © 2018, McMaster University, Hamilton, Ontario, Canada). The Minimal Important Difference Credibility Assessment Tool and Minimal Important difference Inventory have been provided under license from McMaster University and must not be copied, distributed or used in any way without the prior written consent of McMaster University.

Contact the McMaster Industry Liaison Office at McMaster University, email: milo@mcmaster.ca for licensing details.