Introduction

A corneal graft is one of the most frequent transplantation procedures performed worldwide. There were 4504 corneal transplants performed in the United Kingdom (UK) in the year 2019–2020, of which 35% were for Fuchs endothelial dystrophy (FED), 17% were re-grafts, 13% were for pseudophakic bullous keratopathy (PBK) and 13% were for keratoconus (KC) [1]. Endothelial keratoplasty (EK) rather than penetrating keratoplasty (PK) was performed in 97% of eyes with FED. The EK techniques are sub-divided as Descemet Stripping Endothelial Keratoplasty (DSEK) and Descemet Membrane Endothelial Keratoplasty (DMEK). Several factors influence the risk of graft failure, such as perioperative surgical complications, glaucoma, and allograft rejection, which is in turn affected by the number of prior transplants and corneal vascularisation [2,3,4,5]. Additional surgeon-specific risks for transplant failure are not well described but include surgeon experience, case workload, clinical support, and surgical competence [4].

The audit of transplant outcomes in the UK is an essential component of medical revalidation, and the NHS Standard Contract also mandates participation in clinical audits [6]. Regulatory authorities require eye banks to maintain records to ensure the traceability of all ocular tissues used for transplantation. Surgeons undertaking corneal transplantation must complete a standardised data collection form at the time of surgery; they are also strongly encouraged to return follow-up forms at 1, 2, and 5 years post-operatively. Transplant details include the primary corneal diagnosis, transplant type, previous ocular surgery, potential risk factors for transplant failure, transplanting centre and consultant. Follow-up data reports on critical events such as allograft rejection episodes, graft failure, visual acuity, and patient death. Corneal transplant outcomes are analysed for the three most frequent indications (FED, KC and PBK) and reported annually to each surgeon and transplant centre in the UK. A reference standard or benchmark is required to effectively audit outcomes, such as a national rate or an agreed clinical standard.

In addition to visual outcome, the rates of graft rejection and failure at a given time point post-transplant are used to quantify and compare the success of corneal transplantation. The standard analysis of these rates uses a time-to-event methodology, also known as survival analysis, which allows patients with incomplete follow-up data to be incorporated into survival rates. Kaplan–Meier estimates can make univariable rate comparisons, which is most effective when comparing a small number of independent groups, e.g., different graft types. When comparing numerous groups, e.g., multiple surgeons, Kaplan–Meier plots are not easy to interpret as multiple lines may overlap, and it is not possible to compare individual data to pooled national rates due to a lack of independence between the groups. In contrast, a funnel plot can display transplant activity and graft survival rates for each surgeon and compare these to the national rate and its associated 95 and 99.8% confidence intervals. An unadjusted funnel plot reports the observed graft failure rate for each surgeon, but a disadvantage of this approach is that a surgeon with a low-risk case mix is likely to have a graft failure rate that is unfairly favourable when compared to the national average. Conversely, a surgeon with a high-risk case mix may be unfairly penalised. Using a risk-adjusted funnel plot can mitigate the effect of an unequal case mix. Based on each surgeon’s case mix, the expected survival is compared with their observed survival, and the survival rate reported on the funnel plot adjusted accordingly. Visualisation of this data facilitates the interpretation of information for clinical audits and aids decision making [7, 8]. To illustrate the two methods, we compared Kaplan–Meier survival curves and funnel plots as tools for monitoring surgeon outcomes following the first corneal transplantation for FED.

Materials and methods

Data collection and cohort selection

Data were collected by NHS Blood and Transplant (NHSBT) for the UK Transplant Registry. To reduce potential variation in surgeon case mix we only included patients with FED who received a first corneal transplant (PK or EK) in either eye performed in the UK between January 2012 and December 2017. A transplant was recorded according to the consultant who was allocated the tissue, although surgery could have been performed by a trainee under their supervision. We did not specify that a surgeon had performed surgery throughout the 6-year audit period. All data were anonymised. In addition to the real data set, we also sampled this data to generate a simulated set of transplant data for a hypothetical surgeon. We used this simulated data to illustrate the two methods below. We performed a complete case analysis; hence, transplants with no follow-up data and missing risk factor data were excluded. The Ethics Committee of Moorfields Eye Hospital classified the study as an audit and exempt from review. The research methods and the analysis plan adhered to the tenets of the Declaration of Helsinki and UK law in relation to data collection and medical research.

Primary outcome

We analysed outcomes at 2 years post-transplant. The primary outcome measure was graft failure, caused by primary graft failure (no evidence of graft function post-transplant), irreversible rejection, infection, endothelial decompensation, a subsequent record in the database of a replacement graft in the same eye, or an unknown cause of failure. Individuals who died with a functioning graft were censored at the time of death, and patients with a functioning graft at the end of the observation period were censored at the last known follow-up.

Statistical methods

Kaplan–Meier plot

We used a Kaplan–Meier plot to show the change in graft survival over time from transplant until 2 years follow up. We did not use the log-rank test because the individual surgeon’s data contribute to the pooled national data.

Funnel plot

A funnel plot shows individual survival rates plotted against the number of transplants performed (n) at a specified time point after transplantation. The funnel shape of the confidence limits reflects the increasing statistical precision as the number of transplants increases along the horizontal axis. In our example, we superimposed this with the national survival rate at 2 years post-transplant (estimated using the Kaplan–Meier method) and the 95 and 99.8% confidence limits around the national rate (calculated using the score method for the binomial distribution) [9]. We used the score method rather than the more commonly used normal approximation because the normal approximation is unsuitable for rates of 0 or 100% or a small n.

A funnel plot assumes that the mix of patients for each surgeon is the same as the national rate. We compared two different methods to address this assumption:

  1. (1)

    We estimated unadjusted graft survival rates using the Kaplan–Meier method. We refer to this as the unadjusted funnel plot.

  2. (2)

    We calculated risk-adjusted graft survival rates from a Cox regression model that included risk factors commonly associated with graft failure for FED patients (Table 1). The ratio of the observed to the expected failures was then multiplied by the national failure rate. This value was then subtracted from 100% to give the risk-adjusted graft survival rate. We refer to this as the risk-adjusted funnel plot.

Table 1 Clinical factors used to estimate the 2-year risk-adjusted graft survival rates for 3616 Fuchs endothelial dystrophy patients who received their first corneal transplant from 1 January 2012–31 December 2017.

We defined an outlier as a graft survival rate outside the 99.8% confidence limits of the national rate. Due to the sensitivity of being identified as an underperforming outlier, the graft survival rates for outlying individuals, and the number of procedures, were slightly altered without affecting the study conclusions.

Results

Between 2012 and 2017, there were 3616 first corneal transplants for FED with follow-up and complete risk factor data at 2 years, performed or overseen by 196 consultants. We excluded from the analysis 337 (8%) transplants without follow-up data and 232 (6%) transplants with missing risk factor data (Fig. 1). The return of transplant record forms was 100%, with 89% for 1-year follow-up and 86% for 2-year follow-up. The number of procedures allocated to an individual surgeon ranged from 1 to 126. The data set included 2823 (78%) DSEK, 569 (16%) DMEK and 224 (6%) PK procedures. The mean number of DSEK grafts performed was 18 (95% CI: 15–21), 8 for DMEKs (95% CI: 6–10) and 3 for PKs (95% CI: 2–4).

Fig. 1: Flow chart of data exclusion pathway.
figure 1

The total number of corneal transplants analysed in this study excluding transplants due to missing follow-up or risk factor data.

Kaplan–Meier plots (Fig. 2) show the estimated graft survival and 95% confidence limits versus time since transplant. The national graft survival rate was 88% at 2 years, compared with 70% for an individual surgeon (transplants, n = 69). A visual comparison of the confidence limits for these two curves suggests that graft survival for the surgeon is below the national survival rate. The unadjusted funnel plot compares all surgeons to the national rate 2 years post-transplant (Fig. 3). Of the 108 surgeons who performed ≥10 grafts, two (2%) had a graft survival rate below the 99.8% lower confidence limit. One of these two outlying surgeons was the individual described in Fig. 2, confirming the difference in graft survival suggested by the Kaplan–Meier plot.

Fig. 2: Two-year graft survival for 3616 Fuchs endothelial dystrophy patients who received their first corneal transplant from 1 January 2012 to 31 December 2017.
figure 2

The left Kaplan–Meier plot shows the national pooled data, and the right compares a simulated individual surgeon with relatively low graft survival.

Fig. 3: Two-year graft survival for 3616 Fuchs endothelial dystrophy patients who received their first corneal transplant from 1 January 2012 to 31 December 2017.
figure 3

Funnel plot A shows all surgeons’ unadjusted graft survival rates. Funnel plot B shows the risk-adjusted graft survival rates accounting for variation in surgeon case mix. We derived the national rate from Fig. 2. The simulated individual surgeon from Fig. 2 (star), had a risk-adjusted graft survival rate of 66%, slightly lower than their unadjusted rate of 70%.

Table 1 shows the Cox regression model estimates used to calculate the risk-adjusted graft survival rates at 2 years for each surgeon. After risk-adjustment, the graft survival rate of 70 (36%) surgeons stayed the same, 74 (38%) decreased and 52 (27%) increased. The median change (percentage points) from the unadjusted to the risk-adjusted graft survival rate was 0% (IQR: 0%– −2%). Those with the greatest change in graft survival performed fewer transplants. After risk-adjustment, four surgeons were identified as an outlier, suggesting that, for these surgeons, the transplant survival rate was worse than expected even after accounting for the surgeon case mix (Fig. 3). As before, the outlying surgeons included the individual described in Fig. 2.

Discussion

Reporting outcomes is a good medical practice that provides evidence that a clinical service meets benchmark standards [10]. Since 1999 the NHSBT UK Transplant Registry has collected data on over 100,000 corneal transplants. Centre-specific reports have been generated since 2010. These give surgeons an overview of their centre’s activity and outcomes, including graft survival, rejection rates and visual acuity. Initially, Kaplan–Meier survival curves were used to compare a single centre and all other centres, but this gave no information on the distribution of individual surgeon rates. In 2019, performance reports for individual surgeons were introduced that included funnel plots of 2-year graft survival outcomes. In this paper, we have demonstrated that funnel plots facilitate a comparison of the survival rates for individual surgeons versus the national rate, and they allow easier identification of outliers than Kaplan–Meier plots (Table 2). We have also shown that risk-adjustment can account for surgeon case mix by incorporating patient and transplant factors that may influence graft survival. While our unadjusted funnel plot does not distinguish the type of graft (PK or EK), the risk-adjusted model does (Table 1), although surgeon preference may determine the choice of procedure.

Table 2 Comparison between Kaplan–Meier and funnel plots for presentation of outcome data.

The task of improving health care by reporting outcomes is not straightforward. Identifying outliers based solely on statistical probability without regard for clinical circumstances may not be a reliable indicator of poor performance. The Health Quality Improvement Partnership has published quality measures relevant to the individual consultant, team, and surgical unit in the UK [11]. They have developed a protocol to manage instances where results are significantly lower than the national rate, with a definition of outliers based on a two-sided statistical approach with a 95% threshold (equivalent to a p value of 0.05) for ‘alert’ and a 99.8% (equal to a p value of 0.002) for ‘alarm’ [12]. NHSBT, supported by The Royal College of Ophthalmologists (UK), have adopted a similar approach to manage corneal transplant outliers [13]. Importantly, corneal surgeons have an opportunity to review their data annually and submit additional forms, particularly if they have been identified as an outlier. Upon reviewing the data, an outlying surgeon may also identify patients with risk factors for graft failure, a learning effect of adopting a new procedure, or an effect due to surgical supervision of trainees. Following data validation, additional clinical or statistical advice may be recommended, and this may also be an opportunity for surgeons with excellent outcomes to share techniques.

As a first iteration, we have used unadjusted funnel plots to monitor outcomes for surgeons who perform ≥10 first grafts for each indication (KC, FED and PBK) in a 6-year cycle, with results reported separately. We have only reported here the results of first grafts for FED, which excludes second eye grafts or re-grafts, and thus represents only a proportion of the total number of grafts performed by the surgeon in that period. This may explain the relatively low number of procedures for some surgeons. Notably, the threshold of ≥10 first grafts was determined from a visual inspection of the funnel plots rather than clinical evidence. Applying a threshold may avoid identifying surgeons with inconclusive results, but it also excludes 88 (45%) surgeons from this audit, some of whom may be low volume, poorly performing surgeons [10]. An alternative strategy to increase the number of surgeons above the audit threshold would be to consider all grafts performed by a surgeon and include the initial pathology as part of the risk-adjustment. Our aim is to apply risk-adjusted models to each surgeon’s graft survival rate so that the funnel plots will account for patient mix across the three main indications. We also hope to explore the performance of low volume surgeons in more detail to determine whether they should be audited separately. Finally, cases with missing risk factor data were excluded from this study, but in future applications of funnel plots we will evaluate multiple imputation methods to handle missing data to improve bias and precision estimates of graft survival [14].

The role of surgical volume or surgeon experience on graft survival has been reported previously, although it is limited to grouped surgeon data and first PK grafts [4, 15,16,17]. In KC, FED and PBK, there was no difference in transplant survival at 5 years between high volume (>30 transplants per year) compared to low volume surgeons (<10 transplants) [15]. Similarly, there was no difference in graft survival between centres performing more or less than 10 PK procedures per year [17]. Low-activity surgeons are unlikely to be a homogenous group because low activity can occur for several reasons, including early-career corneal surgeons, maternity leave, or retirement during the 6-year audit cycle. Surgeons with low activity tend to have greater uncertainty surrounding their graft survival estimate, demonstrated by a wide confidence interval. Although it is not a formal statistical test, overlapping confidence intervals for an outlying surgeon’s rate and the national rate would indicate uncertainty about the true difference between the two. However, a limitation of the funnel plot is that it does not show the 95% confidence interval for an individual surgeon’s graft survival rate.

An inherent weakness of any registry study is that the accuracy of the data depends on surgeon engagement for the completeness and quality of the information collected, with a risk that poorly performing surgeons may not submit returns. Our registry has the advantage that all transplants undertaken in the UK are registered, although disparities remain between surgeons in returning follow up forms. With both the Kaplan–Meier method and funnel plot, reporting data at 1 or 2 years also means an inherent lag in the performance monitoring process that may allow a poor practice to remain undetected for a period. Whilst we have chosen to use funnel plots to monitor corneal transplant outcomes, other methods for monitoring outcomes exist, e.g., for risk of death following liver transplantation, where outlying survival rates are primarily identified using cumulative sum control chart (CUSUM) methods which monitor survival in real-time [18]. This methodology requires reporting data for graft failure or death at the first follow-up or 3 months.

In conclusion, funnel plots offer more information and context for statistical analysis than raw data or Kaplan–Meier plots. Risk-adjustment may help to account for surgeon case mix, but it is essential that the basis for the statistical adjustment is transparent and fully explained for the clinician. We have demonstrated that visualisation of data using funnel plots can facilitate the audit of corneal transplant outcomes, although there is uncertainty about the graft survival estimates of low-activity surgeons. Finally, funnel plots have previously been used to audit cataract surgery [19], and it may be feasible to use funnel plots for outcome analysis for other ophthalmic procedures in which post-surgical events, such as disease recurrence or procedure failure, can be defined.

Summary

What was known before

  • Reporting graft outcomes is a good medical practice that provides evidence that a clinical service meets benchmark standards.

  • Previous reports have used Kaplan–Meier survival analysis.

  • Survival analysis is difficult to interpret when there are multiple participating surgeons.

What this study adds

  • We have compared Kaplan–Meier and funnel plots to demonstrate graft outcomes.

  • A risk-adjusted funnel plots has advantages for the identification of surgical outliers.

  • Identifying low volume poorly performing surgeons is still problematic.