Introduction

Total hip replacement (THR), total knee replacement (TKR) and unicompartmental knee replacement (UKR) cause short-term increases in mortality persisting for 90-days in hips1 and 45-days in knees2. Observational studies have identified surgical-related factors associated with decreased mortality. For hip replacement, these included posterior approach, thromboprophylaxis, spinal anaesthetic1 and resurfacing hip replacement3. For knee replacement, unicompartmental knee replacement was associated2,4.

Surgical-related exposures may be confounded by indication despite statistical adjustment. Randomised controlled trials are the “gold standard” for inferring causality but are unfeasible when the primary outcome is rare. Patients undergoing joint replacement are pre-selected as fit enough to undergo surgery and hence are healthier than the general population. The overall mortality in this population is less than expected using general population mortality rates, but over time this reduced risk diminishes5. Knowledge of the cause of death is helpful as health selection would be associated with reduced mortality from respiratory and smoking-related cancers, causes unlikely to be influenced by any perioperative surgery-related factor.

Our aim was to determine whether “protective” factors previously identified may be due to selection by examining extended follow-up cause-specific linked mortality data. We hypothesised that (i) truly causal interventions which only influence the perioperative period would show an acute short-term reduction in mortality. However, conditional on survival to the end of this period, the subsequent mortality patterns should be the same for both exposed and unexposed groups. We defined the perioperative period as being 90 days as this is the timepoint at which we have previously shown the increased risk of perioperative mortality associated with the surgical intervention returns to baseline1. (ii) If healthier participants were selected for the intervention, this group would have a persistently lower mortality at all time periods, that was unlikely to be explicable by the intervention, though over time mortality risks would converge towards that seen in the general population due to attenuation. (iii) There may be a causal benefit of the intervention but there is also selection of which patients receive this. In this case, it may not be possible to differentiate causality from selection.

Methods

Patients and Data Sources

539,372 and 589,028 linkable primary THRs and TKRs in the National Joint Registry of England and Wales (NJR), undertaken April 2003 to December 2012 inclusive and reported in the 10th Annual Report6 were analysed. Dates of death were obtained from NHS Personal Demographics Service on 23rd February 2013. We excluded 261 THRs and 243 TKRs because the NHS number was untraceable, consent had been withdrawn or for ambiguous age or gender and a further 6,182 THRs (3,091 patients) and 15,142 TKRs (7,571 patients) with simultaneous bilateral operations. 479,191 THRs and 550,787 TKRs had osteoarthritis (OA) as the only indication for surgery; these two groups were analysed separately.

Patients often had other hip or knee procedures recorded, making it difficult to describe mortality associated with one incident procedure. Furthermore, when left and right joints were replaced at different times, any subsequent death would be included twice. After exploring several strategies, analysis was based on the first primary hip or knee procedure reported in the NJR; 424,156 first THRs and 469,989 first TKRs, of which 33,759 and 36,003 respectively died on/before our censoring date (31st December 2012), as described previously7. For those who died, we obtained the main causes of death (ICD 10) from the Office for National Statistics (ONS) via NJR linkage to the patient’s Hospital Episode Statistics (HES) identifier, thereby excluding patients in the NJR with no inpatient HES records up to the end of November 2012. The latter included patients with no NHS-funded procedures recorded in the NJR, or with procedures performed only in Wales. 332,734 of the patients undergoing hip replacement (26,766 deaths) and 384,291 of the patients undergoing knee replacement (29,802 deaths) could be linked7.

Exposures

Surgical related interventions found to be important in our earlier short-term mortality publications: posterior surgical approach, thromboprophylaxis, spinal anaesthetic, resurfacing hip replacement (for hip replacement)1 and unicompartmental knee replacement2.

Other covariates

We also modelled age, gender, year of primary operation (2003–2005, 2006–2008, 2009–2012) and ASA grade. Where available, each procedure was linked to the patient’s HES inpatient records over a 5-year period prior to the operation date which were used to compute area deprivation quintiles8,9,10 and Charlson co-morbidity1,2. Comorbidity analysis was restricted to primary operations performed prior to the end of November 2012 as HES records beyond this date were not available. BMI was analysed separately as data were incomplete and not recorded in the early phase of NJR. Variable coding and frequency counts are shown in Supplementary Material Tables S1 and S2.

Statistical Methods

Two complementary approaches were adopted (i) an internal and (ii) an external comparison. The internal comparison used proportional hazards regression models to model the time to death from any cause in the presence of censoring (see Supplementary Material Text S1). To capture changing hazard ratios over time, we used flexible parametric survival modelling (FPM)11,12 implemented in Stata13,14. We first sought parsimonious models with just gender and (continuous) age at operation (as four restricted cubic splines)12. We assessed time-varying effects of gender and age by adding appropriate terms to the model to represent these effects, assessing the changes with likelihood ratio tests and examination of Akaike and Bayesian information criteria (AIC and BIC respectively), giving preference to the former. We then added other risk-factor variables to the model, using a series of 0/1 indicators for each (Supplementary Material Tables S1 and S2). The effects of each risk factor on the HRs were assessed, adjusting for age and gender, and whether their effects changed with time. A final multivariable model was constructed of all surgical risk factors, plus age, gender, ASA and year of surgery. Adjusted HRs for each risk factor were plotted. Further models additionally adjusted for comorbidity and quintiles of area deprivation.

The external comparison compared the observed mortality patterns for exposures in relation to expected mortality using national rates. In contrast to the internal comparison, above, this analysis looked at one risk factor at a time, conditioning on contemporaneous age group and sex. To test for differing patterns of mortality over time we calculated Standardised Mortality Ratios (SMRs) by time interval (0–90 days, 90 days–1 year, 3–5 years, 5–7 years and 7 years+) from the primary operation by dividing the observed numbers of deaths (O) in each of these intervals by the expected numbers (E). The latter were calculated from ONS age/sex mid-year populations15 and relevant deaths16 for England and Wales. We did this with respect to (i) all cause deaths (ii) respiratory system deaths (all ICD-10 ‘J’ codes) and (iii) deaths from cancers that were believed to be related to smoking (see Supplementary Material Text S2). We also explored cardiovascular deaths, but these results are not shown for simplicity. These cause-specific analyses could only be performed for the subsets of patients with associated HES records (332,724 hips; 384,291 knees). We plotted the temporal changes in (logged) SMR with time (see Supplementary Material Fig. S1), exploring the observed patterns with Poisson regression models, with the expected number of deaths as the offsets. Excluding the 90-day period deaths post operation, the remaining intervals were coded: 90-days to 1 year = 0, 1–3 years = 1, 3 to 5 years = 2, 5 to 7 years = 3, 7 years plus = 4. The effects of the risk factor was assessed by fitting a group effect allowing for a common slope. Temporal divergence/convergence was tested by including a group by time period interaction. We hypothesised that if there were or were not group differences due to different intercepts but we observed the same slope, this would be suggestive of a causal effect. If there was a difference in the slopes, regardless of whether there was a difference or not in the intercept, this would represent a selection effect and be most likely observed as an improved SMR in the early period which abated with time as the benefit of the selection effect was reduced.

Consent

Patients consent to participate was obtained at the point of data capture by the NJR. As this was an analysis of anonymised routinely collected, no separate ethical approval was required.

Results

In modelling the baseline hazard for the internal model, 4 and 3 ‘knots’ (i.e. degrees of freedom (df) = 5 and 4) sufficiently captured the hip and knee patterns seen in our earlier analyses1,2; i.e. short term increases that subsided by 90 and 45-days, respectively, thereafter increasing with time, reflecting normal mortality.

The effects of both age and gender on the hazard rate were time-varying; Supplementary Material Fig. S2 details how they were modelled, the magnitude of their effects and illustrates that the estimated cumulative mortality fitted the observed mortality well. Supplementary Material Fig. S3 details the results from adding the other surgical covariates. Supplementary Material Figs S4 and S5 document additional analyses including bearing surface and grouped BMI.

Posterior approach

The internal comparison for all-cause mortality showed a significant time-varying effect of the posterior approach compared with ‘other’ approaches (p < 0.001, likelihood ratio test); there was an attenuation of the protective hazard ratio by 2 years (Fig. 1). The external comparison for all-cause mortality found a group difference, the biggest difference was in the earliest period after which the SMRs were fairly similar for both groups, but slightly lower for the posterior approach (Table 1). There was no evidence of selection as smoking-related cancer mortality was almost identical across the exposure groups but far less than expected due to the population being selected for an elective operation. Respiratory mortality was less in the posterior approach group (p = 0.05).

Figure 1
figure 1

Hazard rates ratios (with pointwise 95% CIs) associated with a posterior approach for hips (adjusting for time varying effects of age, gender, year of operation, ASA, mechanical and chemical thromboprophylaxis, anaesthetic used and implant type).

Table 1 Comparison of SMRs between posterior approach and other surgical approaches for hips. (Ratios of observed (O) to expected (E) numbers of deaths obtained from national rates [with 95% CI shown in parentheses]).

Mechanical thromboprophylaxis

Internal modelling showed a relative decrease in risk of death initially when mechanical thromboprophylaxis was used, this effect diminished with time (p = 0.01) over a long period, which was more suggestive of selection rather than causal effect (Fig. 2a). The external comparison found evidence of a group effect for mechanical thromboprophylaxis (p < 0.001); this was most marked in the perioperative period but those that received it had lower mortality at all time periods, though the time interaction was consistent with chance. There was a weak suggestion that those that received it had lower smoking-related mortality, which was more marked for respiratory system mortality (p = 0.07) (Table 2).

Figure 2
figure 2

(a) and (b) Hazard rates ratios (with pointwise 95% CIs) associated with (a) mechanical and (b) chemical thromboprophylaxis for hips (adjusting for time varying effects of age, gender, year of operation, ASA, surgical approach, anaesthetic used and implant type).

Table 2 Comparison of SMRs between those given or not given any mechanical thromboprophylaxis (i.e. ‘Yes’ vs. ‘No’) for hips. (Ratios of observed (O) to expected (E) numbers of deaths obtained from national rates [with 95% CI shown in parentheses]).

Chemical thromboprophylaxis

In the internal comparison, compared with a referent group of ‘no chemoprophylaxis’ (red), ‘aspirin only’ (black) was associated with a reduction in mortality that persisted (Fig. 2b). ‘Heparin + aspirin’ (brown dashed), was associated with an initial marked reduction that slowly receded; ‘Heparin only’ (blue), was associated with a less marked reduction that also slowly receded; a similar but again less marked effect was seem with ‘other chemoprophylaxis/other combinations’ (green).

For the external comparison, chemical thromboprophylaxis was grouped into aspirin alone, other combinations and none. The external comparison showed slightly different results, both aspirin and other combinations had lower 90-day mortality compared to none but after this period, all-cause mortality was slightly lower for aspirin (group effect p = 0.009) than those with none, with no evidence of a time interaction (Table 3). Similarly, there was little difference in smoking related cancer mortality for any of the groups suggesting little or no selection. Interestingly, there was a persistent reduction in circulatory disease deaths as aspirin had a lower SMR for all time periods than no aspirin (data not shown; group difference p = 0.007) which would be consistent with a longer term cardioprotective effect.

Table 3 Comparison of SMRs between subgroups of chemical thromboprophylaxis for hips (‘aspirin only’, ‘others/other combinations’ (see table footnote) and ‘none’). (Ratios of observed (O) to expected (E) numbers of deaths obtained from national rates [with 95% CI shown in parentheses]).

Type of anaesthesia

The internal comparison found that compared with general anaesthetic (GA), spinal anaesthetic was more advantageous early on but this effect diminished with time, whilst spinal plus GA seemed to show persistent reduced mortality (Fig. 3). For the external comparison, anaesthesia was grouped into spinal only, GA only, spinal plus GA and other combinations. The most marked difference was for spinal plus GA versus GA where the former group had lower mortality at all time points (Table 4). This was also seen for smoking related cancers and there was evidence of a group by time interaction (p = 0.03) consistent with selection. Spinal alone compared to GA was more suggestive of a causal effect as the mortality benefits were only seen in the perioperative period and there was no suggestion of reduced smoking-related cancer mortality. Respiratory deaths were, if anything, higher in this group compared to the GA group suggesting that there may be more, rather than less, comorbidity.

Figure 3
figure 3

Hazard rates ratios associated with anaesthetic for hips (adjusting for time varying effects of age, gender, year of operation, ASA, surgical approach, mechanical prophylaxis, chemical prophylaxis and implant type). The referent group is ‘GA only’ (red line); other groups are as shown in the legend but 95% CIs have been omitted for simplification.

Table 4 Comparison of SMRs between specific anaesthetic groups for hips (‘Spinal only’, ‘GA only’, ‘Spinal + GA only’ and ‘others/other combination’ (see table footnote). (Ratios of observed (O) to expected (E) numbers of deaths obtained from national rates [with 95% CI shown in parentheses]).

Type of hip prosthesis

The internal modelling found that the resurfacing hazard ratios remained lower than the cemented referent group over time (Fig. 4). A similar time-invariant pattern was seen with uncemented hips. In the external comparison, resurfacing was associated with reduced all cause (p-value for interaction = 0.01) and smoking-related cancer mortality that was seen over the follow-up period consistent with a selection effect (Table 5).

Figure 4
figure 4

Hazard rates ratios associated with hip implant type (adjusting for time varying effects of age, gender, year of operation, ASA, surgical approach, anaesthetic used, mechanical prophylaxis and chemical prophylaxis).

Table 5 Comparison of SMRs between hip resurfacings, uncemented total hip replacements and other types of hip implants (i.e. cemented, hybrid and reverse hybrid). (Ratios of observed (O) to expected (E) numbers of deaths obtained from national rates [with 95% CI shown in parentheses]).

Supplementary Material Fig. S4 shows results from the internal model which take into account bearing surface. Supplementary Material Fig. S5 shows results for grouped BMI.

Type of knee prosthesis

The internal comparison showed a survival benefit with the use of unicondylar knee replacements that waned over time (Fig. 5). The external comparison found a group effect (p < 0.001) though all-cause mortality was reduced across all the time periods and was not restricted to the early period. There was no evidence of reduced smoking-related cancers but respiratory deaths were reduced in the intervention group (SMR 0.30, 95% CI 0.24, 0.36 versus 0.38, 95% CI 0.36, 0.39, p = 0.01) (Table 6).

Figure 5
figure 5

Hazard rates ratios for mortality after knee replacement associated with type of knee (adjusting for time varying effects of age, gender, year group and ASA; n = 469,952). The referent group ‘cemented’ is shown as a red line; solid black, brown, green and blue denote ‘uncemented, ‘hybrid’, ‘patellofemoral’ and ‘unicondylar’ respectively; 95% CIs shown as dashed lines in the same colours.

Table 6 Comparison of SMRs between unicondylar knees and other knee implant types. (Ratios of observed (O) to expected (E) numbers of deaths obtained from national rates [with 95% CI shown in parenthesis]).

Discussion

We have used the extended mortality follow-up from the NJR to explore where some of the previously reported “protective” surgical-related exposures are truly causally related or the result of confounding. The reduced mortality seen for patients undergoing joint replacement was markedly attenuated, though not totally abolished, over time, highlighting a healthy selection effect, a finding consistent with previous studies17,18,19,20,21.

Of seven interventions studied, we believe three interventions (hip resurfacing, combined spinal and general anaesthetic, unicondylar knee implants) showed patterns more consistent with confounding by indication; healthier patients being more likely to have received the intervention. The evidence for this seemed strongest for hip resurfacing where patients showed persistent reduced mortality from all cause and smoking-related cancers. This type of hip replacement has specifically been marketed for the more active patient, reflected in the demographics observed6. Possibly earlier/better mobilisation might contribute to the decreased mortality22, however, this is unlikely to explain the observed difference in smoking-related deaths. The results for the combined spinal and GA anaesthesia group were similar, the patients had better all-cause mortality and a time interaction for smoking–related cancers. Similar results were seen for unicondylar knee replacement; although there no difference for smoking-related cancers, the reduction in respiratory deaths suggested a selection effect.

Mechanical thromboprophylaxis was less consistent with an apparent genuine intervention-related reduction. There was little evidence that those receiving intervention were otherwise healthier and the internal comparison found a slow waning effect of the mortality risk over time, so we remain unsure. Chemoprophylaxis with aspirin was interesting as aspirin was associated with a reduction in all-cause mortality which was sustained over time and this was mirrored in fewer circulatory disease deaths. We have assumed that any causal perioperative intervention would only exert its effect in the early postoperative period. However, it is possible that previously untreated patients with heart disease who are given aspirin perioperatively keep taking it, leading to decreased risk of mortality from cardiac events23,24. The mortality patterns for a posterior approach were ambiguous, suggesting a selection effect but also possibly a causal benefit with only short-term mortality gains. There was no evidence of lower smoking related mortality but weak evidence of reduced respiratory deaths in the posterior approach group. This may be because the posterior approach was associated with less bleeding, tissue damage25 and better early mobilisation26, hence reduced risk of complications such as thrombosis and therefore a reduction in early deaths. In the longer term, preservation of the abductor muscles in the posterior group compared to the most common alternative (lateral/anterolateral approach) may lead to improved mobility27 and a slow attenuation and protection from respiratory deaths. There is unlikely to be a selection effect as the vast majority of hip replacements are performed by surgeons who use either the posterior or lateral/anterolateral approach with no patient level selection. Some selection is observed amongst surgeons who perform minimally invasive or anterior approaches for patients with lower body mass index but such procedures account for only 4% of the cases recorded by the NJR28. Most convincing were the results for those receiving spinal anaesthetic. Here only perioperative mortality was reduced and if anything, respiratory mortality was higher in the spinal group suggesting that patients with worse lung function may have been selected for this type of anaesthesia. Spinal anaesthesia has been demonstrated to be associated with lower risk of complications such as surgical site infection, pulmonary complication, blood transfusion, thromboembolic events, prolonged length of stay and intensive care unit admission when compared to general anaesthesia29. Avoidance of these complications may therefore reduce the risk of mortality. Smaller evidence synthesis studies have only demonstrated a reduced length of stay in spinal compared to general anaesthesia30 but this may reflect the small overall sample size and the need for follow up beyond hospital discharge to assess the effect of the type of anaesthesia on mortality; as recommended by Johnson et al.30, our analysis includes consideration of intermediate and long-term outcome.

The NJR is the largest joint replacement register in the world6. Linkage with other comprehensive national databases allows good data coverage but the data are observational and causality cannot usually be proven. However, we feel that triangulating findings across different analytical strategies points more strongly to causation and may be an analytical strategy for trying to infer causality with other large sets of observational data when it is not feasible to undertake randomised controlled trials.

Data collection occurs at the point of surgical treatment and as such, variables that may change post operatively, such as type of thromboprophylaxis cannot be reliably captured. Whilst the NJR does have a range of variables that attempt to adjust for case-mix/comorbidity, residual confounding always remains a possibility due to lack of good indicators of disease severity or early stage conditions, such as mild cognitive impairment or early dementia. Replication of our results in other national registries would be helpful especially if the patient selection factors differ from the UK, for example in predominantly private health care systems.

We present further data supporting the potential causal role for aspirin chemothromboprophylaxis, posterior approach and spinal anaesthetic in THR in decreasing post-operative mortality and we recommend that patients undergoing primary THR are treated with these modalities to reduce the risk of mortality. We believe that the apparent “protective” effect of hip resurfacing, spinal and GA, unicondylar knee replacement and mechanical thromboprophylaxis are more likely to be due to selection though the last two are more difficult to interpret and may have a combination of both selection and causal effects, requiring further investigation.