Introduction

Non-steroidal anti-inflammatory drugs (NSAIDs) are widely used for their pain-relieving effects across a variety of ailments, including osteoarthritis (OA)1. However, they are associated with significant risks, contributing to 30% of hospital admissions for adverse drug reactions2. These risks include serious gastrointestinal complications3, heightened risk of cardiovascular disease4, and renal failure5. More specifically, research indicates that 13–15% of NSAID users experience upper gastrointestinal adverse effects6, and a quarter of peptic ulcer cases may result from NSAID use alone7. Moreover, NSAIDs are associated with a 25% increased risk of cardiovascular events8. Consequently, international guidelines discourage NSAID use in certain individuals, particularly those with comorbidities or cardiovascular disease, and recommend short-term use only9,10,11,12.

Despite the above-mentioned guidelines, long-term use of NSAIDs remains prevalent. A recent study revealed that patients with hip and knee OA were prescribed NSAIDs for an average duration of approximately 16 months over three years of observation13. Moreover, the prevalence of long-term use of NSAIDs is on the rise. In 2010, 29 million (12.1% of) adults in the United States of America (USA) reported long-term use of NSAIDs (defined as a usage duration of over three months)14, representing a 41% increase since 2005. Patients with OA account for a significant portion of these users, with nearly 65% of patients with OA and chronic low back pain in the USA prescribed NSAIDs for chronic pain management15.

Despite the prevalence of long-term NSAID use for pain management, existing research provides an incomplete picture of their effects on knee osteoarthritis (KOA) symptoms and structural changes. Prior systematic reviews and meta-analyses have predominantly focused on the short-term effects of NSAIDs, covering periods of use ranging from 2 to 54 weeks (12 months)16,17,18,19,20,21,22,23,24. These findings suggest that NSAIDs may alleviate or improve KOA symptoms during this short-term period. However, questions regarding the potential long-term impact, extending beyond 54 weeks, remain unanswered by these previous meta-analyses16,17,18,19,20,21,22,23,24. A limited number of studies25,26,27 covered in these previous meta-analyses16,17,18,19,20,21,22,23,24 have explored longer-term use, specifically, durations between 80 and 144 months, but their findings have been inconsistent25,26,27. Thus, the impact of long-term use of NSAIDs on symptoms and structural changes in KOA is unclear.

These previous meta-analyses were limited in their ability to explore the long-term effects of NSAID use on KOA symptoms and structural changes, not only due to the limited number of long-term studies available for inclusion but also due to their reliance on aggregated data. Aggregated data meta-analyses summarize information from a number of studies, with those studies often employing varied statistical methods, varied exposure measures, and varied outcome measures. This approach can obscure the true association between exposure and outcomes, compromising the accuracy of the conclusions28. Conversely, analyzing individual-level data from multi-cohorts within a single, unified analysis can more accurately depict the relationship between exposure and outcome measures. In light of this, we used individual-level data from three major OA cohorts (the Osteoarthritis Initiative (OAI), the Multicenter Osteoarthritis Study (MOST), and the Cohort Hip and Cohort Knee (CHECK) study) to examine the association of long-term use of NSAIDs over 4-to-5 years and progression of symptoms and structural changes in KOA, as well as total knee replacement (TKR).

Methods

Study design, setting, and participants

We obtained publicly available data from three independent cohorts: the OAI29; the MOST30; and the CHECK study31. These cohorts primarily comprised participants with KOA or at risk of developing it, and the CHECK study additionally included some participants with or at risk of hip OA. The OAI enrolled participants at four clinical centers in the USA between 2004 and 2006. The MOST enrolled participants at two clinical centers in the USA between 2003 and 2005. The CHECK enrolled participants at ten clinical centers in the Netherlands between 2002 and 2005. There were 4,796, 3,026, and 1,002 adults in the OAI, MOST, and CHECK study cohorts, respectively. This study did not require ethical approval as it relies on already published data and does not involve a new collection of data from, or direct interaction with, human subjects. The OAI, MOST, and CHECK studies had their own appropriate ethical approvals at the time of original data collection.

We used symptomatic data, specifically, pain, disability, and stiffness, and radiographic (X-ray) data, from baseline and at the 4-year follow-up from the OAI study, and from baseline and at the 5-year follow-up from the MOST and CHECK studies. The OAI study had limited radiographic data at any other time points beyond 4 years. The MOST and CHECK studies did not have radiographic data at 4 years, and that is why the 5-year data were used.

Exposure

The exposure in our study was the long-term use of NSAIDs. We determined NSAID use in the OAI and MOST cohorts using the medication inventory method32. This involves participants bringing all their current medications to a study visit, where researchers document the names and other relevant details of their medications. This approach provides an accurate assessment of medication use, minimizing recall bias and allowing for drug verification. In OAI and MOST, NSAID use was defined as any prescribed use of NSAIDs (including COX2 inhibitors) in any route and form (including oral capsules) within the last 30 days at the time of the assessment. In the CHECK cohort, we determined NSAID use (including COX2 inhibitors) based on the information reported during the clinical interview. Participants were defined as NSAID users if they answered ‘yes’ to the question of whether they were using ibuprofen, diclofenac, naproxen, celecoxib, or rofecoxib for their complaints of hip or knee at the time of the assessment.

A participant was classified as a long-term NSAID user if they met the criteria for NSAID usage, as defined above, at baseline and all follow-up visits. The follow-up time points occurred annually over 4 years in OAI (years 1 to 4) and over 5 years in CHECK (years 1–5), and every 2.5 years over 5 years in MOST (years 2.5 and 5). Based on literature indicating that many patients remain on NSAID treatment for durations averaging between 16 ± 12 months13 and 20 ± 16 months33, we assumed continuous NSAID use between visits. A participant was classified as a non-user of NSAID if they did not meet the NSAID usage criteria at any visit, whether at baseline or during follow-up.

Outcomes

We investigated the association of NSAID use with KOA outcomes across two domains over a 4-to-5-year follow-up: symptoms and structural changes. Specifically, we analyzed five binary outcomes: three pertaining to symptoms of KOA and two to structural changes. While our primary focus was on binary outcomes, we have also included average changes in symptom severity from baseline to the 4-to-5-year follow-up as secondary, continuous outcomes for supplementary insight, as detailed in the “Statistical analyses” section below.

Symptoms of KOA

The Western Ontario and McMaster Universities Arthritis Index (WOMAC) was used to assess changes in KOA symptoms, specifically pain, disability, and stiffness. WOMAC scores ranged from 0 to 20 for pain, 0 to 68 for disability, and 0 to 8 for stiffness. Higher scores indicate worse symptoms. The minimal clinically important difference (MCID) for symptom worsening was defined as ≥ 6.4 normalized units (NU, 0–100 scale) for pain, ≥ 10.3 NU for disability, and ≥ 2.9 NU stiffness34. We examined three binary outcomes related to KOA symptoms: worsening pain, worsening disability, and worsening stiffness, all of which were above their respective MCID values.

Structural changes in KOA

Structural changes were assessed by KOA severity grade as assessed by radiography and TKR. We used Kellgren/Lawrence (K/L) grades35, which range from 0 to 4, to evaluate KOA severity. Higher K/L grades indicate more severe KOA. We considered two binary outcomes for structural changes: worsening in radiographic KOA severity grade, and incidence of TKR. The outcome of worsening in radiographic KOA severity grade was defined as a knee with an increase of ≥ 1 K/L grades at the 4-to-5-year follow-up compared to baseline. We did not define an increase from K/L grade 0 at baseline to K/L grade 1 at 4-to-5-year follow-up as a worsening in radiographic KOA because K/L grade 1 is not uniformly considered as KOA35. In addition, we excluded knees with a baseline K/L grade of 4, the highest possible grade, from the analysis of radiographic KOA worsening, as no further deterioration could be detected for these knees. The outcome of incidence of TKR was defined as a knee undergoing TKR at any time point between baseline and the 4-to-5-year follow-up.

Inclusion criteria

We only included participants with complete data at baseline for the covariates used in our analyses. Participants also needed to have enough data available to be classified as long-term NSAID users or NSAID non-users to be included in our analyses (Fig. 1).

Figure 1
figure 1

Selection of participants. * NSAID use reported in the last 30 days at the time of the assessment at baseline and annually over 4 years in OAI; in the last 30 days at the time of the assessment at baseline and every 2.5 years over 5 years in MOST, at the time of the assessment at baseline and annually over 5 years in CHECK. BMI, body mass index; CHECK, Cohort of Hip and Cohort of Knee; MOST, multicenter osteoarthritis study; NSAID, non-steroidal anti-inflammatory drug; OAI, Osteoarthritis Initiative study; WOMAC, Western Ontario and McMaster Universities Arthritis Index.

Statistical analyses

We performed analyses of individual-level data from specific OA cohorts within a single, unified analysis to estimate the associations between the long-term use of NSAIDs and the above-mentioned outcomes over 4-to-5 years. As a statistical model, we used generalized estimating equations with a logistic link function (i.e., logistic regression), clustering the left and right knee of each participant. The analyses were adjusted for the following covariates: sex, race, and baseline values of age, Body Mass Index (BMI), smoking status, comorbidity score, walking for leisure or activity, WOMAC pain score, WOMAC disability score, WOMAC stiffness score, OA severity by K/L grade, and study cohort (OAI, MOST, and CHECK). These covariates were selected prior to the analyses due to their associations with KOA, which was found in previous research. In addition, we reported the characteristics of long-term NSAID users and NSAID non-users at baseline compared by calculating absolute standardized mean differences (SMDs) between them. An SMD > 0.10 indicates a possible imbalance for that characteristic between groups36.

As mentioned above, we assessed the presence or absence of significant symptoms and structural changes using binary, primary outcomes. In addition to these binary, primary outcomes, we also evaluated the severity of symptoms over time through secondary, continuous outcomes, as measured by the WOMAC scores, specifically WOMAC pain, WOMAC disability, and WOMAC stiffness scores. These scores were reported as average differences between baseline and the 4-to-5-year follow-up, both in adjusted and unadjusted forms. We employed linear mixed models for the adjusted estimations of average differences for WOMAC scores, using the participant as a random effect since we included the left and right knee of each participant.

We used STATA/BE 17.0 for Windows (64-bit × 86–64) for our analyses. We adjusted the significance level for multiple testing (i.e., Bonferroni adjustment); therefore, we set our threshold for statistical significance as a two-tailed P-value of less than 0.01 (0.05/5 primary outcomes).

Missing data analyses

At baseline, we ensured a complete dataset by excluding any entries with missing data, as detailed in Fig. 1. However, at the 4-to-5-year follow-up, we encountered instances of missing outcome data. We performed missing data analyses to assess the potential impact of missing outcome data at the 4-to-5-year follow-up on our estimates. Our tests showed that the missing data for those outcomes were missing at non-random (MNAR), so we did not perform multiple imputations. Instead, we used best-case and worst-case scenarios by replacing the missing outcomes with positive outcomes (e.g., no worsening pain that was above the MCID value at 4-to-5-year follow-up) and negative outcomes (e.g., worsening pain that was above the MCID value at 4-to-5-year follow-up), respectively. Missing data analysis was not performed for the incidence of TKR due to challenges in distinguishing between the actual absence of TKR and the absence of data about TKR within the datasets used.

Sensitivity analysis

As a sensitivity analysis, we matched participants in the long-term NSAID users group with participants in the NSAID non-users group, employing 1:1 nearest-neighbor matching with propensity scores. The propensity scores were calculated based on the variables that we used for adjusting the estimates for our main analyses. These variables were sex, race, and baseline values of age, BMI, smoking status, comorbidity score, walking for leisure or activity, WOMAC pain score, WOMAC disability score, WOMAC stiffness score, osteoarthritis severity as indicated by K/L grade, and study cohort (OAI, MOST, or CHECK). After propensity score matching, we compared the characteristics of participants in the long-term NSAID user and NSAID non-user groups at baseline by calculating absolute standardized mean differences (SMDs) between groups to determine if there were significant differences between the groups. As in the main analysis, we considered an SMD > 0.10 to indicate a possible imbalance for that characteristic between groups. We then compared the outcomes over 4-to-5 years between the groups using the same statistical methods as in the main analyses.

Results

Participants

Our study included 4,197 participants, of whom 435 were long-term NSAID users, and 3762 were non-users. Long-term NSAID users and NSAID non-users were similar in age (range 45–79 years), race, smoking status, comorbidity scores, and walking status for leisure or exercise. However, long-term NSAID users were more likely than non-users to be female, heavier, and also had higher scores for symptoms of pain, disability, and stiffness, and were more likely to have radiographic KOA (i.e., K/L grade ≥ 2) (Table 1).

Table 1 Baseline characteristics of participants, by long-term use of non-steroidal anti-inflammatory drugs (NSAIDs).

Outcomes

Symptoms of KOA

At the 4-to-5-year follow-up, compared to non-users, long-term NSAID users had an increase in adjusted mean pain score of 3.95 (95% confidence interval [CI]: 2.68–5.21), adjusted mean disability score of 5.37 (95% CI: 3.98–6.76), and adjusted mean stiffness score of 4.34 (95% CI: 2.73–5.94) (Table 2).

Table 2 Association of long-term use of non-steroidal anti-inflammatory drugs (NSAIDs) with symptoms and structural changes in knee osteoarthritis over 4-to-5 years.

At the 4-to-5-year follow-up, 1,292 (17.22% of) non-users and 264 (30.34% of) long-term NSAID users had worsening pain scores that were above the MCID value. The odds ratio for worsening pain scores that were above the MCID value was 2.04 (95% CI: 1.66–2.49). The increased odds of worsening symptom scores were also observed for the disability and stiffness scores. At the 4-to-5-year follow-up, 901 (12.06%) non-users and 194 (22.67% of) long-term NSAID users had worsening disability scores that were above the MCID value, and 1,941 (25.89% of) non-users and 309 (35.52% of) long-term NSAID users had worsening stiffness scores that were above the MCID value. The odds ratio for worsening disability scores that were above the MCID value was 2.21 (95% CI: 1.74–2.80), and for stiffness was 1.58 (95% CI: 1.29–1.93) (Table 2).

Structural changes in KOA

At the 4-to-5-year follow-up, long-term NSAID users had significantly increased odds of worsening in radiographic KOA severity grade (OR: 1.43, 95% CI: 1.15–1.77), and markedly increased odds of having TKR (OR: 3.13, 95% CI: 2.08–4.70), compared to non-users (Table 2).

Missing data analyses

The missing data analysis (Table 3) to assess the potential impact of missing outcomes on our estimates also showed that NSAID users had increased odds of the outcomes, with the exception of the outcome of worsening in radiographic KOA severity grade in the worst-case scenario, for which the association did not reach statistical significance (P value 0.027) (Table 3).

Table 3 Missing data analyses of the potential impact of missing data on the estimations.

Sensitivity analysis

In the sensitivity analysis, in which we used propensity score matching, the matched data were well balanced between the long-term NSAID users group and the non-users group (Fig. 2). All the estimates were similar or the same as in the main analysis, with the exception of worsening in radiographic KOA severity grade over the follow-up period (Table 4), where the odds ratio for this outcome did not achieve statistical significance (OR: 1.25, 95% CI: 0.94–1.65, P value: 0.12).

Figure 2
figure 2

Summary of balance for matched data. Here, ‘distance’ quantifies the level of similarity or difference in key characteristics among the subjects in our study, offering insight into the extent to which individuals are comparable or distinct based on the variables selected for analysis. BMI, body mass index; K/L, Kellgren/Lawrence; OA, osteoarthritis; WOMAC, Western Ontario and McMaster Universities Arthritis Index.

Table 4 Association of long-term use of non-steroidal anti-inflammatory drugs (NSAIDs) with symptoms and structural changes in knee osteoarthritis over 4-to-5 years using 1:1 propensity score matching (PSM).

Discussion

Our 4-to-5-year longitudinal study revealed that long-term use of NSAIDs is associated with increased odds of experiencing not only a worsening of symptoms of KOA in individuals with or at risk of KOA compared to non-users but also a worsening that exceeds the minimally clinically important difference, as well as an increased incidence of TKR, without any statistically significant difference in structural degradation of the knee joint. Therefore, it appears that long-term NSAID use may accelerate the pathway to TKR by markedly exacerbating symptoms.

Previous reviews of randomized controlled trials and cohort studies showed that long-term use of NSAIDs is associated with serious adverse drug effects, such as gastrointestinal complications3, cardiovascular disease4, and renal failure5. In addition to these well-documented adverse effects, emerging evidence has suggested a potentially detrimental impact of NSAID use on joint health, particularly in terms of cartilage degeneration37,38,39,40,41,42. However, we did not detect any association of long-term use of NSAIDs with structural degradation of the joint. As our study results suggested that long-term NSAID use may increase KOA symptoms and the need for TKR, healthcare providers are advised to carefully consider the implications of long-term NSAID use for individuals with or at risk of KOA and explore alternative management strategies that may improve patient outcomes. Alternative treatments such as patient education, targeted exercise, and weight management, as recommended in the guidelines for managing KOA43,44,45,46, offer promising avenues for enhancing patient outcomes with minimal risks. Prioritizing these non-pharmacological interventions could enable patients to effectively manage their KOA symptoms without exposing themselves to the potential hazards associated with prolonged NSAID consumption.

Previous systematic reviews and meta-analyses have shown that NSAIDs may provide short-term pain relief and/or improve KOA functions, but the effects of long-term use were unclear16,17,18,19,20,21,22,23,24. Using individual-level participant data, our study suggested that long-term use of NSAIDs over 4-to-5 years may aggravate KOA symptoms. Additionally, studies on the effect of NSAIDs on structural changes in KOA report conflicting results. Some studies, including cohort, in vitro, and animal studies, report beneficial effects or no effect at all27,47,48,49. However, other studies, specifically within the cohort category, indicate negative effects50,51,52. Our study showed that 4-to-5 years of NSAID use may not have detrimental effects on structural changes in knees affected with KOA. Our study extends the findings from a recent systematic review and meta-analysis study that associated NSAID use with an increased risk of knee replacements53. However, that study53 included only two case–control studies specifically investigating knee replacement, and it reported high heterogeneity in the meta-analysis results. Therefore, their findings were challenging to interpret. However, our analysis, using individual-level participant data, provides a more reliable conclusion that long-term use of NSAIDs is associated with increased odds of having TKR, and that this observation could be due to the experience of increased symptoms compared to NSAID non-users.

A few limitations in our study should be acknowledged. Firstly, most of our participants were white, which may limit the generalizability of our findings to other populations. Secondly, we did not adjust our results for over-the-counter painkillers and opioid use, which may have led to an overestimation of the magnitude of the risks of long-term use of NSAIDs in our results. Thirdly, while we assumed continuous NSAID use between visits based on literature on average duration of use, this assumption is a limitation in our study, as we lacked direct data to confirm or disprove continuous use of NSAID use among participants, and it is possible that participants used NSAIDs intermittently. Similarly, we acknowledge the lack of specific data on NSAID dosage, the duration of NSAID use prior to baseline, and the timing of NSAID prescriptions. As a fourth limitation, although the application of propensity score matching in our sensitivity analysis has improved the reliability of our findings, it is important to recognize that this method cannot eliminate the impact of unmeasured confounders. Moreover, it does not fully resolve the issue of potential bidirectional causality. Specifically, while long-term NSAID use may increase the symptoms of KOA, an increase in KOA symptoms could also lead individuals to increase their use of NSAIDs. In brief, the observational nature of our study limits causal inference. Finally, although no difference was observed in structural degeneration of the knee joint between long-term NSAID users and non-users, unidentified factors may exacerbate symptoms among NSAID users. To overcome these limitations, future research, especially through randomized controlled trials, is vital.

In conclusion, our findings suggest that long-term use of NSAIDs for the management of KOA may lead to worsening symptoms and greater progression to TKR but not structural changes in the knee joint over 4 to 5 years. Healthcare providers should weigh the benefits and risks of NSAID use in patients with KOA and consider non-pharmacological treatments, such as education, exercise, and weight loss, to improve patient outcomes.