Introduction

Rheumatoid arthritis (RA) is a complex disease with fluctuating course of disease activity and progression1. Although treatments have improved in recent years, sustained response is still not guaranteed, with most patients failing to respond to conventional synthetic DMARD therapy, the first line of treatment. The underlying reasons for drug non-response or moderate response are multi-faceted and not well understood2. One of the aims of precision medicine in rheumatology, is to improve drug response, enabling more targeted treatment through better understanding of the aetiology3,4.

Currently, assessment of drug response in clinical drug-testing trials includes discrete scores defined by American College of Rheumatology (ACR) criteria, indicating 20%, 50% or 70% improvement in a selection of clinical parameters chosen to represent disease ailments5. European League Against Rheumatism (EULAR) response criteria can also be used, in which improvement in the disease activity score 28 (DAS28) identifies patients as good, moderate or bad responders.

Previous studies have identified heterogeneity in treatment response trajectories in RA patients demonstrating varying rates of progression of DAS286,7. These studies identified distinct groups of disease progression, either after initiation of treatment or simply through observation. They also state that there were no distinguishing factors available at baseline which were able to distinguish between the groups, aside from higher disease activity.

One explanation of variation in drug response and disease progression may be linked to the underlying endotypic profile of the patients being treated8,9,10,11. An endotype is a specific subtype of a condition identified that explains the observable properties of a phenotype. Endotypes define subgroups or phenotypes based on specific cells or molecules levels in blood or other fluids, and are believed to be more specific, more accurate, way of defining subgroups12. It has previously been postulated that RA may consist of several endotypes9,10, with distinctive underlying pathological mechanisms. This in turn may alter not only the manifestation of the disease in its physical form, but also drug response, and rate of progression. Using synovial biomarkers, molecular and cellular phenotypes were identified, which were closely associated with clinical outcomes to therapies targeting different biological pathways10. Further investigation into the mechanisms which may influence treatment response and disease progression may pave the way for better tools for precision medicine and better patient care3.

The aim of this study was specifically to identify heterogeneity in response to Tocilizumab, an anti-IL6 receptor treatment, in RA patients with moderate to severe active disease, and to identify factors which may describe the heterogeneity using serological biomarkers as well as patient demographics. Through a data-driven approach, we investigated disease heterogeneity using statistical learning methods.

Methods

Patient data

Longitudinal patient data for 485 RA patients receiving 4 or 8 mg/kg Tocilizumab in combination with MTX with a follow-up of up to 52 weeks were extracted from a biomarker sub-study13 of the three arm, phase III clinical trial LITHE dataset (NCT00106535)14. Patients receiving placebo where not included in this analysis. This cohort is comprised of patients with moderate to severe disease activity and inadequate response to DMARDs14. For this study, patients were selected based on a few main selection criteria which can be seen in Fig. 1. This was a post hoc and retrospective analysis of generalized data available from the clinical trials. For historical information on these clinical trials please refer to following original references14,15. Design and statistical plan was approved by the IRB. No new data was generated or attempted to be generated as course of the analysis. No new data was provided back to the original sponsor or PI of the project. Identifying participant information was not made available to the authors.

Figure 1
figure 1

Patient selection flow diagram. Flow diagram of selection of patients and data for latent class mixture modelling to identify distinct trajectories of treatment response.

Demographic data (age, gender, race, disease duration, BMI) and clinical data such as previous oral steroid usage was collected at baseline (BL), upon enrolment of the study. Baseline and follow-up clinical measurements (e.g. Tender Joint Count (TJC), Swollen Joint Count (SJC), VAS pain, Health Assessment Questionnaire (HAQ), DAS28, modified Sharp Score (mTSS), etc.) were recorded every 4 weeks after baseline for 52 weeks, totalling 14 timepoints. For patients receiving escape therapy, all timepoints after initiation of escape therapy were removed (n = 98).

Endpoints were measured according to ACR and EULAR criteria, and including radiographic measurements, DAS28 disease activity scores as well as patient and physician reported outcomes.

Latent class and statistical analysis

For this study, the lcmm package (version 1.7.9)16 in R was used. Different numbers of latent classes were tested ranging from 1 to 5, with the optimal number of classes chosen based on the lowest Bayesian Information Criteria and Akaike Information Criteria. Both linear and non-linear link functions were tested, with 3–5 I-splines with equidistant nodes being investigated. The number of classes chosen was also based on the percentage of the total patient cohort in each class being above 10% to avoid under-populated groups, and the posterior probability being over 0.7, ensuring a high degree of confidence in class membership. These heuristics can be seen in Table 1. Ultimately, three latent classes with 3 knots were chosen since metrics used were extremely close, and more differentiation was sought. Treatment dose and baseline oral steroid use were used as fixed effects in the model. Other variables such as age and sex were omitted due to low statistical importance. The nature of mixed effects models inherently allows for missing data meaning no imputation was performed.

Table 1 Parameter selection criteria for latent class modelling.

In each distinct trajectory group identified, baseline clinical measurements were then used in order to characterise patients by serological biomarker levels, and clinical traits. Follow-up measurements were also used to describe later timepoints of each groups response to treatment including Disease Activity Score 28. Chi-squared tests were used for non-numeric values whilst ANOVA was used for numeric measurements. In the case data was missing for a patient, statistics were computed omitting these patients, with n reported in Table 2. ACR and EULAR response rates were then compared in each group at 20%, 50% and 70% improvement levels independently using a Chi-squared test.

Table 2 Patient characteristics for the whole study population and identified latent classes.

Longitudinal biomarker data for each patient were collected and centred to baseline levels. Linear mixed effects models were used to model change in serological biomarker levels from baseline between each of the latent classes identified. Fixed effects used were time (weeks) and latent class, and patient used as random effect. A likelihood ratio test was then used for each biomarker to identify the statistical significance of including the class variable compared to time alone.

All analysis was computed using R version 3.4.1. Mixed effects models were implemented using the lme4 package17.

Consent for publication

Not applicable.

Results

A total of 485 patients were included for latent class trajectory analysis selected from a biomarker sub-study of the LITHE clinical trial of Tocilizumab. Time points were removed for patients after they escaped the trial or switched to another therapeutic agent. This resulted in 4–14 observations per patient, over a 52-week period.

Latent class trajectory analysis identified three distinct subgroups of patients following different trajectories of drug response. All groups started with a high DAS28, with an average 6.5 ± 0.9. On average, all three classes demonstrated a decline in disease activity, however at significantly different rates and level of change (Fig. 2) corrected for treatment dose and baseline oral steroid use.

Figure 2
figure 2

Latent trajectories of response to Tocilizumab measured by DAS28. (A) All patient trajectories of DAS28 in response to Tocilizumab anti-IL6 treatment, with three identified latent trajectories of DAS28 in response to Tocilizumab over a 52-week clinical trial (B) mean trajectory for each identified latent class from baseline to week 52.

Class 1, (17.5%, 85 patients) showed the smallest decrease in DAS28 over the 52-week period, with patients remaining above moderate disease activity status (3.2 < DAS28 < 5.1) on average after 52 weeks (DAS28 = 4.6 ± 1.1). Class 2, the largest group (69.7%, 338 patients) showed a greater decrease in the disease activity score over the course of the trial. These patients, on average reached low disease activity (2.6 < DAS28 < 3.2) but did not reach remission status after 52 weeks (DAS28 = 3.0 ± 1.1). Class 3, the smallest group (12.8%, 62 patients) showed the largest improvement in disease activity, with all patients achieving remission status after 52 weeks (DAS28 = 1.45 ± 0.6). The mean posterior probabilities of these three classes were 0.71, 0.74 and 0.80, respectively.

Latent class characteristics

At baseline, age, gender, BMI, duration of disease, baseline steroid use and number of previous DMARDS at baseline were not significantly different (p > 0.05, Table 2). No differences in clinical markers of disease activity or severity, VAS Pain and radiographic Sharp score were seen at baseline either.

DAS28 showed a trend to differ at baseline with class 1, the poor responding group having higher score than the other two groups. This was likely driven by differences in CRP and TJC which were elevated in groups 3 and 1, respectively.

Whilst CRP was significantly different between groups (p < 0.05), other biomarkers showed little or no trend at baseline (Table 2).

The number of responders in each group according to ACR criteria was also significantly different between the three groups at week 24 and week 52, according to 20%, 50% and 70% improvement from baseline. Class 1 had significantly lower proportion of patients responding at ACR20 criteria at all time points compared to both class 2 and 3 (p < 0.001, Table 3). Class 3 also had significantly more patients responding at ACR70 criteria (p < 0.001, Table 3).

Table 3 Response criteria of each latent class identified.

Biomarker dynamics

Serological biomarkers were chosen for the pathological mechanism or tissue they represent, central to RA; PINP, CTX-I, ICTP and OC (bone and cartilage), and MMP3, CRP, C1M and VICM (inflammation). Linear mixed effects modelling revealed biomarker change trajectories over five time points for each of the biochemical markers.

When looking at absolute change in biomarker levels from baseline, there are some differences between the three groups which could be observed. Whilst not statistically significant, markers of bone formation, PINP and OC increase more in class 3 than in class 1, which showed little sign of elevation for the first 16 weeks (Fig. 3). Patients in class 3 also demonstrate a more rapid decline in OC and ICTP than those in class 1 (Fig. 3).

Figure 3
figure 3

Change in biomarker levels for each latent class. Estimated means of percentage change in biomarker for PINP, CTX-I, OC and ICTP (bone), and MMP3, CRP, C1M and VICM (inflammation).

Whilst markers of bone were not significantly different between classes, change in levels of MMP3 and CRP from baseline were different between classes (p < 0.001 and p = 0.03 respectively) with class 3 being much more greatly reduced (Fig. 3). VICM levels in classes 2 and 3 followed a similar path to that of C1M, whilst class 1 showed more steady decline, although all groups showed large confidence intervals.

Discussion

The aims of this study were to identify distinct trajectories of treatment response, and to characterise these groups by clinical and longitudinal biochemical profiles. The overreaching goal of these analyses was to gain better understanding of the dynamics of response over time to highlight that different responder endotypes exist.

We identified three classes of drug response, class one, moderate responders with sustained high levels of disease activity, class 2, also moderate responders to therapy, achieving low levels of disease activity, and class 3, adequate responders, achieving remission status on average. Class three also had significantly higher proportions of patients achieving ACR drug response (20%, 50% and 70%) as well as fewer patients having to receive escape therapy. Class 2 fit closely to the median of the data set, whilst classes 1 and 3 were very much at the extremes.

As shown by other authors, response to treatment is not a linear process, and is in fact highly heterogenous7,18. This gives an indication that response to treatment cannot be treated unilaterally across all patients and thus must be treated as a heterogenous process, either through the identification of temporal phenotypes as we have shown here or through sub-classification of patients into disease subtypes.

These differences in the latent classes is partially explained by their clinical and serological markers. Whilst baseline demographics did not differ between the groups, DAS28 and HAQ score were elevated in class 1 and 3 compared to 2, whilst TJC was elevated in class 1 compared to 2 and 3.

We showed that the biomarker dynamics differ significantly between the three trajectories. This gives an indication that the response profiles identified are responding differently based on molecular endotype, or indeed that their molecular signature evolves over time. There was no difference in bone marker changes between the response classes (PINP, OC, CTX-I, and ICTP, although PINP seem to increase the most in class 1. Previous studies have shown that the bone resorption markers CTX-I and ICTP are suppressed by Tocilizumab, whereas neither of the bone formation markers, PINP and OC, were13,19. This may indicate that response measured by DAS trajectories are disassociated from drug induced changes to bone turnover however, that bone resorption itself is associated with IL-6 receptor modulation. In contrast, the tissue inflammation markers MMP3, C1M, as well as CRP, were more suppressed in class 3 as compared to class 1 or 2. The level of suppression of MMP3 and CRP for class 1 and 2 was the same, whereas there was a trend for class 2 compared to class 1 to be more suppressed in C1M. These data indicate that there is a relationship between the turnover of connective tissue collagens, measured by C1M20, and the DAS28 response trajectories. Previous studies have shown linear correlation between these markers and disease activity measures, as well as with drug dose response13,21.

Whilst there were no clear predictive biomarkers indicating membership of a given trajectory, changes in biomarker levels indicate a difference in molecular response between patients. This is true of bone biomarkers as well as inflammatory markers. Interestingly, earlier analysis of the current data set has shown that a combination of bone and connective markers was predictive of radiographic progression more so than any of the disease activity measures—again supporting the missing link between structural burden and progression and changes in DAS or other disease activity measures22,23. Further research into the clinical and molecular differences between patients may allow for more targeted treatment strategies. Comorbidities and multi-morbidities play a large role in defining the underlying biology of a patient, which can have profound effects on treatment response24,25. Adopting a more holistic approach, incorporating multiple data sources may allow for a better understanding of disease aetiology. By tailoring clinical trial design to allow for discrepancies in patient endotypes, one may expect higher response rates and fewer adverse events.

Strengths and limitations

This study was strengthened by the availability of serological biomarkers of tissue turnover, made available at multiple time points. This allowed a more in-depth analysis of each of the trajectories identified. Unfortunately, other variables such as ACPA and RF positivity were not available which may have given more insight in to the response trajectories identified and the underlining phenotype of the patients26,27. Since the data used in this study was from a randomised controlled trial, many conditions may not reflect those seen in the clinic. For that reason, it may be difficult to translate any of the findings directly to clinical use.

The use of DAS28 to profile patient’s response demonstrated the heterogeneity in a clinical trial population with varying degrees of response. This score incorporates subjective measures of pain and swelling which may increase a large degree of noise into the data. By using non-subjective measures of drug response such as imaging biomarkers, molecular disease scoring systems or radiographic scores, a truer picture of physical drug response may be obtained.

Conclusion

Statistical learning allowed the identification of distinct trajectories of patient response to anti-IL6 treatments. Through the identification of more homogenous populations of drug response, more targeted therapeutic treatment regimens may be achieved. Through the incorporation of clinical and biochemical disease parameters, as well as initial disease severity, it maybe possible to improve on current response rates.