Identification of heterogenous treatment response trajectories to anti-IL6 receptor treatment in rheumatoid arthritis

Rheumatoid arthritis (RA) is a chronic inflammatory disease with fluctuating course of progression. Despite substantial improvement in treatments in recent years, treatment response is still not guaranteed. The aim of this study was to identify variation in Disease Activity Score 28 (DAS28) of RA patients in response to Tocilizumab, and to investigate both molecular and clinical factors influencing response. Clinical and biochemical data for 485 RA patients receiving Tocilizumab in combination with methotrexate were extracted from the LITHE phase III clinical study (NCT00106535), and post-hoc analysis conducted. Latent class mixed models were used to identify statistically distinct trajectories of DAS28 after the initiation of treatment. Biomarker measurements were then analysed cross-sectionally and temporally, to characterise patients by serological biomarkers and clinical factors. We identified three distinct trajectories of drug response: class 1 (n = 85, 17.5%), class 2 (n = 338, 69.7%) and class 3 (n = 62, 12.8%). All groups started with high DAS28 on average (DAS28 > 5.1). Class 1 showed the least reduction in DAS28, with significantly more patients seeking escape therapy (p < 0.001). Class 3 showed significantly higher rates of improvement in DAS28, with 58.1% achieving ACR response levels compared to 2.4% in class 1 (p < 0.0001). Biomarkers of inflammation, MMP-3, CRP, C1M, showed greater reduction in class 3 compared to the other classes. Identification of more homogenous patient sub-populations of drug response may allow for more targeted therapeutic treatment regimens and a better understanding of disease aetiology.


Methods
Patient data. Longitudinal patient data for 485 RA patients receiving 4 or 8 mg/kg Tocilizumab in combination with MTX with a follow-up of up to 52 weeks were extracted from a biomarker sub-study 13 of the three arm, phase III clinical trial LITHE dataset (NCT00106535) 14 . Patients receiving placebo where not included in this analysis. This cohort is comprised of patients with moderate to severe disease activity and inadequate response to DMARDs 14 . For this study, patients were selected based on a few main selection criteria which can be seen in Fig. 1. This was a post hoc and retrospective analysis of generalized data available from the clinical trials. For historical information on these clinical trials please refer to following original references 14,15 . Design and statistical plan was approved by the IRB. No new data was generated or attempted to be generated as course of the analysis. No new data was provided back to the original sponsor or PI of the project. Identifying participant information was not made available to the authors.
Demographic data (age, gender, race, disease duration, BMI) and clinical data such as previous oral steroid usage was collected at baseline (BL), upon enrolment of the study. Baseline and follow-up clinical measurements (e.g. Tender Joint Count (TJC), Swollen Joint Count (SJC), VAS pain, Health Assessment Questionnaire (HAQ), DAS28, modified Sharp Score (mTSS), etc.) were recorded every 4 weeks after baseline for 52 weeks, totalling 14 timepoints. For patients receiving escape therapy, all timepoints after initiation of escape therapy were removed (n = 98).
Endpoints were measured according to ACR and EULAR criteria, and including radiographic measurements, DAS28 disease activity scores as well as patient and physician reported outcomes.

Latent class and statistical analysis.
For this study, the lcmm package (version 1.7.9) 16 in R was used.
Different numbers of latent classes were tested ranging from 1 to 5, with the optimal number of classes chosen based on the lowest Bayesian Information Criteria and Akaike Information Criteria. Both linear and non-linear link functions were tested, with 3-5 I-splines with equidistant nodes being investigated. The number of classes chosen was also based on the percentage of the total patient cohort in each class being above 10% to avoid underpopulated groups, and the posterior probability being over 0.7, ensuring a high degree of confidence in class membership. These heuristics can be seen in Table 1. Ultimately, three latent classes with 3 knots were chosen since metrics used were extremely close, and more differentiation was sought. Treatment dose and baseline oral steroid use were used as fixed effects in the model. Other variables such as age and sex were omitted due to low statistical importance. The nature of mixed effects models inherently allows for missing data meaning no imputation was performed.
In each distinct trajectory group identified, baseline clinical measurements were then used in order to characterise patients by serological biomarker levels, and clinical traits. Follow-up measurements were also used to describe later timepoints of each groups response to treatment including Disease Activity Score 28. Chi-squared tests were used for non-numeric values whilst ANOVA was used for numeric measurements. In the case data was missing for a patient, statistics were computed omitting these patients, with n reported in Table 2. ACR and EULAR response rates were then compared in each group at 20%, 50% and 70% improvement levels independently using a Chi-squared test.
Longitudinal biomarker data for each patient were collected and centred to baseline levels. Linear mixed effects models were used to model change in serological biomarker levels from baseline between each of the Scientific RepoRtS | (2020) 10:13975 | https://doi.org/10.1038/s41598-020-70942-x www.nature.com/scientificreports/ latent classes identified. Fixed effects used were time (weeks) and latent class, and patient used as random effect. A likelihood ratio test was then used for each biomarker to identify the statistical significance of including the class variable compared to time alone. All analysis was computed using R version 3.4.1. Mixed effects models were implemented using the lme4 package 17 .

Results
A total of 485 patients were included for latent class trajectory analysis selected from a biomarker sub-study of the LITHE clinical trial of Tocilizumab. Time points were removed for patients after they escaped the trial or switched to another therapeutic agent. This resulted in 4-14 observations per patient, over a 52-week period.
Latent class trajectory analysis identified three distinct subgroups of patients following different trajectories of drug response. All groups started with a high DAS28, with an average 6.5 ± 0.9. On average, all three classes demonstrated a decline in disease activity, however at significantly different rates and level of change (Fig. 2) corrected for treatment dose and baseline oral steroid use.
Class 1, (17.5%, 85 patients) showed the smallest decrease in DAS28 over the 52-week period, with patients remaining above moderate disease activity status (3.2 < DAS28 < 5.1) on average after 52 weeks (DAS28 = 4.6 ± 1.1). Class 2, the largest group (69.7%, 338 patients) showed a greater decrease in the disease activity score over the course of the trial. These patients, on average reached low disease activity (2.6 < DAS28 < 3.2) but did not reach remission status after 52 weeks (DAS28 = 3.0 ± 1.1). Class 3, the smallest group (12.8%, 62 patients)  Table 2). No differences in clinical markers of disease activity or severity, VAS Pain and radiographic Sharp score were seen at baseline either. DAS28 showed a trend to differ at baseline with class 1, the poor responding group having higher score than the other two groups. This was likely driven by differences in CRP and TJC which were elevated in groups 3 and 1, respectively.
Whilst CRP was significantly different between groups (p < 0.05), other biomarkers showed little or no trend at baseline ( Table 2).
The number of responders in each group according to ACR criteria was also significantly different between the three groups at week 24 and week 52, according to 20%, 50% and 70% improvement from baseline. Class 1 had significantly lower proportion of patients responding at ACR20 criteria at all time points compared to both class 2 and 3 (p < 0.001, Table 3). Class 3 also had significantly more patients responding at ACR70 criteria (p < 0.001, Table 3).
Biomarker dynamics. Serological biomarkers were chosen for the pathological mechanism or tissue they represent, central to RA; PINP, CTX-I, ICTP and OC (bone and cartilage), and MMP3, CRP, C1M and VICM (inflammation). Linear mixed effects modelling revealed biomarker change trajectories over five time points for each of the biochemical markers.
When looking at absolute change in biomarker levels from baseline, there are some differences between the three groups which could be observed. Whilst not statistically significant, markers of bone formation, PINP and OC increase more in class 3 than in class 1, which showed little sign of elevation for the first 16 weeks (Fig. 3). Patients in class 3 also demonstrate a more rapid decline in OC and ICTP than those in class 1 (Fig. 3).
Whilst markers of bone were not significantly different between classes, change in levels of MMP3 and CRP from baseline were different between classes (p < 0.001 and p = 0.03 respectively) with class 3 being much more greatly reduced (Fig. 3). VICM levels in classes 2 and 3 followed a similar path to that of C1M, whilst class 1 showed more steady decline, although all groups showed large confidence intervals.

Discussion
The aims of this study were to identify distinct trajectories of treatment response, and to characterise these groups by clinical and longitudinal biochemical profiles. The overreaching goal of these analyses was to gain better understanding of the dynamics of response over time to highlight that different responder endotypes exist.
We identified three classes of drug response, class one, moderate responders with sustained high levels of disease activity, class 2, also moderate responders to therapy, achieving low levels of disease activity, and class 3, adequate responders, achieving remission status on average. Class three also had significantly higher proportions of patients achieving ACR drug response (20%, 50% and 70%) as well as fewer patients having to receive escape therapy. Class 2 fit closely to the median of the data set, whilst classes 1 and 3 were very much at the extremes.
As shown by other authors, response to treatment is not a linear process, and is in fact highly heterogenous 7,18 . This gives an indication that response to treatment cannot be treated unilaterally across all patients and thus Table 1. Parameter selection criteria for latent class modelling. Criteria for selection of number of groups and knots to be used in latent class analysis, in which BIC and AIC should be minimised, posterior probability of class membership (PP) should be > 0.7 for all classes, and % of total population should not be below 10%. www.nature.com/scientificreports/ must be treated as a heterogenous process, either through the identification of temporal phenotypes as we have shown here or through sub-classification of patients into disease subtypes. These differences in the latent classes is partially explained by their clinical and serological markers. Whilst baseline demographics did not differ between the groups, DAS28 and HAQ score were elevated in class 1 and 3 compared to 2, whilst TJC was elevated in class 1 compared to 2 and 3.
We showed that the biomarker dynamics differ significantly between the three trajectories. This gives an indication that the response profiles identified are responding differently based on molecular endotype, or indeed that their molecular signature evolves over time. There was no difference in bone marker changes between the response classes (PINP, OC, CTX-I, and ICTP, although PINP seem to increase the most in class 1. Previous studies have shown that the bone resorption markers CTX-I and ICTP are suppressed by Tocilizumab, whereas neither of the bone formation markers, PINP and OC, were 13,19 . This may indicate that response measured by DAS trajectories are disassociated from drug induced changes to bone turnover however, that bone resorption itself is associated with IL-6 receptor modulation. In contrast, the tissue inflammation markers MMP3, C1M, as well as CRP, were more suppressed in class 3 as compared to class 1 or 2. The level of suppression of MMP3 and CRP for class 1 and 2 was the same, whereas there was a trend for class 2 compared to class 1 to be more suppressed in C1M. These data indicate that there is a relationship between the turnover of connective tissue collagens, measured by C1M 20 , and the DAS28 response trajectories. Previous studies have shown linear correlation between these markers and disease activity measures, as well as with drug dose response 13,21 . Whilst there were no clear predictive biomarkers indicating membership of a given trajectory, changes in biomarker levels indicate a difference in molecular response between patients. This is true of bone biomarkers as well as inflammatory markers. Interestingly, earlier analysis of the current data set has shown that a combination Table 2. Patient characteristics for the whole study population and identified latent classes. Baseline characteristics of each latent class trajectory. Data presented as mean ± SD (n) or n (%) unless indicated otherwise. In the case data was missing, this was omitted from analysis.   Table 3. Response criteria of each latent class identified. *Patients with missing measurements due to lack of sample or receiving escape therapy did not have a change in DAS28 calculated and were therefore omitted from this analysis. Response criteria fulfilled by each latent class trajectory. Data presented as n (%) unless indicated otherwise. In the case data was missing, this was omitted from analysis. www.nature.com/scientificreports/ of bone and connective markers was predictive of radiographic progression more so than any of the disease activity measures-again supporting the missing link between structural burden and progression and changes in DAS or other disease activity measures 22,23 . Further research into the clinical and molecular differences between patients may allow for more targeted treatment strategies. Comorbidities and multi-morbidities play a large role in defining the underlying biology of a patient, which can have profound effects on treatment response 24,25 . Adopting a more holistic approach, incorporating multiple data sources may allow for a better understanding of disease aetiology. By tailoring clinical trial design to allow for discrepancies in patient endotypes, one may expect higher response rates and fewer adverse events. www.nature.com/scientificreports/ Strengths and limitations. This study was strengthened by the availability of serological biomarkers of tissue turnover, made available at multiple time points. This allowed a more in-depth analysis of each of the trajectories identified. Unfortunately, other variables such as ACPA and RF positivity were not available which may have given more insight in to the response trajectories identified and the underlining phenotype of the patients 26,27 . Since the data used in this study was from a randomised controlled trial, many conditions may not reflect those seen in the clinic. For that reason, it may be difficult to translate any of the findings directly to clinical use. The use of DAS28 to profile patient's response demonstrated the heterogeneity in a clinical trial population with varying degrees of response. This score incorporates subjective measures of pain and swelling which may increase a large degree of noise into the data. By using non-subjective measures of drug response such as imaging biomarkers, molecular disease scoring systems or radiographic scores, a truer picture of physical drug response may be obtained.

Conclusion
Statistical learning allowed the identification of distinct trajectories of patient response to anti-IL6 treatments. Through the identification of more homogenous populations of drug response, more targeted therapeutic treatment regimens may be achieved. Through the incorporation of clinical and biochemical disease parameters, as well as initial disease severity, it maybe possible to improve on current response rates.

Data availability
The datasets generated and/or analysed during the current study are not publicly available due to legal and ethical reasons but are available from the corresponding author on reasonable request.