Robenacoxib shows efficacy for the treatment of chronic degenerative joint disease-associated pain in cats: a randomized and blinded pilot clinical trial

The main objective of this pilot clinical trial was to evaluate outcome measures for the assessment of the nonsteroidal anti-inflammatory drug (NSAID) robenacoxib in cats with degenerative joint disease-associated pain (DJD-pain). Otherwise healthy cats (n = 109) with DJD-pain entered a parallel group, randomized, blinded clinical trial. Cats received placebo (P) or robenacoxib (R) for two consecutive 3-week periods. Treatment groups were PP, RR, and RP. Actimetry and owner-assessment data were collected. Data were analyzed using mixed-effects and generalized mixed-effects linear models. Activity data showed high within-cat and between-cat variability, and 82.4% of the values were zero. Compared to placebo, mean total activity was higher (5.7%) in robenacoxib-treated cats (p = 0.24); for the 80th percentile of activity, more robenacoxib-treated cats had a > 10% increase in activity after 3 (p = 0.046) and 6 weeks (p = 0.026). Robenacoxib treatment significantly decreased owner-assessed disability, (p = 0.01; 49% reduction in disability; effect size ~ 0.3), and improved temperament (p = 0.0039) and happiness (p = 0.021) after 6 weeks. More robenacoxib-treated cats were successes at 6 weeks (p = 0.018; NNT: 3.8). Adverse effect frequencies were similar across groups. Results identified suitable endpoints for confirmatory studies, while also indicating efficacy of robenacoxib in cats with DJD-pain.

www.nature.com/scientificreports/ Across all methods of data partitioning, cats receiving robenacoxib showed greater increases in activity compared to placebo ( Table 4). Analysis of entire day non-zero activity and dusk-to-dawn activity showed greater, but non-significant, activity increases in robenacoxib-treated cats. Analysis of non-zero dusk-to-dawn values, revealed significant increases of approximately 11% after both 3 (C1) and 6 (C2) weeks of treatment with robenacoxib (p = 0.045 and p = 0.040, respectively). No significant deterioration (decrease in activity, C3) was detected across any partitioning of the data.
Activity-within-cat analysis. Group RP contained 35 evaluable cats for within-cat analysis using cumulative distribution function (CDF) analysis (example of CDF analysis in Fig. 2). More cats showed increases in activity with robenacoxib compared to placebo for all cats and the three subgroups (Table 5) with P values ranging from 0.021 to 0.059. Effects were statistically significant for entire-day non-zero activity (10:2 cats [28.6%:5.7%], p = 0.021) and dusk-to-dawn total activity (9:2 cats [25.7%:5.7%], p = 0.035) ( Table 5). These ratios corresponded to number needed to treat (NNT) values of 4.4 and 5.0 cats, respectively.
CSOM baseline and post-treatment score analysis allowed for calculation of the reduction in pain and disability by treatment group (e.g., RR vs PP). The baseline score allowed for a calculation of how much improvement there could be, and the mean change allowed calculation of the actual improvement as a percentage as what was possible. After 6 weeks of treatment, in the PP group, the improvement of 2.28 points equated to a 30% reduction in disability, and in the RR group, the improvement of 3.53 points equated to a 49% reduction in disability, and www.nature.com/scientificreports/ an improvement with robenacoxib over placebo of 19%. The actual CSOM scores for each group at baseline, and at 3 and 6 weeks, and the change from baseline, are shown in Supplementary Table 1.
FMPI, quality of life (QoL), temperament and happiness assessments. No significant treatment effects on FMPI scores (nominal or adjusted) were detected (Table 6). Table 1. Patient demographic data. Sex and breed are presented as count (percentage of treatment group total). Treatment Groups 1-3-lettering designates the medications the patient received during the three treatment periods. P placebo; R robenacoxib, n number of patients, SD standard deviation, CKD chronic kidney disease, IRIS International Renal Interest Society.  www.nature.com/scientificreports/ Analysis of temperament and happiness compared to before the most recent treatment showed significant improvement following 6 weeks of treatment with robenacoxib compared with placebo [odds ratio, OR, = 4.53 (p = 0.0039) and OR = 2.73 (p = 0.021), respectively], while all other comparisons (3 weeks for temperament and happiness; 3 and 6 weeks for QoL) failed to show significance ( Table 7). The frequency distribution tables for temperament, happiness and QoL are shown in Supplementary Tables 2, 3 and 4 respectively. Safety measures. Adverse events were generally mild and self-limiting, typically involving the gastrointestinal tract ( Table 8). None of the rates of occurrence of AEs were significantly different between cats receiving placebo or robenacoxib. The proportions of cats with pre-existing chronic kidney disease (CKD) experiencing at least one AE during the study were not significantly different between treatment groups (p = 0.88).
No clinically relevant hematological, chemistry, or urinalysis differences between groups were observed. Several differences were observed between treatment groups at study exit. Select hematological, chemistry, and urinalysis data, including all statistically significant results, are available as Supplementary Tables 5, 6 and 7 respectively.

Discussion
The study achieved its main objective of identifying suitable outcome measures for testing the efficacy of the NSAID robenacoxib in cats with DJD-pain. Additionally, collectively, the data reported in this study support the hypothesis that cats receiving robenacoxib would show increases in AM-measured physical activity when Table 3. Mean hourly activity data shown for the different analysis contrasts, presented as either arithmetic or percentage change from baseline. Estimate = change from baseline in mean hourly activity counts with robenacoxib relative to placebo. Estimate = change from baseline in hourly activity counts with robenacoxib relative to placebo. C1 contrast 1 = treatment group PP compared against groups RR and RP following 3 weeks of treatment, C2 contrast 2 = treatment group PP compared against group RR following 6 weeks of treatment, C3 contrast 3 = treatment group RP compared against RR for change in activity between weeks 3 and 6 of treatment, SE standard error, P placebo, R robenacoxib. *P values test whether the contrast is significantly different from zero.  Table 4. Analysis of mean hourly activity with ('all data') and without ('non-zero') minutes when activity was zero. Arithmetic change and relative change (percentage) for three contrasts that compare treatment groups. Results shown for non-zero activity across the whole day; all activity over the dusk-to-dawn time period; and non-zero activity over the dusk-to-dawn time period. Estimate = change from baseline in hourly activity counts with robenacoxib relative to placebo (positive values indicate greater activity when on robenacoxib). SE standard error. The contrasts were: C1 (contrast 1) = treatment group PP compared against groups RR and RP following 3 weeks of treatment; C2 (contrast 2) = treatment group PP compared against group RR following 6 weeks of treatment; C3 (contrast 3) = treatment group RP compared against RR for change in activity between weeks 3 and 6 of treatment. P = placebo; R = robenacoxib. *Significance at 0.05 level. www.nature.com/scientificreports/ compared to placebo, and the hypothesis that robenacoxib treatment would be associated with owner-assessed decreases in pain and disability, and improvements in temperament and happiness. However, these improvements were only seen after 6 weeks of treatment, and not after 3 weeks of treatment. As has been reported previously for other NSAIDs 4,5 , it was confirmed that between-cat comparison of total activity was not sufficiently discriminating between robenacoxib and placebo, but that analysis of higher levels of activity (either non-zero counts or a pre-specified high percentile of activity), dusk-to-dawn activity and/or within-cat analyses were more sensitive. In addition, owner subjective assessments of disability, temperament and happiness, but not the version of the FMPI employed in this study, appeared to detect treatment benefits. In our study, 82.4% of activity counts were zero, indicating no measurable activity. This information justified partitioning of the data to analyze the more active times. Analysis of partitioned data showed greater activity with robenacoxib versus placebo for dusk-to-dawn activity and non-zero activity. Recent data demonstrate that higher levels of activity are more impacted in cats with DJD-pain and NSAIDs appear to preferentially positively affect these higher levels, or latent states (manuscript in review). Consistent with these observations, in our study robenacoxib produced a significant increase in activity by ≥ 10% in the 80th percentile of activity values.
Between-cat variability is very high for activity 4,14 , supporting the suggestion that within-cat analysis of data may be superior to between-cat analysis (as used in our primary analysis). In our study, between-cat variability for activity was very high and within-cat variability was lower but still high. Due to the study design, within-cat analysis of our data was restricted to the analysis of the RP group of 35 cats, which sequentially received placebo (baseline), robenacoxib and then placebo. Use of a full cross-over design would allow investigators to minimize variability and take advantage of the increased power associated with each cat being compared to itself.
Significantly better outcomes with robenacoxib compared to placebo were obtained with three of the ownerbased outcome measures-CSOM, temperament and happiness assessments. We failed to detect any significant differences between groups using the FMPI questionnaire. The FMPI consists of set questions, and overall, in its current form, it does not appear to be as sensitive to treatment effects as the individually tailored CSOM 10 . The authors (BDXL, MEG) have recently been revising the FMPI. Treatment effects on CSOM (actual scores and success-failure), temperament and happiness were significantly improved following 6 weeks of treatment with Figure 2. Example cumulative distribution function for a single cat (LAS-31), demonstrating a rightward shift of activity during treatment with robenacoxib, as compared against treatment with placebo. This indicates that the activity counts were higher while receiving robenacoxib. P values from the Kolmogorov-Smirnov test were < 0.0001 for the hypothesis that activity with robenacoxib > placebo, and 1.0 for placebo > robenacoxib. Analysis and graphical output performed using SAS software capabilities (Version 9, SAS Institute Inc., Cary, NC). Table 5. Analysis of within-cat activity data ( n = 35), showing the number and percentage of cats in which hourly activity counts were significantly higher during either placebo (P) or robenacoxib (R) treatment for each method of partitioning the data. NNT number needed to treat. *Significance at 0.05 level, n = number www.nature.com/scientificreports/ robenacoxib, but not after three. This lag may be due to delays in owners noticing or learning to detect behavioral changes, time required for cats to un-learn their learned avoidance or fear of activities, or other unknown factors. The lag time may also be due to pharmacokinetic factors. However, this unlikely because robenacoxib should achieve steady concentrations at sites of inflammation within a few days and exhibits no changes in pharmacodynamic action with time 19 . For activity, we observed increases of approximately 5% and 10% with robenacoxib over placebo for total and non-zero or dusk-to-dawn activity, respectively. These values compare favorably with the previously reported increase of 3.32% (non-significant) over placebo in total activity in cats with DJD receiving the NSAID meloxicam for three weeks 4 . However, Gruen et al. 4 evaluated meloxicam administered at 0.035 mg/kg daily-less than the maintenance dose of 0.05 mg/kg daily approved in the EU. No studies examining activity changes in humans with OA who are administered analgesics have been performed to help put these changes into context. Overall, little is Table 6. Analysis of owner based assessments of client specific outcome measures (CSOM) total score and Feline Musculoskeletal Pain Index (FMPI) score. Estimates show the effect of robenacoxib relative to placebo. SE standard error, C1 contrast 1 = treatment group PP compared against groups RR and RP following 3 weeks of treatment, C2 contrast 2 = treatment group PP compared against group RR following 6 weeks of treatment, C3 contrast 3 = treatment group RP compared against RR for change in activity between weeks 3 and 6 of treatment. P = placebo; R = robenacoxib. *Significance at 0.05 level. **Student's t statistic for testing whether the contrast equals zero; two-tailed test.  Table 7. Results of owner-based assessments of quality of life (QoL), temperament, and happiness. The odds ratios show the effect of robenacoxib relative to placebo. C1 contrast 1 = treatment group PP compared against groups RR and RP following 3 weeks of treatment, C2 contrast 2 = treatment group PP compared against group RR following 6 weeks of treatment, C3 contrast 3 = treatment group RP compared against RR for change in activity between weeks 3 and 6 of treatment. The tests are effectively comparing the categorical response probability distributions. An odds ratio greater than 1 indicates a higher likelihood of better outcomes for the robenacoxib treatment. P = placebo; R = robenacoxib. *Significance at 0.05 level. www.nature.com/scientificreports/ known about what constitutes a clinically relevant change in activity. Additionally, it is clear from these data that a relatively small proportion of cats significantly increased overall activity. Even if one accepts that 'activity' or 'movement' is improved in cats suffering chronic DJD-pain which are treated with an effective analgesic, much remains to be learned about how to analyze and interpret such data.

Group n (% of total) of cats with activity R > P n (% of total) of cats with activity P > R P value Difference in response rates (R -P) NNT
In humans, reductions of pain of 15%, 33%, and 50% were reported to correlate with the minimal clinically important difference (MCID), 'much better' improvement 21 , and 'very much improved' 22 , respectively. The MCID of veterinary species is unknown, but our data show a 49% reduction in disability (CSOM) after 6 weeks of robenacoxib treatment which would equate to 'very much improved' in human medicine. Standardized effect sizes are another accepted way of comparing efficacy of treatments across studies. Baseline-adjusted CSOM scores indicated an effect size (ES) (for treatment over placebo) of 0.26 when comparing groups PP and RR following 6 weeks of treatment. While no ES data for any outcome measures are available for dogs or cats with DJD treated with NSAIDs, the ES for efficacy of NSAIDs in hip and knee OA in humans is approximately 0.3 when based on high-quality trials 23,24 . Another method used to compare the efficacy of treatments across studies is number needed to treat (NNT). Based on the within-cat analysis of change in activity while on placebo and robenacoxib, the NNT values compare favorably with those for NSAIDs used in humans to treat chronic pain (between 3 and 13 depending on the criteria for success) 25 . In contrast to NNT for chronic pain, the NNTs for NSAID alleviation of acute pain tend to be lower 26 .
The current data did not support the hypothesis that the study would detect a deterioration in outcome measures following discontinuation of robenacoxib, as compared against cats which continued to receive the medication. While a single previous study has reported detection of deterioration after withdrawal of the NSAID meloxicam 6 , there are no published reports of this approach having been replicated. Our study may have been insufficiently powered to detect deterioration and deterioration was only assessed after 3 weeks treatment; a greater effect might be detected after longer treatment.
Our CSOM data confirm previous reports of a significant caregiver placebo effect in feline chronic pain studies. Placebo effect sizes calculated using the following equation (Cohen's d for the placebo group alone): were 1.40 and 1.68 following 3 and 6 weeks of administration of placebo, respectively. These values are at the top end of the placebo effect sizes reported by Gruen et al. 3 , and research is needed to understand what drives this placebo effect, and how to mitigate or control it in clinical studies. High placebo effects make it difficult to detect positive treatment effects.
The safety data we report corroborate the previously published clinical safety of robenacoxib in OA-affected cats (with and without CKD) 27 . Most AEs were self-limiting and did not require medical intervention. Furthermore, cats with International Renal Interest Society (IRIS) stage 1 or 2 CKD were no more likely to experience an AE, which is important given the high prevalence of both CKD and DJD in cats 28 . However, the study design was optimized for efficacy rather than safety assessment.
This study has several limitations. There is no easy definitive way to diagnose DJD-pain in cats, so the approach to diagnosis was based on what has been established through clinical research and published in the literature, and used a combination of owner assessment and veterinarian assessment, as well as radiography. The current study did not attempt to grade DJD, or grade the impact of DJD-pain. The disease of DJD could be graded by assigning severity scores to radiographs: grading the impact of DJD-pain on the whole individual has not been described in veterinary medicine other than the use of owner assessments.
The observed high between-cat variability relative to sample size is likely responsible for the lack of statistical significance in several comparisons. The within-cat analysis of activity data was based on CDFs of a single (MeanScore Placebo − MeanScore baseline )/(Pooledstandarddeviation) www.nature.com/scientificreports/ treatment arm (RP) rather than a full crossover design, and the data partitions used may not be equally applicable to all cats. For example, dusk-to-dawn was predefined as 20:00 to 08:00 and individual cats differ from each other in their bimodal activity distribution over the 24-h period of a day. Nonetheless, the current approach enabled us to evaluate each cat in a binary sense as having or not having a significant improvement under robenacoxib compared to placebo. However, all of these cats received robenacoxib and then placebo, and the carry-over effect of robenacoxib on the placebo phase is unknown. The outcomes of QoL, temperament and happiness have not been previously reported, and the questionnaires we used have not been validated either as measures of these factors, or for responsiveness validity. No attempt was made to explain the terms "QoL", "happiness" or "temperament" to owners, but rather allow each owner to interpret the terms themselves. These assessments were included because of our (BDXL, MEG) increasing experience and belief that many dimensions are impacted by pain 10 , and objective accelerometry and clinical metrology instruments based on activity and mobility (e.g. CSOM, FMPI) do not capture all these dimensions, especially the affective dimensions. Previous research has identified some of the aspects that owners consider are important to QoL 29 , confirming that non-active aspects are important to owners. Further research should investigate what the terms "happiness" and "temperament" actually mean to owners. Regardless, the use of these novel assessments revealed positive treatment effects in the current blinded, placebo-controlled study.
Our findings may not be generalizable to the entire population of cats with OA/DJD-pain. The majority of enrolled cats were from a single study site, presenting a geographical bias. Furthermore, while some comorbidities were allowed by the inclusion/exclusion criteria, cats were required to be overall healthy, meaning findings may be different in the general population of cats with mobility impairment. Lastly, inclusion criteria required that cats be moderately to severely impaired, meaning that results may be different in cats with milder impairment.
In conclusion, results of this pilot study identified suitable outcome measures and approaches to data partition for testing the efficacy of the NSAID robenacoxib in cats with DJD-associated pain in a subsequent confirmatory study. Additionally, overall, we detected significant robenacoxib-associated improvements in both objective and subjective outcome measures following 6 weeks of treatment. However, many data were not significant, and activity data were only significant for particular time periods of the day. No deterioration following masked discontinuation of robenacoxib was detected, possibly reflecting the lack of a significant effect at 3 weeks. While significant effects on patient temperament and happiness were detected, these measures have not been previously validated, and no explanations were given to owners as to what these terms meant. Notwithstanding these comments, the level of evidence for the efficacy of robenacoxib derived from this study is, in our view, at least as good as that published for other NSAIDs in cats with DJD-associated pain.

Materials and methods
All procedures performed in this study were approved by the relevant North Carolina State University Institutional Committees. All methods were carried out in accordance with relevant guidelines and regulations. This study was approved by the Animal Care and Use Committees at North Carolina State University College of Veterinary Medicine (IACUC protocol 14-009-O), University of Georgia College of Veterinary Medicine (IACUC protocol CR-447) and Novartis Animal Health. Written owner consent was provided for each case before pre-enrollment after verbal discussion of the study. This manuscript was prepared after consultation of the CONSORT checklist for reporting of parallel-group randomized trials 30 . Study design. This study was conducted in compliance with Good Clinical Practices and was a doubleblind, placebo-controlled, randomized study with 3 parallel arms (groups) ( Table 9). All three groups had an open (unblinded) baseline (BL) period, and then two (blinded) treatment periods of three weeks each (Placebo-Placebo (PP); Robenacoxib-Robenacoxib (RR); Robenacoxib-Placebo (RP)) (Fig. 3). Study days were defined in relation to the first day of blinded treatment (designated Day 0), with Day − 14, Day 0, Day 21 (3 weeks), and Day 42 (6 weeks) involving site visits by the owner and cat, except for Day 21 when just the owner visited (Fig. 3). A minimum sample size of 20 cats per group was estimated based on previous work with accelerometers 5 . Using previous data from an NSAID study, a treatment-placebo group difference in hourly activity of 75.2 and a SD of 92.6, the power was calculated to be 80% with group sizes of 25 cats per group and 90% with 33. Cats were randomized in a 1:1:1 ratio, via permuted block randomization with block size of 3, according to pre-determined randomization tables (generated by the original study statistician using SAS (Version 9, SAS Institute Inc., Cary, NC)) for each site. Medication dispensing was performed by pharmacy personnel not involved in patient assessment or data collection. All people involved in the study were blinded to the treatments until after the database was locked, with the exception of one sponsor representative (SBK).
Cats were client-owned with naturally occurring DJD-associated pain and owner-assessed mobility impairment. Cats remained in the care of their owners at home throughout the study except for visits to the clinic. www.nature.com/scientificreports/ Subjects were recruited using advertising to owners and veterinarians. All study-related costs, including to the owners, were covered by the sponsor, as were recruitment incentives. Patient screening and data collection were performed at both the North Carolina State and University of Georgia study sites. Cats were screened for eligibility and enrolled in a similar manner to previous DJD-associated pain studies in cats performed by the authors 4 . On the day of screening (Day − 14), cats underwent physical, orthopedic, and neurological exams. Blood and urine samples were obtained for hematology, serum biochemistry, urinalysis with sedimentation, and serum T4 analysis at an external laboratory (Antech Diagnostics, Southaven, MS). Complete axial and appendicular orthogonal radiographs were obtained under sedation and were reviewed by a board-certified veterinary radiologist and the lead investigator (BDXL).
Inclusion/exclusion criteria. Cats were required to have at least moderate owner-assessed mobility or activity impairment (CSOM < 6, see below), evidence of pain during orthopedic evaluation of at least two joints or spinal segments, with radiographic changes associated with DJD in at least two joints or spinal segments identified to have pain. The same investigator at each site performed orthopedic pain assessments. Cats were also required to be at least 1 year of age, between 2.5 and 12.0 kg in weight (to allow dosing with available tablets), and to be generally healthy and not currently receiving analgesic or anti-inflammatory medications (including potential analgesics). Cats with controlled diabetes, hyperthyroidism, or stable CKD (international renal interest society (IRIS) stages 1 and 2) were allowed to participate. Cats were required to be primarily indoor only to avoid potential AM loss.
Cats meeting these criteria were pre-enrolled and entered an approximately 2-week acclimation/baseline period. Activity measurements over BL provided the baseline AM data, although only the 7 days prior to Day 0 were used in efficacy analyses in order to try to avoid the confounding effects of the screening visit, sedation medications, and acclimation to the collar or harness. This BL period, during which known placebo was administered, also served as an additional screening step to determine an owner's ability to administer medication (unmasked placebo, see below) and keep daily records of dosing and patient behaviors.
Following BL, cats were randomized and entered into the blinded portion of the study (by DA or SB), and allocated to one of three treatment groups (Table 9) to receive a daily minimum oral dosage of 1 mg/kg (range 1-2.4 mg/kg) of robenacoxib (the commercially available formulation (Onsior) supplied as 6 mg tablets) or an equivalent number of placebo tablets. Both the robenacoxib and placebo tablets and their packaging appeared identical and were supplied in packaging identical to the commercially available formulation (Onsior) in aluminum blister pack cards of 6 tablets each. www.nature.com/scientificreports/ Owners were instructed to administer the tablet(s) directly into the cat's mouth or mixed with a small amount of food (one third or less of the daily food ration).

Outcome measures.
Although this was a pilot study, primary and secondary outcome measures were predefined in the protocol.
Primary outcome measure. The primary outcome measure was the change from baseline in mean hourly activity as measured by the AMs. Physical activity was measured on a per-minute basis using the Actical AM device. Devices were configured as previously described 4 , with the epoch set at 1 min. The AM device was placed on a non-breakaway collar and placed upright on the patient's ventral neck.
Activity values within each hour were summed for use in statistical analysis, and mean hourly activity levels were computed for each cat over the last seven days of BL ("Day 0 mean") and over the first 20 days of the T1 and T2 periods separately ("Day 21 mean" and "Day 42 mean", respectively). Days 0, 21 and 42 were 'travel' days for the cat and were not included in the analyses.
Secondary outcome measures. Secondary outcome measures were additional analyses of the activity data plus owner-based subjective assessments of mobility impairment and pain, QoL, temperament and happiness.
Secondary outcome measures-activity. (a) Activity-success/failure analysis: Using the 80th percentile of activity data (approximating the time that most cats are active), the number of cats in each group that increased their activity by 10% or greater over baseline levels was calculated, and groups compared for weeks 1 to 3, and weeks 1 to 6.
(b) Activity-partitioning for non-zero counts and dusk-to-dawn activity: Previous work had shown minimal changes in mean daytime activity in cats given meloxicam compared to placebo over 3 weeks 4 . In addition, there are data indicating increases in night-time activity but not mean daytime activity in research cats with OA which were administered meloxicam 11 , and also that most cats are inactive for the majority of the day (> 70% of the time) 14 . Based on a review of previous data 4 , the data set was also partitioned into "daytime" (defined as 08:00-20:00) and "dusk-to-dawn" (defined as 20:00-08:00) observations. These data subsets allowed for combinations of mean hourly total values and mean hourly non-zero values, for the entire day, daytime, or dusk-to-dawn comparisons.
(c) Activity-within-cat analysis: Previous studies have shown the value of a cross-over design, since between-cat variability is high in cats with DJD-pain 4,5,9 . In our study, group 2 (RP) cats received robenacoxib then placebo in T1 and T2, respectively, allowing for within-cat analysis of activity. Within-cat analysis of activity used the average per-minute values on an hourly basis.
The CSOM required owners to select three activities (either from the FMPI or self-generated) that their cat had difficulty performing, instructing the owner to rate the cat's ability to perform the task over the past week. Owners assigned an integer score from 0 to 4 (0 = impossible, 4 = no problem) for each activity, with a CSOM total score ranging from 0 to 12. Data were evaluated based on actual scores, change from baseline, and also on a success-failure basis. Success was defined for each cat as an increase in total CSOM score of 2 or greater (decrease in disability), with no individual question score getting worse. A change in CSOM score greater than or equal to 2 has been used in dogs 31 and was considered clinically relevant.
For the FMPI questionnaire, owners rated their cat's ability to perform 17 set activities on an integer scale of 0 to 4 (0 = not at all, 4 = normal); with a total FMPI score ranging from 0 to 68. Scores were adjusted if questions were unanswered or not-applicable by taking the sum of scores for answered questions and multiplying by 68/ (4 times the number of questions answered). Two final questions assessed cat pain levels on a visual analogue scale from 0 to 100 mm, with 0 representing "severe pain" and 100 representing "no pain". The owner's mark on the line was measured and converted to a number and analyzed separately.
At each study visit, QoL, temperament, and happiness in comparison to before the initiation of treatment (the study) were rated by the owner. Each outcome was rated on a 5-point Likert-type scale. Both QoL and temperament compared to before were rated from "much worse" to "greatly improved". Finally, happiness was rated from "much more unhappy" to "much more happy" compared to before. Owners were not provided any descriptors of QoL, temperament or happiness; they were instructed to complete the form based on how they interpreted the terms. This form is available in Supplementary Figure 1. The responses were summarized into frequency distributions for analysis.
Safety assessments. Owners were instructed to immediately report any AEs to the investigators, and owners were proactively questioned about AEs (including anorexia, diarrhea, lethargy, vomiting) at each visit. AEs were defined as any observations in the cat that were deemed unfavorable and unintended that occurred during the study period, whether they were considered treatment-related or not. Serious AEs (SAE) were defined as AEs that were fatal or life-threatening, required veterinary intervention, or were considered clinically serious by the investigators. Owners were required to bring their cat to the study site or local veterinarian if an SAE was suspected. www.nature.com/scientificreports/ Statistical methods. Data were analyzed on both ITT and per-protocol data sets. The ITT data set was used for safety analysis and consisted of data from all randomized animals that received at least one dose of study medication after Day 0. The per-protocol data set was used for efficacy analysis. The statistical model used was repeated measures ANOVA. A mixed-effects linear model was used for continuous response variables and a generalized mixed-effects linear model was used for categorical response variables, both of which involved, as fixed effects, sequence group, treatment period, treatment, and treatment × period, as well as site and site × treatment as random effects. These were implemented using the MIXED and GLIMMIX procedures of SAS software, respectively. Most analyses used baseline-adjusted responses (arithmetic or relative change from baseline). Three linear contrasts (termed C1, C2, and C3) were evaluated using the group-and period-specific least-squares means. C1 compared the placebo (P) to robenacoxib (R) during T1 (i.e., over the first 3 weeks). C2 compared the placebo to robenacoxib responses during T1 and T2 combined (i.e. over all 6 weeks). Finally, C3 compared changes in mean responses after discontinuation of robenacoxib in T2 (i.e., deterioration), using groups 2 and 3. These are summarized as: where upper case letters denote the sequence-treatment period means being used. The estimated values of the contrasts indicate the magnitude and direction (robenacoxib versus placebo) of the associated effects. Rp, Rr, and Pp represent means across the three arms for the T1 period. rP, rR, and pP correspond to means for the T2 period. The contrasts are linear functions of these means reflecting T1-level effects and T1 + T2 level effects. All analyses were conducted at a two-sided 0.05 level of significance. No adjustments for multiple comparisons were made in order not to inflate the type II error rate; the study was a pilot and many of the efficacy endpoints may be correlated (i.e., not independent) and therefore adjustment of the P value using techniques such as the Bonferroni method might be overly conservative.
For the activity endpoint (i.e., mean hourly activity level within treatment periods), the response variables were the arithmetic and the relative (percent) change from baseline. For baseline values, the last 7 days of the BL period were used, thus allowing for acclimation during the initial portion of the BL period. The model described above was applied for full-day data and for data partitioned by time-of-day periods.
A generalized linear model with logit link was used to test the success-failure rates for the increase in activity within the 80th percentile of hourly activity, setting the threshold for success as an increase of at least 10% over the baseline percentile.
For within-cat analysis of activity, the cumulative distribution functions (CDFs) of hourly activity data for both treatment periods were examined and tested for significant differences. The nonparametric two-sample Kolmogorov-Smirnov test statistics (D − and D + ) of the maximum observed differences between the robenacoxib and placebo empirical distribution functions were computed and the associated P values were calculated for each cat individually. In this study, Dcorresponds to stochastically higher activity levels for robenacoxib compared to placebo, while D + corresponds to the opposite case. The proportion of cats demonstrating significantly higher levels of activity under robenacoxib (P < 0.05) was compared with the proportion that had significantly higher activity under the placebo using McNemar's test. The NNT i.e. the estimated number of cats that need to receive robenacoxib in order for one cat to benefit, on average, was calculated as the inverse of the difference between the proportions of cats improved with robenacoxib compared with placebo.
CSOM data were analyzed using a mixed linear model for total score (unadjusted) and the change in total score relative to baseline (adjusted). Additionally, success-failure analysis was based on a decrease in CSOM total scores of 2 units or more defining 'deterioration' (as previously reported 6 ), and the proportions were compared between the RR and RP groups using a generalized linear model (logit link).
A mixed effects linear model was used to analyze FMPI score 4 . Standardized effect sizes (ES) were calculated for CSOM and FMPI endpoints using the equation: [mean of treatment group minus the mean of the control group]/[pooled standard deviation].
Happiness, temperament and QoL responses were analyzed using a generalized mixed linear model for ordered categories, which modeled the probabilities of levels of the response variable having lower ordered values (cumulative logit link). For these analyses, ORs were calculated and expressed so that values greater than 1 indicated the occurrence of more positive outcomes associated with the robenacoxib treatments.
Frequency data (e.g., for AEs and changes in clinical pathology variables) were analyzed using Fisher's exact test. Body weight, complete blood count (CBC), clinical chemistry and urinalysis variables were analyzed using ANOVA.

Ethics declarations and approvals for animal experiments. This study was approved by the Animal
Care and Use Committees at North Carolina State University College of Veterinary Medicine (IACUC protocol 14-009-O), University of Georgia College of Veterinary Medicine (IACUC protocol CR-447) and Novartis Animal Health. Approval by University IACUC committees is only granted if studies follow Animal Welfare Act guidelines. Written owner consent was provided for each case before pre-enrollment after verbal discussion of the study. This manuscript was prepared after consultation of the CONSORT checklist for reporting of parallelgroup randomized trials. www.nature.com/scientificreports/

Data availability
The datasets generated during and/or analyzed during the current study are available from the corresponding author on reasonable request.