Synovial tissue transcriptomes of long-standing rheumatoid arthritis are dominated by activated macrophages that reflect microbial stimulation

Advances in microbiome research suggest involvement in chronic inflammatory diseases such as rheumatoid arthritis (RA). Searching for initial trigger(s) in RA, we compared transcriptome profiles of highly inflamed RA synovial tissue (RA-ST) and osteoarthritis (OA)-ST with 182 selected reference transcriptomes of defined cell types and their activation by exogenous (microbial) and endogenous inflammatory stimuli. Screening for dominant changes in RA-ST demonstrated activation of monocytes/macrophages with gene-patterns induced by bacterial and fungal triggers. Gene-patterns of activated B- or T-cells in RA-ST reflected a response to activated monocytes/macrophages rather than inducing their activation. In contrast, OA-ST was dominated by gene-patterns of non-activated macrophages and fibroblasts. The difference between RA and OA was more prominent in transcripts of secreted proteins and was confirmed by protein quantification in synovial fluid (SF) and serum. In total, 24 proteins of activated cells were confirmed in RA-SF compared to OA-SF and some like CXCL13, CCL18, S100A8/A9, sCD14, LBP reflected this increase even in RA serum. Consequently, pathogen-like response patterns in RA suggest that direct microbial influences exist. This challenges the current concept of autoimmunity and immunosuppressive treatment and advocates new diagnostic and therapeutic strategies that consider microbial persistence as important trigger(s) in the etiopathogenesis of RA.

corresponding inflammation related proteins in synovial fluid of RA but not OA patients. Although these proteins were diluted and in part neutralised in the blood, these differences between RA and OA were even evident in serum.

RA-St transcriptomes indicate involvement of both innate and adaptive immunity.
Samples of highly inflamed synovial tissue (ST) areas from RA and representative specimens from OA patients were collected during open surgery. Transcriptome comparisons identified extensive differences in RNA expression. 2019 Affymetrix probe-sets (~1580 genes) were selected, 1010 up-and 1009 down-regulated (supplementary table 1). Hierarchical clustering (HC) and principal component analysis (PCA) of these transcripts demonstrated a clear separation between these two diseases ( Fig. 2A-C). Specificity of RA-ST genes when compared to OA-ST, which is frequently considered as control group because of limited accessibility to normal synovial tissue, was confirmed when RA-ST was compared to synovial tissues samples obtained i) after joint trauma or ii) post mortem from tissue donors. However, both types of "normal" synovial tissues displayed some abnormalities, like gene-patterns of unspecific inflammation or hypoxia induced alterations (supplementary figure 1).
Functional annotation of 2019 transcripts with GSEA, IPA and DAVID suggested infiltration and activation of immune cells, which differs between RA-and OA-ST and is well known from histological investigations 23,24 . In RA-ST, GSEA annotated expression of transcripts to TLR-, T-, B-, and NK-signalling pathways and emphasised involvement of various cytokines (supplementary table 2; Fig. 2D-H). In OA-ST, GSEA suggested alterations in processes related to focal adhesion and extracellular matrix organisation and involvement of TGFβ-and WNT-signalling pathways (supplementary table 2; Fig. 2I-M). Correspondingly, IPA pointed in RA-ST to TCR-, Figure 2. Transcriptome profiles distinguish RA-from OA-synovial tissues (ST) and reveal involvement of innate and adaptive immunity in RA pathogenesis. (A) Hierarchical clustering (Euclidean distance, average linkage) with 2019 differentially expressed probe-sets (rows), which were identified by pair-wise comparisons between RA (n = 10) and OA (n = 10) synovial tissue transcriptomes, separate RA from OA samples (columns). Signals were log-transformed and z-normalized for each probe set to display relative intensities as indicated by the scale bar. Principal component analysis (PCA) was performed for the 20 samples based on the differentially expressed genes (B). Based on this PCA, a synchronized representation of the 2019 probe-sets is displayed in (C). The first 3 principal components, PC1, PC2, and PC3, reflect 42%, 9% and 7% of variance, respectively. Samples from RA patients were coloured in red, those from OA in green (A and B). Probe-sets highest in RA-ST (n = 1010) are red and those highest in OA-ST (n = 1009) are blue (A and C). Gene set enrichment analysis (GSEA) of the 2019 differentially expressed probe-sets identified particular KEGG pathways for RA-ST and OA-ST, which suggest different pathomechanisms in these two diseases. The KEGG pathways presented here with enrichment plots and heatmaps of gene-sets accentuated the role of innate immunity, cytokines, B-, T-and NK-cells in RA pathogenesis (D-H) and tissue damage without substantial activation of the immune system in OA (I-M). These included: synovial fibroblasts (SFbl) (n = 4, 2 from RA and 2 from OA patients; dark blue), endothelial cells (EC) (n = 4; light blue), platelets (Plt) (n = 3; cyan), CD19 + B cells (n = 3; green), CD4 + T cells (n = 3; yellow), CD8 + T cells (n = 3; yellow), CD56 + NK cells (n = 3; yellow), CD1 + DC (n = 3; red), CD14 + monocytes (n = 3; red), macrophages (n = 3; differentiated for 3 days from blood monocytes of healthy donors; dark red), macrophages isolated from synovial fluid of RA patients (n = 3; dark red) and CD15 + granulocytes (n = 3; pink). The overview of reference transcriptomes is provided in supplementary table 4. Co-expression matrices (B) and (E) were generated by correlating expression of the 1010 and 1009 probe-sets in the reference transcriptomes C and F, respectively. These matrices of correlation coefficients were hierarchically clustered to group co-expressed genes for pattern search in the reference transcriptomes. This order of genes was applied to sort probe-sets in RA-ST and OA-ST (A and D) and in reference transcriptomes (C and F). This alignment identified the patterns, which were characteristic for different cell types. www.nature.com/scientificreports www.nature.com/scientificreports/ respectively appeared only in RA-ST. Two leukocyte gene-patterns appeared in OA-ST: 1) common for monocytes, CD1c + DC and lymphocyte and 2) characteristic for monocytes and Mf. Thus, infiltration of monocytes/Mf, both in RA-ST and OA-ST, and lymphocytes only in RA-ST indicates significant differences in the involvement of the immune system in these two diseases".

RA-St transcriptomes show activation patterns that indicate macrophage activation by microbial triggers and inflammatory mediators.
To investigate immune activation of various types of leukocytes in RA-ST and OA-ST, the initial analysis with the cell type specific transcriptomes was extended to 182 reference transcriptomes (supplementary table 4). These included i) B cells (naïve-, memory-, germinal centre B-cells and plasma cells), ii) T-cells (Th1, Th2, Th17, naïve T-cells, regulatory T-cell and γδT-cells), and iii) myeloid cells stimulated for differentiation or activation by various microbial and inflammatory stimuli. For each stimulation experiment an unstimulated control was included, which allowed identification of overlapping and stimulus-specific gene-patterns (  cell culture or during isolation. In general, H5N1 and YFV stimulation revealed a minor set of genes that was part of the IFN induced response (IFNα or IFNγ). The gene-patterns associated with IFNγ and IFNα stimulation of monocytes were small by the number of genes but strong with respect to their expression. TNF, IL15 and IL1β related monocyte stimulation patterns were similar but weaker in intensity and smaller in number of genes when compared with bacterial stimuli. RA-ST did not reveal substantial patterns related to IL4 and IL10 stimulation of monocytes.
A pattern with dominance in granulocytes was evident only in RA-ST and included molecules involved in phagocytosis, complement activation, pathogen recognition, alarmins, and cytosolic factors (Fig. 4C). Although dominant in granulocytes, this pattern overlapped with Mf from RA-SF, blood monocytes and in vitro differentiated Mf.
The gene-pattern related to B-cells in RA-ST in the previous analysis, was common for CD19+, naïve-, memory-, germinal centre B-cells and plasma cells but exhibited the highest expression in plasma cells (Fig. 4C).
The T-cell gene-patterns of RA-ST suggested infiltration of naïve, T-reg, activated and differentiated T-cells (Fig. 4C). The gene-pattern of activated T-helper subsets (Th1, Th2 and Th17) reflected a general activation by TCR signalling, because all these T-cell reference transcriptomes were experimentally triggered by PMA/ Ionomycin 32 .
In OA-ST, the most prominent gene-patterns were related to fibroblasts and differentiated myeloid cells, more precisely monocyte derived macrophages (MDM) and monocyte derived DC (MDDC) (Fig. 4F). This myeloid pattern partially overlapped with expression patterns of synovial fibroblasts, and did not exhibit activation patterns related to pathogens or inflammatory mediators. B-or T-cell specific transcripts were not identified in OA-ST.  (supplementary  table 7, worksheet-C) revealed association with non-inflammatory macrophage differentiation (MDM) and were best represented by the normal synovial tissue transcriptome when compared to the condition of cultured and proliferating synovial fibroblasts. This corresponds to a rather normal tissue phenotype in OA when compared to RA, where normal synovium related patterns were significantly underrepresented (Fig. 5D).

Monocyte response to microbes is quantitatively the leading functional response in
To estimate whether the RA-ST genes have a significant influence in the 35 reference comparisons (supplementary table 6 To investigate, which of the 35 reference comparisons have similar response patterns, we correlated them on the basis of their scores for the top 100 RA gene. Monocyte activation by bacterial, fungal, TNF, IL15 and IL1β stimulations clustered together (supplementary figure 9A). The cluster of viral and IFN stimulation was partially shared with M1-Mf and C. pneumoniae induced monocytes activation, which is expected for M1-Mf as these in vitro polarized cells were also activated with IFNγ in addition to LPS. T-cell patterns were also in part overlapping but formed their own cluster.
To clarify the impact of endogenous triggers on monocyte activation patterns similar to exogenous (bacterial/ fungal) stimulation, reference transcriptomes were screened for expression of those cytokines that were used for monocyte stimulation (TNF, IL1β, IL4, IL10, IL15, IFNα, IFNγ, CSFs). It appeared that TNF, IL15 and IL1β, www.nature.com/scientificreports www.nature.com/scientificreports/ which were the cytokines that induced patterns similar to bacterial and fungal triggering, were also part of the innate response to bacterial and fungal stimulation of myeloid cells. Besides monocytes, TNF was also induced in activated Th1, Th2, and Th17 cells (supplementary figure 9B). However, these T-cells were stimulated by CD14 + monocytes as antigen presenting cells (Th17) or by PMA and ionomycin as a substitute (Th1, Th2) 32,33 . This suggested again that innate triggering of monocytes appears to be indispensable.  Fig. 6), we also included molecules that reflect activation of synovial fibroblasts (MMP3), endothelial cells (sSELE, sVCAM1), platelets (sSELP), activation of Th1-and Th17-cells (TNF, IFN, MIF), as well as early-differentiation of macrophages (SPP1). Additionally, S100P and IL1R2 were included as part of the gene-patterns from RA blood monocytes in our previous study 34 . In total, 27 proteins were determined in paired samples of SF and serum from 18 RA and 15 OA patients and in serum from 14 healthy donors (Table 1).
Out of these proteins, 23 were elevated in RA-SF when compared to OA-SF and confirmed the differential expression of transcripts. Only MCP4 (CCL13) was lower in RA-SF than in OA-SF and contrasted transcriptome data. Based on the concentrations of the 23 molecules and MCP4, the majority of RA patients were separated from OA patients by HC and PCA (supplementary figures 10A and 10C).
Compared to SF, serum levels of the 27 proteins were lower for the majority of the markers, indicating that most of the factors are produced at the site of inflammation in the arthritic joints, the place, from which they are distributed and appear diluted in the blood. Dilution and "neutralization" by binding to serum proteins, metabolism in the liver or excretion into urine reduce discriminatory power of these proteins in serum. Few molecules revealed higher concentrations in serum than SF (A1AT, IL1R2, LBP, sICAM1, sSELE, sSELP) suggesting other sites of their production than synovium, like liver for acute phase response proteins or endothelial cells for adhesion molecules. Both PCA and HC revealed incomplete separation by serum molecules between RA, OA and HD, and demonstrated the heterogeneity between RA patients. However, it was obvious that RA exhibited the greatest distance from HD, while OA patients were closer to HD than to RA. This confirmed the inflammation patterns in RA sera that match with pathogen-like stimulation of myeloid cells in RA-ST (supplementary figures 10B and 10D). concentration of top proteins, which arise in RA joints and spread into blood, correlated with disease activity. In total, 7 molecules revealed higher concentration in RA compared to OA in both, SF and serum, and also in serum from RA when compared to HD (Table 1, Fig. 7). This selection included sCD14, S100A8/A9, S100P, LBP, CXCL13, MMP3 and CCL18. All revealed good correlation between SF and serum concentrations and correlated with DAS28-ESR (supplementary table 9 and supplementary figure 11). With exception of MMP3, their concentrations in OA-SF were lower or equal to those in sera from HD (Fig. 7). This suggests www.nature.com/scientificreports www.nature.com/scientificreports/ that these mediators produced in the inflamed joints of RA patients not only confirm the concept of molecular pattern selection and interpretation but may also serve as serum markers of RA synovitis activity.

Discussion
In this study we demonstrated that the dominant changes in long-standing RA-ST consist of infiltrating monocytes/macrophages with activation patterns that correspond best to activation induced by microbial stimuli. Taking advantage of the many cell type and stimulation specific transcriptomes, which we generated on RA-ST, OA-ST and various immune cells in the past and extended with data specifically retrieved from the large GEO data collection, we performed a comprehensive transcriptome analysis of RA and OA synovium and showed that long-standing RA-ST revealed patterns that largely and best overlapped with patterns induced with bacterial and fungal activation of myeloid cells. Patterns of activated B-or T-cells in RA-ST suggested that these lymphocytes respond to but do not initiate monocyte/macrophage activation. Part of these myeloid patterns were proinflammatory chemokines and cytokines and thus, obviously contribute to the attraction of T-and B-cells in RA synovium. In contrast to RA, OA-ST displayed weak gene-patterns of normal tissue macrophages, which seems to reflect response to tissue damage but without development of innate inflammatory patterns. These results were confirmed for proteins secreted and shedded by activated myeloid cells at the site of inflamed synovial tissue in patients with long-standing RA. Chemokines like CXCL9, CXCL10, CXCL13, CCL18 and especially alarmins like S100A8/A9 and S100P, which were among the top ranks of RA-ST, demonstrated that their concentration tremendously declined from a high level in the synovial fluid to a much lower level in serum. The importance of these results was supported by correlation of the concentrations of these proteins between SF and serum, and correlation of both with the disease activity score DAS28.  in sera. *p < 0.05; **p < 0.01; ***p < 0.001; ****p < 0.0001; ns: non-significant. # Below detection limit in most of OA patients. ## Below detection limit in most of HD, and OA patients. Mann-Whitney and Kruskal-Wallis (with Dunn's multiple comparisons test) were applied for calculating p-values in comparisons between RA and OA in SF and between RA, OA and HD in sera, respectively. p-value less than 0.05 was considered as significant; * < 0.05; ** < 0.01; *** <0.001; **** < 0.0001; ns-non-significant. # In most of OA patients values were below limit of detection; ## in most of HD and OA patients values were below limit of detection. RA-rheumatoid arthritis; OA-osteoarthritis, HD-healthy donors.
There is increasing evidence that monocytes/macrophages are key players in initiating and driving chronicity of joint inflammation in RA. Some of the inflammatory molecules produced by monocytes/macrophages have been already included for quantitative assessment of disease activity in RA and are part of the multi-biomarker test Vectra 35 . Predominant infiltration of macrophages into RA synovium is known from histological scoring and was recently affirmed by single cell RNA-sequencing from RA synovial tissues 23,24,[36][37][38] . In our previous study, we could demonstrate that RA monocytes have an increased turnover, characterised by more rapid production and release from bone marrow into the blood, reduced time in circulation and pronounced recruitment into synovium 34 . Furthermore, CD14 ++ CD16 + monocytes with an intermediate-like phenotype were a dominant subset in RA synovial fluid and seem to shed surface molecules and secret proinflammatory cytokines as a sign of activation after infiltration into the joint 34 .
Here, monocyte/macrophage dominance in RA-ST transcriptomes from patients with long-standing disease was confirmed and the identified inflammatory pattern was deciphered by comparison with many different transcriptomes of monocytes activated by i) exogenous triggers like bacteria, fungi, viruses, and stimuli with pathogen-associated molecular patterns (PAMP) like LPS, zymosan, NOD2-ligand (muramyl dipeptide), TLR2/1-ligand (triacylated lipopeptide), or by ii) endogenous inflammatory mediators including S100A8, TNF, IFNγ, IL15 and IL1β. These inflammatory mediators included many cytokines/chemokines with some of them produced after bacterial and fungal stimulation. For example, production of IL15 and IL1β depended on innate triggers and these cytokines alone induced a stimulation pattern that correlated with bacterial/fungal stimulations but was weaker in intensity. Significant differences, however, were observed between these bacterial/fungal triggers and viral stimulation. Direct stimulation of monocytes in vitro with H5N1 revealed an IFNα imprint, which overlapped in part with those induced by bacterial stimulation in monocytes/macrophages. The significant overlap between the IFNα and IFNγ induced patterns in monocytes, as we showed here and in our previous study, may explain the observed IFN imprint in RA-ST that is a result of IFNγ production, predominantly by activated T-cells and not IFNα by plasmacytoid DCs 30 .
Although recent RNA-sequencing analyses of monocyte activation by immune complexes (plate coated IgG -(cIgG) that mimics deposition of IgG-IC) suggested their contribution to inflammation, its pattern was clearly different compared to the RA-ST pattern in our study 39,40 . Using these sequencing data with 77 genes overlapping www.nature.com/scientificreports www.nature.com/scientificreports/ to our top 100 RA-ST genes and applying the cumulative sum score of the RNA-sequencing comparison between cIgG treated monocytes and unstimulated controls, some genes were also increased but the overall score of cIgG alone did not show a substantial overlap with gene regulation of the top 100 RA-ST genes. Only RNA-sequencing of the LPS-stimulated monocytes or LPS and cIgG co-stimulation revealed a relevant overlapping with RA-ST gene regulation (supplementary figure 12).
Recently Zhang et al. characterized by scRNA-sequencing four different monocyte subsets (SC-M1 to SC-M4) in RA-ST and demonstrated that the activated phenotype was more abundant in lymphocyte-rich RA-ST than in OA-ST or lymphocyte-poor RA-ST 41 . This supports the assumption that the strength of monocytes/macrophages activation is important for RA subtypes that differ in disease activity and response to treatment [42][43][44][45][46] . Mapping their transcripts, which characterized the 4 different macrophage types in RA-ST (SC-M1 to SC-M4) with the 203 reference transcriptomes applied in our study, revealed that SC-M2 perfectly overlapped with differentiated Mf in OA-ST and that SC-M1, SC-M3 and SC-M4 phenotypes contained transcripts that in part corresponded with innate stimulation patterns in monocytes/macrophages. In fact, the pattern in SC-M1 was also associated with "LPS stimulated macrophages" and "monocytes treated with LPS" by immunologic gene sets in the MSigDB as a part of GSEA and was named IL-1β + proinflammatory monocytes 41 . This naming according to one candidate (IL-1β + ) instead of the suggested trigger (LPS) may reflect that scRNA-sequencing of inflammatory synovial monocytes in RA was not sensitive enough to detect the complete proinflammatory pattern and thus gave comparatively weak indications for a more specific functional annotation in RA (supplementary figure 13). This was also stated by these authors, who could find high expression of chemokines in RA-ST only when analysing bulk RNA-sequencing data of the monocyte population, which emphasised that the limited depth of scRNA-sequencing information requires the investigation of whole populations for confirmation of newly identified subsets.
As we showed in this study, the approach of pattern matching with reference transcriptomes and reference comparisons is reasonable and valid. Furthermore, microarray profiles seem to provide at this time much more depth of transcriptional information, higher dimensionality and less susceptibility to technical and biological artefacts. This is an essential advantage for the application as reference transcriptomes and for robust pattern discovery. Our analysis also suggests that many functional gene expression patterns are sufficiently dominant and specific to be recognized in complex data sets with sufficient depth of information, even when analysing whole tissue transcriptomes. It may even provide reference information to characterize scRNA-sequencing data as shown in the supplementary figure 11 and discussed by Zhang et al. or others 41,47 . Thus, investigating the extensive data repertoire collected by microarray transcriptome studies for functional patterns of known triggers in diseases of unknown origin like RA can inspire etiopathogenesis research in chronic inflammation.
Interestingly, improvement of arthritis upon fasting is a well-known observation and we could recently associate this improvement with a reduction in monocyte turnover, suggesting that gut microbiota derived triggers may exist in RA 48,49 . The highly elevated levels of these "innate triggering-associated" proinflammatory chemokines in synovial fluid compared to blood raises the question, how this triggering can develop especially in inflamed joints and how fasting as a dietary and gut microbiota influencing intervention can actively decrease or suppress inflammation. While the role of microbial DNA detection in arthritic joints by sequencing produces inconclusive results, spreading of microbial antigens from the gut is alternatively discussed and probably the most likely mechanism 50,51 . Furthermore, searching and finding microbial triggers also in early-RA, or at the disease stage when adaptive immunity does not dominate disease pathogenesis, should facilitate implementation of effective strategies for achieving clinical remission and preventing development and progression of early RA.
Although the search for pathogen related triggers in RA remains and might be related not only to infections but also to bacterial and fungal dysbiosis on mucosal surfaces, in this study we showed differences between the pathogen-like inflammatory response in RA and reaction upon "wear and tear" pathology mechanisms in OA, two joint diseases that require better understanding of pathophysiology to develop causal therapies. Methods patients' characteristics. Synovial tissue samples were obtained from RA (n = 10) and OA (n = 10) patients during joint replacement surgery (synovectomy) by macroscopic selection of highly inflamed and vascularised nonfibrotic villas. Synovitis scores and histological evaluation of cellular composition was performed as previously described and is provided in supplementary table 8 36 . Macroscopically healthy joints of tissue donors (n = 10) ≥8 hours post mortem were selected as described earlier 52 . Joint trauma samples were collected during arthroscopic intervention after injuries. After removal tissue samples were frozen and stored at −70 °C. RA patients were classified according to the American College of Rheumatology criteria valid in the sample assessment period and OA patients were classified according to the respective criteria for OA.
Paired samples of synovial fluid and serum from RA (n = 18), OA (n = 15) and sera from healthy donors (n = 14) were utilised for protein analysis. RA patients were defined by ACR criteria and joint destruction was confirmed radiographically. The group of healthy donors had no signs of inflammation and were not receiving medications. All RA patients were treated with methotrexate, low dose corticosteroids and/or non-steroidal anti-inflammatory drugs (NSAIDs) but no biologics. Patients' characteristics are summarized in supplementary table 8. Patients were not involved in the design, or conduct, or reporting, or disseminating our research. All participants submitted a written consent before samples were collected. The study was approved by the Ethics committee of the Charité Universitätsmedizin Berlin, and all experiments were performed in accordance with relevant guidelines and regulations.
Calculation of Disease activity score 28 (DAS28) included erythrocyte sedimentation rate as blood marker of inflammation (ESR), together with the number of tender and swollen joints and patient global health assessment. DAS28-ESR ≤ 3.2 indicates low, >3.2 and ≤5.2 moderate and >5.2 high disease activity.
Statistical and functional analyses of microarray data. Analysis with the BioRetis database (www. bioretis.com), consisted of MAS5.0 pair-wise comparison statistics, was performed to select differentially expressed probe-sets as previously described 30,[53][54][55] . Probe-sets (Affymetrix-IDs) of genes were selected as differentially expressed if they 1) revealed either increased or decreased expression in at least 50% of all pair-wise comparisons between RA (n = 10) and OA (n = 10) samples, and 2) if they showed more homogeneous expression in the experimental when compared to baseline group and increased or decreased expression in 30% of pair-wise comparisons as previously described. Selection of probe-sets included signal log ratio (SLR) t-tests with Bonferroni correction for multiple testing 53 . functional analyses of microarray data. MeV (MultiExperiment Viewer, version 4.0, MA, USA) was applied for hierarchical clustering of signal and correlation coefficient matrices.
Qlucore (Lund, Sweden) was applied for principal component analyses (PCA) of samples and variables. Gene set enrichment analyses (GSEA) (The Broad Institute/MIF, USA) was performed with pre-processed datasets, exactly with 2019 differentially expressed probe-sets identified in pair-wise comparisons between RA and OA synovial tissue transcriptomes. Although this step affects the enrichment score statistics included in GSEA (ES-enrichment score, NES-normalized enrichment score, NOM-nominal p-value, FDR-false discovery rate q-value, and FWER-familywise-error rate p-value) we performed this step since our criteria for selection of genes for analyses are far more stringent than those from GSEA 56 . The GSEA was run with permutation of phenotype for 1000 times, weighted enrichment statistics, log2 ratio of classes as a metrics for ranking genes, and with a minimal gene set size of 10. In total, the top 20 pathways enriched in RA and OA were selected and presented in supplementary table 2.
Ingenuity pathway analysis (IPA) was applied to assign the RA and OA profiles from synovial tissues to distinct molecular networks and canonical signalling pathways (IPA, Qiagen Redwood City, USA).
DAVID functional annotation tool was applied to annotate differentially expressed probe-sets to biological processes (BP) and cellular components (CC) determined by gene ontology (GO) 57  For analysis of immune cell activation, 182 reference transcriptomes from 64 data sets were applied. Beside the above mentioned 38 arrays, the additional 144 reference transcriptomes were included. Out of these 64 data sets, 14 were generated in our Affymetrix core facility and 50 were selected from GEO repository. Each data set contain 2-4 microarrays. The reference transcriptomes belong to the HG-U133A or HG-U133 Plus 2.0 series and are summarized in supplementary table 5. They portrayed activation and differentiation of B-cell, T-cells and monocytes and included following data sets: To harmonize the data from different studies, all reference transcriptomes were integrated into the BioRetis database, were quantile normalised and were subsequently applied for co-expression analysis. Pearson correlation coefficients were calculated between the probe-sets differentially expressed in RA-ST and OA-ST on the basis of the signals in the 182 different reference transcriptomes. Hierarchical clustering of this gene-to-gene correlation matrix was performed by applying Euclidean distance and average linkage as an agglomeration rule, as previously described 34 .