A predominant involvement of the triple seropositive patients and others with rheumatoid factor in the association of smoking with rheumatoid arthritis

The major environmental risk factor for rheumatoid arthritis (RA) is smoking, which according to a widely accepted model induces protein citrullination in the lungs, triggering the production of anti-citrullinated protein antibodies (ACPA) and RA development. Nevertheless, some research findings do not fit this model. Therefore, we obtained six independent cohorts with 2253 RA patients for a detailed analysis of the association between smoking and RA autoantibodies. Our results showed a predominant association of smoking with the concurrent presence of the three antibodies: rheumatoid factor (RF), ACPA and anti-carbamylated protein antibodies (ACarPA) (3 Ab vs. 0 Ab: OR = 1.99, p = 2.5 × 10–8). Meta-analysis with previous data (4491 patients) confirmed the predominant association with the concurrent presence of the three antibodies (3 Ab vs. 0 Ab: OR = 2.00, p = 4.4 ×10–16) and revealed that smoking was exclusively associated with the presence of RF in patients with one or two antibodies (RF+1+2 vs. RF−0+1+2: OR = 1.32, p = 0.0002). In contrast, no specific association with ACPA or ACarPA was found. Therefore, these results showed the need to understand how smoking favors the concordance of RA specific antibodies and RF triggering, perhaps involving smoking-induced epitope spreading and other hypothesized mechanisms.


Material and Methods
patients and samples. Patients from five Spanish (IDIPAZ, PEARL, IDIS, IdISSC, and IDIVAL) and one Italian (Rome) RA cohorts were considered as replication sets (n = 2253 with complete data). The Spanish data were directly available to us, whereas the information corresponding to the Italian patients was extracted from a publication 21 . Two of the replication collections were early arthritis (EA) prospective clinics (IDIPAZ and PEARL), whereas the remaining four replication cohorts (IDIS, IdISSC, IDIVAL and Rome) were from established RA patients. Entry criteria for IDIPAZ 22 and PEARL 23 were: 2 or more swollen joints for less than a year and absence of previous treatment with Disease Modifying Anti-Rheumatic Drugs (DMARD). In addition, the patients with RA according to the 1987 ACR criteria 24 at 2 years of follow-up and with serum samples and smoking information in the baseline visit (IDIPAZ, n = 243; and PEARL, n = 264) were selected. The established Spanish RA patients (according to 1987 ACR criteria, with serum samples and information on smoking) were from IDIS 25 , n = 470, IdISSC 26 , n = 508, and IDIVAL 27 , n = 459. In turn, the Italian patients (n = 309) were classified according to the 2010 ACR criteria 21 . Smoking information was obtained as never smoker, past smoker or current smoker in response to the written questionnaire given to the patients at recruitment, either at the first visit (all patients in IDIPAZ and PEARL) or at any time (in the remaining cohorts). No information on smoking intensity was available from most patients. All the patients included in this study granted their written informed consent. The study was designed and conducted according to the Declaration of Helsinki, the Belmont Report and the Spanish Law  For meta-analysis, data from the three patient collections (NOAR, EAC Leiden and BARFOT) included in the van Wesemael study (n = 2238 with complete information) were retrieved 20 .
Anti-carp antibodies and other RA autoantibodies. Anti-CarP IgG antibodies were assessed by ELISA as previously described 25,28 . IgM-RF was determined by rate nephelometry, whereas ACPA were determined by ELISA. The ACPA were tested as anti-CCP2 with the EDIA enzyme-linked immunosorbent assay kit in all the IDIS and IDIVAL patients and with the Immunoscan RA in all the IDIPAZ and IdISSC patients (both assays from Euro Diagnostica, Malmö, Sweden). The patients of PEARL were tested with Immunoscan RA until October 2010. Afterward, they were assayed with the QUANTA Lite CCP3 IgG and IgA assay (Inova Diagnostics, San Diego, CA).

Statistical analysis.
Results from the different patient cohorts, and from previously reported cohorts 20,21 , were combined by meta-analysis with the R package meta 29 . Subgroup meta-analysis comparing EAC and prevalent RA cohorts was done with Review Manager Version 5.3 30 . Smoking habit was considered as ever or never smoker status. In most analyses, the RF − ACPA − ACarPA − triple negative patients (0 Ab) were used as the reference. Alternative meta-analyses compared other patient subgroups, which are indicated in the text with a combination of the antibody abbreviation (RF, ACPA or ACarPA), the plus and minus superscripts for presence/ absence, and the number of antibodies considered (from 0 to 3) as subscripts. In this way, the patients bearing only RF are coded as RF + 1 , those bearing RF and a second antibody as RF + 2 . See Supplementary Table S1 for the remaining codes. Heterogeneity between cohorts was assessed with the inconsistency parameter I 2 . By default, the fixed effects model was applied for meta-analysis, weighting the contribution of each cohort with the inverse variance method. The random effects model according to DerSimonian and Laird was preferred when heterogeneity was notable (I 2 > 50) and reported as OR re and p re . P values lower than 0.05 were considered statistically significant.
Additionally, we used exploratory analysis on the pooled data across the cohorts for interpretation of the findings. It included graphic representation and classification trees. For the former, we employed a double-decker plot 31 . The classification trees, in turn, were done with the General Classification and Regression Trees module of Statistica (v7.0, StatSoft, Tulsa, OK) that produces an exhaustive and recursive search of the best classification. We considered nine dichotomous variables to classify the patients according to smoking. The variables represented the presence or absence of the antibodies, their combinations and number: RF, ACPA, ACarPA, RF&ACPA, RF&ACarPA, ACPA&ACarPA, one antibody, two antibodies, and three antibodies. The priors were considered proportional to the smoking class sizes, the misclassification costs were taken to be equal for every class, the splits were selected based in the Gini index of node impurity, and no stops were imposed. This procedure searches the minimum number of univariate splits to produce the tree with less misclassified patients without being limited by the high dependence between the classification variables.

Results
Replication of the association of smoking with concurrent autoantibodies. The six newly obtained cohorts of RA patients included 2253 patients with complete data ( Table 1). Two of the cohorts were prospective EAC (n = 507), the remaining included patients with established RA (n = 1746). About half (43.6-55.1%) of the patients were ever smokers, except in one of the cohorts where the frequency of smokers was notably lower (20.4%). This circumstance is characteristic of the population attending the recruiting hospital, as previously noted 32 . The other critical characteristic for this study, the presence of autoantibodies, varied between the cohorts within the commonly observed range: mean percentages of RF + = 61.7%, ACPA + = 57.7%, and ACarPA + = 34.4%. The patients were divided into four subgroups by the number of antibodies they presented. These subgroups were of comparable size except for the patients with one antibody that were the less abundant: 26.6, 18.8, 28.8 and 25.8% for the groups with 0, 1, 2 and 3 antibodies, respectively.
We assessed the relationship of smoking with the patients grouped according to the number of antibodies they presented (Table 2). Only the patients positive for the three autoantibodies were significantly associated with smoking (3 Ab vs. 0 Ab: OR = 1.99, p = 2.5 ×10 −8 ). In stark contrast with this very significant result, the patients with one or two autoantibodies were not significantly different from the patients without antibodies (Table 2). In addition, the association of smoking with the concurrent presence of the three antibodies was significant not only relative to the patients without antibodies ( Table 2), but also relative to the patients with one antibody (3 Ab vs. combined meta-analysis of the available data. We combined the 6 replication cohorts with the 3 cohorts form van Wesemael et al. 20 to a total of 4491 patients with RA. The summary data showed that the association with the concurrent presence of the three antibodies was highly significant (Fig. 1C, 3 Ab vs. 0 Ab: OR = 2.00, p = 4.4 ×10 −16 , I 2 = 17%), and significantly stronger than the observed with the patients carrying two antibodies when directly compared (3 Ab vs. 2 Ab: OR = 1.54, 95% CI 1.29-1.84, p = 1.4 ×10 −6 , I 2 = 12%). Even so, the patients with two concordant positive antibodies were associated with smoking ( Fig. 2B, 2 Ab vs. 0 Ab: OR = 1.26, p = 0.009, I 2 = 41%). In contrast, the patients carrying only one antibody were not significantly associated with smoking ( Fig. 2A, 1 Ab vs. 0 Ab: OR re = 1.12, p re = 0.4, I 2 = 56%). These associations were not different in the EAC and the prevalent RA cohorts (Supplementary Table S2).
Association of smoking with the presence of Rf in the patients with one or two positive antibodies. The patients carrying RF accounted for the smoking associations in the patients without the concurrent presence of the three antibodies. First, the specific RF association was found in RA patients with only one positive antibody (1 Ab) (Supplementary Table S3 www.nature.com/scientificreports www.nature.com/scientificreports/ positive antibodies (RF − 1 ) were indistinguishable from the triple negative patients (RF − 1 vs. 0 Ab: OR re = 1.01; 95% CI = 0.65-1.56; p re = 0.97). In addition, smoking was also associated with the presence of RF in the RA patients carrying two positive antibodies (2 Ab) (Supplementary Table S4). In effect, the patients in whom one of the two positive antibodies was RF (RF + 2 ) were significantly associated with smoking (RF + 2 vs. 0 Ab: OR = 1.30; 95% CI = 1.09-1.55; p = 0.004), whereas the other patients carrying two antibodies (RF − 2 ) were undiscernible from the triple negative patients (RF − 2 vs. 0 Ab: OR = 0.95; 95% CI = 0.64-1.39; p = 0.78). Similar analyses centered on the presence of ACPA or ACarPA did not show any significant association.
A notable finding of the preceding analysis was the very similar association (OR = 1.28 and 1.30) of smoking with patients carrying RF in the patients with only one positive antibody (RF + 1 ) and with two positive antibodies (RF + 2 ). This equivalence was confirmed when the associations of smoking with these two subgroups of patients were directly compared (RF + 2 vs. RF + 1 : p = 0.9). As a consequence, the RF + patients carrying one or two positive antibodies were grouped (RF + 1+2 ). As expected, smoking was associated with this unique RF + subgroup ( Fig. 2A, ). An association that was slightly reinforced when all the RF − patients were used as reference (Fig. 2B, RF + 1+2 vs. RF − 0+1+2 : p = 0.0002, I 2 = 0%). It should be remarked that the patients without RF in this latter analysis (RF − 0+1+2 ) included patients positive for ACPA or ACarPA or both these antibodies. The previous associations were not significantly different in the EAC and the cohorts including prevalent RA patients (Supplementary Table S5).
Global exploratory analysis. We used two exploratory techniques to check if a global analysis of all patient subgroups together was consistent with the sequential analyses in the preceding paragraphs without generating redundant statistical tests.
First, the relative frequencies of ever smokers and never smokers in each of the autoantibody-defined strata were displayed with a double-decker plot. Consistently with the results obtained in the sequential analyses, the smokers were enriched in the RF + patients relative to the corresponding RF − patients in all the strata (Fig. 3).
The second exploratory analysis consisted of an exhaustive search of the best classification tree discriminating ever smokers from never smokers based on the antibodies and their combinations (Fig. 4). The first split of the tree was according to the concurrent presence of the 3 antibodies. The 1187 patients with concordant triple   www.nature.com/scientificreports www.nature.com/scientificreports/ seropositivity (3 Ab) contained 61% of smokers, whereas the remaining patients only 48% of smokers. No other antibody or combination contributed to the classification of the concordant patients. In contrast, two more divisions were observed in the non-concordant patients. The second split was on the presence of RF. The RF + group (RF + 1+2 ) contained more smokers than the RF − stratum (RF − 0+1+2 ). The third split was unanticipated and complex. It divided the RF + patients into a smoker-enriched subgroup in whom RF was the only present antibody (RF + 1 ), and a smoker-depleted subgroup in which RF was present concurrently with other antibodies (RF + 2 ). This latter division reinforced the exclusivity for RF of the smoking association in the patients with one or two antibodies.

Discussion
Our main findings have been the predominant association of smoking with the concordant presence of the three RA antibodies, and the exclusive association with the presence of RF in the seropositive patients with one or two antibodies (RF + 1+2 ). In addition, there was no significant association of smoking with the presence of ACPA or ACarPA. These findings have represented a significant advance. In van Wesemael et al. 20 , the association of   www.nature.com/scientificreports www.nature.com/scientificreports/ smoking with RA could represent a gradual increase in the strength of association with the number of antibodies. The evidence presented here excludes this putative mechanism by the demonstration of the exclusive association of smoking with RF independently of the presence or absence of other antibodies in the patients carrying one or two antibodies, and by the clear distinction between the patients with three antibodies as a separate classification to other seropositive patients. This fundamental insight determines the nature of the models aiming to explain the effect of smoking on RA susceptibility. In addition, our analysis has made more understandable the relationship between smoking and the antibodies thanks to the exploratory techniques. They also showed that the results of the statistical tests were faithful to the data.
The originality of our findings is reflected in the absence of any other study analyzing the association of smoking with combinations of the three antibodies included here and in van Wesemael et al. 20 . Therefore, we considered the studies assessing the combination of RF and ACPA as antecedents. We have found only reports from four large cohorts 20,33-35 , one of them included in van Wesemael et al. but different from the RA cohorts included in our meta-analysis. The four cohorts were large, with ≈9500 healthy Japanese subjects 20 , ≈2000 UK RA patients 33 , ≈1500 USA RA patients 34 , and ≈3600 Swedish RA patients 35 . The three first have shown a significant association of smoking only with the concurrent presence of the two antibodies, not with any of them in isolation. The forth by Hedstrom et al. showed a stronger association with the concurrent presence of RF and ACPA, followed by RF and less significantly by ACPA (more on the results of this study below). These results and those of another study with two smaller UK RA collections 36 are fully compatible with our findings. In addition, the bibliographic search brought to our attention another important fact. None of the studies that support the pathogenic model linking smoking with RA through the production of ACPA has accounted for the association of smoking with the concurrent presence of ACPA and RF 10,11,37-39 until very recently 35 .
We do not know the mechanism behind the predominant association of smoking with the concurrent presence of RA autoantibodies. However, it is well-known that the status of the RA autoantibodies is much more concordant than at random 25,28,40 . This circumstance reveals the existence of pathogenic mechanisms that are shared by the various antibodies. Some of these mechanisms contribute to epitope spreading, which characterizes the progression of T and B cell responses in autoimmune diseases including the preclinical phase of RA [3][4][5]41,42 . In effect, the earliest seropositive samples from patients that will develop RA years later often recognize a single epitope, whereas samples taken near the clinical onset recognize multiple epitopes [3][4][5] . Therefore, smoking could promote concordant seropositivity by broadening and accelerating epitope spreading.
The known factors affecting epitope spreading include the availability of epitope-specific lymphocytes, reflecting incompetent tolerance, and favorable T -B cell interactions and antigen presentation 41,42 . The latter interactions could be boosted by bystander activation, tissue damage and inflammation. Therefore, smoking could promote epitope spreading through these multiple mechanisms. This is possible because the triggering of inflammation and tissue damage, the recruiting and activation of neutrophils, monocytes, and macrophages, and abnormalities in NK, dendritic cells, B cells and many subtypes of T lymphocytes are some of the many effects of www.nature.com/scientificreports www.nature.com/scientificreports/ smoking on the immune responses 43,44 . The overall balance of this range of actions is an increased predisposition to autoimmunity and the production of autoantibodies 43,44 . Specifically, smoking has been associated with the production of anti-dsDNA in SLE 45 , anti-Jo1 in inflammatory myopathies 46 and of RF and other autoantibodies in smokers without any autoimmune disease 20,[47][48][49][50] .
The sharing of immunological mechanisms between the autoantibodies is also the most likely explanation for the correlation between their antibody titers, a correlation that we have also observed in our analyses 25,[51][52][53] . These correlations have been observed independently of the disease stage and, most significantly, to be maintained as parallel titer decreases in response to treatment revealing that they respond similarly to the control of inflammation [51][52][53] . These correlations lack any specific direction and extend to the thresholds for establishing the positive/ negative status (Supplementary Table S6). Therefore, they are unlikely to denote any hierarchy of precedence or causality between the antibodies. Just recently, some of these examples of shared and overlapping immunological mechanisms have been characterized as the convergent pathways model of RA pathogenesis 54 .
The association of smoking with RF in the patients with one of two antibodies suggests that RF could precede other autoantibodies in the smokers that will become RA patients. However, we lack support for this interpretation given the cross-sectional nature of our sample collections. In addition, the studies of preclinical RA samples have been discordant in the order of antibody appearance: IgA RF and IgM RF were the first antibodies in a Swedish cohort 55 , whereas ACPA 4 and ACarPA 56 preceded IgM RF in a Dutch cohort, and again IgG ACPA preceded RF (no ACarPA analysis included) in American military 57 . Similarly, the presence of RF in smokers without RA that has been known for more than two decades 49,50 , cannot be taken as evidence of RF preceding the other autoantibodies because recent studies have found also an increased presence of ACPA in smokers without RA 20,58 . In consequence, we will need to wait for new studies to solve this question.
Independently of the order of appearance, we need an explanation for the specific association of smoking with RF. A couple of possible mechanisms have already been proposed. One of them was originally developed to explain the production of autoantibodies in smokers without autoimmune diseases 48 . It starts by the induction of heat-shock protein 70 (Hsp70) expression and of antibodies against Hsp70 by smoking. In the next step, the production of RF is triggered by the two signals given by the Hsp70 immune complexes (IC): through the BCR recognizing the anti-Hsp70 IgG, and through CD91 binding Hsp70. These details were delineated in mouse experiments 48 , but their reproducibility in RA patients is unclear. A second hypothesis proposes that smoking leads to increased lung production of IgG, which would be recognized by RF in its native form and by ACPA and ACarPA as the citrullinated and carbamylated modified IgGH fragment, respectively 59 . This hypothesis has the appeal of the simplicity of considering a single protein as the link between the different RA antibodies. Accordingly, it has been characterized as the common antigen model 54 . However, only RF is known to recognize the IgGH fragment, in the form of IgGH/HLA class II complexes 60 , whereas the binding of the modified IgGH fragment by ACPA or ACarPA has not been reported. The two hypotheses could become the starting point for future experiments.
The association of smoking with RF positive RA was known before the association with the ACPA positive patients 6,7 , however, the latter displaced RF from the focus of attention 1,2 . Probably, the peculiar nature of RF has some role in this displacement. In effect, RF can be described as an antibody against IC with a role in the development of the early antibody repertoire that in the adult can be induced in the course of sustained immunological responses that include other diseases besides RA [61][62][63] . However, the nature of the RF in patients with RA and non-RA subjects have significant differences. In RA, the RF production is sustained and reach higher titers; also, the range of Ig V genes that are used is broader, and the response shows signs of maturation as isotype switch and mutations changing the sequence of the CDRs, characteristics that are absent or restricted in the RF of healthy subjects 62,63 . It is understood that ACPA and citrullinated proteins, or ACarPA and carbamylated proteins, are the IC recognized in RA, but other antibodies are possible as the anti-Hsp70 antibodies (in relation to smoking) and microbial antigens (including those from viruses and the mucosal microbiota). These diverse IC could act as disease triggers 48,61 . Once RF binds to the IC, the IC become larger and capable of more efficient immune stimulation. A model of RA in which two waves of IC formation, without RF and with RF, followed by complement activation and the production of inflammatory and chemotactic mediators has been proposed 61 . It is also possible that the RF-specific B cells have a critical role in the early phases of the autoimmune response before high titers of RF have been produced 63 . In effect, the RF specific B cells are abundant in healthy subjects and the only B cells capable of efficient presentation to T cells of the antigens trapped in IC 64 . These various roles of RF and the RF-specific B cells could contribute to epitope spreading and the concordance and correlation between the RA autoantibodies.
The fact that our meta-analysis did not detect any specific association of smoking with the presence of ACPA in the patients without RF (OR = 0.95, 95% CI = 0.78-1.20) does not exclude association with a subset of the ACPA + patients. Examples of such subsets could be the patients with shared epitope HLA alleles or patients with heavy and current smoking. Concerning the HLA, the presence of the shared epitope is specifically associated with the presence of ACPA and there are arguments to think it could potentiate the smoking association. The most clear argument comes from a recent study by Hedstrom et al. showing association with ACPA + /RF − 35 , whereas previous studies failed to stratify by the two antibodies or did not find a significant association 10,11,33,34,[36][37][38][39] . The other example is supported by the same recent study, which showed differential association of heavy smokers and light smokers with RA. A difference that was more marked in the ACPA + /RF + patients, followed by the ACPA − /RF + patients and finally the ACPA + /RF − patients 35 . This differential association is in agreement with our results placing the concordant seropositivity at the top and RF afterwards. However, the ACPA + /RF − patients in Hedstrom et al. were significantly associated with smoking and they were not in our meta-analysis.
Our study lacks healthy controls and therefore, we were restricted to comparisons between patient subsets. However, this is a minor limitation because the association of smoking with seropositive RA patients is well-established 6,7 . In addition, the study lacks information on smoking intensity and HLA alleles. These two Scientific RepoRtS | (2020) 10:3355 | https://doi.org/10.1038/s41598-020-60305-x www.nature.com/scientificreports www.nature.com/scientificreports/ types of information could have provided additional insight into the relationship between smoking and the different autoantibodies. In any case, a whole analysis of the potential interactions between smoking and the HLA in the three autoantibodies is not yet possible because only the HLA alleles associated with ACPA are well-defined 65 . The other antibodies, ACarPA and RF, could be specifically associated with HLA-DRB1 alleles that are not included in the shared epitope, but that have not been yet completely defined [66][67][68] . Finally, we should signal that the classification tree in Fig. 4 is not appropriate to separate smokers and non-smokers. This algorithm was only intended as a tool to explore the relationships between smoking and the antibodies present in the data.
In summary, we can conclude that smoking is predominantly and reproducibly associated with the triple, RF, ACPA, and ACarPA, concordant seropositive RA. A result that highlights the need to consider the mechanisms leading to concurrent seropositivity. In the patients that are not concordant for the three antibodies, smoking was exclusively associated with RF positivity in our meta-analysis. This latter association is weaker than the association with the triple concordant patients and its exclusivity needs to be replicated, ideally in studies counting with detailed information on smoking intensity. These results call for a pathogenic model that incorporates the predominant association with multiple antibodies, which could be explained by accelerated and broadened epitope spreading reflecting some of the actions of smoking on the immune system. In addition, an explanation for the particular association of smoking with the production of RF should be sought.