Identification of potential autoantigens in anti-CCP-positive and anti-CCP-negative rheumatoid arthritis using citrulline-specific protein arrays

The presence or absence of autoantibodies against citrullinated proteins (ACPAs) distinguishes two main groups of rheumatoid arthritis (RA) patients with different etiologies, prognoses, disease severities, and, presumably, disease pathogenesis. The heterogeneous responses of RA patients to various biologics, even among ACPA-positive patients, emphasize the need for further stratification of the patients. We used high-density protein array technology for fingerprinting of ACPA reactivity. Identification of the proteome recognized by ACPAs may be a step to stratify RA patients according to immune reactivity. Pooled plasma samples from 10 anti-CCP-negative and 15 anti-CCP-positive RA patients were assessed for ACPA content using a modified protein microarray containing 1631 different natively folded proteins citrullinated in situ by protein arginine deiminases (PADs) 2 and PAD4. IgG antibodies from anti-CCP-positive RA plasma showed high-intensity binding to 87 proteins citrullinated by PAD2 and 99 proteins citrullinated by PAD4 without binding significantly to the corresponding native proteins. Curiously, the binding of IgG antibodies in anti-CCP-negative plasma was also enhanced by PAD2- and PAD4-mediated citrullination of 29 and 26 proteins, respectively. For only four proteins, significantly more ACPA binding occurred after citrullination with PAD2 compared to citrullination with PAD4, while the opposite was true for one protein. We demonstrate that PAD2 and PAD4 are equally efficient in generating citrullinated autoantigens recognized by ACPAs. Patterns of proteins recognized by ACPAs may serve as a future diagnostic tool for further subtyping of RA patients.

Broad reactivity by low-intensity autoantibodies. We observed low reactivity of IgG antibodies against a large number of citrullinated proteins but not against the corresponding native proteins. After incubation with a pool of anti-CCP-positive plasma, 632 proteins showed more than twofold higher binding of IgG after citrullination with PAD2 than in their native form, and the corresponding number was 629 proteins after citrullination with PAD4, suggesting that these proteins were recognized by ACPAs (Fig. 2, Supplementary Dataset 2).
Surprisingly, citrullination also enhanced the binding of IgG autoantibodies to a significant number of proteins when the array was incubated with the anti-CCP-negative plasma pool. This was true for 408 proteins after citrullination with PAD2 and 133 proteins after citrullination with PAD4 ( Fig. 2, Supplementary Dataset 2).
We also found but few differences between IgG antibody binding to PAD2-and PAD4-citrullinated proteins when the protein array was incubated with the anti-CCP negative plasma pool. Three proteins were targeted to a higher degree when citrullinated by PAD2 than by PAD4 (PAD2/PAD4 ratio: approximately 3): caspase recruitment domain-containing protein 9 (CARD9), 3-phosphoinositide-dependent protein kinase 1 (PDPK1), and protein kinase C and casein kinase substrate in neurons protein 3 (PACSIN3). Only one protein was targeted to a lower degree when citrullinated by PAD2: protein CBFA2T3 (ratio PAD2-citrullinated/PAD4-citrullinated: approximately 0.3).
We did not identify native proteins that were targeted by autoantibodies to a higher degree in the native form than in the citrullinated form. We found a total of 844 citrullinated proteins recognized by ACPAs from RA patients. A list of all identified antigens can be found in Supplementary Dataset 2. Binding pattern of autoantibodies from anti-CCP-positive patients. Many of the proteins identified as targets for ACPAs above showed low-intensity staining for IgG antibodies. Proteins that are autoantigens in vivo can be expected to show staining with high intensity; however, to identify proteins that may be autoantigens in vivo, we limited the analysis to include only proteins with z scores > 2 (Tables 1, 2).
After incubation of the arrayed proteins with the anti-CCP-positive plasma pool, two well-established autoantigens in RA fulfilled this criterion: vimentin and keratin 8. For both proteins, the binding of IgG autoantibodies increased markedly when they were citrullinated by PAD4 compared to native proteins, while the same only applied to keratin 8 after citrullination with PAD2 ( Table 1). Irrespective of whether PAD2 or PAD4 was used  Quantitative analysis of arrayed proteins recognized by autoantibodies. Bar chart showing the number of proteins recognized by autoantibodies to a higher degree than native proteins. Plasma from anti-CCP-positive or anti-CCP-negative RA patients was incubated with Immunome protein microarray slides containing native proteins or proteins citrullinated by PAD2 or PAD4. The number above the bars indicates the number of proteins recognized by autoantibodies under the given conditions (defined as more than two-fold differences in fluorescence intensity compared to native proteins, an intraprotein CV < 15, and a P value < 0.05). Specific targets for autoantibodies are shown in Supplementary Dataset 2.  www.nature.com/scientificreports/ for citrullination, more antibody binding was observed after incubation with the anti-CCP-positive plasma pool than after incubation with the anti-CCP-negative plasma pool ( Table 2). The proteins that showed the greatest increase in autoantibody capture after citrullination with PAD2 compared to native proteins were interferon-induced 35 kDa protein (IRF5; 29.5-fold increase after citrullination), cas scaffolding protein family member 4 (CASS4; 15.4-fold increase), and endophilin-A2 (SH3GL1; 6.5-fold increase). The proteins with the greatest change in autoantibody capture after citrullination with PAD4 were IRF5 (34.7-fold increase), double-stranded RNA-binding protein Staufen homolog 1 (STAU1; 20.0-fold increase), and melanoma-associated antigen B1 (MAGEB1; 10.1-fold increase).
Binding pattern of autoantibodies from anti-CCP-negative patients. We next examined the binding of autoantibodies contained in the plasma pool from anti-CCP-negative patients to citrullinated and native proteins. Even after exclusion of proteins with z scores < 2, we identified several proteins that showed increased IgG autoantibody binding after citrullination. This applied to 29 proteins after citrullination by PAD2 and 26 proteins after citrullination by PAD4 (Table 1).

Comparison between anti-CCP-positive and anti-CCP-negative plasma.
Finally, we examined the binding of IgG autoantibodies to the protein array after incubation with the anti-CCP-positive versus the anti-CCP-negative plasma pool ( Table 2). When PAD2 was used for citrullination, 91 proteins showed more than twofold higher binding of autoantibodies when incubated with anti-CCP-positive plasma than with anti-CCPnegative plasma. After citrullination with PAD4, the corresponding number was 98 proteins. The most significant differences were observed for calcium-regulated heat-stable protein 1 (CARHSP1, ratio anti-CCP-positive plasma/anti-CCP-negative: 12.5), macrophage migration inhibitory factor (MIF; ratio 10.9), and keratin type II cytoskeletal 8 (KRT8; ratio 9.9) when PAD2 was used for citrullination. When PAD4 was used for citrullination, the greatest anti-CCP-positive/anti-CCP-negative ratios were observed for calcium-regulated heat-stable protein 1 (CARHSP1, ratio 13.3), acetyl-CoA acetyltransferase (ACAT2, ratio: 12.7), and protein E6 (E6, ratio 12.4).

Discussion
We performed a high-throughput high-density protein microarray analysis on pools of plasma from 15 anti-CCPpositive and 10 anti-CCP-negative RA patients to identify proteins recognized by IgG autoantibodies before and after citrullination. The method proved successful, and we provide here a list of 844 out of 1631 arrayed proteins that were recognized by autoantibodies after citrullination, i.e. recognized by ACPAs. To our knowledge, this is the largest number of proteins identified as potential targets of ACPAs to date. Previous studies have used other types of citrullinated protein arrays to investigate ACPA reactivity but have focused on a single or fewer citrullinated proteins, usually on known RA antigens such as vimentin, fibrinogen, and alpha-enolase or used processed sample material 11,[22][23][24][25][26][27][28][29] . This is the first investigation of autoantibody reactivity against citrullinated proteins on the KREX protein array platform using pure plasma samples from RA patients that we are aware of. Although we demonstrate here that more than 800 proteins can be recognized by ACPAs, they are not necessarily autoantigens in vivo, where several requirements must be met for citrullination to occur: the protein must localize to the same compartment as PAD2 or PAD4 and requirements to pH level, reducing conditions and calcium concentration should be met [30][31][32][33] . More research is needed to clarify which of the proteins shown to bind ACPAs under the optimal conditions used here in vitro also do so in vivo and elaboratory validation experiments are critical in this regard.
Per se, the high numbers of identified ACPA targets suggest that PAD enzymes are promiscuous in generating citrullinated neoepitopes recognized by ACPAs. On the other hand, approximately half of the proteins in the protein array used here were not recognized by ACPA, suggesting either that those proteins lack surface-exposed arginine residues or that they lack citrullination motifs for PADs.
Many of the abovementioned proteins that bound IgG autoantibodies showed low staining intensity. ACPAs appear to consist of a pool of either specific or cross-reactive antibodies, and it can be speculated that the low staining intensity is a result of cross-reactive antibodies 34 . The literature on monoclonal ACPAs shows extensive cross-reactivity, especially if glycine is present in the + 1 position of citrulline 35,36 . This fact may be important to consider in any multiplex ACPA assay using on-array citrullination so that there is complete control of which epitopes are citrullinated and which are not. Proteins that are potential autoantigens in vivo are likely to have relatively high affinity and/or concentrations, so in an effort to narrow down the list to potential genuine autoantigens, we implemented an additional filtration using z score (cut-off > 2) that excluded low-intensity antigens. Approximately 100 proteins in anti-CCP-positive plasma were identified. Among them was vimentin, a wellknown autoantigen in RA. Other proteins showed strong increases in IgG binding intensity after citrullination, e.g., IRF5, CASS4, SH3GL1, and STAU1. Further studies are needed to determine whether the citrullinated forms of these proteins contain T-cell epitopes in addition to being targeted by ACPAs.
Interestingly, we also identified a rather large number of proteins recognized by autoantibodies from anti-CCP-negative individuals. This has been shown several times before and may demonstrate a subgroup of RA patients not identified using traditional serological testing [37][38][39] . Furthermore, this supports the conclusion of Wagner and colleagues that the commonly used commercial anti-CCP assays fail to identify some ACPA-positive RA patients (at least 10% in the authors setup) 40 . ACPA-positive and ACPA-negative RA have quite different pathogenesis 6,7 , and when future treatment targeting ACPA-positive RA specifically (e.g., PAD inhibitors) becomes available, protein arrays such as the one employed here may discriminate ACPA-positive and ACPAnegative RA better than the anti-CCP test. Another explanation why we identify several proteins recognized by autoantibodies from anti-CCP negative patients may be due to the implication of citrullination and not the citrullinated epitope. As already mentioned, the citrullination process results in conformational changes of the Scientific Reports | (2021) 11:17300 | https://doi.org/10.1038/s41598-021-96675-z www.nature.com/scientificreports/ proteins which may lead to recognition of unmodified proteins from the anti-CCP negative patient pool and not necessarily recognition of the citrullinated epitope. The relative efficiency of PAD2 and PAD4 in generating epitopes recognized by ACPAs has been a matter of some controversy. One study showed that at high antibody titers (1:250 and 1:1000) but not low titers (1:40 and 1:100), ACPAs preferentially bind to fibrinogen citrullinated by PAD4, while we have previously reported that PAD2 and PAD4 are equally efficient in generating epitopes for the binding of ACPAs to fibrinogen and alpha-enolase 10,11 . In a similar setup, we previously showed that PAD4 was the dominant isoform in generating ACPA-binding sites in histone H3 11 . At the serum dilution used in the present study (1:200), the staining IgG ACPAs was equally intense when proteins were citrullinated by PAD2 and PAD4, except for only four proteins out of 1631 proteins.
The protein array methodology to investigate posttranslationally modified epitopes may not only be relevant for RA but may also be used in diseases where autoantibodies against other modified proteins have been shown, such as oxidized proteins in type 1 diabetes or autoantigens phosphorylated during stress-induced apoptosis in systemic lupus erythematosus 5,41,42 . Furthermore, it may be relevant to investigate at risk individuals to compare citrullination profiles or investigate other PAD enzymes such as the P. gingivalis PAD enzyme, which has been proposed to trigger RA even though conflicting studies exist 43,44 .
A limitation to the current study is that the Immunome protein arrays were not specifically enriched for antigens of particular relevance for RA, i.e. proteins that are present in joints or proteins that have previously been identified as autoantigens in RA, although numerous prominent ribonuclear proteins and other well-known autoantigens are present on the arrays. The development of focused arrays containing such proteins would be a natural next step. Another limitation is the use of plasma pools rather than individual plasma samples. Using individual samples, we could compare specific clinical phenotypes to autoantibody patterns or demonstrate the potential of subdifferentiation of patients based on their autoantibody profile [45][46][47][48] . This first study of its kind was merely a proof-of-concept study; it proves that further development of the technique is warranted.

Conclusion
We present a list of 844 citrullinated proteins recognized by ACPAs from RA patients. We demonstrate that PAD2 and PAD4 are equally efficient at generating binding sites for ACPAs. We present a list of approximately 100 potential autoantigens in RA, and we suggest that the pattern of autoantibody recognition may form a basis for subgrouping of anti-CCP-positive RA patients and anti-CCP-negative patients that rightfully should be considered ACPA-positive. The next steps in the development of the technique should be the production of arrays with RA-associated antigens and comparison of ACPA reactivity patterns with clinical phenotypes.

Materials and methods
Collection of patient plasma. Plasma samples were obtained from 10 anti-CCP-negative RA patients and 15 anti-CCP-positive RA patients. Patient data can be seen in Table 3. Plasma from anti-CCP-negative and anti-CCP-positive RA patients was pooled separately before protein array analysis. Individual patient response data have previously been published, and we have previously used the same patient cohorts in another study to investigate autoantibody reactivity against native autoantigens 11,20 . RA patients fulfilled the American College of Rheumatology and European League Against Rheumatism criteria for the diagnosis of RA 49 . Plasma was isolated from peripheral venous blood and drawn into BD Vacutainers containing EDTA (BD, Plymouth, UK). The use of patient samples was approved by the local ethics committee of the Institute of Rheumatology in Prague, Czech Republic, and written informed consent was obtained from all patients before initiation of the study (June 26, 2012, No. 3294/2012). All methods were performed according to relevant guidelines and regulations and in accordance with the Declaration of Helsinki.
Sample preparation for protein array analysis. The Immunome (v1) protein microarray (Sengenics, Singapore) consists of 1631 proteins, in addition to several control proteins, spotted in quadruplicates, allowing assessment of spot-to-spot variation and across-slide variation in background intensity (Fig. 3A). The microarray consists of a variety of different proteins representing different categories, such as cancer-associated antigens, transcription factors, kinases, and other proteins involved in inflammation and cell signaling (Fig. 3B). www.nature.com/scientificreports/ Each protein in the array is coupled to a biotin carboxyl carrier protein tag, which ensures correct threedimensional folding during expression. Four slides were carefully transferred to different quadriperm chambers (Greiner BioOne, Kremsmünster, Austria) containing different citrulline reaction buffers consisting of 1.2 µg/mL recombinant human PAD2 or PAD4 (Cayman Chemicals, Ann Arbor, MI, USA); 1 mM 1,4-dithiothreitol (DTT), 10 mM CaCl, and 100 mM Tris-HCl and incubated at 37 °C for 3 h while shaking under horizontal rotation at 50 rpm (IKA, Germany, Königswinter). The slides were washed two times using cold serum albumin buffer (SAB) containing 0.1% Triton X-100 and 0.1% bovine serum albumin (BSA) in phosphate-buffered saline (PBS) and placed in a new quadriperm chamber. Additionally, two slides following the same procedure, but without the addition of PAD enzymes, were used.
Four milliliters of diluted pooled plasma (1:200) from anti-CCP-positive or anti-CCP-negative RA patients was added to the new chambers and incubated at 20 °C for 2 h at 50 rpm. The slides were washed using SAB buffer and added to a new quadriperm chamber. The binding of IgG antibodies was detected using Cy3-conjugated (GE Healthcare, Chicago, Ill, USA) polyclonal rabbit anti-human IgG (Dako, Santa Clara, CA, USA) diluted 1:1000 (v/v) in SAB buffer. The slides were covered in tinfoil and incubated at 20 °C for 2 h at 50 rpm. Finally, the slides were washed twice in SAB buffer and three times in ultrapure water followed by centrifugation at 240g for 5 min to dry the slides. Slides were stored at room temperature and scanned within 24 h.
Protein array imaging. The intensity of the individual spots was measured using a microarray laser scanner (Innoscan 710AL, Innopsys, Carbonne, France) using Mapix software (Ver. 8.2.2, Innopsys). The scan settings were as follows: 532 nm laser with low laser power (5 V), PMT gain at 60%, 5 µm resolution, and a scan speed of 35 px/s. Spotxel (SICASYS ver. 1.7.6) was used to automatically annotate each protein on the slide. Semiautomatic array alignment was used to specify the location of each spot. The median pixel intensity for each spot was used to eliminate the effect of outliers. Background intensity levels were extracted from the intensity of the adjacent spot. The data were exported as CSV files, and further data analysis was performed in R (Ver. 1.1.456, R Core Team).
Protein array quantitation data analysis. Raw intensities were normalized using a combination of quantile and intensity-based normalization 50 . Based on the normalized intensity levels, a z score, percent coefficient of variation (CV%), and Chebyshev inequality precision (CI-P) were calculated for each protein. An intraprotein CV% cutoff of < 15 was applied to ensure high reproducibility between the same protein spots (n = 4) on each microarray and to demonstrate an equal degree of citrullination. A CI-P cutoff of < 0.05 was applied to ensure that the identified intensities did not belong to the negative control distribution. We applied www.nature.com/scientificreports/ a z score with a cutoff of > 2 to discard low RFU intensities. Two-sample t tests with Benjamini-Hochberg FDR were performed to identify any statistically significant changes between the positive spots and the corresponding spots on the other slides. Finally, ratios for the statistically significantly changed expressions were calculated, and fold differences below 2 were discarded.
Mass spectrometry sample preparation. Fibrinogen (Cayman Chemical) was incubated for 3 h at 37 °C in citrulline reaction buffer containing 1.2 µg/mL PAD2 or PAD4 (Cayman Chemicals), 1 mM DTT, 10 mM CaCl, and 100 mM Tris-HCl to citrullinate fibrinogen. Digestion of fibrinogen (Cayman Chemical) was performed using filter-aided sample preparation 51 . Briefly, samples were transferred to Amicon Ultra 0.5 Centrifugal filters 10 kDa (Merck Millipore, MA, USA) containing 0.5% SDC in 50 mM triethylammonium bicarbonate (TEAB) buffer were centrifuged at 14,000g for 15 min. Next, the samples were reduced and alkylated by incubating in 10 mM tris(2-carboxyethyl)phosphine hydrochloride and 50 mM chloroacetamide for 30 min at 37 °C. Samples were washed in 0.5% sodium deoxycholate in 50 mM TEAB, and each wash was followed by centrifugation at 14,000g for 15 min at 20 °C. Next, samples were digested using 1 µg trypsin/100 µg sample protein in 0.5% SDC in TEAB and incubated overnight at 37 °C. Peptides were eluted by centrifugation at 14,000g for 15 min followed by the addition of 200 µl TEAB buffer and another centrifugation step. Next, the peptides were isolated by phase separation using ethyl acetate and acidified by trifluoroacetic acid. Phase separation was repeated two times, and the aqueous phase containing the peptides was recovered. All samples were dried down and stored at − 20 °C until the time of analysis.
Ultra-performance liquid chromatography tandem mass spectrometry. Fibrinogen samples were rehydrated in 2% acetonitrile and 0.1% formic acid. Protein concentration was measured using a DeNovix spectrophotometer DS-11 FX+ (DeNovix, Wilmington, Del, USA), and 0.4 µg was loaded per sample. Peptides were separated by reverse-phase liquid chromatography on a UPLC system (Dionex RSLX, Thermo Scientific), ionized by a nanoelectrospray ion source (CaptiveSpray, Bruker Daltonics), and analyzed using a timsTOF PRO mass spectrometer (Bruker Daltonics, Bremen, Germany). Samples were injected directly onto a C18 reversedphase column (IonOpticks, 25 cm × 75 µm ID, 1.6 µm C18) kept at 40 °C. The peptides were eluted with a constant flow rate of 400 nL/min using solvent A (0.1% formic acid in water) and solvent B (acetonitrile with 0.1% formic acid) with a total runtime of 60 min. The gradient was as follows: 0-16 min at 2% B, 16-45 min at 5% B, 45-48 at 35% B, 48-52 min at 95% B, 52-60 min at 2% B. Raw files were loaded into PEAKS (Bioinformatics Solution Inc, v. 10.5) and followed the standard analysis pipeline with the addition of citrulline as a variable modification.
Ethics approval and consent to participate. The use of patient samples was approved by the local ethics committee of the Institute of Rheumatology in Prague, Czech Republic, and written informed consent was obtained from all patients before initiation of the study (June 26, 2012, No. 3294/2012).

Data availability
The microarray raw data are available from the corresponding author on reasonable request. The mass spectrometry proteomics data generated during the current study are available from the ProteomeXchange Consortium via the partner repository with the dataset identifier PXD024955 52 .