Polymorphisms of RAD51B are associated with rheumatoid arthritis and erosion in rheumatoid arthritis patients

Rheumatoid arthritis (RA) is a common, chronic autoimmune disease affecting 0.5–1.0% of adults worldwide, including approximately 4.5–5.0 million patients in China. The genetic etiology and pathogenesis of RA have not yet been fully elucidated. Recently, one new RA susceptibility gene (RAD51B) has been identified in Korean and European populations. In this study, we designed a two-stage case-control study to further assess the relationship of common variants in the RAD51B gene with increased risk of RA in a total of 965 RA patients and 2,511 unrelated healthy controls of Han Chinese ancestry. We successfully identified a common variant, rs911263, as being significantly associated with the disease status of RA (P = 4.8 × 10−5, OR = 0.64). In addition, this SNP was shown to be related to erosion, a clinical assessment of disease severity in RA (P = 2.89 × 10−5, OR = 0.52). These findings shed light on the role of RAD51B in the onset and severity of RA. More research in the future is needed to clarify the underlying functional link between rs911263 and the disease.

Scientific RepoRts | 7:45876 | DOI: 10.1038/srep45876 recombinational repair (HRR) of DNA double-strand breaks (DBSs) to maintain cell genomic stability and is a promising candidate oncogene and biomarker for cancer diagnosis and prognosis 24,[27][28][29] . Indeed, the absence of RAD51B may disrupt the formation of RAD51B nucleoprotein filaments, the initial stage of HRR, thereby resulting in DNA mutations, rearrangements and/or loss of chromosomes 27 . RAD51B has been shown to form a stable heterodimer with the family member RAD51C, which further interacts with other family members such as RAD51, XRCC2, and XRCC3. Because the underlying biological mechanisms of RA remain largely unknown, the effects of RAD51B on RA have not been clarified, despite the evidence of strong significant associations within Korean and European populations 8,13,14,30 .
Considering that the role of RAD51B in RA susceptibility in Han Chinese has not been assessed, we performed a two-stage case-control study to evaluate the transferability of discovered RA susceptibility loci in Han Chinese individuals to improve our current understanding of the role of the RAD51B gene in predisposition to RA. Moreover, there are no reports on the association between RAD51B and clinical manifestations of RA, such as the 28-joint disease activity score (DAS28) and clinical severity. The other aim of our study was to assess the role of potential associated variants in the clinical manifestations of RA, which may help in defining the primary set of risk alleles for RA susceptibility and provide clues to the mechanisms involved in the etiology and pathogenesis of RA.

Materials and Methods
Subjects. Two independent cohorts of RA patients and controls were included in this study. In the discovery stage, we recruited 402 RA patients (age 33-61 years) and 969 unrelated healthy controls (age 33-61 years) from Honghui Hospital and the First Affiliated Hospital of Xi'an Jiaotong University. In the replication stage, 2,105 subjects consisted of 563 RA patients (age 36-64 years) and 1,542 unrelated healthy controls (age 36-64 years) who were enrolled from Orthopedic Hospital of Henan Province. All subjects included in the study were random chosen genetically unrelated Han Chinese individuals without migration history within the previous three generations. All patients were diagnosed with RA according to the 2008 Classification Criteria of the American College of Rheumatology, and all healthy controls had no history of rheumatism or infectious or chronic inflammatory autoimmune diseases. This study was performed in accordance with the ethical guidelines of the Helsinki Declaration of 1975 (revised in 2008) and was approved through the Local Ethics Committee of Xi'an Honghui Hospital. Informed consent was obtained from subjects.
Clinical assessments. The history of all RA patients was recorded, especially regarding presenting symptoms, joints affected and extra-articular features, and medications. In assessing disease activity according to the DAS28, it was determined that RA patients had been not treated with intra-articular corticosteroids, MTX or biological agents. Joint erosion in RA patients was evaluated by X-rays of the hands and feet. We obtained only the presence or absence of erosion, without radiological score calculation. Patients were assigned to two groups according to erosion (erosive RA and non-erosive RA). Moreover, they were grouped according to the presence or history of extra-articular features. Laboratory parameters were recorded, including rheumatoid factor (RF), anti-cyclic citrullinated peptide antibody (aCCP), anti-glucose phosphate isomerase (aGPI) and the erythrocyte sedimentation rate (ESR). In addition, we also obtained the information of visual analogue scale (VAS) from each RA patients. Demographic information was obtained from each subject at enrollment. SNP selection and genotyping. As an initial screen of common SNPs in the Han Chinese population, we searched for all SNPs with a minor allele frequency (MAF) ≥ 0.05 of the RAD51B gene in the 1000-genomes CHB database. Then, MAF ≥ 0.05 with pair-wise tagging and r 2 ≥ 0.5 were used as cutoff criteria during tag SNP selection, resulting in 62 tag SNPs covering the RAD51B region (Supplementary Table S1).
Peripheral venous blood samples were collected in plain tubes, and genomic DNA was isolated from peripheral blood leukocytes according to the manufacturer's protocol (Genomic DNA kit, Axygen Scientific, Inc., CA, USA). SNP genotyping was performed using the high-throughput Sequenom MassARRAY platform with iPLEX GOLD chemistry (Sequenom, San Diego, CA, USA) based on the manufacturer's protocols 31 . The results were processed using Sequenom Typer 4.0 software, and genotype data were generated from the samples 32 . The status of the case and control samples was blinded for quality control during genotyping processes, and random processing of 5% of samples was performed with a concordance of 100%.

Statistical analyses and power analyses. Power Analyses.
To estimate the statistical power of our study design, we implemented a comprehensive power analysis using Genetic Power Calculator 33 . The results of this power analysis are summarized in Supplemental Table S2. As shown, if the underlying risk allele of RA has an MAF of ~0.1 and OR greater than 1.3, our sample will achieve statistical power of > 0.8.
Genetic Association Analyses. We conducted association analyses at two levels: the single-marker level and the haplotype level. For single-marker-level analyses, we conducted logistic regression for each SNP marker to evaluate their underlying effects on the onset of RA. The SNP markers were coded in three modes: additive, dominant and recessive. In each logistic model, sex and age were included as two covariates to remove potential confounding effects. We implemented a two-stage study design. In the discovery stage, we tested all 62 tag SNPs. In the validation stage, we only included those SNPs with nominal significance in the discovery stage (and SNPs strongly related with these SNPs). Bonferroni corrections were applied to genetic association analyses. The P value threshold in the discovery stage was 0.0008 (0.05/62). For haplotype-level analyses, linkage disequilibrium (LD) blocks were constructed for the 62 SNPs selected for the discovery stage. Because analyses of several SNPs are insufficient to draw a conclusion 34-36 , haplotype-based analyses were then conducted according to these LD blocks. In addition, we also performed haplotype-based analyses for all the SNPs selected for genotyping in validation stage.
In addition to genetic association analyses targeting the disease status of RA, we also conducted association analyses between significant SNPs (with RA status) and three RA-related indicators or phenotypes: VAS, DAS28, erosions and extra-articular involvement. DAS28 is an important clinical indicator measuring the disease activity of RA. A DAS28 score greater than 5.1 is considered to be indicative of high disease activity, between 5.1 and 3.2 of moderate disease activity and less than 3.2 of low disease activity Erosions and extra-articular manifestations are two clinical assessments of the severity of RA. RA with erosions and extra-articular manifestations were considered to be the severe type. Only RA patients (965) were included in these analyses. For quantitative traits (DAS28), linear regression was implemented; for qualitative traits such as erosions and extra-articular manifestations, logistic models were fitted. Age and sex were also included as covariates in the model fitting. All these genetic association analyses were implemented by Plink 37 . The regional association plot was generated using LocusZoom 38 .
Bioinformatic Analyses and Data Mining. The web-based population genetics software SNAP 39 was utilized to identify SNPs that were not genotyped in this study but in strong LD in the Chinese population with significant SNPs. Data from the Chinese population in the 1000 genomes project were used as the reference in this analysis. To predict the potential functional significance of SNPs (especially for intronic/synonymous SNPs), we utilized RegulomeDB, a database that annotates SNPs with known and predicted regulatory elements in intergenic regions of the Homo sapiens genome 40 . In addition, STRING 41 , a functional protein-protein interaction network database, was utilized to investigate the network neighbors of our candidate gene RAD51B.

Results
Characteristics of the subjects. A total of 965 RA patients and 2,511 healthy controls were included in our two-stage case-control study. In each stage, the RA patients and healthy controls were matched by mean age, and there were no significant differences in gender distribution between the cases and controls ( Table 1). The demographic and clinical data of the RA patients are presented in Table 1.
Genetic association analyses with RA status. Three SNPs (rs911263, rs2525504, rs17756404) were identified to be nominally significant in the discovery stage. These SNPs and 4 other SNPs that are strongly correlated with them were genotyped and analyzed in the validation stage (Table 2), and only one, rs911263, was successfully validated (P = 4.8 × 10 −5 ). The C allele of this SNP showed a strong protective effect on RA (OR = 0.64). The association results of the discovery stage based on 62 SNPs are shown in Fig. 1. As shown in this regional association plot, most of the SNPs were not related to rs911263. The complete results of single-marker-based analyses in the discovery stage are summarized in Supplemental Table S3. Two 2-SNP LD blocks were constructed for haplotype analyses based on the discovery dataset, but they were not associated with RA status in our sample (Supplemental Table S4). In addition, haplotype based analyses using combined data for all 7 SNPs were summarized in Supplemental Table S5, which indicated a similar association pattern with single marker based analysis.
Genetic association analyses with disease severity and activity indicators. We implemented association analyses between the significant SNP rs911263 and four disease activity-and severity-related clinical assessments of RA. Our results indicated that one, erosion, was significantly associated with this SNP  (P = 2.89 × 10 −5 ). Our finding showed that the C allele of rs911263 is associated with a lower incidence rate of erosion in RA patients (OR = 0.52). The complete results of these association analyses can be found in Table 3.
Bioinformatic analyses. Using SNAP and 1000 genomes project data, we identified 3 ungenotyped SNPs, rs3784099, rs7148416 and rs10129646, as being in strong LD in the Asian population with our significant SNP rs911263. None of these 3 SNPs or rs911263 are exonic. Their potential functional significance was evaluated using RegulomeDB, which has a systematic score system, whereby an SNP is assigned a score ranging from 1 to 6: the lower the score is, the more functional significance the SNP might have. A score of 4 was found for our targeted SNP rs911263, and the other three SNPs, rs3784099, rs7148416 and rs10129646, had scores of 3, 6 and 4, respectively. All four SNPs showed moderate functional significance. We also investigated protein-protein interaction network neighbors of our candidate gene RAD51B and identified 10 other genes with strong interactions with RAD51B. Among them, RAD51D, RAD51C, RAD52, RAD54 L and RAD54B, belong to the RAD gene family (Fig. 2).

Discussion
Multiple previous studies have indicated a connection between RAD51B and RA. In a meta-analysis conducted by McAllister et al. 23 , rs911263 in RAD51B was identified as being significantly associated with RA susceptibility. The protective effect of the C allele (or the G allele if using a different reference) in that study was identified as approximately 0.8; however, our finding indicated that this effect can be as high as 0.5-0.6. The difference between our study and that of McAllister et al. can be explained by the difference in genetic background of the study subjects. The meta-analysis was based on a sample of Europeans, whereas our study was based on the Chinese Han population. In addition, a study based on Korean and European populations also identified a significant association between rs911263 and RA susceptibility 30 . Compared to these previous studies, one advantage of our study is that we did not only check the association between rs911263 and RA status but investigated the potential link between this SNP and disease activity and RA severity in patients. Our findings regarding the connection between  Table 2. Summarized results of the association analyses on RA status for SNPs included in the validation stage. Significant SNPs were highlighted in bold. In this table we showed the results of statistical results when SNPs were coded in additive model. rs911263 and erosions indicated that the C allele of rs911263 is associated with a lower incidence rate of erosion in RA patients. To our knowledge, this finding has not been reported before. Further replication of our results in other populations is needed. RAD51B is an important member of the RAD51 protein family, which are evolutionarily conserved proteins essential for DNA repair via homologous recombination. The function of this gene is rather fundamental in human metabolism, which may partly explain why the RAD51 family is evolutionarily conserved: any mutations with high functional significance might be lethal. The SNP rs911263 has been identified as significantly associated with RA susceptibility in multiple previous studies, and this was validated in our large sample based on the Han Chinese population. Therefore, the chance that this is merely a false positive signal duet to confounding factors is very low. The next question to address is how this SNP affects RA susceptibility. Three hypotheses can be invoked. The first is that rs911263 has direct functional significance and thus could have a direct effect on the transcription or translation of RAD51B. However, our investigation using RegulomeDB for this SNP does not support this: rs911263 showed only moderate functional significance with a score of 4. Another hypothesis is that rs911263 is simply a surrogate for some underlying common SNPs not genotyped in our study. Again, scrutinizing the potential functional significance of the three other common SNPs in strong LD with rs911263 (rs3784099, rs7148416 and rs10129646) tends to negate this hypothesis. Our findings showed that, similar to rs911263, these three common SNPs had only moderate functional significance (RegulomeDB scores ranging from 3 to 6). The last hypothesis is that SNP rs911263 is a surrogate for a combination of multiple rare variants. However, due to the limitation of our study, it is difficult to validate this hypothesis. More studies, especially those employing sequencing technology, which can provide information for both rare and common variants, should be conducted to examine the direct link between the association signal and functional effects of this SNP on RA onset.  Table 3. Summarized results of association analyses between seven SNPs and four clinical assessments of RA with combined RA patients sample from both discovery and validation stages. The four clinical assessments, VAS, DAS28, extra-articular and erosion were indicated as VAS, DAS28, EA and ERO, respectively. The significant results were indicated in bold. Despite the advantages of our study described above, there are also several limitations. First, population stratification is one of the most important confounding factors for most population-based genetics studies. In GWASs, this confounding factor can be adjusted by principle component analysis (PCA), which requires thousands and even tens of thousands of markers. Due funding limitations, it was impossible for us to conduct PCA to adjust population stratification. Instead, we implemented certain criteria to confine the genetic background of our study subjects during the sample recruitment stage to avoid the potential population stratification 42,43 . Another limitation is that we only evaluated one gene: RAD51B. However, human metabolism and disease onset are complex processes that might involve multiple related genes and several functionally related pathways 44 . Thus, it may be necessary for researchers to thoroughly investigate the entire RAD51 gene family and RAD51B network neighbor genes in the future.

SNP BP
In summary, we investigated the potential association between common polymorphisms in RAD51B and RA susceptibility in the Chinese Han population. We successfully identified an intronic SNP, rs911263, as being significantly associated with the disease status of RA in our study subjects. Furthermore, we investigated the potential connection between this SNP and certain disease activity and severity indicators of RA. Our results indicated that SNP rs911263 is significantly associated with erosions occurring in RA patients. Despite these statistical findings, more research in the future is needed to clarify the underlying functional link between rs911263 and RA.