Association between AIRE gene polymorphism and rheumatoid arthritis: a systematic review and meta-analysis of case-control studies

Autoimmune regulator (AIRE) is a transcription factor that functions as a novel player in immunological investigations. In the thymus, it has a pivotal role in the negative selection of naive T-cells during central tolerance. Experimental studies have shown that single nucleotide polymorphism (SNP) alters transcription of the AIRE gene. SNPs thereby provide a less efficient negative selection, propagate higher survival of autoimmune T-cells, and elevate susceptibility to autoimmune diseases. To date, only rheumatoid arthritis (RA) has been analysed by epidemiological investigations in relation to SNPs in AIRE. In our meta-analysis, we sought to encompass case-control studies and confirm that the association between SNP occurrence and RA. After robust searches of Embase, PubMed, Cochrane Library, and Web of Science databases, we found 19 articles that included five independent studies. Out of 11 polymorphisms, two (rs2075876, rs760426) were common in the five case-control studies. Thus, we performed a meta-analysis for rs2075876 (7145 cases and 8579 controls) and rs760426 (6696 cases and 8164 controls). Our results prove that rs2075876 and rs760426 are significantly associated with an increased risk of RA in allelic, dominant, recessive, codominant heterozygous, and codominant homozygous genetic models. These findings are primarily based on data from Asian populations.

encodes a 545 amino acid protein of 58 kDa by 14 exonial sequences [7][8][9] . The AIRE protein is a transcription factor that is indispensable with regards to the negative selection of immature T-cells (thymocytes). Cooperating with DNA-binding proteins, AIRE controls the promiscuous expression of peripheral tissue antigens (PTA). Mutations in the protein coding gene sequence of AIRE results in the development of autoimmune polyendocrinopathy candidiasis-ectodermal dystrophy, an autoimmune deterioration of numerous organs [10][11][12] .
To date, increasing numbers of publications have suggested that SNPs in the gene sequence affect AIRE transcription. The SNPs thereby alter the functional activity of AIRE and potentially elevate disease susceptibility 7 . A recent experimental study described two distinct SNPs of AIRE. AIRE−230Y, and AIRE−655G. AIRE−230T haplotype transcriptionally modifies AIRE expression and influence negative selection, elevating the risk of autoimmunity 13 . Various SNPs in the AIRE genetic sequence have garnered attention; however, to date, only a minority of case-control studies have observed an association between gene polymorphism and susceptibility to diseases, including vitiligo 7,14 , alopecia areata 7,15 , melanoma 7,16 , systemic sclerosis 7,17 and RA 7,[18][19][20][21][22] . Among the latter diseases, only RA has been analysed by multiple case-control studies and, therefore, seems to be optimal to analyse positive or negative associations 7 . Xu et al. have published that AIRE polymorphism was associated with the increased risk of RA 23 . Here, we present a systematic review and first meta-analysis that includes case-control studies to verify the association of SNPs rs2075876 and rs760426 in the AIRE gene with RA.

Results
Characteristics of included studies. We identified 19 publications after a thorough search of Embase, PubMed, Cochrane Library, and Web of Science databases. After removing duplicates, we reviewed the remaining 11 studies for eligibility and selected five publications for inclusion in our meta-analysis. Our PRISMA flow chart of the searching process is shown in Fig. 1. Asian and Caucasian ethnicities were involved. Diagnosis of RA was determined according to the American College of Rheumatology classification criteria in 1987 24 . The overall mean age of RA patients was 54.1 ± 2.4 years, and the percentage of female cases was 73.34%. Genotyping was conducted by microarrays, single base extension methods (SNaPshot), and Taqman SNP Genotyping Assays. By further reviewing the five eligible publications, we identified 11 SNPs of the AIRE gene (rs2075876, rs760426, rs1800250, rs2776377, rs878081, rs1055311, rs933150, rs1003854, rs2256817, rs374696, rs1078480). Only rs2075876 and rs760426 were involved in four or more studies; therefore, we performed meta-analysis for rs2075876 (7145 cases and 8579 controls) and rs760426 (6696 cases and 8164 controls). All genotype frequencies of the controls were in Hardy-Weinberg Equilibrium (HWE). Characteristics of the included studies on rs2075876 and rs760426 are summarized in Table 1.

Meta-analysis of SNP rs2075876 (G > A).
Five studies were identified that investigated the association between SNP rs2075876 and RA susceptibility [18][19][20][21][22] . Most of the publications doubled the individual number to account for alleles; thus, to normalize the data, we also calculated with duplicated values (see Supplementary  Table S1). GWAS by Terao et al. 18 served as three independent case-control studies (denoted with A, B, C). With the exception of García-Lozano et al. 19 , all of the studies described the genotype distribution for GG, AG, AA. Therefore, we calculated odds ratios (ORs) for genetic models where there was no available or feasible data in the given study (Table 2). Results for each genetic model are shown in Fig. 2. For the allelic model (A vs. G, Fig. 2A  Results of heterogeneity analysis for each genetic model are shown in Supplementary Table S2. For the allelic model P h = 0.439 and I 2 = 0%, for the dominant model P h = 0.011 and I 2 = 66.2%, for the recessive model P h = 0.005 and I 2 = 69.9%, for the codominant heterozygous model P h = 0.004 and I 2 = 70.5%, and for the codominant homozygous model P h = 0.012 and I 2 = 65.4%. Moderate heterogeneity was found in dominant, recessive, codominant heterozygous, and codominant homozygous models. Only four out of 31 ORs were statistically insignificant, and the ORs revealed that SNP rs2075876 (G > A) is associated with an elevated risk of RA. These results therefore suggest a link between AIRE SNP rs2075876 (G > A) and RA susceptibility.

Meta-analysis of SNP rs760426 (A > G).
Four studies investigated the association between SNP rs760426 and RA susceptibility 18,[20][21][22] . Most of the publications doubled the individual number; thus, to normalize the data, we also calculated with duplicated values, as was conducted with rs2075876 SNP (see Supplementary  Table S1). Again, GWAS by Terao et al. 18 served as three independent case-control studies (denoted with A, B, C). With the exception of Feng et al. 21 , all studies described the genotype distribution for AA, GA, GG. We therefore calculated ORs for all the genetic models that were not published in the original articles (Table 2). Furthermore, we excluded the OR, 95% CI and p-value of Feng et al. 21     Only four out of 26 ORs were statistically insignificant, and the ORs showed that rs760426 (A > G) SNP is associated an elevated risk. These results therefore suggest a link between AIRE SNP rs760426 (A > G) and RA susceptibility.

Sensitivity analysis.
To detect the influence of each case-control study on the whole meta-analysis, we performed sensitivity analysis by omitting one individual study. Heterogeneity was not found in SNP rs2075876 or rs760426 by investigating allelic (Fig. 4), dominant, recessive, codominant heterozygous, and codominant homozygous genetic models (see Supplementary Figs S3 and S4).
Publication bias. Bias analysis was performed by generating funnel plots for each polymorphism of the allelic (Fig. 5), dominant, recessive, codominant heterozygous, and codominant homozygous genetic models (see Supplementary Figs S5 and S6). After analysis, all funnel plots were perfectly symmetric, and no publication bias was detected for SNP rs2075876 or rs760426.
Trial sequential analysis. We performed a TSA for the allelic models (Fig. 6) of SNPs rs2075876 (G > A) and rs760426 (A > G). Results of allelic models for both polymorphisms showed that the blue line of cumulative z-curve crossed the TSA monitoring boundary and the cumulative sample size was reached. Therefore, we observed robust evidence in the association between SNPs rs2075876 (G > A) and rs760426 (A > G) and RA risk. These results suggest that no further studies are necessary to confirm the association.   Table 2. ORs, 95% CIs, and P-values for each genetic model in the association of SNPs rs2075876 (G > A) and rs760426 (A > G) with RA risk (NA = not available; OR = odds ratio; CI = confidence interval; * literature data.

Discussion
RA is a multifactorial disorder where genetic and environmental events equally contribute to disease commencement 1,25 . The latest GWAS meta-analysis discovered and screened 42 novel RA risk SNPs at a genome level from 98 candidate biological RA risk genes 6 . The detected risk genes, including AIRE, are mainly in the category of primary immunodeficiency (PID), HIV, and immune dysregulation. With the exception of AIRE, none of the other associated proteins have been directly related to central tolerance 6 . Self-tolerance involving negative selection, the machinery of which is directed by AIRE, is a central immuno-physiological process required to create a normal adaptive immune system. We believe that polymorphisms in this indispensable gene lower the protein expression of AIRE, decrease the presentation of self-antigens, reduce negative selection, and contribute to the escape and survival of autoimmune T-cells. Reaching the periphery, matured, autoimmune T-cells are a source of autoantibodies and serve as a medium for numerous immune disorders, including RA. In support of this belief, Lovewell et al. 13 have concluded, through a gene reporter assay, that specific haplotypes (AIRE−655G AIRE-230T) can dramatically reduce AIRE transcription. However, with in vitro and in vivo experiments, Kont et al. 26 have demonstrated that the presentation of PTAs from mTECs is quantitatively affected by these reductions in AIRE expression.
An in silico investigation by Terao et al. 18 , which analysed the expression profile of 210 lymphoblastoid cells in the Gene Expression Omnibus (GEO) database, has demonstrated a statistically significant (p < 0.001) correlation between the rs2075876 risk allele (A) and decreased AIRE transcription 27 . No association was found in GEO between rs760426 (G) and AIRE expression. Additionally, García-Lozano et al. 19 found statistically significant decreases in the expression levels of rs878081 C allele by analysing GEO database. This SNP is located in the Exon 5 region of AIRE; however, rs2075876 (G > A) is located in Intron 5 and rs760426 (A > G) in Intron 12 7 . The latter SNPs may affect the transcription of AIRE by modifying alternative splicing or intron-mediated enhancement 28 . The reduction in transcription, in turn, provides lower amounts of PTAs ectopically on the major histocompatibility complex/human leukocyte antigen of mTECs, which thereby contributes to the failure of negative selection in the thymus and increases the survival of autoimmune T-cells. In individuals who carry these SNPs, this sequence increases RA susceptibility. By analysing allelic, dominant, recessive, codominant heterozygous, and codominant homozygous models, we demonstrated that the SNPs rs2075876 (G > A) and rs760426 (A > G) occur more frequently in RA patients than in controls.
There are some limitations in our meta-analysis. We cannot extrapolate the findings of rs2075876 (G > A) and rs760426 (A > G) to Caucasians due to the limited study number. Based on García-Lozano et al. 19 , the results are not statistically significant; however, rs878081 C allele seemed to occur more frequently in RA patients. Furthermore, considering GWAS of Terao et al. 18 , the association of AIRE with RA among Caucasians was not supported. The number of the included studies also limited our meta-analysis; however, Terao et al. 18 provided three case-control studies in one publication, which elevated the number of the included epidemiological studies. In the future, further European case-control, GWAS, and stratified subgroup analyses (age, smoking) are needed in order to better elucidate the association between RA and AIRE polymorphism. To our knowledge, this is the first time that the association between SNP rs2075876 (G > A), rs760426 (A > G), and RA susceptibility was statistically estimated in one meta-analysis. We used multiple haplotype investigations for each polymorphism, sensitivity analyses, and TSA to confirm the robustness of association. In conclusion, our meta-analysis clearly confirmed with each genetic model that the presence of SNPs rs2075876 (G > A) and rs760426 (A > G) is significantly associated with an increased risk for RA.

Methods
Search strategy. We searched for related literature in the PubMed, Embase, Cochrane Library, and Web of Science databases in accordance with the recommendations of the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) statement 29 . On 16th May 2017 we completed the search. Keywords ("autoimmune regulator"; "AIRE"; "polymorphism"; "rheumatoid arthritis") were thoroughly used by two independent investigators. All studies were published from April 2011 to June 2016.

Inclusion and Exclusion criteria.
In order for studies to be included, publications had to demonstrate that (1) the study focused on the association between SNPs or haplotypes within the AIRE gene and RA susceptibility, (2) the study was case-control-designed, (3) all RA patients met the American College of Rheumatology classification and diagnostic criteria, and (4) detailed genotype data and feasible ORs, 95% CIs, and p-values were available. Publications were excluded if (1) a previous study was duplicated or (2) the given polymorphism was not found in at least four studies. Review articles were also excluded. Inclusion and exclusion criteria were independently screened by two investigators. Statistical analysis. HWE was calculated by the chi-squared test for each study in the control groups.
Pooled ORs and 95% CIs were calculated to examine the strength of the association between rs2075876 and rs760426 polymorphisms and RA. We used the random effect model by DerSimonian and Laird 30 because of the different ethnicities of those included. Heterogeneity between trials was tested with two methods. First, we employed the Cochrane's Q homogeneity test, which exceeds the upper-tail critical value of chi-square on k-1 degrees of freedom, with a p-value of less than 0.10 considered suggestive of significant heterogeneity. Second, we used the inconsistency (I 2 ) index. I 2 is the proportion of total variation contributed by between-study variability. I 2 values of 25, 50 and 75% correspond to low, moderate, and high degrees of heterogeneity, respectively, based on Cochran's handbook 31 . Sensitivity analyses were performed to identify the influence of each study on the pooled ORs and 95% CIs. Publication bias was examined by visual inspection of funnel plots where the standard error was plotted against the log odds ratio. Meta-analytic calculations were performed with Comprehensive MetaAnalysis software Version 3 (Biostat, Inc., Englewood, NJ, USA).
Trial sequential analysis (TSA). Meta-analyses may be biased in type I errors owing to an increased risk of random error when sparse data are analysed, combined with reduplicative testing on accumulating data. To avoid this problem and to increase the robustness of conclusions, we used trial sequential analysis (TSA) [32][33][34] . TSA combines an estimation of the required sample size with an adjusted threshold for statistical significance. The relationship between the cumulative z-curve and the trial sequential monitoring boundary shows the expressiveness of the meta-analysis. If the cumulative z-curve crosses the trial sequential monitoring boundary, and the cumulative sample size of the meta-analysis reaches the required sample size, firm evidence can be observed. When the cumulative z-curve crosses the boundaries, but the sample size does not reach the required information size, a sufficient level of evidence for the anticipated intervention effect may have been reached and no further trials are needed. If the z-curve does not cross any of the boundaries and the required sample size has not been reached, evidence to reach a conclusion is insufficient 35 . For calculation of the information size, we used a heterogeneity adjusted assumption with 10% of relative risk reduction, 5% of overall Type-I-Error, and 10% of Type-II-Error for the case of both gene alleles. The adjusted CIs for rs2075876 and rs760426 are 1.13-1.31 and 1.11-1.26, respectively. For calculations we used the Trial Sequential Analysis software tool from Copenhagen Trial Unit, Center for Clinical Intervention Research, Denmark (version 0.9 beta, www.ctu.dk/tsa).
All data generated or analysed during this study are included in this published article and its Supplementary information file.