Introduction

Rheumatoid arthritis (RA) is a common autoimmune disease associated with chronic synovial inflammation. The resultant symmetrical polyarticular arthritis, combined with extra-articular complications, leads to functional impairment. In developed countries, disease prevalence is 0.5–1% of the adult population, and the annual incidence has been reported to be 5–50 per 100,0001. Although the aetiology of RA has not been completely elucidated, numerous publications agree that autoimmune T-cells may escape from the adaptive immune system and, by migrating to the synovium, initiate disease development2,3,4,5. RA susceptibility is determined by multiple environmental and genetic factors, including several risk alleles. The latest trans-ethnic genome wide association study (GWAS), which involved 29,980 RA cases and 73,578 controls, completely screened novel polymorphisms in genes contributing to the disease6. One of the associated genes that seems to play a pivotal role in controlling autoimmunity is autoimmune regulator (AIRE). The gene is located in the 21q22.3 region, is 12.5 kb long, and encodes a 545 amino acid protein of 58 kDa by 14 exonial sequences7,8,9. The AIRE protein is a transcription factor that is indispensable with regards to the negative selection of immature T-cells (thymocytes). Cooperating with DNA-binding proteins, AIRE controls the promiscuous expression of peripheral tissue antigens (PTA). Mutations in the protein coding gene sequence of AIRE results in the development of autoimmune polyendocrinopathy candidiasis-ectodermal dystrophy, an autoimmune deterioration of numerous organs10,11,12.

To date, increasing numbers of publications have suggested that SNPs in the gene sequence affect AIRE transcription. The SNPs thereby alter the functional activity of AIRE and potentially elevate disease susceptibility7. A recent experimental study described two distinct SNPs of AIRE. AIRE−230Y, and AIRE−655G. AIRE−230T haplotype transcriptionally modifies AIRE expression and influence negative selection, elevating the risk of autoimmunity13. Various SNPs in the AIRE genetic sequence have garnered attention; however, to date, only a minority of case-control studies have observed an association between gene polymorphism and susceptibility to diseases, including vitiligo7,14, alopecia areata7,15, melanoma7,16, systemic sclerosis7,17 and RA7,18,19,20,21,22. Among the latter diseases, only RA has been analysed by multiple case-control studies and, therefore, seems to be optimal to analyse positive or negative associations7. Xu et al. have published that AIRE polymorphism was associated with the increased risk of RA23. Here, we present a systematic review and first meta-analysis that includes case-control studies to verify the association of SNPs rs2075876 and rs760426 in the AIRE gene with RA.

Results

Characteristics of included studies

We identified 19 publications after a thorough search of Embase, PubMed, Cochrane Library, and Web of Science databases. After removing duplicates, we reviewed the remaining 11 studies for eligibility and selected five publications for inclusion in our meta-analysis. Our PRISMA flow chart of the searching process is shown in Fig. 1. Asian and Caucasian ethnicities were involved. Diagnosis of RA was determined according to the American College of Rheumatology classification criteria in 198724. The overall mean age of RA patients was 54.1 ± 2.4 years, and the percentage of female cases was 73.34%. Genotyping was conducted by microarrays, single base extension methods (SNaPshot), and Taqman SNP Genotyping Assays. By further reviewing the five eligible publications, we identified 11 SNPs of the AIRE gene (rs2075876, rs760426, rs1800250, rs2776377, rs878081, rs1055311, rs933150, rs1003854, rs2256817, rs374696, rs1078480). Only rs2075876 and rs760426 were involved in four or more studies; therefore, we performed meta-analysis for rs2075876 (7145 cases and 8579 controls) and rs760426 (6696 cases and 8164 controls). All genotype frequencies of the controls were in Hardy-Weinberg Equilibrium (HWE). Characteristics of the included studies on rs2075876 and rs760426 are summarized in Table 1.

Figure 1
figure 1

PRISMA flow diagram for inclusion and exclusion of studies in the meta-analysis.

Table 1 Characteristics of the included studies on SNP rs2075876 (G > A) and rs760426 (A > G) (SNP = single nucleotide polymorphism; NA = not available; HB = hospital based).

Meta-analysis of SNP rs2075876 (G > A)

Five studies were identified that investigated the association between SNP rs2075876 and RA susceptibility18,19,20,21,22. Most of the publications doubled the individual number to account for alleles; thus, to normalize the data, we also calculated with duplicated values (see Supplementary Table S1). GWAS by Terao et al.18 served as three independent case-control studies (denoted with A, B, C). With the exception of García-Lozano et al.19, all of the studies described the genotype distribution for GG, AG, AA. Therefore, we calculated odds ratios (ORs) for genetic models where there was no available or feasible data in the given study (Table 2). Results for each genetic model are shown in Fig. 2. For the allelic model (A vs. G, Fig. 2A) the ORs were 1.21 (95% CI 1.09–1.36, P < 0.001), 1.18 (1.07–1.30, P = 0.001), 1.15 (1.06–1.24, P < 0.001), 1.02 (0.42–2.42, P = 0.964), 1.32 (1.04–1.69, P = 0.021), 1.30 (1.12–1.50, P < 0.00), and 1.41 (1.16–1.70, P < 0.001). For the dominant model (AG + AA vs. GG, Fig. 2B) the ORs were 1.18, (95% CI 1.06–1.32, P = 0.002), 1.31 (1.19–1.45, P < 0.001), 1.18 (1.09–1.27, P < 0.001), 1.41 (1.08–1.84, P = 0.010), 1.55 (1.32–1.82, P < 0.001), and 1.48 (1.22–1.78, P < 0.001). For the recessive model (AA vs. AG + GG, Fig. 2C) the ORs were 1.53 (95% CI 1.31–1.79, P < 0.001), 1.09 (0.95–1.26, P = 0.204), 1.25 (1.12–1.39, P < 0.001), 1.52 (1.13–2.05, P = 0.006), 1.25 (1.05–1.49, P = 0.010), and 1.78 (1.36–2.35, P < 0.001). For the codominant heterozygous model (AG vs. GG, see Supplementary Fig. S1), the ORs were 1.08 (0.96–1.21, P = 0.168), 1.32 (1.20–1.47, P < 0.001), 1.13 (1.05–1.23, P = 0.002), 1.28 (0.97–1.70, P = 0.077), 1.51 (1.28–1.79, P < 0.001), and 1.34 (1.10–1.64, P = 0.003). For the codominant homozygous model (AA vs. GG, see Supplementary Fig. S1) the ORs were 1.60 (95% CI 1.35–1.89, P < 0.001), 1.27 (1.09–1.48, P = 0.002), 1.34 (1.19–1.51, P < 0.001), 1.78 (1.26–2.52, P = 0.001), 1.62 (1.32–1.99, P < 0.001), and 2.09, (1.56–2.81, P < 0.001).

Table 2 ORs, 95% CIs, and P-values for each genetic model in the association of SNPs rs2075876 (G > A) and rs760426 (A > G) with RA risk (NA = not available; OR = odds ratio; CI = confidence interval; *literature data.
Figure 2
figure 2

The association of SNP rs2075876 (G > A) with RA risk in different genetic models. (A) Allelic model (A vs. G). (B) Dominant model (AG + AA vs. GG). (C) Recessive model (AA vs. AG + GG).

Results of heterogeneity analysis for each genetic model are shown in Supplementary Table S2. For the allelic model Ph = 0.439 and I2 = 0%, for the dominant model Ph = 0.011 and I2 = 66.2%, for the recessive model Ph = 0.005 and I2 = 69.9%, for the codominant heterozygous model Ph = 0.004 and I2 = 70.5%, and for the codominant homozygous model Ph = 0.012 and I2 = 65.4%. Moderate heterogeneity was found in dominant, recessive, codominant heterozygous, and codominant homozygous models.

Only four out of 31 ORs were statistically insignificant, and the ORs revealed that SNP rs2075876 (G > A) is associated with an elevated risk of RA. These results therefore suggest a link between AIRE SNP rs2075876 (G > A) and RA susceptibility.

Meta-analysis of SNP rs760426 (A > G)

Four studies investigated the association between SNP rs760426 and RA susceptibility18,20,21,22. Most of the publications doubled the individual number; thus, to normalize the data, we also calculated with duplicated values, as was conducted with rs2075876 SNP (see Supplementary Table S1). Again, GWAS by Terao et al.18 served as three independent case-control studies (denoted with A, B, C). With the exception of Feng et al.21, all studies described the genotype distribution for AA, GA, GG. We therefore calculated ORs for all the genetic models that were not published in the original articles (Table 2). Furthermore, we excluded the OR, 95% CI and p-value of Feng et al.21 from the statistical analysis due to the asymmetry of the OR. Results for each genetic model are shown in Fig. 3. For the allelic model (G vs. A), Fig. 3A, the ORs were 1.23 (95% CI 1.10–1.37, P < 0.001), 1.13 (1.02–1.25, P = 0.011), 1.16 (1.08–1.26, P < 0.001), 1.25 (0.98–1.60, P = 0.062), 1.25 (1.04–1.52, P = 0.018). For the dominant model (GG + GA vs. AA, Fig. 3B), the ORs were 1.16 (1.04–1.29, P = 0.007), 1.19 (1.08–1.31, P < 0.001), 1.18 (1.09–1.27, P < 0.001), 1.19 (0.92–1.55, P = 0.171), and OR 1.32 (1.10–1.59, P = 0.003). For the recessive model (GG vs. GA + AA, Fig. 3C) the ORs were 1.66 (95%CI 1.43–1.94, P < 0.001), 1.15 (1.00–1.32, P = 0.047), 1.19 (1.07–1.32, P = 0.001), 1.55 (1.16–2.08, P = 0.003), and 1.36 (1.05–1.77, P = 0.020). For the codominant heterozygous model (GA vs. AA, see Supplementary Fig. S2) the ORs were 1.03, (95% CI 0.92–1.16, P = 0.582), 1.17 (1.06–1.30, P = 0.002), 1.15 (1.06–1.25, P = 0.001), 1.04 (0.79–1.38, P = 0.741), and 1.26 (1.03–1.54, P = 0.020). For the codominant homozygous model (GG vs. AA, see Supplementary Fig. S2), the ORs were 1.69 (1.43–2.00, P < 0.001), 1.25 (1.08–1.46, P = 0.003), 1.29 (1.15–1.44, P < 0.001), 1.60 (1.15–2.24, P = 0.006), and 1.54 (1.16–2.04, P = 0.003).

Figure 3
figure 3

The association of SNP rs760426 (A > G) with RA risk in different genetic models. (A) Allelic model (A vs. G). (B) Dominant model (AG + AA vs. GG). (C) Recessive model (AA vs. AG + GG).

Results of heterogeneity analysis for each genetic model are shown in Supplementary Table S2. For the allelic model Ph = 0.737, I2 = 0%, for the dominant model Ph = 0.822, I2 = 0%, for the recessive model: Ph = 0.001, I2 = 76.7%, for the codominant heterozygous model: Ph = 0.323, I2 = 14.2%, and for the codominant homozygous model Ph = 0.038, I2 = 60.5%. Moderate heterogeneity was found in recessive and codominant homozygous models.

Only four out of 26 ORs were statistically insignificant, and the ORs showed that rs760426 (A > G) SNP is associated an elevated risk. These results therefore suggest a link between AIRE SNP rs760426 (A > G) and RA susceptibility.

Sensitivity analysis

To detect the influence of each case-control study on the whole meta-analysis, we performed sensitivity analysis by omitting one individual study. Heterogeneity was not found in SNP rs2075876 or rs760426 by investigating allelic (Fig. 4), dominant, recessive, codominant heterozygous, and codominant homozygous genetic models (see Supplementary Figs S3 and S4).

Figure 4
figure 4

Sensitivity analysis for the allelic models of (A) SNP rs2075876 (G > A) and (B) rs760426 (A > G).

Publication bias

Bias analysis was performed by generating funnel plots for each polymorphism of the allelic (Fig. 5), dominant, recessive, codominant heterozygous, and codominant homozygous genetic models (see Supplementary Figs S5 and S6). After analysis, all funnel plots were perfectly symmetric, and no publication bias was detected for SNP rs2075876 or rs760426.

Figure 5
figure 5

Funnel plots of allelic genetic models of (A) SNP rs2075876 (G > A) and (B) rs760426 (A > G).

Trial sequential analysis

We performed a TSA for the allelic models (Fig. 6) of SNPs rs2075876 (G > A) and rs760426 (A > G). Results of allelic models for both polymorphisms showed that the blue line of cumulative z-curve crossed the TSA monitoring boundary and the cumulative sample size was reached. Therefore, we observed robust evidence in the association between SNPs rs2075876 (G > A) and rs760426 (A > G) and RA risk. These results suggest that no further studies are necessary to confirm the association.

Figure 6
figure 6

Trial sequential analysis for allelic genetic models of (A) SNP rs2075876 (G > A) and (B) rs760426 (A > G).

Discussion

RA is a multifactorial disorder where genetic and environmental events equally contribute to disease commencement1,25. The latest GWAS meta-analysis discovered and screened 42 novel RA risk SNPs at a genome level from 98 candidate biological RA risk genes6. The detected risk genes, including AIRE, are mainly in the category of primary immunodeficiency (PID), HIV, and immune dysregulation. With the exception of AIRE, none of the other associated proteins have been directly related to central tolerance6.

Self-tolerance involving negative selection, the machinery of which is directed by AIRE, is a central immuno-physiological process required to create a normal adaptive immune system. We believe that polymorphisms in this indispensable gene lower the protein expression of AIRE, decrease the presentation of self-antigens, reduce negative selection, and contribute to the escape and survival of autoimmune T-cells. Reaching the periphery, matured, autoimmune T-cells are a source of autoantibodies and serve as a medium for numerous immune disorders, including RA. In support of this belief, Lovewell et al.13 have concluded, through a gene reporter assay, that specific haplotypes (AIRE−655G AIRE-230T) can dramatically reduce AIRE transcription. However, with in vitro and in vivo experiments, Kont et al.26 have demonstrated that the presentation of PTAs from mTECs is quantitatively affected by these reductions in AIRE expression.

An in silico investigation by Terao et al.18, which analysed the expression profile of 210 lymphoblastoid cells in the Gene Expression Omnibus (GEO) database, has demonstrated a statistically significant (p < 0.001) correlation between the rs2075876 risk allele (A) and decreased AIRE transcription27. No association was found in GEO between rs760426 (G) and AIRE expression. Additionally, García-Lozano et al.19 found statistically significant decreases in the expression levels of rs878081 C allele by analysing GEO database. This SNP is located in the Exon 5 region of AIRE; however, rs2075876 (G > A) is located in Intron 5 and rs760426 (A > G) in Intron 127. The latter SNPs may affect the transcription of AIRE by modifying alternative splicing or intron-mediated enhancement28. The reduction in transcription, in turn, provides lower amounts of PTAs ectopically on the major histocompatibility complex/human leukocyte antigen of mTECs, which thereby contributes to the failure of negative selection in the thymus and increases the survival of autoimmune T-cells. In individuals who carry these SNPs, this sequence increases RA susceptibility. By analysing allelic, dominant, recessive, codominant heterozygous, and codominant homozygous models, we demonstrated that the SNPs rs2075876 (G > A) and rs760426 (A > G) occur more frequently in RA patients than in controls.

There are some limitations in our meta-analysis. We cannot extrapolate the findings of rs2075876 (G > A) and rs760426 (A > G) to Caucasians due to the limited study number. Based on García-Lozano et al.19, the results are not statistically significant; however, rs878081 C allele seemed to occur more frequently in RA patients. Furthermore, considering GWAS of Terao et al.18, the association of AIRE with RA among Caucasians was not supported. The number of the included studies also limited our meta-analysis; however, Terao et al.18 provided three case-control studies in one publication, which elevated the number of the included epidemiological studies. In the future, further European case-control, GWAS, and stratified subgroup analyses (age, smoking) are needed in order to better elucidate the association between RA and AIRE polymorphism.

To our knowledge, this is the first time that the association between SNP rs2075876 (G > A), rs760426 (A > G), and RA susceptibility was statistically estimated in one meta-analysis. We used multiple haplotype investigations for each polymorphism, sensitivity analyses, and TSA to confirm the robustness of association. In conclusion, our meta-analysis clearly confirmed with each genetic model that the presence of SNPs rs2075876 (G > A) and rs760426 (A > G) is significantly associated with an increased risk for RA.

Methods

Search strategy

We searched for related literature in the PubMed, Embase, Cochrane Library, and Web of Science databases in accordance with the recommendations of the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) statement29. On 16th May 2017 we completed the search. Keywords (“autoimmune regulator”; “AIRE”; “polymorphism”; “rheumatoid arthritis”) were thoroughly used by two independent investigators. All studies were published from April 2011 to June 2016.

Inclusion and Exclusion criteria

In order for studies to be included, publications had to demonstrate that (1) the study focused on the association between SNPs or haplotypes within the AIRE gene and RA susceptibility, (2) the study was case-control-designed, (3) all RA patients met the American College of Rheumatology classification and diagnostic criteria, and (4) detailed genotype data and feasible ORs, 95% CIs, and p-values were available. Publications were excluded if (1) a previous study was duplicated or (2) the given polymorphism was not found in at least four studies. Review articles were also excluded. Inclusion and exclusion criteria were independently screened by two investigators.

Statistical analysis

HWE was calculated by the chi-squared test for each study in the control groups. Pooled ORs and 95% CIs were calculated to examine the strength of the association between rs2075876 and rs760426 polymorphisms and RA. We used the random effect model by DerSimonian and Laird30 because of the different ethnicities of those included. Heterogeneity between trials was tested with two methods. First, we employed the Cochrane’s Q homogeneity test, which exceeds the upper-tail critical value of chi-square on k–1 degrees of freedom, with a p-value of less than 0.10 considered suggestive of significant heterogeneity. Second, we used the inconsistency (I2) index. I2 is the proportion of total variation contributed by between-study variability. I2 values of 25, 50 and 75% correspond to low, moderate, and high degrees of heterogeneity, respectively, based on Cochran’s handbook31. Sensitivity analyses were performed to identify the influence of each study on the pooled ORs and 95% CIs. Publication bias was examined by visual inspection of funnel plots where the standard error was plotted against the log odds ratio. Meta-analytic calculations were performed with Comprehensive MetaAnalysis software Version 3 (Biostat, Inc., Englewood, NJ, USA).

Trial sequential analysis (TSA)

Meta-analyses may be biased in type I errors owing to an increased risk of random error when sparse data are analysed, combined with reduplicative testing on accumulating data. To avoid this problem and to increase the robustness of conclusions, we used trial sequential analysis (TSA)32,33,34. TSA combines an estimation of the required sample size with an adjusted threshold for statistical significance. The relationship between the cumulative z-curve and the trial sequential monitoring boundary shows the expressiveness of the meta-analysis. If the cumulative z-curve crosses the trial sequential monitoring boundary, and the cumulative sample size of the meta-analysis reaches the required sample size, firm evidence can be observed. When the cumulative z-curve crosses the boundaries, but the sample size does not reach the required information size, a sufficient level of evidence for the anticipated intervention effect may have been reached and no further trials are needed. If the z-curve does not cross any of the boundaries and the required sample size has not been reached, evidence to reach a conclusion is insufficient35. For calculation of the information size, we used a heterogeneity adjusted assumption with 10% of relative risk reduction, 5% of overall Type-I-Error, and 10% of Type-II-Error for the case of both gene alleles. The adjusted CIs for rs2075876 and rs760426 are 1.13–1.31 and 1.11–1.26, respectively. For calculations we used the Trial Sequential Analysis software tool from Copenhagen Trial Unit, Center for Clinical Intervention Research, Denmark (version 0.9 beta, www.ctu.dk/tsa).

All data generated or analysed during this study are included in this published article and its Supplementary information file.