Association of NCF1 polymorphism with systemic lupus erythematosus and systemic sclerosis but not with ANCA-associated vasculitis in a Japanese population

Genome-wide association studies of systemic lupus erythematosus (SLE) in Chinese and Korean populations demonstrated strong association of single nucleotide polymorphisms (SNPs) located in the GTF2I-NCF1 region, rs73366469 (GTF2I), rs117026326 (GTF2I), rs80346167(GTF2IRD1) and rs201802880 (NCF1). This region has also been associated with susceptibility to Sjögren syndrome and rheumatoid arthritis; however, association studies with systemic sclerosis (SSc) and ANCA-associated vasculitis (AAV) have not been reported. Here we made an attempt to confirm their associations with SLE in the Japanese population, to find the primarily associated SNP, and to investigate whether these SNPs are also associated with susceptibility to SSc and AAV. By genotyping these four SNPs on 842 SLE, 467 SSc, 477 AAV patients and 934 healthy controls, striking association was confirmed in Japanese SLE. In addition, these SNPs were significantly associated with susceptibility to SSc, but not with AAV. Conditional logistic regression analysis revealed that the association of NCF1 rs201802880, a missense SNP encoding p.Arg90His, can account for the association of other SNPs by linkage disequilibrium. These results suggested that GTF2I-NCF1 region is associated with susceptibility to multiple autoimmune rheumatic diseases but not with AAV, and the primarily associated variant may be the missense SNP in NCF1.

GTF2I-NCF1 region, rs73366469 (GTF2I), rs117026326 (GTF2I), rs80346167(GTF2IRD1) and rs201802880 (NCF1). this region has also been associated with susceptibility to Sjögren syndrome and rheumatoid arthritis; however, association studies with systemic sclerosis (SSc) and ANCA-associated vasculitis (AAV) have not been reported. Here we made an attempt to confirm their associations with SLE in the Japanese population, to find the primarily associated SNP, and to investigate whether these SNPs are also associated with susceptibility to SSc and AAV. By genotyping these four SNPs on 842 SLE, 467 SSc, 477 AAV patients and 934 healthy controls, striking association was confirmed in Japanese SLE. In addition, these SNPs were significantly associated with susceptibility to SSc, but not with AAV. conditional logistic regression analysis revealed that the association of NCF1 rs201802880, a missense SNP encoding p.Arg90His, can account for the association of other SNPs by linkage disequilibrium. these results suggested that GTF2I-NCF1 region is associated with susceptibility to multiple autoimmune rheumatic diseases but not with AAV, and the primarily associated variant may be the missense Snp in NCF1.
Autoimmune diseases are caused by a combination of multiple genetic and environmental factors, but the precise mechanisms of their development are largely unestablished. Genome wide association study (GWAS) is an efficient approach to identify the genetic factors of such complex disorders. GWAS of autoimmune rheumatic diseases including rheumatoid arthritis (RA), systemic lupus erythematosus (SLE), systemic sclerosis (SSc) and ANCA-associated vasculitis (AAV) unanimously demonstrated that the strongest association signal is present within the major histocompatibility complex (MHC) 1 until 2013, when GWAS of Sjögren's syndrome (SS) in the Chinese population surprisingly demonstrated striking associations of single nucleotide polymorphisms (SNPs), rs73366469 (T > C), rs117026326 (C > T) and rs80346167 (G > A), in a region encoding general transcription factors GTF2I and GTF2IRD1, which were even stronger than that of the MHC region 2 .
Subsequently, Immunochip and replication studies in Chinese and Korean populations also demonstrated strong association of the SNPs located at GTF2I region with susceptibility to SLE 3 . Furthermore, this region was also reported to be associated with susceptibility to RA in Korean and Japanese populations 4 . This region has also been shown to be associated with susceptibility to SLE in European American populations, albeit more weakly 5 . Thus, the region appears to be one of the strongest genetic factors for multiple autoimmune rheumatic diseases in East Asian populations.
GTF2I encodes general transcription factor II-I (TFII-I). TFII-I usually localizes in the cytoplasm. It is translocated into the nucleus after activation by growth factors, B cell and T cell receptor triggering factors, and endoplasmic reticulum stress. In the nucleus, TFII-I binds to promoter regions of target genes and promotes transcription 6 . In addition, cytoplasmic TFII-I regulates surface expression of Ca 2+ channel protein TRPC3 6 . Thus, TFII-I has relevant functions to autoimmune diseases.
On the other hand, NCF1 gene encoding neutrophil cytoplasmic factor 1, a subunit of NADPH oxidase, is one of the responsible genes for chronic granulomatous disease, and is located close to GTF2I and GTF2IRD1 genes. A naturally occurring reduction-of-function polymorphism of Ncf1 has been positionally identified to be associated with severity of pristane-induced arthritis in rats 7 . Subsequently, introduction of Ncf1 mutation in mice has been shown to be associated with arthritis, autoimmune encephalomyelitis 8 , and also lupus-like phenotypes with glomerulonephritis and type I interferon signature 9 . In humans, a missense variant (p.Arg90His, rs201802880) in NCF1, leading to reduction-of-function of NADPH oxidase, has also been associated with susceptibility to SLE. The NCF1 and GTF2I region variants are in linkage disequilibrium (LD), and two studies strongly suggested that the causative variant of this region is the NCF1 missense variant 10,11 . However, because of the complicated genomic configuration of this region with the presence of NCF1 copy number variation (CNV) and highly homologous pseudogenes (NCF1B and NCF1C), further studies from various populations will be informative in establishing the genetic contribution of each variant of this chromosomal region. SLE and SSc are both characterized by antinuclear antibodies, and a small proportion of patients exhibit symptoms of both diseases (SSc-SLE overlap syndrome). In a recent cohort study of SSc in Toronto, the prevalence of SSc-SLE overlap was 6.8% 12 . Similarly, although rare, co-occurrence of SLE and AAV has been reported especially in MPO-ANCA positive AAV, and a concept of SLE-AAV overlap syndrome has been proposed 13 . Such co-occurrence suggests the presence of shared genetic factors. With respect to the overlap of susceptibility alleles, out of 116 non-HLA loci associated with SLE with P < 5 × 10 −8 in a large-scale Immunochip analysis (based on the summary statistics downloaded from the NHGRI-EBI GWAS Catalog 14 for study 5 downloaded on 07/23/2019) and 18 confirmed SSc susceptibility loci 15 , 10 loci were shared by SLE and SSc. As for AAV, only three loci (PTPN22, PRTN3, SERPINA1) have been confirmed as susceptibility loci except for HLA, among which only PTPN22 is shared with SLE 16,17 . Thus, a rather small proportion of SLE susceptibility loci appear to be shared with SSc and AAV. To distinguish the susceptibility loci shared by multiple autoimmune rheumatic diseases and those specific for each disease will eventually lead us to deeper understanding of pathogenesis of these diseases.
Although GTF2I-NCF1 region associations have been reported in SLE, SS and RA, association studies have not been reported for SSc and AAV. In addition, to our knowledge, association study between this region and SLE has not been reported in the Japanese population. In this study, we examined whether the SNPs in GTF2I-NCF1 region are associated with susceptibility to SSc and AAV in addition to SLE. We also made an attempt to identify which SNP plays the primary role among these four SNPs.

Results
Association of GTF2I-NCF1 region Snps with overall SLe and SSc. First, we examined whether the GTF2I-NCF1 region SNPs are also associated with susceptibility to SLE in the Japanese population. The previously reported risk alleles at the four SNPs were strikingly increased in patients with SLE in comparison with healthy controls also in the Japanese population (Table 1).
Next we performed the association tests of these SNPs with SSc. When compared with healthy controls, the same alleles as in SLE were significantly associated with SSc (Table 1) In contrast, significant association was not detected in AAV ( Table 1). The statistical power to detect association in AAV was calculated to be 51.2% (rs73366469), 48.4% (rs117026326), 73.4% (rs80346167) and 73.4% (rs201802880) for the risk allele with the OR of 1.3. primary role of NCF1 rs201802880 among the four SNPs. Next we constructed the LD plot of the SNPs of 876 healthy control samples using Haploview 4.2 software. All of the four SNPs were found to be in LD; however, LD between NCF1 rs201802880 and GTF2I SNPs was moderate (Fig. 1).
To determine the primarily associated SNP among the four, conditional logistic regression test was performed with adjustment by each SNP. Notably, the associations of rs201802880 remained significant when conditioned on other SNPs. In contrast, when conditioned on rs201802880, no significant difference remained in other SNPs (Table 2). Therefore, NCF1 rs201802880 was considered to be primarily associated with SLE and SSc, while the associations of rs73366469, rs117026326 and rs80346167 were thought to be secondarily caused by LD with rs201802880.
Association of NCF1 rs201802880 with clinical characteristics of SLE and SSc. Finally, we tested whether NCF1 rs201802880 is associated with specific clinical characteristics of SLE and SSc. Patients with SLE were stratified according to the age of onset ( < 20 years or ≥ 20 years), presence of renal disorders, neurological disorders, anti-dsDNA, anti-Sm and anti-RNP antibodies, and patients with SSc according to diffuse cutaneous SSc (dcSSc) or limited cutaneous SSc (lcSSc), presence or absence of anti-topoisomerase I antibody (ATA), anti-centromere antibody (ACA), and interstitial lung disease (ILD), and case-case analysis was performed. As shown in Table 3, rs201802880 A allele was significantly enriched in the patients with SLE with the age of onset <20 years as compared with the patients with the age of onset ≥ 20 years.
Among the SSc patients, 23 were complicated by RA, SS and/or SLE. Because SLE, SS and RA were already associated with GTF2I-NCF1 SNPs 2-4,10 , association analysis was also performed after excluding these patients from the SSc group. Significant difference remained after the exclusion of these patients (n = 303, P = 6.58 × 10 −4 , OR = 1.48, 95% CI 1.18-1.85), indicating that the association with SSc did not derive from the patients complicated by SLE, RA and SS.

Discussion
In this study, GTF2I-NCF1 region SNPs were strikingly associated with susceptibility to SLE also in the Japanese population. More importantly, the same alleles were found to be associated with susceptibility to SSc for the first time. On the other hand, association was not detected in AAV. Taken together with previous observations on RA 4 and SS 2 , GTF2I-NCF1 region represents a shared genetic factor for multiple autoimmune rheumatic diseases, but not for AAV.
NCF1 is located adjacently to GTF2I and GTF2IRD1, and variants in these genes are in LD. The genomic structure of NCF1 region is extremely complicated due to presence of two pseudogenes highly homologous to NCF1. Two recent studies performed careful association analysis of the GTF2I-NCF1 region with SLE, n rs73366469 (T > C) GTF2I-GTF2IRD1    Table 2. Primary association of NCF1 rs201802880 among the GTF2I-NCF1 region SNPs demonstrated by conditional logistic regression analysis. Conditional logistic regression analysis was performed under the additive model using R software. P values (P) and odds ratios (OR) were adjusted for sex. P and OR on rows "before" are before adjustment by any other SNPs. P and OR on rows "after" are adjusted by each SNP. In this table, P values are not adjusted for multiple testing. CI; confidence interval. also in human SLE. Although the association of GTF2I and GTF2IRD1 region SNPs reported by GWAS was weaker in the European than in the Asian population, the ORs of NCF1 rs201802880 were comparable in both populations; thus, the difference in the GTF2I associations is likely to be caused by the difference in the LD with NCF1 between these populations. The risk allele rs201802880 A (the same allele is denoted as NCF1 −339T in Olsson et al. 11 ) was shown to be associated with reduced function of NADPH oxidase, leading to the reduced production of reactive oxygen species (ROS) 11 . Interestingly, the reduced production of ROS has recently been shown to be associated with autoimmune diseases with elevated interferon response in rodents and humans, especially SLE 18 , suggesting a regulatory role of ROS against autoimmunity. The present study also detected that the susceptibility allele rs201802880 A is significantly enriched in SLE patients with younger age of onset, which is consistent with the previous observations in the European population that the age at diagnosis of SLE was significantly younger in the patients carrying the susceptibility allele 10,11 .
On the other hand, lack of association of GTF2I-NCF1 region with susceptibility to AAV was an unexpected observation, because the role of neutrophil extracellular traps (NETs) has been strongly implicated in AAV as well as in SLE 19 . This lack of association is unlikely to be caused by lack of detection power, because our sample size had 73.4% detection power for a risk allele at NCF1 rs201802880 with OR of 1.3, and we did not observe even a trend for association. These results suggested that it is unlikely that this allele has substantial genetic contribution to overall AAV, although the possibility that the genetic effect of NCF1 plays a role in granulomatosis with angiitis (GPA) or proteinase 3-ANCA positive AAV which are rare in the Japanese population cannot be excluded at this point.
In view of the complexity of this genomic region, as well as potential functional relevance of both GTF2I/GTF2IRD1 and NCF1, further studies are required to dissect the genetic contribution of this region and to determine whether a single causally associated variation can account for the genetic effect, or multiple variants are independently involved.
In conclusion, the association between GTF2I-NCF1 region SNPs and susceptibility to SLE was replicated in the Japanese population. In addition, the same alleles were also associated with susceptibility to SSc, but not with AAV. Furthermore, NCF1 rs201802880 appears to be primarily associated and could account for the genetic associations of other three SNPs. Further studies on GTF2I-NCF1 region are required to establish the effect size of this shared genetic risk factor among multiple autoimmune rheumatic diseases.  20,21 . Presence or absence of renal disorders and neurological disorders in SLE was classified by the same criteria 20 . dcSSc and lcSSc were determined according to the classification criteria by LeRoy et al. 22 . The diagnosis of interstitial lung disease (ILD) was made by site investigators based on chest radiography and/or thoracic computed tomography. AAV patients were classified according to the European Medicines Agency (EMEA) algorithm 23 . Autoantibody profiles were determined by ELISA.  TaqMan SNP genotyping assay. The genotypes of SNPs were determined by TaqMan SNP genotyping assays (ABI 7300, Applied Biosystems). For rs73366469 and rs80346167, the premade primer/probe sets were used (Assay ID: rs73366469: C__97234117_10 and rs80346167: C_100871497_10; Applied Biosystems), and for rs117026326 and rs201802880, the customized primer/probe sets were used (Applied Biosystems, the sequences were shown in Supplementary Table 1). For PCR, DNA samples were added to the reaction mixture containing TaqMan ® Genotyping Master Mix (Applied Biosystems) and TaqMan probes. The PCR conditions consisted of initial denaturation at 95 °C for 10 min, followed by 40 cycles (for rs73366469, rs80346167 and rs117026326) or 25cycles (for rs201802880) of denaturation at 95 °C for 15 s, annealing at 60 °C for 60 s.

Statistical analysis.
Association analysis was performed using logistic regression analysis using R software (https://journal.r-project.org) with adjustment for sex. The analysis was performed under the additive, dominant and recessive models ( Table 1, Supplementary Tables 2 and 3), and because the Akaike's Information Criteria (AIC) was the lowest for all SNPs under the additive model in SLE, and almost equal under the three models in SSc and AAV (Supplementary Table 4), the additive model was selected for the association analysis throughout the study. P values for all case-control (Table 1) and case-case analyses (

Data availability
Based on the "Act on the Protection of Personal Inormation" enforced in Japan and the conditions on which the informed consent was given, it is not permitted to disclose an individual's genotypes and clinical information. All publicly available data generated or analyzed during this study are included in this published article and its Supplementary Information.