Genetic variants in systemic lupus erythematosus susceptibility loci, XKR6 and GLT1D1 are associated with childhood-onset SLE in a Korean cohort

Impact of genetic variants on the age of systemic lupus erythematosus (SLE) onset was not fully understood. We investigated a cumulative effect of SLE-risk variants on the age of SLE onset and scanned genome-wide SNPs to search for new risk loci of childhood-onset SLE (cSLE). We analyzed 781 Korean single-center SLE subjects who previously genotyped by both Immunochip and genome-wide SNP arrays. Individual genetic risk scores (GRS) from well-validated SLE susceptibility loci were calculated and tested for their association with cSLE (<16 years at onset). Single-variant association tests were performed using a multivariable logistic regression adjusting for population stratification. GRS from SLE susceptibility loci was significantly higher in cSLE than aSLE (p = 1.23 × 10−3). Two SNPs, rs7460469 in XKR6 (p = 1.26 × 10−8, OR = 5.58) and rs7300146 in GLT1D1 p = 1.49 × 10−8, OR = 2.85), showed the most significant associations with cSLE. The model consisting of GRS of SLE and two newly identified loci showed an area under curve (AUC) of 0.71 in a receiver operating characteristics (ROC) curve for prediction of cSLE. In conclusion, cSLE is associated with a high cumulative SLE-risk effect and two novel SNPs rs7460469 and rs7300146, providing the first predictive model for cSLE in Koreans.

the largest ancestral group in that study. Additionally, both the studies used HLA SNPs that are hard to explain majority of the SLE associations in the HLA region and they calculated individual's genetic burdens from the small numbers of SLE loci found at the time of publications.
Here, we investigated the genetic contribution on cSLE, overcoming the limitations of the previous researches. A cumulative SLE-risk effect in an individual from the current HLA amino-acid haplotypes and the latest non-HLA SLE-risk variants was calculated and tested for the association with cSLE (diagnosed before the age of 16). We also searched for new risk loci of cSLE using genome-wide SNP data in a Korean population.

Results
Characteristics of cSLE and aSLE. Mean age of SLE onset was 12.5 ± 2.5 years in 96 cSLE patients and 29.0 ± 9.4 in 685 aSLE patients. cSLE showed worse clinical outcomes regarding to SLE specific manifestations and disease activity. The number of cumulative American College of Rheumatology (ACR) criteria and the adjusted mean SLE disease activity index (AMS) were higher in cSLE than aSLE during follow-up period (p < 0.05, Table 1).
Association of rs7460469 and rs7300146 with cSLE. A total of 648,077 genome-wide SNPs were obtained with the average call rate of 99.7%, after the quality control procedure. We identified two loci associated with cSLE that surpassed the genome-wide significance level in multivariable logistic regression model (Table 2 and Fig. 2A), with the genomic inflation factor of 1.003 (Fig. 2B). The most significant signal (p = 1.26 × 10 −8 , OR = 5.58) was at the genotyped SNP rs7460469 in XKR6 (Fig. 2C). Notably, this locus was located between FAM167A (previously referred to as C8orf13) and BLK, which were demonstrated as an SLE risk loci in Asians as well as Caucasians [5][6][7][8] . The SNP rs7460469 was polymorphic only in Asian populations according to the 1000 Genomes Project data. The SLE-risk variants reported in FAM167A-BLK (rs922483 and rs1382568) 5 were not correlated with rs7460469 in XKR6 and showed no association with cSLE in the study subjects (p > 0.05).
We note that the local association was supported by only a single SNP rs7460469 without its correlated SNPs with similar association P values. In order to ensure the association of rs7460469, we checked that the raw fluorescence signals in the rs7460469 assays clearly cluster into three discrete genotypic groups (data not shown) and the distribution of genotypes in the subjects follows Hardy-Weinberg equilibrium (HWE; P HWE = 1.00). In addition, we re-genotyped the same variant in a subset (n = 110 with 1 AA, 56 AG and 53 GG) of the study subjects using the Sanger sequencing method. The genotype results between GWAS array and sequencing were concordant 100% in the 110 samples. Linkage disequilibrium (LD) between rs7460469 and its neighboring SNPs was also calculated, based on the Asian LD information in the 1000 Genomes Projects database. There was no SNP in LD (r 2 > 0.2) with rs7460469, which supports the observed regional association plot with only an associated variant (Fig. 2C).
The other significant signal (p = 1.49 × 10 −8 , OR = 2.84) was detected at rs7300146 in GLT1D1 (Fig. 2D). High genotyping quality for rs7300146 was assessed by checking the raw fluorescence signals in the genotyping assays (data not shown). We re-genotyped the same variant in the subset (n = 110 with 24 CC, 57 AG and 29 GG) of the subjects using Sanger sequencing and confirmed a concordance rate of 100% between array and sequencing data. After a regional imputation and fine mapping for the rs7300146 region, 8 variants exceeding the genome-wide significance threshold were identified (Supplementary Table S1). Among them, rs122989222 (p = 6.16 × 10 −9 , OR = 3.02) showed the most significant association and the rs12309809 (p = 8.96 × 10 −8 , OR = 2.93) was associated with the cis expression quantitative trait locus (eQTL) for SLC15A4, which has been known an SLE susceptibility gene in a Chinese population.
Predictive model for cSLE using an ROC curve analysis. We further estimated the predition efficacy of the model constructed with weighted SLE GRS and two novel cSLE-risk loci. These predictors were added seperately or together into a multivariate model and AUCs of the ROC curve were calculated. The model with the weighted SLE GRS showed a ROC curve with an AUC of 0.6 and the model with the two cSLE-risk SNPs (rs7460469 and rs7300146) showed an AUC of 0.68, bringing the full model combined with the separate models to the highest predictive value with the AUC of 0.71 (Fig. 3).

Discussion
We have demonstrated that cSLE known as worse phenotype than aSLE is associated with a higher accumulation SLE-risk effect, which might explain worse clinical feature in cSLE. In addition, for the first time, we newly identified two genes associated cSLE susceptibility, XKR6 and GLT1D1. There findings allowed us to construct a single model to predict cSLE.
Webb et al. demonstrated that cSLE (<18 years at onset) was associated with the increased number of SLE-risk variants than aSLE 4 . They counted 19 SLE-risk alleles in each individual and found the significant difference between cSLE and aSLE in Gullah and African-American patients with SLE, but not Hispanic and European-American patients with SLE. Asian patients with SLE were excluded due to small sample size. In our study, we showed that SLE patients with higher GRS were significantly enriched in the cSLE than aSLE group. In contrast to Webb et al. study, we used large-size Korean cohort and calculated GRS from the most updated, well-validated Asian SLE-risk loci and HLA-DRβ1 amino-acid haplotype model. The weighted GRS used in the study was calculated based on the odds ratio as well as the total number of risk-variant copies.
In addition, our study revealed the genetic contribution on the age of SLE onset at individual variants within the two loci. First, GLT1D1 around rs7300146 has been known as a candidate oncogene of colorectal cancer and relapse marker of multiple myeloma 9,10 . The SNP rs12309809, the proxy SNP of rs7300146 in GLT1D1 was associated with expression level of the neighboring gene SLC15A4. It is known that SLC15A4 is indispensable to innate immunity and antibody isotype switching to IgG2c 11,12 . Lack of SLC15A4 is related with impaired function of SLE pathogenesis associated cytokines and protein, such as toll-like receptor 7 (TLR7)-and TLR9 dependent cytokines including type 1 interferon (IFN) and nucleotide oligomerization domain-1 (NOD1) 13,14 . Interestingly, the genetic variants in SLC15A4 was associated with SLE risk in a Chinese population 15 and lupus nephritis in a Caucasian population 16 .     Another new locus associated with cSLE was XKR6, which is located between FAM167A and BLK. This locus was associated with lupus nephritis as well as SLE susceptibility in multiple ancestreis [5][6][7][8]17 . However, the cSLE-associated SNP in XKR6 that we identified in the study is Asian-specific and has no correlation with the SLE-risk, functional SNPs reported in previous studies 5 . Furthermore, the associated SNP in XKR6 had no proxy SNPs in the flanking region and no functional effects reported so far. Thus, it may need more studies on this region to understand biological importance.
Consistent with reported observations 2,18,19 , lupus nephritis was more common in cSLE (60.4%) than aSLE (46.4%) in the study cohort (p = 0.010). Although the regions around the two novel cSLE-risk variants have been associated with lupus nephritis 16,17 , we note that the associations of the two newly identified variants were not explained by lupus nephritis. There was no association between lupus nephritis and the two cSLE-risk variants (p > 0.05 in 376 lupus nephritis-positive SLE vs. 405 lupus nephritis-negative SLE).
The limitation of the study is the relatively small sample size (n = 781) and lack of replication in independent cohorts. However, we could use the samples with high-density genotyping data and accurate clinical information including onset age which was gathered in the prospective Hanyang BAE Lupus cohort.
In conclusion, we reported the presence of genetic effects on the age of SLE onset, especially in the SLE susceptibility loci and the two novel cSLE loci, providing the first model to predict cSLE in Koreans.

Methods
Ethic statement. This study protocol was approved by the Institutional Ethics Review Board of Hanyang University Hospital and written informed consent was obtained from all participants in accordance with the principles of the Helsinki Declaration. All methods were carried out in accordance with the relevant guidelines and regulations.

Study population.
A total of 781 Korean patients with SLE were selected from the Hanyang BAE Lupus cohort 20 . cSLE (n = 96) was defined as SLE patients with onset age of below 16 and aSLE (n = 685) was defined as those with onset age of 16 or above 2 . As it is well known that hormones such as estrogen influence the development of SLE and flares, we restrict cSLE to patients below 16 based their stage of puberty, in which stage 5 of puberty is defined as the stage after 16 years when height no longer increases.
Genotype data. All the study subjects were previously genotyped by both Immunochip and HumanOmniExpress genome-wide arrays, as previously described 21,22 . In brief, the merged array data was passed the general quality control thresholds such as call rates per individual or SNP (>95%), HWE in autosomal SNPs (p < 1 × 10 −5 ), and minor allele frequency (MAF > 0.01). Principal component (PC) analysis calculated PCs to adjust for population stratification among the subjects in the subsequent statistical models. We noted that there were no genetic outliers of >6 standard deviations for each of the top 10 PCs. Ungenotyped variants in the associated loci were imputed by IMPUTE2 and passed the imputation quality score (info) of 0.6. Genetic risk score for SLE risk. Individual's weighted GRS from well-validated SLE susceptibility loci were calculated to measure the accumulation effect of SLE susceptibility loci on the age of SLE onset. The SLE susceptibility loci used in calculation of GRS were composed with 45 Asian confirmed non-HLA SNPs from Sun et al. study 21 and HLA-DRB1 haplotypes (constructed from HLA-DRβ1 amino-acid positions 11, 13, and 26) in our previous study 23 . We weighted the number of effect alleles by the previously reported effect size (=the natural logarithm of the previously reported odds ratio) for both non-HLA and HLA loci, and then summed all the locus-specific weighted score, as previously described 24 . Statistical analysis. The associations of weighted GRS with the age of SLE onset and the development of cSLE were tested using a linear regression and a logistic regression adjusting for the top 10 PCs as covariates, respectively. The association of each of single variants in the whole genome with cSLE were evaluated by a multivariable logistic regression analysis controlling for the top 10 PCs to calculate an odds ratio (OR) and its 95% confidence interval (95% CI) of the minor allele. Then, we estimated the predictive power of weighted GRS and/ or risk loci of cSLE by measuring the area under the curve (AUC) of receiver operating characteristics (ROC). All data were analyzed with using the PLINK and R.