Introduction

Systemic lupus erythematosus (SLE) is a chronic autoimmune disease with complex genetic etiology that can affect the majority of organs and tissues. The strong familial aggregation and high disease concordance in monozygotic twins (24–56%) have suggested a genetic component to SLE predisposition1. To date, based on hypothesis-free genome-wide association studies (GWAS), 62 SLE susceptibility loci have been identified in single ancestries2, providing important clues for understanding the genetic architecture of SLE patients.

However, the incidence of SLE varies across different populations, with a markedly higher prevalence in Asians than in Europeans3,4, suggesting genetic heterogeneity across populations. Although it was suggested that most of the novel SLE risk loci identified by GWAS was shared between Asians and Europeans2,5, a trans-ancestral comparison study would be helpful for uncovering genetic heterogeneity and detecting novel susceptibility loci in different populations6. Recently, 10 novel SNPs have been identified by a GWAS of SLE in European populations7. Among them, rs7726414 in TCF7 has also been reported to be associated with Asian SLE patients8. Thus, in this study, by genotyping the 10 novel SNPs without selection, we initially tended to explore the genetic similarities/differences in Asians and Europeans by comparing the allelic associations, odds ratios (ORs), risk allele frequencies (RAFs), and population-attributable risk percentages (PARPs) of these novel risk alleles.

In addition to, considering that most of the SLE-associated variants are located in non-coding regions of the genome, the public Encyclopedia of DNA Elements (ENCODE) databases and expression quantitative trait loci (eQTL) mapping could provide us a novel perspective on interpreting functional single nucleotide polymorphisms (SNPs) with regulatory effects9. Thus, to prioritize the plausible functional SNPs and genes of the associated SNPs in this study, we integrated different layers of functional data, including DNase I hypersensitivity (DHS) analysis results, DNase I footprints, chromatin immunoprecipitation followed by sequencing (ChIP-seq) data, and eQTL mapping results according to the previous report10.

Results

Allelic association analyses

After quality control, 493 cases and 628 controls were included in the analysis. All 10 SNPs were in Hardy-Weinberg equilibrium in patients and controls (P > 0.05). As shown in Tables 1, three SNPs, including rs6740462 in SPRED2, rs564799 in IL12A, and rs2286672 in PLD2 have been significantly detected (with P values ranging from 3.51 × 10−2 to 9.36 × 10−5). The variant rs7726414 in TCF7 showed marginal significance in the discovery population (P = 5.37 × 10−2), while no solid evidence of associations was observed for the others (with P values of 0.16–0.77). For replication, genotype data for the 4 associated SNPs were then extracted from our previous study on East Asians, including Korean, Han Chinese and Malaysian Chinese8. Consistent associations of rs564799 in IL12A and rs7726414 in TCF7 have been observed, and the significances were enhanced by meta-analysis. Interestingly, the association between these two SNPs and SLE remained significant after multiple corrections (P values were 5.91 × 10−4 and 4.12 × 10−8, respectively, using the Bonferroni method on 4 SNPs) (Table 2). The effects of the two associated alleles were in the same direction (either risk or protective factors for SLE) as reported in Europeans.

Table 1 Associations between the 10 newly mapped loci and susceptibility to systemic lupus erythematosus.
Table 2 Independent replications of associated single nucleotide polymorphisms and meta-analysis.

Detection powers for 10 SNPs in the Chinese Han population, assuming the odds ratios (ORs) in the published GWAS, are 55.1%, 61.6%, 71.7%, 75.8%, 80.9%, 92.6%, 95.1%, 95.7%, 96.8%, and 98.3% for rs9652601, rs887369, rs4902562, rs10774625, rs3768792, rs2286672, rs6740462, rs7726414, rs564799, and rs3794060, respectively. For the replicated 4 SNPs, rs7726414, rs564799, rs2286672, and rs6740462, genetic powers of the combined set of 2978 SLE cases and 4575 controls were 95.7%, 96.8%, 97.7%, and 99.5%. Thus, the un-replicated SNPs may be due to sample heterogeneity or limited detecting power (lower MAFs compared to Europeans).

Comparisons of risk allele frequencies, effect sizes and risk across cohorts

As shown in Table 3, the risk allele frequencies (RAFs) of all 10 SNPs in the controls were significantly higher in Asians than in Europeans, with P values ranging from 1.93 × 10−266 to 4.54 × 10−2. Especially for rs4902562 in RAD51B and rs2286672 in PLD2, the minor alleles in Europeans were the major alleles in Asians. Consistently with the clear differences in RAFs between Asians and Europeans, the PARPs were higher for most of the 10 SNPs in Asians than in Europeans, highlighting likely more pivotal roles in Asian patients2,5. Notably, the PARP value of the significant SNP rs564799 in IL12A was almost three times as high in Asians as in Europeans. In contrast, as mentioned in the former part, the effect size of all 10 SNPs, regarding the OR value and direction, were comparable in both Asians and Europeans.

Table 3 Comparison of risk allele frequencies, odds ratios and population-attributable risk percentages between Chinese and Europeans.

Systematic annotation and prioritization of the functional SNPs

For the two replicated SNPs, 11 proxy SNPs (r2 > 0.8) were extracted, resulting in 13 candidate SNPs for functional annotation. Overall, we found that all of the variants located in non-coding regions of the genome and overlapped with at least one layer of ENCODE data, indicating that these SNPs are likely to influence SLE through mechanisms regulating gene expression. As shown in Table 4, in one context, for the lead SNP rs564799 and its proxies, rs485789 showed the most layers of functional information (i.e., the highest RegulomeDB score + Promoter/Enhancer histone marks and DNAse sites + protein-binding site + matched motifs). This concordance of peaks in rs485789 that correlated disease susceptibility with IL12A expression made this SNP a strong candidate as a functional SNP, with IL12A as the potentially causal genes. In another context, for the lead SNP rs7726414 and its proxies, rs7726414 intersected with the most layers of functional data (i.e., the highest RegulomeDB score + Promoter/Enhancer histone marks and DNAse sites + protein-binding site + matched motifs) and thus was prioritized as the most likely functional SNP. However, no cis-eQTL effects of rs7726414 and its proxies were identified in the databases applied in the current study.

Table 4 Detailed annotation information on the SLE-associated single nucleotide polymorphisms and their proxies.

Discussion

By investigating the 10 SLE related SNPs in 3 independent East Asian SLE populations, we detected one significant novel loci (IL12A) and confirmed one previously reported one (TCF7)8. With the current replication population, the statistical power for the two significant association signals were 96.8% and 95.7%, respectively. Notably, the locus IL12B was identified as novel related genes for SLE in East Asian populations by high-density genotyping8, emphasizing the validity and immune relevance of these regions. Moreover, markedly higher RAFs and PARPs for these SNPs were observed in Asian populations compared with Europeans, providing further evidence for a genetic background for the difference in prevalence. The risk alleles and their effects (both effect size and direction) were shared by Asians and Europeans. Consistently with previous studies2,5, both similarities and differences with respect to RAFs, PARPs and ORs were observed across ethnicities.

Although, the genetic heterogeneity across ancestries would cause different association results. For the 8/10 SNPs for which we did not detect consistent association signals in the current study, different distributions of RAFs and PARPs were also observed. Even using a similar number of cases and controls for both ethnicities, differences in the power to detect significant associations for individuals SNPs across ethnicities appear to depend largely on their allele frequencies. As mentioned above, the detection powers for the 8/10 un-replicated SNPs were about 50–80%. In future work, independent replication in larger populations, especially for the SNPs with lower allele frequencies, will be needed.

More importantly, using the public available databases, we have been able to zoom in on the functional SNPs of the significant SNPs rs564799 and rs7726414, which were proposed to affect the SLE pathology. We found that most of the SLE-related SNPs were located in non-coding regions of the genome and played a role in disease pathogenesis through altering the target gene expression. On one hand, rs485789 in high LD with rs564799 (r2 = 1) showed the strongest regulatory evidence among the lead SNP rs564799 and its proxies. It also had a cis-eQTL effect on IL12A, indicating rs485789 as the functional SNP and IL12A as the potential causal gene. IL12A encodes IL-12α, which is a component of IL-12 (made in B cells, macrophages, dendritic cells and neutrophils). IL-12 is a critical secreted signal in T cell activation. On the other hand, the lead SNP rs7726414 itself was annotated as the strongest regulatory variant. Although no cis-eQTL effects of rs7726414 have been detected in the current study, the annotated gene TCF7 seemed more likely to be the causal gene. TCF7 is a T cell–specific transcription factor that regulates the expression of CD3. A mouse Tcf7 knockout showed reduced immune-competence of T cells in the periphery. Thus, further fine-mapping analysis and functional studies are still needed to clarify the role of TCF7 in the pathogenesis of SLE.

In summary, two novel loci reported by SLE GWAS in Europeans have been significantly replicated in three independent East Asian populations. The comparison of RAFs and PARPs in Europeans and Asians provides further evidence for a genetic basis of the high incidence of SLE in Asia compared to Europe. By integrating multiple layers of regulatory information and eQTL mapping, the functional SNPs and genes have been detected.

Materials and Methods

Subjects

The current association analysis was conducted in two stages. In the discovery stage, a discovery cohort of Chinese Han ancestry from Northern China was recruited, including 493 SLE cases (age 32.52 ± 12.31 years, female 86.07%) and 628 unrelated healthy controls (age 41.40 ± 11.01 years). In the replication stage, three independent East Asian cohorts, including Koreans, Han Chinese and Malaysian Chinese8, were included to validate the associated SNPs (P < 0.1). A flowchart of the current study is presented in Fig. 1.

Figure 1: Workflow of our study design.
figure 1

The study was designed in three stages. First, we genotyped the 10 novel genome wide associated loci with European SLE patients in a Han Chinese cohort in Beijing and compared the genetic similarities and differences between the two ancestries. Four out of 10 loci were identified as significant (P < 0.1). Second, we performed independent replications of these 4 loci in three cohorts from Korean, Han Chinese and Malaysian Chinese. Consistent associations have been identified in our discovery and replication cohorts for two loci (i.e., IL12A and TCF7). Third, by integrating different layers of functional data, we identified the most likely functional SNPs for these two loci. Abbreviations: GWAS: genome-wide association study; SLE: systemic lupus erythematosus; SNP: single nucleotide polymorphism.

All the patients met the revised SLE criteria of American College of Rheumatology11. This investigation was conducted according to the Declaration of Helsinki. The medical ethics committee of Peking University approved the study. All participants gave informed consent.

SNP selection and genotyping

Although the variant rs7726414 in TCF7 was also discovered and replicated in our previous study8, for consistency and to assess the replication, we evaluated the 10 novel SNPs reported in a recent SLE GWAS conducted in Europeans7 without selection. Genotyping was conducted using TaqMan allele discrimination assays as previously reported12,13,14.

To comprehensively evaluate the genetic heterogeneity between Asians and Europeans, we retrieved the summary data of the 10 SNPs from the published SLE GWAS in Europeans7. The Han Chinese replication population8 were also included for the analysis because both this population and the discovery population were recruited from Northern China. Genotype data for 5 SNPs, including rs6740462, rs564799, rs7726414, rs10774625, and rs9652601, were extracted from the Immunochip, while the remaining 5 SNPs for this cohort were genotyped using TaqMan allele discrimination assays.

Systematic annotation

To prioritize potential functional SNPs and causal genes at the replicated susceptibility loci, we integrated multiple functional data. The detailed procedures of the prioritization process were presented in Fig. 2. Considering that there are often SNPs showing high linkage disequilibrium (LD) with the associated SNPs, we first extracted the proxies (r2 > 0.8, 1000 Genomes Project, Asian population as reference) for the significant replicated SNPs using the HaploReg v4.1 database (http://www.broadinstitute.org/mammals/haploreg/haploreg.php), forming the candidate SNPs. The potential functional consequences of the candidate SNPs were predicted using rSNPBase (http://rsnp.psych.ac.cn/) and RegulomeDB databases (http://www.regulomedb.org/). The rSNPBase database provides the regulatory information on SNPs with experimentally validated regulatory elements controlling transcriptional and post-transcriptional events. RegulomeDB ranks SNPs based on the amount of regulatory information with which an SNP intersects. Then, the eQTL mapping data were used to prioritize the replicated SNPs. As a discovery set, the comprehensive and versatile eQTL database seeQTL (http://www.bios.unc.edu/research/genomicsoftware/seeQTL/), which includes various eQTL studies and a meta-analysis of HapMap eQTL information, was investigated, and the results were replicated in lymphoblastoid cell lines of 462 individuals from the 1000 Genomes Project15.

Figure 2: Annotation strategy for the significant variants and their proxies.
figure 2

The figure shows the detailed strategy of our annotation approach used to prioritize functional SNPs. HaploReg v4.1 (http://www.broadinstitute.org/mammals/haploreg/haploreg.php) was used to search proxies (r2 ≥ 0.8 in Asian) of the associated SNPs and their binding motifs and epigenetic marks. The rSNPBase (http://rsnp.psych.ac.cn/) database was used to search for regulatory SNPs with experimentally validated regulatory elements controlling transcriptional and post-transcriptional events. RegulomeDB (http://www.regulomedb.org/) was used to search for the regulatory scores of SNPs according to their amount of regulatory information. Abbreviations: eQTL: expression quantitative trait locus; GWAS: genome-wide association study; rSNP: regulatory single nucleotide polymorphism; SLE: systemic lupus erythematosus; SNP: single nucleotide polymorphism.

Statistical analysis

Quality control of genotyping, Hardy-Weinberg equilibrium tests, allelic association analyses were performed using PLINK16. As a replication, no multiple testing was applied, and P < 0.05 was considered significant. ORs and allele frequencies were presented according to the risk alleles identified in Europeans. The contributions of SNPs to the risk of SLE were estimated with PARP, which considers both OR and RAF in the general population, using the formula RAF(OR-1)/[RAF(OR-1) + 1] × 100%17. Statistical power was estimated using Power and Sample Size Calculations Version 3.0 (http://biostat.mc.vanderbilt.edu/PowerSampleSize) with a two-sided type I error rate of 0.05.

Additional Information

How to cite this article: Zhang, Y.-m. et al. Evaluation of 10 SLE susceptibility loci in Asian populations, which were initially identified in European populations. Sci. Rep. 7, 41399; doi: 10.1038/srep41399 (2017).

Publisher's note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.