Investigation of C1-complex regions reveals new C1Q variants associated with protection from systemic lupus erythematosus, and affect its transcript abundance

Although rare variant C1Q deficiency was identified as causative risk for systemic lupus erythematosus (SLE), there are limited and inconsistent reports regarding the common polymorphisms of C1Q genes in SLE susceptibility. Furthermore, there are no reports concerning polymorphisms of C1S, C1R, and C1RL and whether they confer susceptibility to SLE. We therefore evaluated 22 SNPs across six C1-complex genes in two independent case-control cohorts, and identified four novel SNPs that confer protection from SLE. The four SNPs are all located in C1Q. Particularly, the variant rs653286 displayed an independent reduced risk on SLE susceptibility (OR 0.75, P = 2.16 × 10−3) and anti-dsDNA antibodies (OR 0.68, P = 0.024). By bioinformatics analysis, SNPs rs653286 and rs291985 displayed striking cis-eQTL effects on C1Q genes expression. Individuals homozygous for the ‘protective’ allele at four SNPs had significantly higher levels of serum C1q (rs680123–rs682658: P = 0.0022; rs653286–rs291985: P = 0.0076). To our knowledge, this is the first study to demonstrate that only C1Q polymorphisms are associated with SLE. The C1Q SNP rs653286 confers an independent protective effect on SLE susceptibility and affects transcript abundance.

Discovery screening of C1-complex genomic regions reveals C1Q polymorphisms associated with SLE susceptibility. We first sought to investigate any potential association(s) between the selected SNPs resided in C1-complex and SLE susceptibility in the discovery cohort. As shown in Tables 2 and S2, although at the allele level no association was observed, at the genotype level we found three SNPs that conferred nominal protective effects against SLE (dominant model: C1QA rs680123: OR 0.72, P = 0.028, q = 0.235; C1QC rs682658: OR 0.72, P = 0.032, q = 0.235; and C1QC rs653286: OR 0.70, P = 0.018, q = 0.235, respectively). A suggestive reduced risk effect was also found for C1QB rs291985 (dominant model: OR = 0.75, P = 0.059, q = 0.325). Notably, the four SNPs are all resided in C1Q genes. No potential association was observed for other polymorphisms outside C1Q in SLE susceptibility (Supplementary Table S1).
Replication study and joint analysis confirms the reduced risk of the four C1Q polymorphisms, but only SNP rs653286 has an independent protective effect on SLE susceptibility. To confirm the possible associations that we observed in the discovery cohort, the top 4 SNPs were replicated in an independent case-control cohort. Joint analysis was then performed by combining results from discovery and replication cohorts.
Subsequent haplotype analysis showed consistent results supporting the protective effects of the four C1Q polymorphisms on SLE susceptibility (Supplementary Results, Table S3). The variants rs680123-rs682658, and rs653286-rs291985 are almost in perfect LD (r 2 = 0.96/0.97 in controls/cases, respectively) (Supplementary Figure S1).
To test the independence of the four SNPs, a stepwise (forward conditional) logistic regression analysis was performed in the combined cohort. As shown in Table 3, only rs653286 was independently associated with the disease at both allelic (P = 0.022) and genotypic (dominant model: P = 2.16 × 10 −3 ) levels. Subsequent additions of rs291985, rs682658 and rs680123 did not show significant association with SLE susceptibility.
To investigate whether the independent protective SNP rs653286 predispose to any particular disease manifestation(s), we assessed the association(s) between SNP rs653286 and clinical/serologic features in a case-only cohort. Following stratification, we found SNP rs653286 was significantly associated with anti-dsDNA antibodies (dominant model: OR 0.68, P = 0.024). However, no association was observed for other SLE manifestations (Table S4). This finding might be explained by the limited statistical power of this analysis.

Bioinformatics annotations support potential regulatory function(s) of the four C1Q variants.
As the four variants are localized in either near 5′-UTR or introns, we performed bioinformatics analysis to access their potential regulatory effects. As annotated in rVarBase database, the four variants all displayed potential regulatory activities, such as location within TF (transcription factor) binding sites or chromatin interactive regions, showing LD-proxies with rSNPs or overlapping with rCNVs, regulating TSS (transcription start site) or transcriptional enhancers, and/or association with mRNA abundance, etc. (Fig. 1A). By searching in the RegulomeDB database, the four variants also showed regulatory potential by affecting protein binding, chromatin structure, and histone modifications (Supplementary Table S5). By searching into the Blood eQTL database, we found SNPs rs653286 and rs291985 displayed striking cis-eQTL (expression Quantitative Trait Loci) effects on expression of C1QB (rs653286: P = 2.91 × 10 −77 , FDR = 0.00, Z score = 18.61; rs291985: P = 1.17 × 10 −79 , FDR = 0.00, Z score = 18.90), C1QA (rs653286: P = 6.32 × 10 −6 , FDR = 0.01, Z score = 4.52; rs291985: P = 1.34 × 10 −5 ,  Genotype-dependent expression analyses are further evidence of the 'protective' role of the four C1Q variants. Supported by the functional annotations, we next sought to determine whether the 'protective' alleles at the four SNPs had any impact on serum C1q levels. As SNPs rs680123-rs682658 and rs653286-rs291985 are almost in complete LD (r 2 = 0.96/0.97 in controls/cases, respectively), we defined individuals homozygous for minor alleles of the four SNPs as "m/m" genotype and individuals homozygous for the major alleles as "M/M" genotype, respectively. As shown in Fig. 1C, in concordance with our association data and in silico functional annotations, individuals homozygous for the 'protective' allele "m/m" at all four SNPs had significantly higher levels of serum C1q expression, compared to individuals homozygous for the major allele "M/M" (rs680123-rs682658: P = 0.0022; rs653286-rs291985: P = 0.0076) or heterozygous "M/m"(rs680123-rs682658: P = 0.0058; rs653286-rs291985: P = 0.0233).

Discussion
Although the rare deficiencies of C1Q and C1S have been reported as causative genetic risks for SLE and the common variants resided in C1Q have been linked to SLE, there are no reports concerning the polymorphisms of C1R and C1S in susceptibility to SLE. In present study, we investigated the possible association(s) of the six C1-complex genes, i.e. C1QA, C1QC, C1QB, C1S, C1R, and a novel human complement-related gene C1RL, with SLE susceptibility or its clinical/serologic manifestations. We demonstrate that only C1Q but not C1R, C1RL, and C1S polymorphisms, are negatively associated with SLE. The C1QC SNP, rs653286, has an independent protective effect on SLE susceptibility. Bioinformatics annotations and genotype-dependent expression analyses further support a potential regulatory role for each of the four C1Q variants. Several studies have linked the common C1Q polymorphisms to SLE susceptibility or sub-phenotypes 8,9,12 . Among those, Martens et al. 8 reported that C1QB rs631090 was associated with SLE susceptibility, C1QA rs292001 and C1QC rs294183 was associated with more severe disease in 103 patients with SLE and their first degree relatives of a Caucasian cohort. Though the three SNPs were not tested in present study, they were captured by our candidate SNPs rs629409 (rs631090-rs629409, r 2 = 0.88), rs680123 (rs292001-rs680123, r 2 = 1.0), and rs653286 (rs294183-rs653286, r 2 = 0.84), respectively (shown in Supplementary Figure S2). Among these, rs680123 and rs653286 were shown association with SLE in present study. Interestingly, the two SNPs conferred protective effects in our study. Though rs629409 was in strong LD with rs631090 and was also reported as a risk for SLE 9 , no association is observed between this variant and SLE in the present work. We also genotyped three additional SNPs, rs12033074, rs4655085, and rs672693, known as risk variants for SLE or subphenotypes 9 . However, none of these SNPs demonstrated any association in our study. This finding may indicate that genetic heterogeneity exists among different populations. In addition, C1QA rs172378 has been reported to be associated with photosensitivity in lupus patients in African American and Hispanic populations 9 . However, in a previous study that used a candidate gene approach, rs172378 was not shown to have any associations with SLE in a Han Chinese cohort 10 . As the polymorphism rs172378 was not a tagSNP according to HapMap phase III CHB panel, it was not included in our study.
As the four polymorphisms reside in near 5′-UTR or introns, their functional consequence and the mechanism(s) underlying this genetic association remain unclear. However, by in silico functional annotations, the four variants all displayed potential regulatory activities. SNPs rs653286 and rs291985 also showed strong cis-eQTL effects on C1Q gene expression. Furthermore, several studies have suggested that functionally impaired C1q may contribute to SLE pathogenesis 13,14 . Active disease in patients with SLE is often accompanied by low levels of C1q  and other classical complement components 15,16 . Restoration of C1q levels by plasma transfusion in C1q-deficient lupus patients resulted in amelioration of the disease 17 . These data are biologically consistent with our findings:, the individuals homozygous for the 'protective' allele "m/m" at the four SNPs had significantly higher levels of serum C1q. Interestingly, the 'protective' T allele at rs653286 also conferred reduced risk for anti-dsDNA antibodies. Anti-dsDNA antibodies are one of the putative serologic markers for diagnosis of SLE. Levels of circulating anti-dsDNA antibodies fluctuate with disease activity in lupus patients. Furthermore, Yang et al. reported that the combination of anti-C1q and anti-dsDNA autoantibodies indicated higher renal disease activity and predicted poor renal outcome in patients with lupus nephritis 18 . Human anti-DNA antibodies could cross-react with C1q and deposit in the kidney in lupus patients 19 . The protective effect of rs653286 T allele on this serologic manifestation may explain, at least in part, its consistent association with increased serum C1q expression in SLE patients.
In summary, to best of our knowledge, this is the first report to investigate the genetic association between the six C1-complex genes and susceptibility to SLE. Our data indicate that only C1Q polymorphisms confer susceptibility to SLE. The four novel variants all confer reduced risk against SLE and affect its transcript abundance. The C1QC SNP, rs653286, has an independent protective effect on SLE susceptibility. Future association studies in other populations will be required to further confirm our findings. Future functional characterization of these polymorphisms is warranted to fully understand their contributions to SLE pathogenesis.

Materials and Methods
Study design and study population. A two-stage case-control study was conducted. Two independent cohorts, including 384 SLE patients and 384 healthy controls (discovery cohort), as well as 507 SLE patients and 645 controls (replication cohort), were enrolled in the study. Patients with SLE satisfied 1982 revised American College of Rheumatology classification criteria for a diagnosis of SLE 20 . Autoantibodies, including anti-nuclear antibodies (ANA), anti-double stranded DNA (dsDNA) antibodies, anti-SSA/SSB antibodies, anti-Smith (Sm) The SLE patients in the discovery and replication cohorts were recruited from the Department of Rheumatology at Peking University People's Hospital and People's Hospital of Xinjiang Province, respectively. The healthy controls were recruited from Health Care Centers of People's Hospital. All patients and healthy controls were Han Chinese. The baseline demographic characteristics of patients and healthy controls were summarized in Table 4.
The study was approved by Medical Ethics Committee in Peking University People's Hospital and written informed consent was obtained from all participants. All methods were performed in accordance with the relevant guidelines and regulations. SNP selection and genotyping. A total of 22 SNPs were selected for the discovery screening, including 18 tag-SNPs spanning the C1QA, C1QC, C1QB, C1S, C1R, and C1RL loci and an additional 4 SNPs known to be associated with SLE or related symptoms (Table 1). Linkage disequilibrium (LD) tag-SNPs were selected with a threshold of r 2 < 0.8 and a minor allele frequency (MAF) ≥5% using Haploview v4.2, according to the HapMap phase III Chinese Han Beijing (CHB) panel (http://hapmap.ncbi.nlm.nih.gov/, Figure S2). For the validation analysis, the top 4 SNPs with significant or nominal associations from the discovery cohort were further genotyped in the replication cohort.
All SNPs were genotyped using Sequenom MassArray platform (Sequenom, San Diego, California), and performed at Beijing SequeSci Co., Ltd. Briefly, DNA from study subjects was randomly assigned to the 96 well plates, and genotyping was performed blind to the status of all the samples. Genotyping was repeated in 5% of the samples for validation and quality control. The genotyping error rate was less than 0.1%. Individuals with genotyping success rates less than 90% were excluded from the analyses. Individual SNP markers with more than 10% missing genotypes were also removed from the analyses.   instructions (eBioscience, San Diego, CA). In brief, an anti-human C1q monoclonal antibody is coated and adsorbed onto microwells. The absorbance was measured at 450 nm. The concentration of C1q in a serum sample was determined by matching its absorbance with the corresponding C1q concentration in the standard curve. Samples were run in duplicate and analyzed individually. All cases had genotyping data.

Quantification of human
Bioinformatics analysis. The variant's potential regulatory features were annotated according to RegulomeDB (http://regulome.stanford.edu/) and rVarBase (http://rv.psych.ac.cn/). RegulomeDB is a database that annotates SNPs with known and predicted regulatory features in the non-coding regions of the human genome. Known and predicted regulatory DNA elements include sites of DNAase hypersensitivity, binding sites for transcription factors, and promoter regions that have been biochemically characterized to regulate transcription. The sources of these data include public datasets from GEO, the ENCODE project, and published literature 21 . rVarBase annotates a variant's regulatory features in following aspects: chromatin state of the region surrounding the variant, regulatory elements overlapped with the variant, and the variant's potential target genes. rVarBase also provides additional extended annotations for variants, including: LD-proxies of known SNPs, SNP/CNV that are overlapped with or co-localize with the queried variant, and traits (disease and expression quantitative trait) associated with the variant 22 . The blood eQTL data were derived from Blood eQTL browser (http://genenetwork.nl/bloodeqtlbrowser/) 23,24 .
Statistical analyses. The HWE test was performed for each polymorphism, using Pearson's goodness-of-fit chi-square test. The heterogeneity among study cohorts was evaluated using Review Manager 5 software (www. cc-ims.net/RevMan) and carried out with the Mantel-Haenszel method. A significant I 2 statistic (I 2 > 30%, P < 0.05) indicated heterogeneity for ORs across studies. The fixed-effects model was applied in current heterogeneity analyses. The frequencies of alleles and genotypes were compared between cases and controls, and were assessed using Pearson chi-square test and logistic regression adjusting for age and sex, respectively. Odds ratios (ORs) with 95% confidence intervals (CIs) were calculated to estimate the relative risk for developing SLE or clinical/serologic manifestations. A stepwise (forward conditional) logistic regression analysis was performed to test the independence of the identified SNPs. LD and haplotype were calculated using online software SHEsis (http://analysis2. bio-x.cn/myAnalysis.php). The Mann Whitney test was applied for the analysis of serum C1q levels between two genotypic groups. Statistical analyses were conducted using SPSS 13.0 software (SPSS Inc., Chicago, IL). The false discovery rate (FDR, q-value) was applied for the multiple testing corrections. P-value ≤ 0.05 was considered nominally statistical significant. A q-value ≤ 0.10 was considered statistically significant.