Replication of Gout/Urate Concentrations GWAS Susceptibility Loci Associated with Gout in a Han Chinese Population

Gout is a chronic disease resulting from elevated serum urate (SU). Previous genome-wide association studies (GWAS) have identified dozens of susceptibility loci for SU/gout, but few have been conducted for Chinese descent. Here, we try to extensively investigate whether these loci contribute to gout risk in Han Chinese. A total of 2255 variants in linkage disequilibrium (LD) with GWAS identified SU/gout associated variants were analyzed in a Han Chinese cohort of 1255 gout patients and 1848 controls. Cumulative genetic risk score analysis was performed to assess the cumulative effect of multiple “risk” variants on gout incidence. 23 variants (41%) of LD pruned variants set (n = 56) showed nominal association with gout in our sample (p < 0.05). Some of the previously reported gout associated loci (except ALDH16A1), including ABCG2, SLC2A9, GCKR, ALDH2 and CNIH2, were replicated. Cumulative genetic risk score analyses showed that the risk of gout increased for individuals with the growing number (≥8) of the risk alleles on gout associated loci. Most of the gout associated loci identified in previous GWAS were confirmed in an independent Chinese cohort, and the SU associated loci also confer susceptibility to gout. These findings provide important information of the genetic association of gout.

genetic studies for SU/gout have been conducted in Han Chinese [16][17][18][19] . However, most of them examined only a small minority of loci. In the present study, we try to determine whether the previously identified SU/gout loci affect susceptibility to gout in Chinese using our recent gout GWAS dataset.

Methods
Samples, genotyping and variants selection. All samples including 1255 clinically ascertained gout patients and 1848 healthy controls were of Han Chinese males and signed written informed consent, as described in our recent gout GWAS paper (Supplementary Methods) 13 . Clinical characteristics for the samples were shown in Supplementary Table S1. Genotyping was conducted using Affymetrix Axiom Genome-Wide CHB Array. Detailed methods of quality control and imputation were described as done previously 13 , and a brief description was shown in Supplementary Methods. All the previously identified genome-wide significant loci (p < 5.0 × 10 −8 ) related to gout/SU were obtained from the NHGRI GWAS catalog (as to May 12, 2015) and further fine-mapping or mutation analysis studies [20][21][22] (Supplementary Tables S2 and S3). Considering the linkage disequilibrium (LD) patterns might differ across different ethnicities for the same susceptibility locus, we included all the available variants those are in LD (r 2 > 0.6) with the genome-wide significant variants based on 1000 Genomes Project datasets. A total of 2255 variants for 1255 gout patients and 1848 controls were kept for subsequent analyses.
Statistical analysis. Association analysis was performed using the logistic regression with 20 principal components as covariates for correcting the potential population stratification (PCA adjustment analysis, Supplementary Methods). In order to approximate the number of independent variants within each region, we pruned the variants based on LD. A total of 56 LD pruned variants were generated using a r 2 threshold of 0.2. The simple Bonferroni correction for multiple comparisons (n = 2255) was applied, thus 2.22 × 10 −5 (0.05/2255) was set as the statistical significance level. An uncorrected p value of 0.05 was considered as nominal evidence for association. For the variants in the gout associated loci, an evidence of nominal association was treated as a successful replication, considering pervious evidences for the associations between these loci and gout were solid. The association and LD prune analyses were performed using PLINK 23 . The exact binomial test was performed using R package, by comparing the direction of effect sizes of the tested SNPs between our dataset and the previous reports. The p value was generated under the null hypothesis (H0: p = 0.50). Cumulative genetic risk score analysis was conducted by counting risk alleles in an unweighted method for each individual and calculating the effect on gout risk using logistic regression analysis adjusting for the covariates of principal components.
The study protocol was approved by the Ethics Committee of the Affiliated Hospital, Qingdao University. All procedures were conducted in accordance with the Declaration of Helsinki 24 Data availability. The results are available upon request by contacting Li CG or Shi YY. Any additional data (beyond those included in the main text and Supplementary Information) that support the findings of this study are also available from the corresponding author upon request.
Ethics approval. This study was approved by the relevant ethics review board at the Affiliated Hospital of Qingdao University.
The previously identified susceptibility SNPs were usually considered as more important variants, especially the non-synonymous ones should be given priorities. Because these variants are most likely to have functional consequences, and to be involved in the pathology of gout. We, therefore, performed further analysis for the previously reported non-synonymous variants (Supplementary Table S3). Eight of these reported non-synonymous variants were available in our dataset (Table 2), and all the gout-risk and SU-raising alleles were overrepresented in our cases (Exact binomial test p = 7.81 × 10 −3 ). Of them, two variants exhibited statistically significant associations (ABCG2 Q141K (rs2231142), p = 3.83 × 10 −10 and SLC17A1 I269T (rs1165196), p = 1.94 × 10 −5 ) and three showed nominal significant associations (SLC17A4 A318T (rs11754288), p = 9.58 × 10 −5 , GCKR L446P (rs1260326), p = 2.23 × 10 −4 and ALDH2 E504K (rs671), p = 6.80 × 10 −3 ). We noticed the rarity of SLC2A9 V253I (rs16890979) and ABCG2 Q126X (rs72552713) (with a minor allele frequency of about 1%) in our sample. As the minor allele frequency of ABCG2 Q126X is about 1%, the effect of ABCG2 Q141K will hide the effect by Q126X, we thus performed a multivariate logistic regression only for Q126X and Q141K of ABCG2 (Supplementary  Table S6). Comparing to the univariate analysis, the effect for Q126X was increased (the OR was increased from 1.404 to 2.027), which is consistent with result from similar analysis in the Japanese study 14 . However, it remained non-significant (p = 0.1612). These rare variants often required a larger sample size for detecting significant associations. Similarly, SLC22A12 G65W (rs12800450) and ALDH16A1 P476A (rs150414818) were absent in our dataset. Both were low-frequency variants identified to be associated with gout in European and/or Americans samples 9, 10 , however, they were non-polymorphic in the 1000 Genomes Project datasets. Of noted, the other gout associated SNPs identified in the Japanese study 14 also showed direction-consistent association and with nominal significance in our dataset (rs4073582, p = 0.0339 and rs3775948, p = 3.09 × 10 −3 ).
We then further investigated the cumulative effect for risk alleles of gout associated variants at these loci. Conditional analysis was used to test independent effect for the loci with multiple significant SNPs. The previously identified SNPs (especially the gout associated and non-synonymous ones) were given higher priorities in the analysis for their more important roles. The conditional analysis indicated seven independent variants for the gout associated loci: rs1260326 (L446P) of GCKR, rs11722228 of SLC2A9, rs12505410 and rs2231142 (Q141K) of ABCG2, rs4073582 of CNIH2, rs671 (E504K) of ALDH2 (MYL2-CUX2) and rs9895661 of BCAS3 (Supplementary  Table S7), thus we only included these independent variants in the cumulative genetic risk score analysis. We observed a strong increase in the OR with increasing risk allele load (Fig. 1). Comparing to the reference category of having five or fewer risk alleles, ORs for having 8, 9, 10, 11 or 12 more risk alleles were 1.310, 2.925, 4.158, 6.892 and 16.361, respectively (Supplementary Methods and Table S8).
One of the other 12 significant variants from the SU associated loci, rs68094823 (p = 4.33 × 10 −6 , OR = 0.546), was statistically significant after Bonferroni correcting (Table 1). Rs68094823 is an intron variant of SLC17A1 (also known as NPT1), and it's in strong LD with a previously identified SU associated variant (rs1165151,  Table 2. Association results for the selected important variants. CHR, Chromosome; SNP, dbSNP rs number; BP, Position, based on hg19; A1, minor allele for the whole sample; Freq., frequency of A1 for cases/controls; OR, odds ratio, for A1; L95, the lower endpoint of the 95% confidence interval (CI); U95, the upper endpoint of the 95% confidence interval; P, p value. Reported gene(s), The reported gene(s) in the previous GWAS; aa_change, amino acid change; Gout or SU, indicating whether the locus found to be associated with gout or SU; The variants with p < 0.05 were indicated in bold. All the OR (95% CI) and p values reported in this study were based on the PCA adjustment analysis.
Scientific RepoRts | 7: 4094 | DOI:10.1038/s41598-017-04127-4 r 2 = 0.90) 11 . Haplotype analysis suggested that our finding were consistent with previous finding, that is, the gout risk allele is in highly LD with the SU-raising allele. For the SLC17A1 locus, a common missense variant, rs1165196 (I269T), required special attention. A previous study showed rs1165196 was significantly associated with renal underexcretion gout (a major subtype of gout), but not significant for all gout 20 . In the present study, we provided statistically significant evidence for rs1165196 (p = 1.94 × 10 −5 , OR = 0.570), thus we confirmed the association of rs1165196 with gout ( Table 2). The conditional analysis showed that rs1165196 could be the one independent variant in the SLC17A1 locus (Supplementary Table S7). For the SU associated loci, we observed 12 independent variants (rs17632159, rs6935612, rs1165196 (SLC17A1 I269T), rs3734692, rs9321446, rs9314273, rs10821871, rs2361216, rs11172134, rs7978353, rs61168554 and rs11150190). In the cumulative genetic risk score analysis of these variants, we also observed a trend of increase in risk for gout with the growing number of the risk alleles (Supplementary Methods and Figure S1). When setting the reference group as having eight or fewer risk alleles, ORs for the groups having more risk alleles ranged from to 1.644 to 8.884 (Supplementary Table S9). Additionally, we also found an additive effect of the variants from the gout and SU associated loci. The tendency of increasing ORs for cumulative effect of seven variants on gout associated loci escalated, when additional risk alleles on SU associated loci were considered (Fig. 1). Comparing to the reference category as having five or fewer risk alleles at the variants on gout associated loci, ORs ranged from 1.697 to 30.230 for the categories having 8 or more risk alleles on gout associated loci, and at the same time having nine or more risk alleles on SU associated loci (Supplementary Methods and Table S10).

Discussion
We used a Han Chinese GWAS data of clinically defined gout cases to investigate whether variants associated with gout/SU in other studies can be replicated. For the previously reported gout associated loci, we provided further solid supports that the well-known urate transporter genes (ABCG2 and SLC2A9) and glucokinase regulatory protein gene (GCKR) are associated with gout 2,9,11,14,25 . We, for the first time, replicated the associations of the CNIH2 and MYL2-CUX2 (ALDH2) loci 14,22 with gout using a data from a different ethnic group. Moreover, one additional SU associated loci (SLC17A1) was found to be associated with gout significantly. The cumulative effects on gout risk for the variants from the gout associated loci were observed in our samples. Similar result was also observed for the variants from the SU associated loci, but the tendency for increasing OR was moderate. Combined analysis of the gout and SU loci presented an additional additive effect.
This study represents a comprehensive evaluation of individual and cumulative effects on risk for gout for previous GWAS identified gout/SU associated loci in a Han Chinese cohort. Replication across different ethnic groups provides stronger evidence for the associations between gout and these loci, and their biological mechanisms will become increasingly important for the understanding of the etiology of gout. However, it should be noted that our data didn't provide very strong support for most of the loci, which might due to the limited sample size of this study and modest effect sizes of the risk variants. Additional studies with larger sample size and functional studies (mechanism, functional assay and etc.) will be needed to further clarify the roles in gout risk of these loci. Meanwhile, large-scale GWAS of multiple populations are necessary for uncovering the additional genetic factors, especially the ones with small to moderate effect sizes, for further understanding the genetic architecture of gout. Figure 1. Cumulative effect of the associated variants from gout associated loci and gout + SU associated loci on gout incidence. For the analysis using variants from the gout associated loci (GOUT, blue color), seven variants (rs1260326 (L446P)of GCKR, rs11722228 of SLC2A9, rs12505410 and rs2231142 (Q141K)of ABCG2, rs4073582 of CNIH2, rs671 (E504K) of ALDH2 (MYL2-CUX2) and rs9895661 of BCAS3) were included and eight bins (≤5, 6,7,8,9,10,11, and ≥12) were generated. Using the ≤5 bin as the reference category, the OR and 95% CI for each of the other bins (6,7,8,9,10,11, and ≥12) were assessed using logistic regression. For the combined analysis of variants from gout and SU associated loci (GOUT + SU, red color), we also used the ≤5 bin in the gout associated loci analysis as reference, and excluded the individuals with ≤8 risk alleles in the SU associated loci analysis from the test bins (Supplementary Methods).