Uric acid is the final product of purine metabolism in humans1. Uric acid is produced primarily in the liver, and 70% of daily uric acid excretion occurs via the kidney1. The uric acid transport system in the kidney plays an important role in its excretion2, since only 10% of the initially filtered uric acid is excreted through urine3. The serum uric acid (SUA) level is determined by the balance of renal clearance, extrarenal clearance in the gut, and purine metabolism in the liver. The absence of the enzyme uricase and the presence of the renal uptake system contribute to higher SUA levels in humans, compared to that in other mammals1,4. SUA levels in the general population follow a normal distribution2. The physiological normal range of uric acid is 2.5-7.5 and 2.0-6.5 mg/dL for men and women, respectively5. The female hormones that influence the renal uptake of uric acid6,7 and the effects of genetic variations in SLC2A9 and ABCG28,9 are reported as sex-related factors.

Hyperuricemia is the result of overactive hepatic metabolism and high cell turnover or renal under-excretion/extra-renal under-excretion or a combination of both10. Generally, renal and gut under-excretion account for two-thirds11 and one-third12 of hyperuricemia caused by under-excretion. In addition to environmental factors, genetic factors identified by previous genome-wide association studies (GWASs) have been implicated in the critical process of uric acid excretion and its role in hyperuricemia13,14,15,16,17,18. SLC2A9 and ABCG2 were found to be the most prominent loci in a GWAS of more than 143,160 participants of European ancestry9 and a trans-ancestry GWAS of 457,690 individuals19.

A rare, loss-of-function variant (p.W258X) of SLC22A12 has been reported to be a founder mutation of familial renal hypouricemia (RHUC1, OMIM #220150) in the Japanese population20. An earlier GWAS reported an association between common variants of the SLC22A12 locus and SUA levels15. The loss-of-function variant (p.W258X) could not be identified in the Asian Genetic Epidemiology Network (AGEN) GWAS owing to its rare occurrence in East Asian populations21. Recently, rare functional variants (p.R325W, p.R405C, and p.T467M) of the SLC22A12 gene have been detected in European and African-American populations, highlighting the importance of population-specific rare and common variants22.

Global research in GWAS suffers from a Eurocentric bias and, as a consequence, disproportionately underrepresents non-European populations23,24. Although trans-ancestry GWASs have identified more than 180 loci associated with SUA25,26, the current challenges include unearthing explanations for the remaining missing heritability of SUA by identifying additional genetic variants in diverse populations and a systematic evaluation of differential effects of common and rare variants. Recently, it has been shown that common and low-frequency variants can accurately be imputed (R2 ≥ 0.8) using a large imputation panel of the Haplotype Reference Consortium (HRC) consisting of 64,976 haplotypes27. Here, we performed a genome-wide association analysis of SUA levels in the Korean population and evaluated the genetic effects on SUA levels using imputed single-nucleotide polymorphism (SNP) chip data. We also calculated polygenic risk scores (PRS) to evaluate the variability of SUA in multiple Korean cohorts.


Characteristics of the participants

In total, 6,881 individuals from urban and rural cohorts were included in the discovery phase, and a total of 3,194 individuals from the Ansan-Ansung and Kangbuk Samsung Hospital (KBSMC) cohorts were used for model validation (Table 1). The mean characteristics of the participants were slightly or substantially different between the cohorts. Overall, the urban and Ansan-Ansung cohorts included more males and more participants with hypertension, hyperlipidaemia, and diabetes than the rural cohort, which were included as covariates in the association analysis. Our discovery (n = 6,881) and replication (n = 3,194) sets consisted of individuals with preserved renal function (eGFR > 60 ml/min); however, we did not exclude participants with hypertension or diabetes.

Table 1 Clinical characteristics of the study cohort.

Meta-analysis of SUA levels

We analysed a total of 6,129,701 SNPs in the discovery phase to identify the association with SUA levels in individuals using linear regression and discovered eight independent SNPs (two low-frequency and six common) that reached the threshold of statistical significance (Table 2). In the discovery phase, the inflation factor for the meta-analysis of SUA was 0.998 after performing a genomic control (1.02 without genomic control) (Supplementary Fig. S1). A Manhattan plot of the meta-analysis on SUA levels is shown in Fig. 1. A total of 1,149 SNPs that reached a level of genome-wide significance (P-value <5.0 × 10−8) for association with SUA levels are listed in Supplementary Table S1, and the 10 independent nonsynonymous SNPs that passed Bonferroni’s correction for 11,600 nonsynonymous SNPs (P-value <4.31 × 10−6) are listed in Supplementary Table S3. The regional plots for genome-wide significant association are shown in Supplementary Fig. S3. Conditional analyses identified multiple variants, including rs184521656, a low-frequency (minor allele frequency (MAF) = 0.01) intronic variant in FRMD8, that reached a level of genome-wide significance for association with SUA levels after conditioning with pre-selected lead SNPs in the flanking region (Supplementary Fig. S4a). Other regions did not show multiple independent signals at the level of genome-wide significance (Supplementary Fig. S4). The results of the main meta-analysis were similar to the simple model that used only sex, age, and the first 10 principal components of genetic variants as covariates, except for the chromosome 17 locus near the BCAS3 gene, which is a previously reported locus (Supplementary Fig. S5)9.

Table 2 Lead variants associated with SUA from meta-analysis of GWASs.
Figure 1
figure 1

Manhattan plot depicting genome-wide association analysis of serum uric acid. Each dot represents a variant plotted as −log10(P-value) on the y axis against the corresponding variant position on the x axis. Dots in blue indicate variants in loci characterised by a genome-wide significance level (shown by the pink dashed line).

Of note, a rare nonsense variant in the SLC22A12 locus showed the strongest association with SUA levels (rs121907892, P-value = 7.4 × 10−54; β = −1.15 mg/dl; standard error (SE) = 0.07 mg/dl) in the discovery phase (Supplementary Fig. S2). The locus was monomorphic in most populations, but the variant was found to be present at a very low frequency in only East Asian populations. This variant has been reported in a previous exome-wide association study in the Japanese population28. It is also present in very low linkage disequilibrium with most of the other neighbouring variants. Another low-frequency missense variant in SLC2A9, previously identified in GWAS studies9,29, also showed significant associations with SUA levels (rs16890979, P-value = 5.86 × 10−7; β = −0.47 mg/dl; SE = 0.09 mg/dl) in the discovery phase. This variant is a common polymorphism observed in most tested populations. Other significant common variants were found to be located near the ABCG2, SLC2A9, NRXN2, NAA25, SLC17A3-SLC17A2, and BCAS3 genes (Supplementary Table S5). Although most of these loci were previously reported in GWASs, the lead SNPs identified in the present study are different from those reported in European populations (Supplementary Fig. S3)9.

Sex-stratified meta-analysis of SUA levels

In addition to the lead variants in ABCG2 (rs2231142, male; P-value = 5.9 × 10−14; β = 0.27 mg/dl; SE = 0.04 mg/dl, female; P-value = 4.3 × 10−18; β = 0.19 mg/dl; SE = 0.02 mg/dl) and SLC2A9 (rs4529048, male; P-value = 6.2 × 10−06; β = 0.15 mg/dl; SE = 0.03 mg/dl, female; P-value = 2.2 × 10−31; β = 0.23 mg/dl; SE = 0.02 mg/dl) that were previously known to show sex-specific patterns8,9, our sex-stratified analysis identified intronic variants in CDH13 on chromosome 16 that were significant in the analysis of female participants (rs8063966, P-value = 1.6 × 10−08; β = −0.18 mg/dl; SE = 0.03 mg/dl) (Supplementary Fig. S7).

Alcohol intake-adjusted associations in the region on chromosome 12

We performed meta-analysis of SUA levels using the alcohol intake-adjusted association results to check for the influence of the identified loci on uric acid levels via alcohol-related pathways. The variants in the chromosome 12 region did not show significant associations after adjusting alcohol intake, whereas other loci remained significant after this adjustment (Supplementary Fig. S6a,b). In particular, the chromosome 12 region was significant in the genome-wide meta-analysis of alcohol intake (Supplementary Fig. S6c). A common, missense variant in ALDH2 in this region, known to be associated with alcohol metabolism and drinking behaviour30,31, showed the strongest association with alcohol intake (rs671, P-value = 9.4 × 10−101; odds ratio = 0.049; 95% confidence interval = 0.037 to 0.065) and was in high linkage disequilibrium with rs116873087, the lead SNP in NAA25 (r2 = 0.985). Variants in the ATXN2 and CUX2 genes associated with SUA levels also showed association with alcohol intake (Supplementary Fig. S6d).

Comparison of significant variants identified in our study and the European study

Allele frequencies, haplotype structure, and effect sizes of the SNPs identified in the Korean cohorts may be different from those reported in European populations, which may lead to differences in the proportion of variance explained by the variants identified in the Korean and European populations. We compared the effect sizes and the proportion of variance for phenotype explained by a given SNP (PVE) for the significant variant sets from our study and from the Global Urate Genetics Consortium (GUGC) for the European population. The effect sizes of the lead SNPs in our study were approximately linear to those in the GUGC study and showed the same direction of effects as the GUGC study. Of the 27 significant lead variants from the GUGC study, 22 showed significant associations (P-value <0.05) in our study and the same direction of effects between the two studies.

Spearman’s correlation coefficient of effect sizes for the GUGC GWAS hit SNPs was 0.743, and it was calculated to be 0.878 for the lead SNPs in our study (Fig. 2a,c). The Cohen’s kappa (κ) coefficients of effect sizes were also high (Supplementary Table S2). In contrast, the sum of the PVE values was different between the two cohorts. The sum of PVE for the lead SNPs in our study was larger in Koreans (0.135) than in Europeans (0.071), whereas that for GUGC GWAS hit SNPs was slightly larger in Europeans (0.055) than in Koreans (0.048) (Fig. 2b,d).

Figure 2
figure 2

Comparative analysis of effects of and variations in variants identified in the Korean and European cohort studies. (a) β coefficients and (b) PVE values calculated for GUGC GWAS hit SNPs. (c) β coefficients and (d) PVE values calculated for lead SNPs identified in our study. Plots indicate high concordance in effect size but different proportion of variance explained by the SNPs identified in our cohort (Korean cohort; βKorean and PVEKorean) and the GUGC cohort (European cohort; βEUR and PVEEUR). Each dot indicates a variant and different colours represent the significance level of the variant; genome-wide significance (P-value <5 × 10−8) in both ethnic cohorts (blue), genome-wide significance in only one cohort but nominal significance in the other (P-value <0.05) (sky-blue), insignificance (P-value ≥ 0.05) (black) or missing (grey) significance in the other cohort. The regression line (pink) shows the similarity between the two cohorts. There is greater dissimilarity between the two cohorts when the regression line is far from the y = x line (dotted line). Variants missing in any cohort were excluded from the regression analysis. Abbreviations: ρ, Spearman’s correlation coefficient; κ, Cohen’s kappa coefficient; β, coefficient of each SNP obtained by linear regression; PVE, proportion of variance in phenotype explained by a given SNP.

Validation of low-frequency variants and PRSs of common SNPs

The linear regression model that comprised PRSs of common variants associated with SUA levels, the two low-frequency variants, and covariates was fitted to the discovery phase cohorts and validated in the two independent Ansan-Ansung and KBSMC cohorts. The PRSs were normally distributed and the overall distribution correlated with SUA levels (Fig. 3).

Figure 3
figure 3

Distribution of polygenic risk scores (PRS) and serum uric acid (SUA) levels. (a) Polygenic risk scores corresponding to significantly common SNPs were binned by 0.25. (b) Overall pattern of PRS corresponding to the distribution of SUA levels. Participants in each cohort were binned into 10 deciles according to the PRS, and each dot indicates average SUA levels.

The highest and lowest bins of the PRSs corresponded to the highest and lowest levels of SUA, respectively. The PRS was significant in both validation cohorts (P-value = 7.06 × 10−13, β = 0.24 mg/dL, SE = 0.03 mg/dL in the Ansan-Ansung cohort and P-value = 7.93 × 10−21, β = 0.21 mg/dL, SE = 0.02 mg/dL in the KBSMC cohort) (Table 3). Among the two low-frequency variants, the p.W258X nonsense mutation of SLC22A12 showed nominal association (P-value = 0.047). There was insufficient data to verify the significance of these low-frequency variants because of the small sample size in the cohort study.

Table 3 Estimated coefficients and standard errors obtained on modelling trends in SUA by linear regression.


The main findings of our study include the identification of genome-wide significant association of common and low-frequency variants with SUA levels in Koreans, and the validation of the PRS model of SUA in different cohort studies. The strength of our study is that it includes both urban and rural dwelling participants across Korea and represents an investigation on alcohol consumption. In addition, we have assessed our PRS model in independent cohorts and used available data of genotype-phenotype associations from a large European GWAS for comparison. Our study evaluates the contribution of both low-frequency and common variants to SUA levels by using an HRC-based imputation.

Our results demonstrate the benefits of using a large imputation panel such as the HRC reference panel for discovery, and characterisation of common and low-frequency variants contributing to SUA levels in the general population. We replicated several previously known genetic loci associated with SUA levels (ABCG2, SLC2A9, NRXN2, NAA25, SLC17A3-SLC17A2, SLC22A12, and BCAS3) reported for the European population and identified independent lead SNPs in two loci (SLC22A12 and NAA25) between the European GWAS and the present study. Comparison with results of the GUGC European study showed that effect sizes of common variants were comparable between East Asian and European populations, whereas allele frequencies and sum of PVE of variants were higher in the population in which the corresponding variants were identified. These results suggest that SUA levels may be affected by population-specific variants as well as shared variants at the same locus, and in particular by population-specific low-frequency variants.

The population-specific SLC22A12 p.W258X variant and the SLC2A9 p.W282I variant were accurately imputed (R2 ≥ 0.8) using the HRC reference panel. We identified an association between a low-frequency, SLC22A12 loss-of-function variant (rs121907892) and SUA levels in the general population at a genome-wide significance level (P-value <5 × 10−8)9,15,32. SLC22A12 p.W258X is a well-known variant for hypouricemia in Koreans33. This variant has been earlier identified only by sequencing analysis20,34 because of its rarer frequency and weaker linkage disequilibrium with other neighbouring common variants analysed in GWASs. Other common variants in this region were identified in the European population study (rs505802)15 and AGEN study (rs504915) including a combined total of ~110,347 Asian individuals21. The imputation in AGEN was performed on the basis of the HapMap Phase 2 reference panel. In a more recent analysis by Lee et al., who used 1000 Genomes phase 3 haplotypes as a reference panel for imputation, the founder mutation was still not identified35. The observed significant association near the NAA25 gene on chromosome 12 was previously reported, although different genes (ALDH2, ATXN2, and CUX) were reported for this region. The lead SNP was in high linkage disequilibrium with rs671 in ALDH2, a common missense variant in East Asians, but mostly absent in other populations. Our results of alcohol intake-adjusted association analysis and comparison with GWAS for alcohol intake suggests that the association of this region with SUA levels is mediated by alcohol consumption or metabolism through ALDH2.

The sex-specific CDH13 variant (rs8063966) was only detected in females. This may be associated with lower levels of SUA in women since oestradiol and progesterone have been shown to be associated with CDH13 expression36. Estrogen is known to exert a protective effect on SUA levels, with SUA starting to rise in the late-menopausal transition stage37. Further research is needed to elucidate the mechanism underlying the interplay of CDH13 and SUA levels in women.

The missense variant SLC2A9 p.W282I (rs16890979) has also been previously reported in European and African populations29. Renal hypouricemia is believed to arise owing to genetic defects in SLC22A12 and SLC2A9; this is based on the “rare disease, rare variant hypothesis”38. In contrast, we tested a regression model of low-frequency variants and PRSs of common alleles associated with SUA levels. Our model showed their independent effects despite predicting a relatively small proportion of variance in SUA levels of the general population, reflecting the “combinational effects of common and rare variants”. Our linear regression model also showed that the SUA levels increased by 0.21–0.28 per 1 standard deviation (SD) increase in PRS and increased by 0.77–0.90 (rs121907892) and 0.44–0.45 (rs16890979) per risk allele of the low-frequent missense variant in the general population.

Our study has several limitations. First, since we enrolled a cohort of ethnically homogeneous Koreans, the generalisability of our results to other populations may be limited. For example, the rare SLC22A12 p.W258X variant has been found only in the East Asian population, whereas different missense SLC22A12 variants (c.1245-1253del and p.T467M) have been found in the Roma population with renal hypouricemia39. Second, we were not able to analyse all the low-frequency variants due to the lack of sequencing data and population-matched reference panels for imputation. The power to detect additional rare and less frequent disease-causing variants is expected be improve with the availability of reference panels that include sequencing data of more diverse racial/ethnic (e.g., East Asians) samples. Third, we conducted the present analysis in the general population, and therefore, we could not assess the role of genetic variants in people with impaired renal functions. Our results may be different from those reported for gout patients or individuals with impaired renal function, who may exhibit distinct pathways for SUA regulation by other transporters such as ABCG210,40. Fourth, potentially confounding environmental factors were not considered in our analysis. A potential contribution of genetic interaction with other factors such as food and drug intake, and chronic disease conditions in the pathophysiology of SUA should be examined. The combination of all the significant genetic loci identified in this study explained approximately 9.55% of the variance in SUA levels of our study, suggesting the need for discovery of additional epigenetic and genetic factors in larger, more comprehensive cohorts.

In summary, we replicated the previously known common genetic variants associated with SUA in the Korean population and identified low-frequency variants that exerted substantial impact on reduced SUA levels. Further studies are needed to identify additional variants associated with SUA and to evaluate our model under different disease conditions (e.g., gout and chronic kidney disease).


Ethics, consent, and permissions

This study was approved by the institutional review board of Sungkyunkwan University (IRB# SKKU 2017-12-007) and Kangbuk Samsung Hospital (KBSMC 2013-01-245-008), and written informed consent was obtained from all participants. All experiments were performed in accordance with all applicable institutional and governmental regulations concerning the ethical use of human participants.

Study participants

The Korean Genome and Epidemiology Study (KoGES) is a national biobank for genomic and epidemiological studies41. The discovery-phase samples comprised two cohorts from KoGES, namely a nation-wide cohort from urban (3,585 individuals) and another from rural (3,296 individuals) areas. In total, 6,881 individuals with no missing covariates were included in the present study. Genotyping was conducted on the Affymetrix 6.0 and the Illumina Omni1 arrays for the urban and rural cohorts, respectively. We evaluated two low-frequency variants and PRS for validation in 1,167 individuals of a community-based KoGES cohort from Ansan and Ansung areas, in which SUA levels were measured, and in 2,027 individuals from the Health Screening and Examination cohort of the Kangbuk Samsung Hospital (KBSMC). Genotyping was conducted using the Affymetrix 5.0 array and Illumina HumanCore BeadChips for the Ansan-Ansung and KBSMC cohorts, respectively.

Quality control, imputation, and annotation

Sample-level and SNP-level quality controls were performed on the genotyped data using PLINK 1.942. SNPs with MAF < 1%, call rate <98%, and deviation from Hardy-Weinberg equilibrium (P-value <1.0 × 10−6) were excluded. Samples were excluded based on criteria including call rate <95%, heterozygosity rate (samples with observed heterozygosity rate 3 SD away from the mean were removed), and principal components derived from the genome-wide genotype data (samples with the first and second principal components 3 SD away from the mean were removed). We excluded one of related pairs of individuals with second-degree or closer relationships using KING 2.143. After quality control, the genotype data was phased using Eagle 2.344 and imputed on the reference panel of the Haplotype Reference Consortium (HRC r1.1 2016) using Minimac345. SNPs with low imputation quality (R2 < 0.8) and MAF < 0.5% were excluded from the main analysis. We annotated gene symbols closest to each SNP using ANNOVAR46; the two nearest genes were annotated for intergenic SNPs.

Association analysis

The association of SNPs with SUA was tested using a linear regression model adjusted for sex, age, body mass index (BMI), estimated glomerular filtration rate (eGFR), diabetes, hypertension, hyperlipidaemia, systolic blood pressure, total cholesterol, high-density lipoprotein (HDL) cholesterol, and triglycerides. Blood pressure medication was measured in the rural cohort and adjusted in linear regression. The equation from the Chronic Kidney Disease Epidemiology collaboration (CKD-EPI) was used for the estimation of eGFR47. We performed an inverse variance-weighted fixed-effects meta-analysis of SUA levels to combine the summary statistics of the association test results from the two discovery cohorts using METAL48. The meta-analysis results were double-genomic-controlled for population structure. In the meta-analysis, SNPs that reached a level of genome-wide significance (P-value <5.0 × 10−8) for association with the SUA levels were considered significant. For nonsynonymous variants, Bonferroni’s correction threshold (P-value <4.31 × 10−6, 0.05 for a total of 11,600 nonsynonymous tested variants) was used.

Conditional analysis

We performed conditional association analysis for each significant locus using GCTA-COJO49 to identify statistically independent SNPs, which have been listed in Table 2. The peak SNPs in each significant locus were selected as the primary associated lead SNP and the association analysis conditioning on the primary lead SNP was conducted for variants within the surrounding 2-megabase pairs region. If there were significant SNPs with a conditional P-value that reached a level of significance described in the association analysis, the peak SNP of conditional analysis was selected as the secondary associated lead SNP. The association analysis conditioning on the primary and secondary lead SNPs was conducted for variants in the surrounding region. This procedure was repeated until there were no more SNPs that reached the significant level in each locus.

Alcohol intake

We observed a high linkage disequilibrium between a lead SNP on chromosome 12 and a known functional missense variant (rs671) in the aldehyde dehydrogenase 2 (ALDH2) gene. Therefore, alcohol intake-adjusted association of SNPs with SUA was tested using a linear regression model adjusted for sex, age, the first 10 principal components derived from the genome-wide genotype data, and dummy variables for the alcohol intake groups. Several alcohol-related variables were investigated in the urban and rural cohorts, including the prevalence of a drinking habit, average intake per drink by type of alcoholic beverage, and average number of intakes per year. We calculated the daily alcohol intake based on these variables using the following equation:

$$Alcohol\,intake(g/day)=\sum \{(Alcohol\,content\,ratio\,)\times (Average\,intake\,per\,drink)\times (Average\,number\,of\,intake\,per\,year)\div365\}$$

The study participants in each cohort were then divided into four groups based on the calculated daily alcohol intake: non-drinker, light drinker (<20 g/day), moderate-heavy drinker (≥ 20 g/day and <50 g/day), and heavy drinker (≥50 g/day). To compare the GWAS results of SUA, we also performed the association analysis of alcohol intake (non-drinkers vs. moderate-heavy or heavy drinkers) using a logistic regression model adjusted for sex, age, and the first 10 principal components.

Comparison with results from the European cohort study

Summary statistics results from the GUGC were used for comparison of significant associations identified in our study with those reported for European populations. We extracted lead SNPs from the present study and GWAS hit SNPs from GUGC. Three of the GUGC GWAS hit SNPs were not found in our datasets and nine of the lead SNPs in our study were absent from the GUGC data. We set the β coefficients of these SNPs to zero. To compare the effect sizes from our meta-analysis with those of GUGC summary statistics, we used Cohen’s kappa (κ) coefficient to determine the direction of effect size and Spearman’s correlation coefficient (ρ). In addition, we used proportion of variance in phenotype explained by a given SNP (PVE) for comparison. The GUGC summary statistics provides sample size, MAF, effect size (β coefficient), and SE of effect size for each SNP above the P-value <5.0 × 10−8 threshold9. We estimated PVE values using the equation below50:

$$PVE=\frac{2{\hat{\beta }}^{2}MAF(1-MAF)}{2{\hat{\beta }}^{2}MAF(1-MAF)+{(se(\hat{\beta }))}^{2}2NMAF(1-MAF)}$$

we used values of the MAF of Europeans from the Genome Aggregation Database (gnomAD) because information on allele frequency was not included in the GUGC summary statistics51. Comparative measurements were calculated after excluding the variants that were missing in our meta-analysis result or from the GUGC summary statistics.

Polygenic risk scoring and model construction

We calculated PRS values as a weighted sum of the effects multiplied by the number of alleles based on the β estimates for SUA-raising alleles for 14 common (MAF ≥ 5%) SNPs that reached a level of genome-wide significance (P < 5.0 × 10−8) for association with SUA levels and were not in a high linkage disequilibrium (r2 ≤ 0.2) with the other variants. The PRS of β estimates for SUA-raising alleles was standardised to zero mean and unit variance. For nonsynonymous variants, Bonferroni’s correction threshold (P < 4.31 × 10−6, 0.05 for a total of 11,600 nonsynonymous variants) was applied to correct for the multiple testing problem. Two low-frequency SNPs (MAF ~1%), rs121907892 and rs16890979, which passed the threshold were also included in the linear regression model-based analysis of SUA levels. We constructed a linear regression model on SUA levels that comprised these PRSs and two low-frequency nonsynonymous variants. The linear regression model was adjusted for the following covariates available in all cohorts for validation: sex, age, BMI, eGFR, diabetes, hypertension, hyperlipidaemia, systolic blood pressure, total cholesterol, HDL cholesterol, and triglycerides.