Epigenome-wide association study of serum urate reveals insights into urate co-regulation and the SLC2A9 locus

Elevated serum urate levels, a complex trait and major risk factor for incident gout, are correlated with cardiometabolic traits via incompletely understood mechanisms. DNA methylation in whole blood captures genetic and environmental influences and is assessed in transethnic meta-analysis of epigenome-wide association studies (EWAS) of serum urate (discovery, n = 12,474, replication, n = 5522). The 100 replicated, epigenome-wide significant (p < 1.1E–7) CpGs explain 11.6% of the serum urate variance. At SLC2A9, the serum urate locus with the largest effect in genome-wide association studies (GWAS), five CpGs are associated with SLC2A9 gene expression. Four CpGs at SLC2A9 have significant causal effects on serum urate levels and/or gout, and two of these partly mediate the effects of urate-associated GWAS variants. In other genes, including SLC7A11 and PHGDH, 17 urate-associated CpGs are associated with conditions defining metabolic syndrome, suggesting that these CpGs may represent a blood DNA methylation signature of cardiometabolic risk factors. This study demonstrates that EWAS can provide new insights into GWAS loci and the correlation of serum urate with other complex traits.


GENOA
The Genetic Epidemiology Network of Arteriopathy (GENOA) study is a communitybased study of hypertensive sibships that was designed to investigate the genetics of hypertension and target organ damage in African Americans from Jackson, Mississippi and non-Hispanic whites from Rochester, Minnesota (Daniels, 2004). In the initial phase of the GENOA study (Phase I: 1996(Phase I: -2001, all members of sibships containing ≥ 2 individuals with essential hypertension clinically diagnosed before age 60 were invited to participate, including both hypertensive and normotensive siblings. Exclusion criteria of the GENOA study were secondary hypertension, alcoholism or drug abuse, pregnancy, insulin-dependent diabetes mellitus, or active malignancy. Eighty percent of African Americans (1,482 subjects) and 75% of non-Hispanic whites (1,213 subjects) from the initial study population returned for the second examination (Phase II: 2001-2005. Study visits were made in the morning after an overnight fast of at least eight hours. Demographic information, medical history, clinical characteristics, lifestyle factors, and blood samples were collected in each phase. Written informed consent was obtained from all subjects and approval was granted by participating institutional review boards. DNA methylation levels were measured only in African Americans participants, so participants in the current analysis are African American. Participants were excluded from this analysis if they were also participants in the Atherosclerosis Risk in Communities (ARIC) study. Methylation and all kidney-related measures were from the Phase II examination. Support for GENOA was provided by the National Heart, Lung and Blood Institute (U01 HL054457, RC1 HL100185, R01 HL119443, and R01 HL133221) and the National Institute of Diabetes and Digestive and Kidney Diseases (R01 DK073537) of the National Institutes of Health. We appreciate technical assistance from Stephen T. Turner, Pamela I. Hammond, Julie M. Cunningham, and the Mayo Clinic Advanced Genomics Technology Center. We would also like to thank the families that participated in the GENOA study.

JHS
The JHS is a large, population-based observational study evaluating the etiology of cardiovascular, renal, and respiratory diseases among African Americans residing in the three counties (Hinds, Madison, and Rankin) that make up the Jackson, Mississippi metropolitan area. Data and biologic materials have been collected from 5306 participants, including a nested family cohort of 1,498 members of 264 families. The age at enrollment for the unrelated cohort was 35-84 years; the family cohort included related individuals >21 years old. Participants provided extensive medical and social history, had an array of physical and biochemical measurements and diagnostic procedures, and provided genomic DNA during a baseline examination (2000)(2001)(2002)(2003)(2004) and two follow-up examinations (2005-2008 and 2009-2012). The study population is characterized by a high prevalence of diabetes, hypertension, obesity, and related disorders. Annual follow-up interviews and cohort surveillance are ongoing.

KORA F4
The  . The authors thank the LURIC study team who were involved in patient recruitment as well as sample and data handling, in addition to the laboratory staff at the Ludwigshafen General Hospital and the universities of Freiburg, Ulm and Heidelberg, Germany.

Normative Aging Study
The Normative Aging Study (NAS) is a longitudinal study on aging established by the U.S. Department of Veterans Affairs in 1963. Details of the study have been published previously [1]. Briefly, the NAS is a closed cohort of 2,280 male veterans living in the Greater Boston area. Participants were enrolled after an initial health screening to determine whether they were free of known chronic medical conditions. Most of the participants were examined up to four times between 1999 and 2013. They have been reevaluated every three to five years on a continuous rolling basis using detailed on-site physical examinations and questionnaires. To control for the heterogeneity of race, a total 774 Caucasian participants aged 55-85 years at initial visit were used in the analysis.

Rhineland Study
The Rhineland Study is an ongoing community-based cohort study in which all inhabitants of two geographically defined areas in the city of Bonn, Germany aged 30-100 years are being invited to participate. Persons living in these areas are predominantly German from Caucasian descent. Participation in the study is possible by invitation only. The only exclusion criterion is insufficient German language skills to give informed consent. The Rhineland Study's overarching aims are to investigate the etiology and prediction of age-related (neurodegenerative) diseases and to assess normal and pathological (brain) structure and function over the adult life course.

Supplementary Note 2: Results on the reverse Mendelian randomization (MR) analysis of the causal effect of serum urate on DNA methylation levels
The reverse MR analysis using all available index SNPs of serum urate among persons of European ancestry resulted in five significant CpGs (3 in the SLC2A9 region). Leave-one-out analysis showed that the significant causal effects were driven by the index SNP at SLC2A9, rs4447862 (Supplementary Figures 17A  to 17E) and no longer significant upon its removal. Therefore, the significant results including rs4447862 were associations driven by this SNP, rather than causal effects of serum urate on DNA methylation levels.
Supplementary Note 3: Minimum detectable effect size in forward MR analysis of DNA methylation on serum urate or gout. : SNP with lowest p-value in EA: rs4447862, independent SNPs identified by GCTA stepwise selection: rs6825187, rs62286563, rs10017305, rs73224492. ** The effect size was estimated for mg/dL per SD of rank-based transformed DNA methylation beta value for urate and OR per SD of rank-based transformed DNA methylation beta value for gout.

Supplementary Figures 2A to 2AF.
Forest plots by ancestry of DNA methylation beta value per mg/dL of serum urate on 32 replicated CpGs with heterogeneity >50% from the meta-analysis of four ancestry groups: European (17 studies), African American (5 studies), and one study each for South Asian and Sub-Saharan Africa. Among these CpGs, 90% (29 CpGs) had consistent effect directions across the ancestry groups. Given that only one study each was available for South Asian and Sub-Saharan African ancestry, some heterogeneity might reflect study heterogeneity. The CpGs are ordered by chromosomal position. Abbreviations: EA: European ancestry, AA: African American, SA: South Asian, AFR: Sub-Saharan Africa. The numbers in parentheses next to the ancestry abbreviations reflect sample size. The I 2 statistic is provided as a measure of heterogeneity across the four groups. Whiskers correspond to 95% confidence intervals. Sample size for each plot: For the CpG with a significant causal effect on gout, plots of the association pvalue of the meQTLs (association with DNA methylation) included in the Mendelian randomization analysis (top plot) and r 2 among meQTLs (bottom plot). The p-values were 2-sided and obtained from inverse variance weighted fixed effect meta-analysis. Sample size: GoDMC meQTL study (n=27,750), r 2 (n=804).

Supplementary Figure 7.
Forest plot of leave-one-out analysis for the four CpGs with significant causal effect on serum urate levels: cg02387843 (A), cg03725404 (B), cg11266682 (C), and cg13841979 (D). Xaxis is the causal effect estimate on serum urate in mg/dL per SD of rank-based transformed DNA methylation beta value when each of the SNPs on the y-axis was excluded as the instrument of the CpG. The whiskers represent 95% confidence intervals. Sample size: GoDMC meQTL study (n=27,750), serum urate GWAS (n=288,649). Abbreviations: MR, Mendelian randomization; SD, standard deviation.
Supplementary Figure 7A Supplementary Figure 8. Forest plot of leave-one-out analysis for the CpG with significant causal effect on gout. X-axis is the causal effect estimate on gout in log odds ratio per SD of rank-based transformed DNA methylation beta value when each of the SNPs on the y-axis was excluded as the instrument of the CpG. The whiskers represent 95% confidence intervals. Sample size: GoDMC meQTL study (n=27,750), gout GWAS (n=692,537). Abbreviations: MR, Mendelian randomization; SD, standard deviation.
Supplementary Figure 9. Forest plot of the effects of the meQTLs on serum urate included in the forward MR analysis for the four CpGs with significant causal effects: cg02387843 (A), cg03725404 (B), cg11266682 (C), and cg13841979 (D). X-axis is the causal effect estimate on serum urate in mg/dL per SD of rank-based transformed DNA methylation beta value using each of the SNPs on the y-axis as the instrument of the CpG. The whiskers represent 95% confidence intervals. Sample size: GoDMC meQTL study (n=27,750), serum urate GWAS (n=288,649). Abbreviations: MR, Mendelian randomization; SD, standard deviation.
Supplementary Figure 9A Supplementary Figure 10. Forest plot of the effects of the meQTLs on gout included in the forward MR analysis for the CpG with significant causal effects. The x-axis is the causal effect estimate on gout in log odds ratio per SD of rank-based transformed DNA methylation beta value using each of the SNPs on the y-axis as the instrument of the CpG. Sample size: GoDMC meQTL study (n=27,750), gout GWAS (n=692,537). The whiskers represent 95% confidence intervals. Abbreviations: MR, Mendelian randomization; SD, standard deviation.
Supplementary Figure 11. SLC2A9 gene expression among white blood cells. Abbreviations: DC, dendritic cells; PBMC, peripheral blood mononuclear cells; T-reg, regulatory T cells; NX, normalized expression. The normalized expression levels were combined from data generated by the Human Protein Atlas, GTEx and FANTOM5. The plot was downloaded from the Human Protein Atlas website on September 23, 2020.
Supplementary Figure 16. Enrichment of transcription factor binding sites in the overlap between DNase I hypersensitive sites and the urateassociated CpGs (p<1E-5 in the combined meta-analysis of the discovery and replication cohorts). The enrichment p-values on the y-axis were 2sided and obtained using binomial test. Abbreviation. DMP, differentially methylated positions of the urate-associated CpGs; TF, transcription factor; FDR, false discovery rate using the Benjamini-Yekutieli method.
Supplementary Figure 17. Forest plot of leave-one-out analysis showing that the causal estimates of serum urate on 5 CpGs were driven by the urate GWAS index SNP at SLC2A9, rs4447862, and became insignificant after removing rs4447862: cg01881899 (A), cg02387843 (B), cg14348967 (C), cg18125510 (D), and cg22821355 (E). The x-axis is the causal effect estimate in SD of rank-based transformed DNA methylation beta value per mg/dL of serum urate using each of the SNPs on the y-axis as the instrument for serum urate. Sample size: serum urate GWAS (n=288,649), meQTL from FHS (n=3,866). The whiskers represent 95% confidence interval. Abbreviation: MR, Mendelian randomization; SD, standard deviation.
Supplementary Figure 17A