Abstract
Objectives
Genome-wide association studies (GWAS) have successfully revealed numerous susceptibility loci for obesity. However, identifying the causal genes, pathways, and tissues/cell types responsible for these associations remains a challenge, and standardized analysis workflows are lacking. Additionally, due to limited treatment options for obesity, there is a need for the development of new pharmacological therapies. This study aimed to address these issues by performing step-wise utilization of knowledgebase for gene prioritization and assessing the potential relevance of key obesity genes as therapeutic targets.
Methods and results
First, we generated a list of 28,787 obesity-associated SNPs from the publicly available GWAS dataset (approximately 800,000 individuals in the GIANT meta-analysis). Then, we prioritized 1372 genes with significant in silico evidence against genomic and transcriptomic data, including transcriptionally regulated genes in the brain from transcriptome-wide association studies. In further narrowing down the gene list, we selected key genes, which we found to be useful for the discovery of potential drug seeds as demonstrated in lipid GWAS separately. We thus identified 74 key genes for obesity, which are highly interconnected and enriched in several biological processes that contribute to obesity, including energy expenditure and homeostasis. Of 74 key genes, 37 had not been reported for the pathophysiology of obesity. Finally, by drug-gene interaction analysis, we detected 23 (of 74) key genes that are potential targets for 78 approved and marketed drugs.
Conclusions
Our results provide valuable insights into new treatment options for obesity through a data-driven approach that integrates multiple up-to-date knowledgebases.
Similar content being viewed by others
Introduction
Obesity is a multifaceted condition characterized by excessive fat accumulation in the body, often associated with chronic conditions such as heart disease, diabetes, high blood pressure, and cancers [1]. Despite concerted efforts, the prevalence of obesity has significantly increased, with the proportion of obese adults in the United States rising from 30.5% to 42.4% in less than two decades [2]. While lifestyle modifications have limited efficacy in controlling obesity, few available drugs are serving as anti-obesity agents [3]. Unfortunately, current research methods are insufficient for developing personalized therapies, and the traditional drug discovery process is time-consuming, laborious, expensive, and risky [4]. Furthermore, concerns exist regarding the long-term effects of FDA and EMA-approved weight-loss drugs [5]. Therefore, it is imperative to address obesity seriously, necessitates effective strategies to identify and target associated key genes and pathways.
Genome-wide association studies (GWAS) represent significant advancement in sequencing technology for identifying genetic associations with various traits and diseases. Nevertheless, GWAS encounters several inherent limitations [6], including non-coding variants introducing complexity and necessitating tissue-specific exploration contexts, as well as the proximity of closely situated genes, which complicates the determination of their significance. Furthermore, linkage disequilibrium (LD) can result in false positives, obscuring the identification of true causal variants. Additionally, complex diseases often arise from disruptions in intracellular biological network, rather than single gene abnormalities.
Despite considerable efforts to investigate the functional implications of obesity-related GWAS [7, 8], certain gaps persist. Previous research has investigated the genetic regulation of blood pressure regulatory genes using post-GWAS data [9], but similar investigations for obesity remain limited. Although potential causal SNPs and hub genes [10] have been identified based on their proximity to GWAS signals, the lack of eQTL data and investigation of relevant tissues has hindered causal inference. Additionally, drug repositioning application in the post-GWAS analysis of obesity have not been addressed. Nonetheless, a recent study employed expression datasets to identify differentially expressed genes and screened potential drugs targeting important obesity hub genes [11].
Accordingly, we conducted data-driven integrative analysis by leveraging a credible GWAS dataset [8] with updated bioinformatics tools and knowledgebases [12,13,14], prioritizing obesity-associated genes with significant in silico evidence. Expanding upon previous research [7], our study incorporated a larger study population, allowing us to identify and update the most plausible causal genes and evaluate their clinical relevance as potential therapeutic targets. Protein-protein interaction [15] and network centrality analysis [16] pinpointed key genes, while gene-set enrichment analysis [17] shed light on underlying biological processes and pathways. Drug-gene interactions [18] analysis, as well as adverse drug reactions [19], unveiled promising opportunities for drug repurposing. Our study informs future obesity research and guides future experimental assays to investigate mechanisms and targeted therapies. An overview of this study is illustrated in Fig. 1.
Materials and methods
GWAS SNPs analysis
Associations identified through GWAS provide a foundation for investigating the biological underpinning of obesity. Given this, we compiled a credible catalog of 941 near-independent genome-wide significant SNPs (COJO P < 1E-08), which captured meaningful association in the GWAS while minimizing potential false positives [20], identified from the Genetic Investigation of Anthropometric Traits (GIANT) consortium, the largest centralized BMI GWAS dataset derived from ~800,000 individuals [8]. To identify additional SNPs that equally contribute to obesity, we calculated linkage disequilibrium (LD) using genotype data from the 1000 Genomes Phase 3 Project, focusing on the European ancestry (CEU) population reference panel. Our search criteria included a distance range of ± 500 kb from the query significant SNPs and r2 > 0.9 from pairwise LD calculation. These LD SNPs were consolidated together with the genome-wide significant SNPs to create a list of obesity-associated SNPs.
eQTL analysis
Integrating expression Quantitative Trait Loci (eQTL) data with GWAS offers insights into the genetic variance associated with changes in gene expression of disease phenotypes [21]. To prioritize eQTL-genes and explore their potential regulatory roles, we utilized RegulomeDB [12], a comprehensive knowledgebase that provides functional interpretation of SNPs based on curated references, where these SNPs were scored based on their combinatorial existence of functional categories. We focused specifically on category 1 variants (1a–1f), also known as eQTL-SNPs which demonstrate strong evidence of influencing the expression of eQTL-genes associated with obesity.
Tissue expression analysis
Tissue expression analysis revealed specific tissues where particular genes are expressed, highlighting their potential involvement in disease pathogenesis [22]. To determine specific tissues associated with obesity, we performed tissue expression analysis using FUMA, a web platform capable of performing functional mapping and annotation of genetic variations identified in GWAS studies. Within FUMA, we utilized Multi-marker Analysis of GenoMic Annotation (MAGMA) [23] to evaluate the enrichment of genes in specific tissues and prioritized differentially upregulated genes for further functional analysis of their potential roles in the pathophysiology of obesity.
TWAS analysis
Transcriptome-wide association studies (TWAS) integrate genotype and phenotype data from GWAS with reference expression panels, providing insights into potential causal genes in diseases [24]. TWAS complements GWAS findings by uncovering genes that are missed by GWAS, providing additional regulatory evidence. Using TWAS Hub [14] with searchable access to TWAS results of complex traits and expression studies, we identified gene expression associated with obesity-related phenotypes, including BMI, fat mass, waist circumference, and weight. Considering the emerging role of the brain in weight regulation [25], we focused on transcriptionally regulated genes associated with the brain, such as caudate basal ganglia, cerebellar hemisphere, cerebellum, cortex, frontal cortex BA9, hippocampus, hypothalamus, nucleus accumbens basal ganglia and putamen basal ganglia. These genes were prioritized for further investigation.
Protein-protein interactions and network centrality analysis
Protein-protein interactions (PPI) play a crucial role in regulating biological processes and provide valuable insights into the functions and interactions of proteins within the cells [26]. To better understand the network involved in obesity-associated genes, we used STRING [15] to reconstruct networks by integrating associations between proteins derived from computational approaches and assigning confidence scores (0-1) to quantify the strength of supporting evidence. We chose the default medium confidence score of 0.4 to interpret the interactions between the obesity-associated genes. Given the significance of network centrality analysis in identifying important hub genes [27], we employed cytoHubba [16] to perform a topological analysis of the network structure using centrality algorithms such as betweenness, closeness, and degree to assess the importance of individual proteins within the network. We ranked the top 100 hub genes using these algorithms and identified key genes by finding overlaps in the resulting lists.
Gene set enrichment analysis
Gene Set Enrichment Analysis (GSEA) integrates knowledge about the function of a group of genes, taking into account their involvement in specific biological pathways or co-expression under certain conditions [28]. Gene ontology (GO) is commonly used for gene functional annotation covering biological processes, molecular functions, and cellular components. Similarly, the Kyoto Encyclopedia of Genes and Genomes (KEGG) increases the explanatory power of specific gene sets to gain insights into underlying biology pathways. To identify significantly enriched GO and KEGG terms of the key obesity genes, we utilized Enrichr [17], an integrative web-based application. Enriched GO and KEGG pathways terms with adjusted p-value < 0.05 and involving > 3 genes were considered statistically significant.
Drug-gene interaction and adverse drug reaction analysis
Drug-gene interaction (DGI) refers to the interaction between genes and drugs that can potentially influence drug responses [29]. To identify target genes and potential drugs that interact with key obesity genes, we queried the Drug Gene Interaction Database (DGIdb) [18], a consolidated resource of DGI interactions and druggable genes. We focused on FDA-approved drugs supported by evidence from two or more databases and PubMed sources, considering them as potential candidates for drug repurposing in obesity. Adverse drug reaction (ADR) refers to unintended and potentially harmful effects arising from the therapeutic use of medications [30]. To examine the safety profile of our repurposed drug candidates, we conducted a search using the Side Effect Resource (SIDER) database [19], a resource containing information on potential side effects of approved drugs. Specifically, we focused on side effects classified as “very common” according to the Medical Dictionary for Regulatory Activities (MedDRA) hierarchy, as they have a frequency of ≥ 10% and are likely to occur in a significant proportion of patients using the medication.
Negative control and benchmarking analysis
Negative control and benchmarking analysis are crucial for experimental reliability [31]. In our workflow, we first prioritized obesity-associated genes by integrating multiple human sources. In the second part, we identified highly interconnected hub genes in the PPI network. To validate the effectiveness of each step, we conducted analyses with negative controls, resulting in different sets of key genes. We then calculated the overlap of these genes with genes from the mouse knockout database [32] and drug-related information from the drug-gene interaction database [18]. Additionally, we performed two benchmarking analyses by reanalyzing previous investigations of BMI GWAS study [7], utilizing GWAS summary statistics and prioritized genes as starting materials, respectively.
Results
Identification of obesity-associated SNPs
To obtain a credible list of genetic associations with obesity, we acquired 941 near-independent genome-wide significant SNPs from BMI GWAS of the GIANT consortium, and subsequently examined representative signals of obesity based on LD. From the significant SNPs, we identified 27,846 LD SNPs applying a threshold of ± 500 kb from the query SNPs and an r2 > 0.9. After trimming off overlapping SNPs, a non-redundant list of 28,787 obesity-associated SNPs was assembled, which were then clustered into 640 genomic loci based on < 500 kb distances (Supplementary Table 1).
Prioritization of obesity eQTL-SNPs and eQTL-genes
To identify SNPs located in regulatory regions of the genome, we employed RegulomeDB to score the obesity-associated SNPs for regulatory functions. Of the 28,787 SNPs examined, 25,776 ( ~ 90%) were assigned with putative regulatory functions. From these, we extracted 867 putative eQTL-SNPs exhibiting high potential for regulatory function (score 1) for downstream analysis. These SNPs were predicted to have an influence on the expression of 243 eQTL-genes, which were prioritized based on their potential to cause obesity through changes in gene expression (Supplementary Table 1).
Prioritization of differentially upregulated genes in obesity tissues
To identify the most relevant tissues associated with obesity, we used MAGMA [23], which was incorporated in FUMA [13] for tissue expression analysis. Our findings revealed that nearly all brain tissues (10 out of 13) were significant at P < 0.001, with the brain cerebellum having the strongest p-value (P = 5.45 × 10−15), demonstrating a strong relationship between the brain and obesity. Additionally, pituitary tissue was also significant with a P = 6.48 × 10−5. Subsequent differential analysis identified a set of differentially upregulated genes found in these brain tissues at adjusted P < 0.05 (Supplementary Table 2). Notably, cortex exhibited the strongest upregulation pattern (P = 3.25 × 10−6), followed by frontal cortex BA9 (P = 8.04 × 10−6), and cerebellum (P = 1.17 × 10−5), with slightly weaker but still strongly significant upregulation observed in the cerebellar hemisphere (P = 1.28 × 10−4) and anterior cingulate cortex BA24 (P = 3.05 × 10−4). A total of 845 differentially upregulated genes from these brain tissues were prioritized for downstream analysis on their potential roles in causing obesity.
Prioritization of TWAS genes associated with obesity
TWAS enabled the identification of genes whose expression is associated with obesity. We analyzed TWAS experiments from TWAS Hub [14], specifically targeting 4 phenotypes capable of defining obesity. Given the enrichment of brain tissues in our tissue enrichment analysis, we expanded our analysis to include TWAS experiments to uncover additional genes involved in transcriptional activity related to obesity. We prioritized 396 genes exhibiting transcriptional activity in the brain, which are listed in Supplementary Table 3. A total of 63 genes are shared between both TWAS Hub and FUMA. In addition, 31 genes were found to be common between TWAS Hub and RegulomeDB, with a total of 14 genes identified in all three targeted knowledge resources.
Identification of key obesity genes
We aimed to explore the interconnectedness of genes implicated in obesity, deducing their significance in important biological pathways associated with obesity by being closely integrated within a protein network. To achieve this, we merged the genes prioritized based on in silico evidence from targeted knowledge resources, namely RegulomeDB (243 genes), FUMA (845 genes), and TWAS Hub (396 genes), resulting in a gene set of 1,372 genes associated with obesity.
To identify potential physical and functional associations among these genes, we utilized STRING [15] and applied a minimum interaction score > 0.4. Subsequently, we employed cytoHubba [16] to evaluate the nodes in the PPI network using three centrality parameters, namely betweenness, closeness, and degree, and ranked the top 100 hub genes from each algorithm respectively. By selecting genes that overlapped in all three network centrality analyses, we identified 74 key obesity genes highly integrated within a protein network (Fig. 2A). Through extensive literature review, which included assessing single gene knock-out experiments conducted by the International Mouse Phenotyping Consortium (IMPC), we classified these key genes into predicted novel (n = 37) and functionally validated (n = 37). For the functionally validated known genes, we further categorized them to their association with obesity, based on their regulation of appetite, fat, size, lipid, and glucose (Fig. 2B).
Identification of major biological pathways associated with obesity
To further investigate the biological significance of the key obesity genes, we performed gene set enrichment analysis using Enrichr. To indicate strong enrichment, stringent parameters such as adjusted p-value < 0.05 and the presence of more than 3 genes in a gene set were employed (Fig. 3). Our analysis indicated that the key obesity genes were significantly enriched in a total of 119 GO terms, with 91 terms (~76%) associated with biological processes, 12 terms (~11%) associated with molecular functions, and 16 terms (~13%) associated with cellular components (Supplementary Table 5). In addition, 104 KEGG pathway terms were significantly enriched (Supplementary Table 6).
Identification of repurposed drug candidates and adverse drug reactions
Key genes are essential in maintaining the structure and function of PPI networks, making them attractive candidates for novel therapeutics. In search of potential therapeutic targets, we utilized DGIdb to analyze drug-gene interactions among the key obesity genes. We focused on reliable interactions supported by at least two resources and PubMed references while excluding cancer-specific resources and limiting our search to FDA-approved drugs or drugs in clinical trials (Fig. 4). Among the 74 key obesity genes, we identified 23 drug-related genes (Supplementary Table 7), as well as 51 genes not previously noted. Further analysis of the genes not previously noted showed 37 druggable and 14 non-druggable (Supplementary Table 8). The 23 drug-related genes were found interacting with 78 drugs, where of these drugs, 47 drugs can lead to weight loss, another 19 were associated with weight gain, while 12 had unknown effects based on prior reports.
To evaluate the safety profile, we searched the SIDER database for side effect data of the 78 drug candidates, focusing on side effects classified as “very common” to identify the most frequent occurrences. Among the 78 drug candidates, 19 had reported side effects (Supplementary Fig. 1). These findings could inform future research on the safety and potential use of these drugs for obesity management in clinical settings (Supplementary Table 9).
Validation of workflow through negative control and benchmarking analyses
Our workflow underwent rigorous validation tests to ensure its reliability. It involved two steps: (1) prioritizing 1372 obesity-associated genes by integrating multiple human data sources, and (2) narrowing down the list to 74 key hub genes in the PPI network. The mice knockout database [32] and drug-gene interaction database [18] both revealed 21 genes (~28%) with reported mice knockout abnormalities and 23 drug-related genes (~31%).
To evaluate the effectiveness of the first step, we randomly selected 1372 genes from a pool of 14,937 well-annotated human genes (obtained from Enrichr’s GO Biological Process library) and performed hub genes identification analysis using our workflow ten times (Fig. 5). Our workflow detected a significantly greater percentage of genes with mice knockout abnormalities compared to the negative analyses, indicating effective enrichment of true obesity genes (t-test P < 0.00001). Additionally, the percentage of genes with drug-related information was also significantly greater at P = 0.035517. To evaluate the second step, we randomly selected two groups of 74 genes from the 1372 prioritized obesity-associated gene set to serve as substitutes for hub genes and repeated the analysis ten times. The first set comprised genes with limited connections in the PPI network (node degree < 2), while the second set consisted of genes without considering their interconnectivity information. Our analysis revealed a significantly greater percentage of genes with drug information in our workflow than the negative analyses (t-test P < 0.00001), indicating effective enrichment of true drug target genes. Moreover, genes with lower node degree had a lower likelihood of having drug-related information compared to genes with higher node degree.
Lastly, leveraging the GWAS summary statistics and 195 significant genes prioritized from a previous investigation of BMI GWAS [7], we conducted two benchmarking analyses. Our workflow outperformed both benchmarking analyses in identifying a higher percentage of genes associated with mice knockout abnormalities and drug information. This improvement can be attributed to the utilization of a larger GWAS size (Benchmarking analysis A) and updated bioinformatics tools (Benchmarking analysis B) as demonstrated in Supplementary Table 10. Our findings showed the importance of incorporating larger sample sizes and employing up-to-date bioinformatics tools are crucial for identifying key genes and druggable targets.
Discussions
In this study, we prioritized obesity genes with potential causal roles utilizing targeted knowledgebases and identified relevant biological processes and pathways for clinical translation in obesity drug applications. Leveraging GWAS with a larger sample size, we conducted a series of integrative analyses using the latest bioinformatics tools and knowledgebases. Exploring the relationship between SNPs and gene expression, we prioritized 243 eQTL-genes. We then examined tissue expression and prioritized 845 differentially upregulated genes from the brain and another 396 transcriptionally regulated genes from TWAS experiments, highlighting expression patterns of the brain in obesity. Establishing a ~1400 prioritized obesity-associated gene set, we performed protein-protein interaction analysis and identified 74 key genes enriched in biological pathways regulating feeding behavior, energy expenditure, metabolic homeostasis, and insulin secretion. Lastly, our drug-gene interaction analysis identified 23 genes targeted by 78 existing drugs in repurposing applications, as well as additional 37 druggable genes with potential for drug development, offering valuable insights for new obesity therapeutics. Our workflow was inspected through negative and benchmarking analyses, where a greater percentage of genes associated with mice knockout abnormalities and drug-related information were identified, indicating its effectiveness.
Our study presents two novel findings that contribute to the current understanding of genetic regulation and therapeutic options for obesity. The first finding addresses the lack of knowledge regarding regulatory genes involved in obesity pathogenesis. While comprehensive integrative analysis with updated bioinformatics tools and knowledgebases on post-GWAS data of blood pressure [9], the field of obesity has lagged behind in achieving similar advancements, despite recent computational analysis of obesity-associated GWAS SNPs [10] that focused on genes closest to the GWAS signals. To fill this gap, we implemented a data-driven annotation strategy that employed three-layer evidence from targeted knowledgebases to prioritize regulatory genes associated with obesity. Our analysis prioritized 74 key genes associated with obesity, half of which were predicted novel genes that lacked functional validation in relation to obesity. The second novel finding concerns drug repurposing for obesity treatment. While previous studies [11] have identified drugs using expression datasets, no attempt has been made using post-GWAS obesity data. Our approach identified 23 drug-related genes and 78 drugs supported by evidence from human and animal experiments suitable for repurposing as novel therapeutic options for obesity. On the remaining 51 genes not previously noted, 37 were found to be druggable, providing opportunities for future drug development. To ensure the validity of our findings, negative analyses were performed at every major step of our analysis and demonstrated the significant improvement of our prediction workflow compared to previous reports, predicting and identifying truly enriched key obesity genes, and potential drug target genes for obesity.
In contrast to a previous study [10] that focused on the nearest genes to obesity-associated SNPs, our study instead highlights the relevance of studying eQTL-genes in explaining the regulatory mechanisms of obesity. To investigate this, we gathered significant GWAS SNPs from BMI GWAS of the GIANT consortium and calculated LD to find SNPs contributing equally to obesity. We identified 867 eQTL-SNPs with a RegulomeDB score of ≤ 1, capable of regulating a total of 243 eQTL-genes associated with obesity. Among these, 63 of them were genome-wide significant eQTL-SNPs capable of regulating 76 eQTL-genes. Tissue expression analysis enables the identification of genes expressed in particular tissues, providing insights into their contribution to the various cell types and organs. To supplement the prioritized eQTL-genes that lack tissue-specific information, we conducted tissue expression analysis and prioritized 845 differentially upregulated genes present in the brain tissues, consistent with previous studies from other groups proposing that brain tissues play a crucial role in the regulation of body weight and the development of obesity [7, 33].
Emerging evidence suggests that vulnerability to obesity extends across multiple brain regions, receiving signals from internal and external sources to collectively regulate feeding habits and energy storage [34, 35]. Also, a previous study [36] reported the enrichment of neuronal cells in the brain, providing valuable evidence of the neurological consequences of obesity. The brain is extensively reviewed as a critical regulator of metabolic traits and physiological processes, including energy metabolism and glucose regulation in the central nervous system [37]. Moreover, another study showed that the dysregulation of neuronal pathways in the brain disrupts energy balance, resulting in excessive food intake and reduced energy expenditure in mice [38]. Enrichment in brain tissues prompted us to expand our investigation by integrating TWAS experiments to identify and prioritize an additional 396 transcriptionally regulated genes associated with the brain, 63 of which were previously identified in our tissue expression analysis.
PPI provides valuable insights into the organization and coordination of biological processes underlying diseases. Specifically, our analysis identified 74 key genes, of which 37 were predicted novel and have not been functionally validated in relation to obesity. We further examined the potential association of these novel genes to shed light on their underlying mechanism. ANXA5 is known for anticoagulation and is involved in triglyceride biosynthesis [39]. Additionally, AXIN1 and BTRC regulate adipose tissue lipogenesis through the Wnt signaling pathway [40]. CACNA1D affects insulin secretion and has been linked to various conditions [41]. Other genes, such as CTNNA1 and CTNNA2 are involved in the Hippo signaling pathway, regulating adipogenesis [42]. Impairment of DNM1 causes insulin secretion failure and hyperglycemia in mice [43] while inhibiting the expression of EP300 reduces adiposity in larval zebrafish [44]. FAIM2 is linked to obesity and dyslipidemia in the Chinese population [45], while GNAI3 is associated with non-alcoholic fatty liver disease (NAFLD) [46]. Furthermore, GRIA1 influences appetite in T2D patients [47]. GSK3B regulates inflammation in diabetes patients [48]. MAP2K5 is known to activate ERK5, a critical regulator of adipogenesis through the PKA signaling pathway [49], while inhibiting MTOR signaling resulting in reduced food intake and body weight in mice [50]. The homolog of NUDT3 in Drosophila, Aps, is involved in insulin signaling [51], while defects in PACSIN1 have been associated with schizophrenia-like behavior in mice, another condition linked to obesity [52]. PRKCB deficiency reduces the obesity syndrome of mice [53], while RAB3A is involved in the regulation of insulin secretion [54]. Moreover, RBFOX1 regulates BDNF, crucial for neuronal development and energy metabolism in mice [55]. TFAP2B is linked to insulin resistance and adiposity [56]. These predicted novel genes are involved in various biological pathways, such as lipid and energy metabolism, insulin secretion, adipogenesis, and neural development. Further investigation is required to evaluate the potential link between BSN, GRIN2A, H2AFX, LEO1, NCOR1, NRXN3, PLCB1, POC5, PRRT2, SCN2A, SPI1, STX1B, UBXN7, VCL, YWHAZ, and obesity.
Subsequently, we examined the 74 key obesity genes to determine their presence in prior analysis by Locke’s GWAS dataset [7], as reported by DEPICT software [57]. Among these 74 genes, DEPICT identified 36 genes, while the remaining 38 genes were not reported. Notably, within this group of unreported genes, we observed that 12 genes were situated close to the sentinel SNPs of Yengo’s GWAS dataset [8] (dataset used in this analysis), with distances ranging from 0 (closest) to 18,724 base pairs (farthest). Interestingly, only one gene among these 12 genes was also found to be situated near the sentinel SNPs of Locke’s GWAS datasets [7]. Conversely, the remaining 26 unreported genes were not found to be the nearest genes to the GWAS SNPs. These findings highlight the importance of larger GWAS datasets and updated bioinformatics tools in achieving greater precision in research outcomes.
GO terms and KEGG pathways enrichment analysis offers valuable conclusions about gene sets. Our GO analysis indicated that key obesity genes are enriched in biological processes related to the brain and nervous system, such as myeloid cell differentiation, NMDA receptor activity, and neuron projection development. Deficiencies in myeloid cells protect mice from diet-induced obesity and insulin resistance [58], while NMDA receptor signaling is involved in appetite regulation [59]. Furthermore, these key obesity genes were involved in regulating cation channel activity, synapse maturation, and neurotransmitter receptor activity, which could impact food intake, energy expenditure, and glucose metabolism [60]. Meanwhile, KEGG pathway enrichment analysis revealed that key obesity genes were enriched in signaling pathways in the brain, specifically neurotransmitter signaling, involving dopaminergic, glutamatergic, and cholinergic synapses. Brain scans of humans indicated dopamine-regulated brain circuits were involved in obesity [61], while obese mice on a high-fat diet displayed reduced levels of multiple enzymes involved in dopamine production when switching to the low-fat diet [62]. Changes in glutamate transmission in obese animals showed increased dopamine transmission and altered synaptic functions [63]. Basal forebrain cholinergic signaling was reported to regulate feeding behavior in rats [64], while the frontal cortex and hippocampus displayed functional impairments in cholinergic and synaptic activity, leading to weight gain, hypertension, and dysmetabolism [65]. Moreover, a decrease in growth hormone secretion has been associated with obesity [66] and the suppression of insulin secretion led to weight and fat mass reduction [67]. Our enrichment analysis offers evidence of the complex interplay between key obesity genes and the brain, impacting feeding behavior, energy expenditure, metabolic homeostasis, and insulin secretion.
Hub genes have shown promise as targets for drug development [68]. To validate this hypothesis, we analyzed a recent lipid GWAS [69] and identified hub genes associated with LDL cholesterol from the reported genes. Notably, our analysis revealed the inclusion of HMGCR and PCSK9 as hub genes. These genes have been extensively studied and play crucial roles in hypercholesterolemia treatment. HMGCR is a major target of statins, regulating cholesterol levels by inhibiting its expression [70]. Furthermore, PCSK9 has been linked to blood cholesterol levels, and inhibitors of PCSK9 have proven effective in lowering LDL cholesterol [71]. These findings further support the potential utilities of hub genes, making them attractive targets for drug development and repurposing.
We examined our key genes with DGI analysis and identified 23 drug-related genes that serve as targets for 78 FDA-approved drugs that showed potential in regulating body weight based on previous studies involving human and animal experiments (Supplementary Table 7). Among these genes, four were targeted by five or more drugs, with KIT being the focus of seven weight loss drugs. Among the 47 weight loss-associated drugs, fluoxetine [72] and citalopram [73] are commonly prescribed antidepressants for treating binge eating disorder linked to obesity. Conversely, topiramate [74] and zonisamide [75] are antiepileptic medications capable of suppressing appetite and increase energy expenditure, leading to weight loss. Additionally, metformin [76] has shown effectiveness in promoting weight loss by reducing glucose production in the liver and improving insulin sensitivity. Mesalamine [77] reduces fasting glucose levels and BMI while increasing HDL-cholesterol. Of the 78 candidate drugs listed, SIDER reported 19 drugs with common side effects including asthenia, headache, nausea, fatigue, dermatitis, musculoskeletal discomfort, vomiting, decreased appetite, and diarrhea (Supplementary Fig. 1). While not life-threatening, they can considerably affect the health and well-being of patients, leading to discontinuation of treatment or additional medical attention. We proposed that drugs with high reported side effects (e.g., ribavirin, n = 52), may not be suitable for repurposing. Conversely, drugs with low reported side effects, (e.g., duloxetine and paliperidone, n = 1; quetiapine, n = 2; amisulpride, n = 7; oxcarbazepine, n = 8), targeting genes with obesity-related knockout abnormalities in mice could be considered for repurposing. However, it is important to note that drug repurposing is a complex process and requires careful evaluation.
Our study has strengths and weaknesses. We translated biological data into functional knowledge and treatment interventions, suggesting promising key obesity genes as targets for new obesity therapeutics. However, the specificity of computational tools and inadequate specific information on biological processes and pathways remained challenging to establish causality. To address this, the integration of multiple credible biological resources and statistical tools could compensate for the specificity limitation of each resource, further enhancing the prioritization of candidate genes and markers [78]. Additionally, our approach explored anti-obesity therapy and uncovered novel repurposed applications and adverse drug reaction information for the key obesity genes. However, we lack knowledge of the interactions between these drugs and the genes in the context of obesity. Furthermore, adverse drug reactions may differ among diverse populations [79]. As a follow-up to our study, we proposed the integration of genome editing techniques, such as CRISPR-Cas9 [80], to validate our prioritized key obesity genes in animal experiments. Supplementing our findings with empirical experiments would improve our comprehension of regulatory gene interactions and their role in obesity, bringing us closer to effective obesity treatment.
Conclusions
In conclusion, our study provides valuable contributions to the obesity research field by utilizing a systematic data-driven in silico approach to identify and predict novel regulatory genes and potential therapeutic targets for obesity through the translation of GWAS results. Firstly, we prioritized key obesity genes from multiple knowledgebases and identified novel genes which had not been functionally validated in regard to obesity. These genes were involved in various biological pathways, such as lipid and energy metabolism, insulin secretion, adipogenesis, and neural development, adding insights into the underlying mechanism of obesity. Secondly, we identified promising drug-related genes and repurposing drug candidates for novel obesity management. These drugs are capable of regulating energy metabolism and expenditure, appetite control, glucose homeostasis, and insulin sensitivity, offering promising avenues for the development of effective treatments.
Code availability
The codes used in this analysis are available on our GitHub page at https://github.com/angmiayang/integrative_obesity_analysis.git. These codes are freely available, enabling reproducibility and further exploration of our findings.
References
Keramat SA, Alam K, Rana RH, Chowdhury R, Farjana F, Hashmi R, et al. Obesity and the risk of developing chronic diseases in middle-aged and older adults: Findings from an Australian longitudinal population survey, 2009–2017. PLoS One. 2021;16:e0260158.
Hales CM, National Center for Health S. Prevalence of obesity and severe obesity among adults: United States, 2017–2018. Hyattsville, MD: U.S. Department of Health and Human Services, Centers for Disease Control and Prevention, National Center for Health Statistics; 2020.
Dietrich MO, Horvath TL. Limitations in anti-obesity drug development: the critical role of hunger-promoting neurons. Nat Rev Drug Discov. 2012;11:675–91.
Pushpakom S, Iorio F, Eyers PA, Escott KJ, Hopper S, Wells A, et al. Drug repurposing: progress, challenges and recommendations. Nat Rev Drug Discov. 2019;18:41–58.
Gjermeni E, Kirstein AS, Kolbig F, Kirchhof M, Bundalian L, Katzmann JL, et al. Obesity-An Update on the Basic Pathophysiology and Review of Recent Therapeutic Advances. Biomolecules. 2021;11:1426.
Tam V, Patel N, Turcotte M, Bosse Y, Pare G, Meyre D. Benefits and limitations of genome-wide association studies. Nat Rev Genet. 2019;20:467–84.
Locke AE, Kahali B, Berndt SI, Justice AE, Pers TH, Day FR, et al. Genetic studies of body mass index yield new insights for obesity biology. Nature. 2015;518:197–206.
Yengo L, Sidorenko J, Kemper KE, Zheng Z, Wood AR, Weedon MN, et al. Meta-analysis of genome-wide association studies for height and body mass index in approximately 700000 individuals of European ancestry. Hum Mol Genet. 2018;27:3641–9.
Kamali Z, Keaton JM, Haghjooy Javanmard S, International Consortium Of Blood P, Million Veteran P, e QC. Large-scale multi-omics studies provide new insights into blood pressure regulation. Int J Mol Sci. 2022;23:7557.
Cheng M, Mei B, Zhou Q, Zhang M, Huang H, Han L, et al. Computational analyses of obesity associated loci generated by genome-wide association studies. PLoS One. 2018;13:e0199987.
Yu Y, Zhang YH, Liu L, Yu LL, Li JP, Rao JA, et al. Bioinformatics analysis of candidate genes and potential therapeutic drugs targeting adipose tissue in obesity. Adipocyte 2022;11:1–10.
Dong S, Zhao N, Spragins E, Kagda MS, Li M, Assis P, et al. Annotating and prioritizing human non-coding variants with RegulomeDB v.2. Nat Genet. 2023;55:724–6.
Watanabe K, Taskesen E, van Bochoven A, Posthuma D. Functional mapping and annotation of genetic associations with FUMA. Nat Commun. 2017;8:1826.
Mancuso N, Shi H, Goddard P, Kichaev G, Gusev A, Pasaniuc B. Integrating Gene Expression with Summary Association Statistics to Identify Genes Associated with 30 Complex Traits. Am J Hum Genet. 2017;100:473–87.
Szklarczyk D, Kirsch R, Koutrouli M, Nastou K, Mehryary F, Hachilif R, et al. The STRING database in 2023: protein-protein association networks and functional enrichment analyses for any sequenced genome of interest. Nucleic Acids Res. 2023;51:D638–46.
Chin CH, Chen SH, Wu HH, Ho CW, Ko MT, Lin CY. cytoHubba: identifying hub objects and sub-networks from complex interactome. BMC Syst Biol. 2014;8:S11.
Evangelista JE, Xie Z, Marino GB, Nguyen N, Clarke DJB, Ma’ayan A. Enrichr-KG: bridging enrichment analysis across multiple libraries. Nucleic Acids Res. 2023;51:W168–79.
Freshour SL, Kiwala S, Cotto KC, Coffman AC, McMichael JF, Song JJ, et al. Integration of the Drug-Gene Interaction Database (DGIdb 4.0) with open crowdsource efforts. Nucleic Acids Res 2021;49:D1144–51.
Kuhn M, Letunic I, Jensen LJ, Bork P. The SIDER database of drugs and side effects. Nucleic Acids Res. 2016;44:D1075–9.
Yang J, Ferreira T, Morris AP, Medland SE, Genetic Investigation of ATC, Replication DIG. Conditional and joint multiple-SNP analysis of GWAS summary statistics identifies additional variants influencing complex traits. Nat Genet. 2012;44:369–75.
Nica AC, Dermitzakis ET. Expression quantitative trait loci: present and future. Philos Trans R Soc Lond B Biol Sci. 2013;368:20120362.
Consortium GT. The GTEx Consortium atlas of genetic regulatory effects across human tissues. Science 2020;369:1318–30.
de Leeuw CA, Mooij JM, Heskes T, Posthuma D. MAGMA: generalized gene-set analysis of GWAS data. PLoS Comput Biol. 2015;11:e1004219.
Gusev A, Ko A, Shi H, Bhatia G, Chung W, Penninx BW, et al. Integrative approaches for large-scale transcriptome-wide association studies. Nat Genet. 2016;48:245–52.
Loos RJF, Yeo GSH. The genetics of obesity: from discovery to biology. Nat Rev Genet. 2022;23:120–33.
von Mering C, Krause R, Snel B, Cornell M, Oliver SG, Fields S, et al. Comparative assessment of large-scale data sets of protein-protein interactions. Nature 2002;417:399–403.
Koschutzki D, Schreiber F. Centrality analysis methods for biological networks and their application to gene regulatory networks. Gene Regul Syst Bio. 2008;2:193–201.
Maleki F, Ovens K, Hogan DJ, Kusalik AJ. Gene set analysis: challenges, opportunities, and future research. Front Genet. 2020;11:654.
Spear BB, Heath-Chiozzi M, Huff J. Clinical application of pharmacogenetics. Trends Mol Med. 2001;7:201–4.
Coleman JJ, Pontefract SK. Adverse drug reactions. Clin Med (Lond). 2016;16:481–5.
Lipsitch M, Tchetgen Tchetgen E, Cohen T. Negative controls: a tool for detecting confounding and bias in observational studies. Epidemiology. 2010;21:383–8.
Munoz-Fuentes V, Cacheiro P, Meehan TF, Aguilar-Pimentel JA, Brown SDM, Flenniken AM, et al. The International Mouse Phenotyping Consortium (IMPC): a functional catalogue of the mammalian genome that informs conservation. Conserv Genet. 2018;19:995–1005.
Ndiaye FK, Huyvaert M, Ortalli A, Canouil M, Lecoeur C, Verbanck M, et al. The expression of genes in top obesity-associated loci is enriched in insula and substantia nigra brain regions involved in addiction and reward. Int J Obes (Lond). 2020;44:539–43.
Zeltser LM. Feeding circuit development and early-life influences on future feeding behaviour. Nat Rev Neurosci. 2018;19:302–16.
Grill HJ, Hayes MR. Hindbrain neurons as an essential hub in the neuroanatomically distributed control of energy balance. Cell Metab. 2012;16:296–309.
Timshel PN, Thompson JJ, Pers TH. Genetic mapping of etiologic brain cell types for obesity. Elife 2020;9:e55851.
O’Brien PD, Hinder LM, Callaghan BC, Feldman EL. Neurological consequences of obesity. Lancet Neurol. 2017;16:465–77.
Niu Y, Chang P, Liu T, Shen X, Zhao H, Zhang M, et al. Obese mice induced by high-fat diet have differential expression of circular RNAs involved in endoplasmic reticulum stress and neuronal synaptic plasticity of hippocampus leading to obesity-associated cognitive impairment. Front Mol Neurosci. 2022;15:1000482.
Seok H, Park HJ, Lee BW, Kim JW, Jung M, Lee SR, et al. Association of annexin A5 polymorphisms with obesity. Biomed Rep. 2013;1:654–8.
Bagchi DP, Nishii A, Li Z, DelProposto JB, Corsa CA, Mori H, et al. Wnt/beta-catenin signaling regulates adipose tissue lipogenesis and adipocyte-specific loss is rigorously defended by neighboring stromal-vascular cells. Mol Metab. 2020;42:101078.
Flanagan SE, Vairo F, Johnson MB, Caswell R, Laver TW, Lango Allen H, et al. A CACNA1D mutation in a patient with persistent hyperinsulinaemic hypoglycaemia, heart defects, and severe hypotonia. Pediatr Diabetes. 2017;18:320–3.
Shen H, Huang X, Zhao Y, Wu D, Xue K, Yao J, et al. The Hippo pathway links adipocyte plasticity to adipose tissue fibrosis. Nat Commun. 2022;13:6030.
Fan F, Wu Y, Hara M, Rizk A, Ji C, Nerad D, et al. Dynamin deficiency causes insulin secretion failure and hyperglycemia. Proc Natl Acad Sci USA 2021;118:e2021764118.
Nishimura Y, Sasagawa S, Ariyoshi M, Ichikawa S, Shimada Y, Kawaguchi K, et al. Systems pharmacology of adiposity reveals inhibition of EP300 as a common therapeutic mechanism of caloric restriction and resveratrol for obesity. Front Pharm. 2015;6:199.
Wu L, Zhao X, Shen Y, Zhang MX, Yan Y, Hou D, et al. Promoter methylation of fas apoptotic inhibitory molecule 2 gene is associated with obesity and dyslipidaemia in Chinese children. Diab Vasc Dis Res. 2015;12:217–20.
Zhu H, Ge K, Lu J, Jia C. Downregulation of GNAI3 Promotes the Pathogenesis of Methionine/Choline-Deficient Diet-Induced Nonalcoholic Fatty Liver Disease. Gut Liver. 2020;14:492–9.
Kochetova OV, Avzaletdinova DS, Korytina GF, Morugova TV, Mustafina OE. The association between eating behavior and polymorphisms in GRIN2B, GRIK3, GRIA1 and GRIN1 genes in people with type 2 diabetes mellitus. Mol Biol Rep. 2020;47:2035–46.
Lappas M. GSK3beta is increased in adipose tissue and skeletal muscle from women with gestational diabetes where it regulates the inflammatory response. PLoS One. 2014;9:e115854.
Zhu H, Guariglia S, Li W, Brancho D, Wang ZV, Scherer PE, et al. Role of extracellular signal-regulated kinase 5 in adipocyte signaling. J Biol Chem. 2014;289:6311–22.
Yang SB, Tien AC, Boddupalli G, Xu AW, Jan YN, Jan LY. Rapamycin ameliorates age-dependent obesity associated with increased mTOR signaling in hypothalamic POMC neurons. Neuron. 2012;75:425–36.
Williams MJ, Eriksson A, Shaik M, Voisin S, Yamskova O, Paulsson J, et al. The obesity-linked gene Nudt3 Drosophila homolog Aps is associated with insulin signaling. Mol Endocrinol. 2015;29:1303–19.
Koch N, Koch D, Krueger S, Troger J, Sabanov V, Ahmed T, et al. Syndapin I Loss-of-Function in Mice Leads to Schizophrenia-Like Symptoms. Cereb Cortex. 2020;30:4306–24.
Huang W, Bansode RR, Bal NC, Mehta M, Mehta KD. Protein kinase Cbeta deficiency attenuates obesity syndrome of ob/ob mice by promoting white adipose tissue remodeling. J Lipid Res. 2012;53:368–78.
Georgescu T, Lyons D, Heisler LK. Role of serotonin in body weight, insulin secretion and glycaemic control. J Neuroendocrinol. 2021;33:e12960.
Tomassoni-Ardori F, Fulgenzi G, Becker J, Barrick C, Palko ME, Kuhn S, et al. Rbfox1 up-regulation impairs BDNF-dependent hippocampal LTP by dysregulating TrkB isoform expression levels. Elife. 2019;8:e49673.
Nordquist N, Gokturk C, Comasco E, Eensoo D, Merenakk L, Veidebaum T, et al. The transcription factor TFAP2B is associated with insulin resistance and adiposity in healthy adolescents. Obes (Silver Spring). 2009;17:1762–7.
Pers TH, Karjalainen JM, Chan Y, Westra HJ, Wood AR, Yang J, et al. Biological interpretation of genome-wide association studies using predicted gene functions. Nat Commun. 2015;6:5890.
Catrysse L, Maes B, Mehrotra P, Martens A, Hoste E, Martens L, et al. A20 deficiency in myeloid cells protects mice from diet-induced obesity and insulin resistance due to increased fatty acid metabolism. Cell Rep. 2021;36:109748.
Guard DB, Swartz TD, Ritter RC, Burns GA, Covasa M. NMDA NR2 receptors participate in CCK-induced reduction of food intake and hindbrain neuronal activation. Brain Res. 2009;1266:37–44.
Miki T, Liss B, Minami K, Shiuchi T, Saraya A, Kashima Y, et al. ATP-sensitive K+ channels in the hypothalamus are essential for the maintenance of glucose homeostasis. Nat Neurosci. 2001;4:507–12.
Wang GJ, Volkow ND, Thanos PK, Fowler JS. Imaging of brain dopamine pathways: implications for understanding obesity. J Addict Med. 2009;3:8–18.
Carlin J, Hill-Smith TE, Lucki I, Reyes TM. Reversal of dopamine system dysfunction in response to high-fat diet. Obes (Silver Spring). 2013;21:2513–21.
Fritz BM, Munoz B, Yin F, Bauchle C, Atwood BK. A High-fat, High-sugar ‘Western’ Diet Alters Dorsal Striatal Glutamate, Opioid, and Dopamine Transmission in Mice. Neuroscience. 2018;372:1–15.
Herman AM, Ortiz-Guzman J, Kochukov M, Herman I, Quast KB, Patel JM, et al. A cholinergic basal forebrain feeding circuit modulates appetite suppression. Nature. 2016;538:253–6.
Martinelli I, Tomassoni D, Roy P, Amenta F, Tayebati SK. Altered Brain Cholinergic and Synaptic Markers in Obese Zucker Rats. Cells 2021;10:2528.
Rasmussen MH. Obesity, growth hormone and weight loss. Mol Cell Endocrinol. 2010;316:147–53.
Huang Z, Wang W, Huang L, Guo L, Chen C. Suppression of Insulin Secretion in the Treatment of Obesity: A Systematic Review and Meta-Analysis. Obes (Silver Spring). 2020;28:2098–106.
Xu W, Wang X, Liu D, Lin X, Wang B, Xi C, et al. Identification and validation of hub genes and potential drugs involved in osteoarthritis through bioinformatics analysis. Front Genet. 2023;14:1117713.
Graham SE, Clarke SL, Wu KH, Kanoni S, Zajac GJM, Ramdas S, et al. The power of genetic diversity in genome-wide association studies of lipids. Nature. 2021;600:675–9.
Villa E, Naves R, Bezares K, Cobena K, Villarroel AC, Guevara C. Pearls & Oy-sters: Managing Cholesterol in a Patient With Statin Intolerance Due to Anti-HMG-CoA Reductase-Associated Myopathy. Neurology. 2022;99:909–13.
Parikh RR, Breve F, Magnusson P, Behzadi P, Pergolizzi J. The Use of Monoclonal Antibody-Based Proprotein Convertase Subtilisin-Kexin Type 9 (PCSK9) Inhibitors in the Treatment of Hypercholesterolemia. Cureus. 2022;14:e25641.
Serralde-Zuniga AE, Gonzalez Garay AG, Rodriguez-Carmona Y, Melendez G. Fluoxetine for adults who are overweight or obese. Cochrane Database Syst Rev. 2019;10:CD011688.
McElroy SL, Hudson JI, Malhotra S, Welge JA, Nelson EB, Keck PE Jr. Citalopram in the treatment of binge-eating disorder: a placebo-controlled trial. J Clin Psychiatry. 2003;64:807–13.
McElroy SL, Hudson JI, Capece JA, Beyers K, Fisher AC, Rosenthal NR, et al. Topiramate for the treatment of binge eating disorder associated with obesity: a placebo-controlled study. Biol Psychiatry. 2007;61:1039–48.
Apovian CM, Aronne LJ. Zonisamide for weight reduction in obese adults. JAMA. 2013;310:637–8.
Hui F, Zhang Y, Ren T, Li X, Zhao M, Zhao Q. Role of metformin in overweight and obese people without diabetes: a systematic review and network meta-analysis. Eur J Clin Pharm. 2019;75:437–50.
Luck H, Tsai S, Chung J, Clemente-Casares X, Ghazarian M, Revelo XS, et al. Regulation of obesity-related insulin resistance with gut anti-inflammatory agents. Cell Metab. 2015;21:527–42.
Jia P, Zhao Z. Network assisted analysis to prioritize GWAS results: principles, methods, and perspectives. Hum Genet. 2014;133:125–38.
Wilson JF, Weale ME, Smith AC, Gratrix F, Fletcher B, Thomas MG, et al. Population genetic structure of variable drug response. Nat Genet. 2001;29:265–9.
Fellmann C, Gowen BG, Lin PC, Doudna JA, Corn JE. Cornerstones of CRISPR-Cas in drug discovery and therapy. Nat Rev Drug Discov. 2017;16:89–100.
Acknowledgements
We gratefully acknowledge the invaluable contributions of members of the National Center for Global Health and Medicine (NCGM), who have provided their expertise, guidance, and support throughout this study.
Funding
Open access funding provided by The University of Tokyo.
Author information
Authors and Affiliations
Contributions
MYA and FT conceived the idea of the study. MYA performed data analysis and interpretation of the results. FT and NK supervised the conduct of this study. All authors reviewed and revised the manuscript draft. All authors approved the final version of the manuscript to be published.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing interests.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Ang, M.Y., Takeuchi, F. & Kato, N. Deciphering the genetic landscape of obesity: a data-driven approach to identifying plausible causal genes and therapeutic targets. J Hum Genet 68, 823–833 (2023). https://doi.org/10.1038/s10038-023-01189-3
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1038/s10038-023-01189-3