Introduction

Sarcopenia is the age-related loss of skeletal muscle mass and strength, accompanied by functional impairment. As such, it is associated with disability, poor quality of life, and increased mortality1,2,3. Considering the difficulties posed by frailty, and the healthcare costs associated with age-related conditions, such as sarcopenia4,5, it is necessary to identify meaningful disease phenotypes and biomarkers. Several studies have suggested various criteria for defining sarcopenia6,7,8,9,10. Muscle mass is thought to be an important factor for the diagnosis of sarcopenia; of the parameters related to muscle mass, lean body mass (LBM) is frequently used to predict sarcopenia. In addition, the Asian working group for sarcopenia 2019 (AWGS 2019) recently reached the consensus that skeletal muscle index (SMI) and appendicular skeletal muscle (ASM) may also be reliable parameters11. In this respect, in addition to LBM, it is necessary to analyse SMI and ASM to understand the complex aetiology of sarcopenia, which can be attributed to a variety of factors, including oxidative stress, inflammation, mitochondrial dysregulation, and genetic factors12,13.

Muscle mass has a genetic trait phenotype, with a heritability estimate of over 50%14. Studies have investigated the genetic factors of LBM using the associations of single nucleotide polymorphisms (SNPs)15,16,17,18,19. Moreover, as osteoporosis and sarcopenia share a common risk factor (ageing), several studies have conducted joint genome-wide association study (GWAS) analyses on overlapping genetic variants20,21. Notably, a GWAS into osteoporosis revealed 64 loci. In contrast, fewer loci were identified by a GWAS into muscle-related phenotypes, thus. providing fewer biological insights into pathways regarding sarcopenia20. To address this issue, a large GWAS meta-analysis was conducted using 20 cohorts of European ancestry, identifying a set of five loci (HSD17B11, VCAN, ADAMTSL3, IRS1 and FTO) for total LBM, and SNPs related to IRS1, ADAMTSL3, and VCAN for appendicular LBM22. However, as this study analysed the entire cohort, irrespective of age, its findings regarding genes associated with sarcopenia as a senile disease were limited. In addition, the GWAS was based on European ancestry, and little is known regarding genetic determinants in elderly East Asians. Still further, few genetic studies have utilised a new index for sarcopenia (released by the AWGS in 2019) to investigate ASM or SMI11. Thus, a need exists for the investigation of genetic components associated with sarcopenia using multiple cohorts comprising elderly East Asians. The current study conducted a GWAS meta-analysis on sarcopenia phenotypes using Korean relatively aged cohorts, combining the Veterans Health Service Medical Center (VHSMC) and Korean Association Resource (KARE) cohorts.

Results

Characteristics of the study participants

A total of 7753 eligible subjects were included in this study (2518 subjects from the VHSMC cohort and 5235 from the KARE cohorts). However, 792 were excluded due to the exclusion criteria (Fig. 1), leaving a remainder of 6961 participants (1781 subjects from the VHSMC cohort and 5180 from the KARE cohort) that were included in analyses. The mean age of the VHSMC cohort was higher than that of the KARE cohort (69.10 ± 7.83 years vs 62.79 ± 8.33 years, P < 0.001, Table 1). No significant difference was observed in mean height (1.59 ± 0.08 m in VHSMC vs 1.59 ± 0.09 m in KARE, P = 1.000) between the two cohorts. The mean weight (63.24 ± 10.51 kg in VHSMC vs 62.64 ± 10.37 kg in KARE, P = 0.037) and BMI (24.74 ± 3.21 kg/m2 in VHSMC vs 24.53 ± 3.15 kg/m2 in KARE, P = 0.016) were statistically different between the cohorts. The LBM of the VHSMC cohort was lower than that of the KARE cohort (40.10 ± 7.83 kg vs 42.03 ± 8.27 kg, respectively, P < 0.001), whereas the body fat mass (BFM) of the VHSMC cohort was higher than that of the KARE cohort (20.60 ± 6.23 kg vs 18.28 ± 5.85 kg, respectively, P < 0.001). Descriptive statistics for subgroups according to sex are also presented in Table 1. The mean values of SMI and ASM, which could only be calculated for the VHSMC cohort, were 6.77 ± 1.00 kg/m2 and 17.49 ± 4.06 kg, respectively.

Figure 1
figure 1

Schematic of study population.

Table 1 Baseline characteristics of study populations.

GWAS meta-analysis of lean body mass and body fat mass

A total of 2,360,975 SNPs were used for the GWAS meta-analysis of LBM and BFM. Quantile–quantile (Q-Q) and Manhattan plots for LBM are shown in Fig. 2. The Q-Q plot revealed no evidence of test statistic inflation (variance inflation factor [VIF] = 1.044). The top ten variants for LBM are listed in Table 2; two of which were genome-wide significant loci. The most significant variant was rs1187118 (effect = 0.720, standard error [SE] = 0.117, P = 1.09 × \({10}^{-9},\) HetPVal = 0.199) near Glutamate Metabotropic Receptor 4 (GRM4) and High Mobility Group AT-Hook 1 (HMGA1), followed by rs3768582 (effect = 0.554, SE = 0.100, P = 4.09 ×  \({10}^{-8}\), HetPVal = 0.537) near Neutrophil Cytosolic Factor 2 (NCF2). The remaining eight variants are presented as candidate loci in Table 2. The Q-Q and Manhattan plots for BFM are shown in Supplementary Fig. S1. The Q-Q plot revealed no evidence of test statistic inflation (VIF = 1.037). The GWAS meta-analysis for BFM showed no genome-wide significant loci and the variant with the smallest P-value was rs1592269 (effect = 0.753, SE = 0.148, P = 3.43 ×  \({10}^{-7}\), HetPVal = 0.379) near GRM4 and HMGA1. The top ten candidate loci associated with BFM with P-values < 1.00 × \({10}^{-5}\) are listed in Supplementary Table S1. As the GWAS results for the LBM and BFM phenotypes exhibited similar loci (GRM4 and HMGA1), linkage disequilibrium (LD) analysis was performed. A high (r2 = 0.935) LD between rs1187118 and rs1592269 was observed, indicating a relatively high dependency.

Figure 2
figure 2

Manhattan and quantile–quantile plot for lean body mass in the meta-analysis. (A) Manhattan plot of the P-values in the genome-wide association study (GWAS) meta-analysis for lean body mass. (B) Quantile–quantile (Q-Q) plot showing expected vs. observed [− log10(P)values]. The expected line is shown in red and confidence bands are shown in grey.

Table 2 Results of GWAS meta-analysis for lean body mass (leading SNPs, top 10).

GWAS of appendicular skeletal muscle and skeletal muscle index

A total of 2,804,834 SNPs were used for GWAS analyses of ASM and SMI, using only the VHSMC cohort. The Q-Q and Manhattan plots for ASM are shown in Fig. 3; the Q-Q plot did not exhibit evidence of test statistic inflation (VIF = 1.031). The top ten variants for ASM are listed in Table 3; the only significant variant was a genome-wide locus: rs6772958 (effect = − 0.456, SE = 0.081, P = 2.30 ×  \({10}^{-8}\)) near zinc finger protein 860 (ZNF860) and Glycerol-3-Phosphate Dehydrogenase 1 Like (GPD1L). The Q-Q and Manhattan plots for SMI are shown in Supplementary Fig. S2 and revealed no evidence of test statistic inflation (VIF = 1.034). However, the GWAS for SMI exhibited genome-wide significant loci; the variant with the smallest P-value was rs6772958 (effect = − 0.121, SE = 0.023, P = 1.72 ×  \({10}^{-7}\), HetPVal = 0.379) near ZNF860 and GPD1L. The top ten candidate loci with P-values < 1.00 × \({10}^{-5}\) are suggested in Supplementary Table S2.

Figure 3
figure 3

Manhattan and quantile–quantile plot for appendicular skeletal muscle in the genome-wide association analysis. (A) Manhattan plot of the P-values in the genome-wide association study (GWAS) meta-analysis for appendicular skeletal muscle. (B) Quantile–quantile (Q-Q) plot showing expected vs. observed [− log10(P)values]. The expected line is shown in red and confidence bands are shown in grey.

Table 3 Results of GWAS for appendicular skeletal muscle (leading SNPs, top 10).

Regional analysis and functional annotation

For the genome-wide significant variants of each phenotype (LBM and ASM), the regional plots with the lead SNPs are displayed in Figs. 4 and 5. The first phenotype of interest was LBM. The most genome-wide significant SNP, rs1187118 eQTL analyses from the GTEx Project (V7), showed that Ribosomal Protein S10 (RPS10) was highly expressed in skin sun-exposed lower leg tissue (P = 1.40 ×  \({10}^{-7}\)). Its LD variant eQTL association for Nudix Hydrolase 3 (NUDT3) was also found in the skeletal muscle tissue (P = 4.30 × \({10}^{-21}\)). The second genome-wide significant SNP, rs3768582 eQTL analyses, showed that NCF2, SMG7 (SMG7 Nonsense-Mediated mRNA Decay Factor) and Actin Related Protein 2/3 Complex Subunit 5 (ARPC5) were highly expressed in the artery (P = 1.50 ×  \({10}^{-9}\)), heart (P = 2.60 ×  \({10}^{-5}\)), and cultured fibroblast tissue (P = 7.30 ×  \({10}^{-5}\)), respectively. In the differently expressed gene (DEG) analysis with GSE38718, compared with the young group, RPS10 (P = 8.00 ×  \({10}^{-4}\)), NUDT3 (P = 1.19 ×  \({10}^{-3}\)), NCF2 (P = 1.26 ×  \({10}^{-2}\)), SMG7 (P = 1.03 ×  \({10}^{-3}\)) and ARPC5 (P = 4.26 ×  \({10}^{-2}\)) were more expressed in the elderly group (Table 4).

Figure 4
figure 4

LocusZoom plot of genome-wide significantly associated SNPs for lean body mass.

Figure 5
figure 5

LocusZoom plot of genome-wide significantly associated SNPs for appendicular skeletal muscle.

Table 4 Results of differentially expressed genes from Gene Expression Omnibus (GEO) databases.

The second phenotype of interest was ASM. The only genome-wide significant SNP, rs6772958 eQTL analysis, showed that GPD1L was highly expressed in thyroid tissue (P = 8.10 ×  \({10}^{-15}\)). However, the DEG analysis showed that GPD1L expression was not significant (P = 0.159) in the transcriptome study (GSE38718).

In addition, gene ontology (GO) analyses of biological processes revealed that the term ‘mRNA destabilisation (GO: 0061157)’ (FDR-adjusted P = 0.090) was enriched, which is involved in skeletal muscles related genes. The term contains a pathway of alpha-ketoglutarate-dependent dioxygenase FTO (U6 small nuclear RNA [2′-O-methyladenosine-N(6)-]-demethylase FTO), which is involved in the regulation of fat mass, adipogenesis, and body weight. Thus, it contributes to the regulation of body size and body fat accumulation23.

Discussion

This study discovered novel genetic biomarkers of LBM (rs1187118) and ASM (rs6772958) from the VHSMC and KARE cohorts, which comprise relatively aged (mean age: 69.10 vs. 62.79, respectively) Koreans. Their related genes for LBM, such as RPS10, NUDT3, NCF2, SMG7, and ARPC5, were expressed in skeletal muscle tissue. In addition, in the biological process, the term ‘mRNA destabilisation (GO: 0061157)’ (FDR-adjusted P = 0.090) was enriched for sarcopenia. This process contains alpha-ketoglutarate-dependent dioxygenase FTO. These results suggest that the pathogenesis of sarcopenia requires further investigation using a metabolic pathway linked to mRNA.

The aetiology of sarcopenia is complex and includes oxidative stress, inflammation, inadequate diets, a sedentary lifestyle, and genetic factors13. A previous study on genetic markers for sarcopenia identified the loci near FTO, ESR1, NOS3, KLF5, and HLA-DQA1 to be associated with physical phenotypes, such as low handgrip strength and decreased LBM24,25,26. Nonetheless, these identified loci can only explain a small portion of phenotypic variations; thus, additional genetic loci should be identified. A recent large meta-analysis of the Cohorts for Heart and Ageing Research in Genome Epidemiology (CHARGE) Consortium and various other cohorts identified only a few loci, such as FTO and VCAN for LBM22. Therefore, assuming that identifying genetic variants for sarcopenia is challenging, we conducted GWAS analysis on a cohort comprising elderly subjects. The findings revealed that several genetic variants related to metabolism could be of importance in determining the pathogenesis of sarcopenia. Previous sarcopenia GWAS for European descendants showed association with FTO22,27,28 and several loci, including TGFA and HLA-DRB129.

Our meta-analysis for both LBM and BFM showed significant differences in the intergenic area of GRM4 and HMGA1, with a high LD between rs1187118 and rs1592269. HMGA1 is overexpressed in adipose tissue, impairs adipogenesis, and prevents diet-induced obesity, and insulin resistance30. The top loci for LBM and BFM were similar, and those of ASM and SMI were similar since the parameter of LBM was calculated from body weight minus BFM, and SMI was calculated from ASM/height2. Hence, it would be useful to calculate the correlation and genetic correlation for each parameter. In VHSMC cohorts, the correlation and genetic correlation were 0.078 and 0.078 between LBM and BFM, respectively, whereas those between SMI and ASM were 0.948 and 0.948, respectively. In KARE cohorts, the correlation was − 0.02 between LBM and BFM with the genetic correlation being 0.349.

The eQTL analysis for muscle mass using GTEx datasets showed that RPS10, NUDT3, NCF2, SMG7, and ARPC5 were differentially expressed in the muscle tissue for sarcopenia. However, this finding requires further validation. As the regional locations of HMGA1, RPS10, and SIMM29 were in the upper stream of NUDT3, and may represent a regulatory function for the association of NUDT3 with sarcopenia, further focus should be directed towards NUDT3. A previous study by Singh et al. suggested that NUDT3 was a candidate target-locus, and emphasised the need for real-world validation using transcriptome-wide association study (TWAS) approaches that combine GWAS and eQTL summary data24. In the current study, NUDT3 was found to be related to LBM in an elderly cohort. NUDT3 belongs to the MutT or Nudix protein families, which act as homeostatic checkpoints at important stages in inositol phosphate metabolic pathways. These pathways, such as phosphatidyl-1d-myo-inositol and glycerophospholipid metabolism, from the Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway database (https://www.kegg.jp/pathway/map00564)31 may, therefore, be related to LBM. For these reasons, it is necessary to understand the metabolic aspects of sarcopenia. A study into DEGs in skeletal muscle tissues from patients with cachexia32 showed that NCF2 was identified from signal pathways related to inflammation. These findings are consistent with the findings of the present study. SMG7 encodes a protein that is essential for nonsense-mediated mRNA decay, which is related to body height and BMI-adjusted waist circumference from a GWAS catalogue (https://www.ebi.ac.uk/gwas/). As SMG7 is linked with telomerase reverse transcriptase (TERT), sarcopenia may be related to muscle cell senescence via microRNA-19533. In addition, ARPC5 encodes the actin related protein2/3 complex, which exhibited a negative fold change in expression related to the cytoskeleton in muscle tissue34. As these findings may represent secondary changes, or may be postulated from bioinformatics analysis, further studies are needed.

Additionally, the present study found that GPD1L is a significant genetic marker for ASM and SMI in the VHSMC cohort. Although this is a novel finding for GWAS using ASM as a parameter for sarcopenia, it requires further validation by studies from several cohorts. A tissue-based study into rat muscle identified GPD1L as a candidate locus for sarcopenia35. These findings were also observed in a previous study that investigated the sarcopenic muscle tissue of elderly women36, in which GPD1L was found to be downregulated via cytoplasmic energy metabolism. In addition, a systemic genetic approach identified that GPD1L and its molecular mechanism for obesity in human adipose tissue were associated with energy metabolism37. GPD1L expression was found to be negatively correlated with microRNA-210 (miR-210) levels, and was consistently downregulated in obese subjects37. They hypothesised that the decreased miR-210 levels increased GPD1L, thus inhibiting hypoxic transcription factor-1α (HIF-1α) activity. A previous study into the circulating miRNAs in plasma revealed that miR-210 is significantly downregulated in elderly patients with sarcopenia, compared to patients without sarcopenia38. Combined with results of previous studies37,38, the findings presented here suggest that GPD1L could be a genetic biomarker for sarcopenia, based on both miR-210 and HIF-1α pathways. Hence, an additional biomarker for sarcopenia may be postulated from this metabolic research. Recent studies into plasma biomarkers for sarcopenia have identified higher levels of amino acids and lower levels of phosphatidylcholines (PCs) and lysophosphatidylcholine (lysoPC)39,40. The association between GPD1L and PCs or lysoPC and sarcopenia may involve (1) dysregulation of GPD1L related to decreased PCs and lysoPC from previous lipid biomarkers39,40, or (2) an increase in the glycerol-3 phosphate pathway inducing changes in glycolysis via GPD1L. However, the results of the present study can only be used to suggest a genetic hypothesis; thus, further follow-up studies are needed.

Analysis of the enriched biological processes identified via GO analysis of the cohorts revealed that alpha-ketoglutarate-dependent dioxygenase FTO is related to sarcopenia. This finding is consistent with those of a previous study on the influences of FTO and muscle phenotypes27. In addition, alpha-ketoglutarate is a component of the tricarboxylic acid cycle, which is related to the HIF-1α pathway. This evidence suggests that a simultaneous understanding of both genes and gene-metabolic pathways is necessary to understand the pathogenesis of sarcopenia.

One of the primary strengths of this study is the utilisation of a relatively elderly cohort sample, which provides a better sarcopenic phenotype. Here, NUDT3 and RPS10 were replicated using a real cohort, which was an approach suggested by a previous study using the TWAS of muscle tissue24. Furthermore, our study focused on East Asian subjects, which have not been fully evaluated, unlike other ethnic groups. In this regard, we conducted phenome-wide association studies (pheWAS) using the “Common metabolic disease knowledge portal” (https://hugeamp.org), indicating that SNPs such as rs1187118, rs3768582, and rs6772958 are related to metabolic conditions such as waist-hip ratio, lipid metabolism, and body fat percentage in the European population (Supplementary Table S3).

Nevertheless, certain limitations were noted in this study. First, although novel signals for LBM and ASM were discovered with genome-wide significance, our results were based on bioinformatics analysis and, therefore, must be replicated in other Asian cohorts or multi-ethnic samples. A large number of samples for phenotypes, such as ASM and SMI, will improve the study’s validity. Hence, further studies, including replication or meta-analysis, are needed in other cohorts of the Asian population. Moreover, the number of SNPs (2,804,834) in the VHSMC cohort was limited as we set the imputation accuracy to 0.9. These points should be considered in the interpretation of the results. Second, the difference in ageing biology between sexes further hinders the identification of meaningful biomarkers for age-related conditions. Although GWAS analysis was conducted according to sex, the results did not show significant loci with genome-wide significance. It is expected that a metabolite-GWAS, considering sex as a factor, could help address this problem. Third, bioinformatics analysis revealed that genetic variants and metabolic pathways were related to sarcopenia, however, the causality of this hypothesis requires further investigation. Moreover, previous studies on genetic variants in sarcopenia have shown that these variants may be associated with the effects of genetic, metabolic, and environmental factors22,27,28. Fourth, we used bioelectrical impedance analysis (BIA) for LBM and BFM examinations, as it is a non-invasive method for measuring body composition. However, dual-energy X-ray absorptiometry (DXA) is the standard method for muscle mass. BIA and DXA have different limitations for studies using body composition measurements. A previous study that compared these two methods found that BIA overestimated ASM compared to DXA41. In addition, BIA devices differed in the two cohorts (InBody 3.0 for KARE cohort, InBody770 for VHSMC cohort), which may be a confounding factor. In a technical review of BIA for people with high body fat, InBody 3.0 tended to be lower, with a difference of about 2% in an extreme case (unpublished data). A previous study showed that different BIA devices were reliable by high intraclass correlation coefficients and low standard errors42. Since the focus of our study was on muscle mass rather than fat mass, and we analysed each cohort using different PCs, differences associated with the BIA device between the two cohorts would not significantly influence the LBM values and analysis results presented in this study. However, it is necessary to consider these when interpreting research results.

In conclusion, sarcopenia can result in adverse outcomes, such as an increased risk of falls, a decreased quality of life, and mortality. Thus, it is necessary to identify a biomarker for this condition. Here, the loci near genes such as RPS10, NCF2, SMG7, ARPC5, and NUDT3 were identified to be significant biomarkers for LBM. In addition, the loci near GPD1L were identified as significant biomarkers for ASM and SMI, which serve as novel index for sarcopenia. These genes are related to metabolism pathways, such as glycerophospholipid pathways, energy metabolic pathway, the inositol phosphate and HIF-1α pathways, and alpha-ketoglutarate-dependent dioxygenase FTO. Further studies are required to evaluate the aetiology of sarcopenia.

Methods

Study subjects

Schematic plots of the analytical study design are shown in Fig. 1. Data were obtained from two cohorts: the VHSMC (n = 2518) and KARE (Ansan/Ansung study: from Korean Genome and Epidemiology Cohort, n = 5235) cohorts. Each cohort has its own distinct characteristics. The VHSMC cohort is a hospital-based elderly cohort that includes many patients with various diseases. The KARE cohort is a nationwide representative cohort for genome research in Korea; it is a longitudinal cohort of the Ansan and Ansung communities in Korea. This study included subjects from the KARE cohort and VHSMC cohort consisting of micro array data. Patients who had functional declines or limitations, or who had chronic diseases that may affect primary sarcopenia according to AWGS 201911, were excluded. After exclusion, 6961 participants were enrolled across both cohorts (Fig. 1). The institutional review boards of the Veterans Health Service Medical Center approved this study protocol and informed consent waiver (IRB No. 2020-02-015 and IRB No. 2021-05-005), since this study was performed in a retrospective manner, and the study was conducted in compliance with the Helsinki Declaration. The committee of VHS Biobank (VBP-2020-03) and the National Biobank of Korea (KBN-2021-041) approved the use of bioresources for this study.

Muscle mass measurement

BIA measurements were performed using InBody 770 (Biospace Co., LTD, Seoul, Korea) in the VHSMC cohort and using InBody 3.0 (Biospace Co., LTD, Seoul, Korea) in the KARE cohort. Each subject stood on the footplate and held both of the hand electrodes. The screen automatically displayed measurements of LBM (kg), skeletal muscle mass (kg), BFM (kg), and body fat percentage (%). LBM and BFM data were available for both cohorts and were used as initial phenotypes for analysis. Subgroup analysis was conducted using ASM or SMI, which were derived from BIA; these data were available only for the VHSMC cohort. The parameters were defined according to the consensus of the AWGS 201911.

Genotyping and imputation

Genomic DNA was separated from venous blood samples, and 100 ng DNA was genotyped using Korea Biobank Array Affymetrix Axiom 1.1 (Affymetrix, Santa Clara, CA), which was designed by the Korean National Institute of Health43. Genotypes were identified with a K-medoid clustering-based algorithm to minimise the batch effect44. The PLINK (version 1.9, Boston, MA)45 and ONETOOL46 software packages were used for quality control procedures and association analyses. Samples matching any of the following criteria were excluded: (1) sex inconsistencies or (2) a call rate of up to 97%. SNPs were filtered if the call rate was lower than the Hardy–Weinberg equilibrium (HWE) test (P < 1 ×  \({10}^{-5}\)). The genotype imputation was conducted using the Michigan imputation server (https://imputationserver.sph.umich.edu). Only ‘non-European’ or ‘mixed’ populations from Haplotype Reference Consortium release v1.147 were used for reference purposes. Pre-phasing and imputation were performed using Eagle v2.448 and Minimac449, respectively. After the imputation processes, imputed SNPs were removed if the R-squared (i.e., imputation accuracy) was less than 0.9 or there were duplicated SNPs, missing genotype rates were more extensive than 0.05, P-values for HWE were less than 1 ×  \({10}^{-5}\), or minor allele frequencies (MAFs) were less than 0.05. The MAF was compared with a reference such as Korean reference data (Kref) (http://coda.nih.go.kr) or the Genome Aggregation Database (GnomAD) with East Asian subjects (https://gnomad.broadinstitute.org/). Finally, 2422 subjects (and their 2,804,834 SNPs) from the VHSMC cohort and 5235 subjects (and their 3,423,819 SNPs) from the KARE cohort were used for analysis.

Statistical analyses

Baseline characteristics of the study population are presented herein as means with standard deviation (SD) for continuous variables and numbers, and as proportions for categorical variables. Genome-wide analyses were conducted using a linear model; PLINK was used within each cohort. Age, sex, and ten principal component scores were included as covariates. Meta-analyses of the VHSMC and KARE cohorts were performed using the METAL software (http://csg.sph.umich.edu/abecasis/meta). Cochran’s Q-test for heterogeneity was conducted; its P-value was marked with ‘HetPVal’50, where HetPVal < 0.05 indicates heterogeneity between two datasets51. The dense regional association result of each GWAS was plotted using the LocusZoom software52. The threshold for statistical significance in this model was P < 5.0 ×  \({10}^{-8}\), which is conventionally considered to reflect genome-wide significance.

Functional annotation analyses

Expression Quantitative trait (eQTL) studies were performed using the Genotype-Tissue Expression (GTEx) dataset (https://gtexportal.org/home/), which provides a variety of human tissues from donors using the densely genotyped data to assess genetic variations within their genomes. Genes related to metabolites were analysed using KEGG pathway analysis31. Associated genes were further investigated for DEGs in the skeletal muscles of subjects 19 to 28 and 65 to 76 years of age from the Gene Expression Omnibus (GEO) dataset (GSE38718)53. In addition, biological process, cellular component, and molecular function GO analyses were performed using gene set enrichment analysis. The Benjamini–Hochberg false discovery rate (FDR)-adjusted 0.1 significance level was applied for multiple hypothesis test corrections54.

Ethics declarations

The institutional review boards of the Veterans Health Service Medical Center approved this study protocol and informed consent waiver (IRB No. 2020-02-015 for VHSMC cohort and IRB No. 2021-05-005 for KARE cohort) since this study was performed in retrospective manner, and the study was conducted in compliance with the Helsinki Declaration.

Consent to participate

Informed consent waiver was approved by the institutional review boards of the Veterans Health Service Medical Center since this study was performed in a retrospective manner and anonymised and de-identified data were used for the analyses. The KARE cohort and VHSMC cohort obtained the informed consents from participants.