Abstract
Length and quality of life are important to us all, yet identification of promising drug targets for human aging using genetics has had limited success. In the present study, we combine six European-ancestry genome-wide association studies of human aging traits—healthspan, father and mother lifespan, exceptional longevity, frailty index and self-rated health—in a principal component framework that maximizes their shared genetic architecture. The first principal component (aging-GIP1) captures both length of life and indices of mental and physical wellbeing. We identify 27 genomic regions associated with aging-GIP1, and provide additional, independent evidence for an effect on human aging for loci near HTT and MAML3 using a study of Finnish and Japanese survival. Using proteome-wide, two-sample, Mendelian randomization and colocalization, we provide robust evidence for a detrimental effect of blood levels of apolipoprotein(a) and vascular cell adhesion molecule 1 on aging-GIP1. Together, our results demonstrate that combining multiple aging traits using genetic principal components enhances the power to detect biological targets for human aging.
Main
Aging affects us all, from the personal, progressive loss of health to the collective burden of chronic age-related disease and frailty on society. In humans, the body undergoes a systemic functional decline after reaching adulthood, which manifests itself as age-related disease, infirmity and eventually death1. The factors determining the rate of aging and death are complex and interlinked, and include genetics, lifestyle, environmental exposures and chance.
Quantifying the aging process is not straightforward. A variety of aging-related phenotypes has been studied as proxies, from chronological measurements such as the length of time from birth until the occurrence of a major disease (healthspan)2 or death (lifespan)3,4, to cellular deterioration measurements such as telomere attrition5 and loss of the Y chromosome6,7, to holistic measurements such as the frailty index8, encompassing multiple functional impairment indicators9. Although the genetic component of these aging-related traits tends to be estimated at <15%2,8,10, recent progress has been made on characterizing this component using large genome-wide association studies (GWASs), and combining similar GWASs to increase statistical power11,12.
The benefit of combining GWASs of several aging phenotypes, especially in different populations, is the ability to detect biological mechanisms that influence multiple core components of aging, while downweighing population- and trait-specific features. For example, a recent multivariate analysis of healthspan, parental lifespan and longevity GWASs found that genetic loci that were not shared between traits often associated with population-specific, behavioral risk factors such as smoking and skin cancer11. On the other hand, that analysis showed that genetic loci shared between traits were associated with biological pathways such as cellular homeostasis and heme metabolism11.
However, to date, large aging-related trait GWASs have been combined only using multivariate analysis of variance, which detects a genetic variation that is either shared between multiple traits or strongly associated with a single trait13,14. This mixture generates heterogeneous SNP effect sizes, complicating the downstream analysis11. An alternative is to perform principal component analysis on the genetic covariance between traits and use the component loadings to construct new, genetically independent phenotypes (GIPs)15. As their name implies, GIPs capture the genetic covariance between phenotypes while being genetically uncorrelated to each other. In practice; this means that the first principal component (GIP1) maximizes the genetic overlap between all traits, whereas each subsequent GIP contains genetic variation distinguishing the traits from each other15.
In the present study, we cluster 11 large aging-related trait GWASs by genetic similarity, and explicitly quantify their common and unique genetic architecture using the GIP methodology. We then characterize this common genetic aging phenotype, identify robust genomic loci and highlight proteins that may be potential drug targets for improving the length and quality of life.
Results
Aging-related traits cluster based on genetic correlations
We gathered publicly available GWAS summary statistics on aging-related traits measured in at least 10,000 European-ancestry individuals. These included self-rated health16, healthspan2, father lifespan4, mother lifespan4, exceptional longevity17, frailty index8, perceived age18, Hannum epigenetic age acceleration19, Horvath epigenetic age acceleration19, telomere length5 and mosaic loss of the Y chromosome7 (Supplementary Table 1). A variety of UK and European individuals is represented between these studies, from children (aged 10+) to centenarians (aged 100+), with birth years spanning the twentieth and early twenty-first century. The largest sample consists of UK Biobank participants and their parents (see Supplementary Note for details of each GWAS, and Extended Data Fig. 1 for correlations between studies due to sample overlap).
Calculating genetic correlations (rg) from summary statistics, and performing hierarchical clustering based on the magnitude of these correlations, we find that the first six traits—mostly chronological and holistic measures of aging—form a cluster of high genetic similarity (|rg| ≥ 0.5; P < 5× 10−15). In contrast, the cellular measures of aging show low or no correlations with other traits (|rg| ≤ 0.3), although the epigenetic age acceleration phenotypes correlate strongly with each other (rg = 0.5; 95% confidence interval (CI) = 0.2–0.8; Fig. 1 and Supplementary Data 1).
Diagonal values show the observed-scale SNP heritability of each aging-related phenotype and off-diagonal circles show the genetic correlation among traits, with both values calculated using HDL21. Blank squares did not pass multiple testing corrections (FDR > 5%). The bottom dendrogram shows the hierarchical relationship between traits, based on the magnitude of their genetic correlations. The black box highlights the first cluster of aging-related phenotypes used in follow-up analyses. EAA, epigenetic age acceleration; mLOY, mosaic loss of Y chromosome.
Testing the 6 aging-related traits in the main cluster for correlations with 728 other traits from the GWAS-MAP platform20, we find that their genetic similarity may be largely explained through strong, shared genetic correlations (meta |rg| ≥ 0.5, false discovery rate (FDR) < 0.05, Phet > 0.05, I2 < 0.50) with chest pain, cardiovascular disorders, smoking-related disease, type 2 diabetes, and general illness or medication use in UK Biobank (Supplementary Data 2). However, each core aging trait also has several genetic correlations that differ substantially (Methods) from the other aging-related phenotypes and may reflect population- or trait-specific risk factors. For example, self-rated health correlates more strongly with physical fitness, body mass index and a noisy work environment, healthspan correlates more strongly with skin and breast cancers, father lifespan correlates more strongly with hypertension, mother lifespan uniquely correlates with lower childhood height, longevity uniquely lacks a negative correlation with chronic knee pain, and frailty correlates more strongly with hearing aid usage, daytime napping and allergic disease (eczema/dermatitis and hayfever/rhinitis) (Supplementary Data 2).
Aging-GIPs capture distinct elements of wellbeing
We combined the six GWASs in the main correlation cluster using the loadings from the principal components of the genetic correlation matrix (Fig. 2), yielding association summary statistics for six aging-GIPs (available at https://doi.org/10.7488/ds/2972). As expected, high-definition likelihood (HDL)21 estimated aging-GIP1 to be the most heritable of the aging-GIPs (h2SNP = 0.20; standard error (s.e.) = 0.005), capturing >70% of the genetics of healthspan, parental lifespan and longevity, and 90% of the genetics of self-rated health and the inverse of frailty (henceforth referred to as ‘resilience’) (Fig. 3). A leave-one-out analysis confirms that the GIP analysis is highly robust to the selection of GWAS, with the genetic architecture of aging-GIP1 remaining largely the same, after excluding any one of the six chronological and holistic aging-related trait GWASs (estimates range from rg = 0.950 (s.e. = 0.024) when excluding resilience, to rg = 0.996 (0.023) when excluding healthspan; Supplementary Table 2).
Resilience is the inverse of the frailty index. The error bars represent 95% CIs of the loading estimate, inferred using 1,000 rounds of Monte Carlo simulations (see Supplementary Note for details). GIP, genetically independent phenotype; PCA, principal component analysis.
a, Table showing the estimated sample size (Nest) and SNP heritability (h2SNP) of each aging-GIP GWAS, with s.e. in parentheses. b, Genetic correlations between aging-GIP GWASs and the six aging-related trait GWASs used to construct them. Blank values failed to pass nominal significance (P ≥ 0.05). c, GWASs of 402 phenotypes measured in European-ancestry individuals from the GWAS-MAP platform with strong and significant associations with GIP1 (P < 1.5 × 10−5 and |rg| > 0.25). These 402 traits are summarized here in 25 groups, based on hierarchical clustering of the magnitude of pairwise genetic correlations. Each group is manually annotated with a label describing the traits within the cluster, and displays the values of the most informative trait (that is, the trait with the highest total z-score across aging-GIPs). Values failing to pass nominal significance (P ≥ 0.05) are grayed out. The dendrogram displayed here shows the hierarchical relationship between the most informative trait in each cluster. HDL, high-density lipoprotein; VLDL, very-low-density lipoprotein. See Supplementary Data 3 for the full list of correlations. Genetic correlation P values are derived from two-sided Wald’s test with one degree of freedom.
Apart from the genetic correlations with its component traits, aging-GIP1 shows significant correlations, with 459 of 729 traits tested (Padj < 0.05, Bonferroni-adjusted for 6 GIPs and 729 traits). These suggest that, in addition to length of life, aging-GIP1 also captures mental and physical wellbeing. For example, among the largest negative aging-GIP1 correlations are mental illness, taking medications and diseases of old age (such as cardiometabolic disorders, cancers and osteoarthritis; rg ≤ –0.5; Padj < 1 × 10−9). Inversely, fitness and education are among the largest positive correlations (rg ≥ 0.5; Padj < 1 × 10−10). Aging-GIP1 also shows moderate negative correlations with infectious diseases, including N39 (International Classification of Diseases, 10th revision22) urinary tract infections (rg = –0.66; 95% CI = –0.42 to –0.90; Padj = 3 × 10−4), coughing on most days (rg = –0.39; 95% CI = –0.30 to –0.49; Padj = 1 × 10−11) and severe COVID-19 hospitalization (that is, resulting in respiratory support or death; rg = –0.33; 95% CI = –0.20 to –0.46; Padj = 0.004).
However, aging-GIP1 also retains some strong correlations with socioeconomic factors, such as smoking behaviors (for example, current tobacco smoking: rg = –0.50; 95% CI = –0.45 to –0.55; Padj = 4 × 10−92) and having a job involving manual or physical work (rg = –0.49; 95% CI = –0.44 to –0.54; Padj = 2 × 10−89; Fig. 3 and Supplementary Data 3). Although such socioeconomic risk factors are partly heritable23, they could also reflect social stratification or geographic confounding shared between the original six studies used to construct aging-GIP1 (refs. 24,25). Therefore, as a sensitivity analysis, we used the GIP methodology to remove all genetic correlations with UK Biobank GWASs of household income and socioeconomic deprivation from the aging-GIPs—recognizing that this may remove some of the true signal as well (Extended Data Fig. 2). The SNP heritability of the adjusted aging-GIP1 phenotype is substantially attenuated (h2SNP = 0.09; s.e. = 0.003) and, consequently, genetic correlations with all phenotypes are reduced. However, most traits that were significantly associated at 5% FDR remain associated with the residual phenotype (Extended Data Fig. 3). As expected, the largest reductions in genetic correlations are with socioeconomic, educational and smoking/drinking traits, which are reduced by ≥0.30 (Supplementary Data 3).
The remaining aging-GIPs have much lower heritability (h2SNP < 0.065) and are of less interest to the present study because they capture genetic variance shared between fewer traits and do so with higher degrees of uncertainty. In short, aging-GIP2 correlates mainly with nonlethal causes of poor self-rated health, such as chronic pain (neck, back, stomach), negative emotions and hearing problems (59 traits; Padj < 0.05); aging-GIP3 correlates mainly with measures of socioeconomic deprivation and body composition (73 traits); and aging-GIP4 captures healthspan-specific correlations with cancer and diabetes (7 traits). Although aging-GIP5 and aging-GIP6 are underpowered due to their low heritabilities, they appear to distinguish parental lifespan from longevity (possibly through educational attainment), and between father lifespan and mother lifespan (possibly through risk taking and cardiovascular factors), respectively (Fig. 3).
Characterizing the genomics of aging-GIP1
Across the genome, 27 loci in the aging-GIP1 GWAS pass the genome-wide significance threshold, Bonferroni’s adjusted for 6 GIPs (P < 8.3 × 10−9). The strongest lead SNPs in these loci are rs429358 (P = 2 × 10−40) and rs660895 (P = 2 × 10−22), located nearest to the APOE and HLA-DRB1/DQA1 genes, respectively (Fig. 4 and Table 1). Genetic fine-mapping with SuSiE26 highlights three independent 95% credible sets of causal variants for the APOE locus: the APOE ε4 and ε2 alleles at single variant resolution, and a set of three variants intronic or near to APOC1. Three other loci have a credible set containing only one variant. These variants are rs9271300 (intergenic between HLA-DRB1 and HLA-DQA1), rs34811474 (a nonsynonymous ANAPC4 exon variant) and rs2165702 (an intergenic variant near PHB; Supplementary Table 3 and Supplementary Data 4).
a, Manhattan plot showing associations of SNPs across the genome with aging-GIP1, with the y axis showing the strength of the association and the x axis the genomic position of the SNP. The line represents the genome-wide significance threshold, Bonferroni’s adjusted for 6 GIPs (P = 8.3 × 10–9). The lead SNP of each independent locus is marked with a cross and annotated with the nearest (upstream) gene, or the cytogenetic band if there are no genes within 250 kb. The y axis has been capped at P = 1 × 10−20. The following loci exceed this cap and are represented as triangles: APOE (P = 1.5 × 10−40) and HLA-DRB1 (P = 1.4 × 10−22). b, Significance of the association of each lead SNP with the aging-related trait GWASs used to construct aging-GIP1. Bonf-GWS, Bonferroni’s adjusted genome-wide significance (P < 8.3 × 10−9); GWS, genome-wide significance (P < 5 × 10−8); Bonferroni’s, P < 0.05 adjusted for 6 traits and 27 SNPs (P < 3 × 10−4); nominal, P < 0.05; NS, not significant. P values are derived from two-sided Wald’s test with one degree of freedom.
Most lead SNPs are strongly associated with self-rated health and resilience, in line with the large loadings of these traits in the construction of aging-GIP1 (Fig. 4 and Supplementary Table 4). Seventeen aging-GIP1 loci overlap with genome-wide significant (P < 5 × 10−8) loci from the original aging-related trait GWAS, showing evidence of a shared causal variant (coloc posterior probability (PP) ≥ 80%; Supplementary Table 5). For the other loci, the GIP framework either increases power (n = 4) or indicates that there may be a different causal variant (coloc PP < 80%; n = 6; Supplementary Data 5).
Loci near APOE, HLA-DRB1/DQA1, LPA and CDKN2B/-AS1 have previously been validated using the same trait in an external cohort2,3,4. For the remaining 23 loci, we measured lead SNP effects on participant survival in FinnGen (release 5; n = 203,244; 6.94% deceased) and BioBank Japan (n = 135,983; 24.1% deceased), to provide additional evidence of their association with human aging traits in independent samples. Combining both cohorts to achieve adequate power, we find that aging-GIP1-increasing alleles of lead SNPs near HTT and MAML3 have a protective effect on survival in these cohorts (one-sided P < 0.0022, that is, taking into account the 23 loci tested), increasing average lifespan by around 2.53 (95% CI = 0.91–4.15; one-sided P = 0.001) and 2.51 (95% CI = 0.79–4.23; one-sided P = 0.002) months per allele, respectively. Although we are underpowered to confirm the remaining 21 loci individually, we find that, collectively, their aging-GIP1-increasing alleles are also associated with increased Finnish and Japanese survival (one-sided P = 0.044; Supplementary Table 6).
Again, we find aging-GIP1 genetics are stable when performing a leave-one-out analysis: lead SNP effects of most aging-GIP1 loci do not change significantly when excluding one of the six aging traits. The exceptions are for previously replicated loci and SLC22A1/A2, where exclusion of one of the traits can reduce the aging-GIP1 effect size, although loci near HLA-DRB1/DQA1, LPA and CDKN2B/-AS1 remain genome-wide significant in all leave-one-out scenarios (Supplementary Data 6). Similarly, when completely removing genetic correlations with household income and socioeconomic deprivation, we find that aging-GIP1 effect sizes are attenuated but, in all cases, remain associated with the residual trait (P < 0.0019, that is, taking into account the 27 loci tested; Extended Data Fig. 4 and Supplementary Data 7).
We further looked up all lead aging-GIP1 SNPs and close proxies (r2EUR ≥ 0.8) in PhenoScanner27 and the GWAS catalog28, excluding associations discovered solely in UK Biobank, which showed that 24 of 27 aging-GIP1 loci had previously been associated with one or more traits at genome-wide significance. Most of these loci were associated with cardiometabolic, immune-related or neuropsychiatric disorders, although several were also associated with measures of educational attainment and household income. Of specific interest are loci near APOE, HLA-DRB1/DQA1, CDKN2B/-AS1 and ZNF652/PHB, which show aging-GIP1-increasing alleles are associated with a reduction in multiple diseases, but do not appear to associate with socioeconomic factors, suggesting that these loci largely capture intrinsic sources of aging (Supplementary Data 8).
Aggregating SNP association statistics across the genome into gene scores using the Pathway Scoring Algorithm (PASCAL)29, we find that high-scoring genes for aging-GIP1 (Supplementary Data 9) appear to be overrepresented in the heme metabolism Hallmark gene set, as well as 300 gene ontology (GO) pathways (FDR = 5%), regardless of adjustment for socioeconomic correlations. These GO pathways cluster into 20 groups, related to: neuronal development, organization and function; transcriptional regulation; chemical homeostasis; cellular growth, differentiation and apoptosis; proteolysis; protein phosphorylation; intracellular signaling and transport; immune system development; the muscle system; and lipoprotein metabolism (Supplementary Data 10). Similarly, aging-GIP1 heritability appears to be enriched in genomic regions containing histone marks associated with the central nervous system (Supplementary Table 7).
Causal inference of blood protein levels on aging-GIP1
Mendelian randomization (MR) uses genetic variation as instrumental variables to estimate the casual effect of exposures of interest on an outcome30. We used a set of well-validated blood protein quantitative trait loci (pQTLs) for 857 proteins30 as genetic instruments in a two-sample MR31 and colocalization32 framework to infer putative causal links between protein levels and aging-GIP1 (Supplementary Data 11). We find robust evidence for a detrimental effect on aging-GIP1 (FDR < 5%) for the levels of four proteins in blood (Table 2), with pQTL instruments passing sensitivity and causal directionality tests, and additionally colocalizing with the aging-GIP1 signal (Methods). Three of these proteins—apolipoprotein(a) (LPA), olfactomedin-1 (OLFM1) and low-density lipoprotein (LDL) receptor-related protein 12 (LRP12)—were instrumented by a cis-pQTL and encoded by genes that appeared significantly enriched in the gene score analysis. The remaining protein, vascular cell adhesion molecule 1 (VCAM1), was instrumented by a trans-pQTL shared with β2-microglobulin (β2M); however, only VCAM1 protein levels colocalized with the signal at this locus (Supplementary Data 11). When performing the same MR analysis on aging-GIP1 adjusted for household income and socioeconomic deprivation, protein effects remained significant (Extended Data Fig. 5).
Among the discovered proteins, LPA shows the most significant effect on aging-GIP1 (PMR = 2 × 10−8), with an increase of 1 s.d. in genetically predicted blood protein levels, causing a decrease of 0.035 (95% CI = 0.025–0.045) s.d. in aging-GIP1. This significance appears to be driven by a consistent detrimental effect across all six aging-GIP1 component traits (βMR range 0.013–0.035; all nominal P < 0.05; Fig. 5). For a sense of scale, when performing the same MR analysis on the unstandardized parental lifespan GWAS, this equates to a loss of approximately 7 months of life (95% CI = 5–9 months) per s.d. increase in LPA blood levels (Supplementary Table 8). The effects of the remaining proteins on aging-GIP1 are larger in magnitude but appear unequally distributed across aging-GIP1 component traits. We estimate that a decrease of 1 s.d. in genetically predicted VCAM1 blood protein levels is associated with an increase of approximately 18 months of life (13–23 months); however, comparing VCAM1 effects on standardized aging-GIP1 components, we find its effect to be largest on mid-to-late-life aging traits (lifespan, longevity, resilience) compared with early life aging traits (healthspan). Similarly, the effects of OLFM1 and LRP12 proteins appear mostly mediated through late-life aging traits (resilience and/or longevity; Fig. 5).
Exposures that showed causal evidence of an effect on aging-GIP1 were tested for association with each of the aging-related traits used to construct aging-GIP1. The x axis shows the MR effect estimate, with lines representing 95% CIs, as estimated by TwoSampleMR3. The sample sizes of exposure GWASs were 6,861 for LPA, 3,301 for VCAM1, and 3,200 for OLFM1 and LRP12.
Discussion
We combined European-ancestry GWASs of healthspan, father lifespan, mother lifespan, longevity, frailty and self-rated health using a framework that maximized power to detect associations with their shared genetic component while downweighing trait-specific genetic associations. The resulting aging-GIP1 trait captured the genetics underlying physical and mental wellbeing, and showed strong inverse genetic correlations with cardiovascular, inflammatory and neuropsychiatric disease traits. We highlight 27 loci with genome-wide significant effects on aging-GIP1, including two new loci near HTT and MAML3 which showed directionally consistent evidence of an effect on survival in two additional, independent samples. Across the genome, we found genes enriched for association with aging-GIP1 to be overrepresented in heme metabolism and pathways related to (among others) neurogenesis, homeostasis, proteolysis, immunity and the muscle system in human aging. Last, we performed MR of predicted blood protein levels on aging-GIP1, which revealed that the levels of LPA, VCAM1, OLFM1 and LRP12 proteins may be detrimental to multiple indices of healthy aging.
LPA is a glycoprotein making up the main component of large lipoprotein(a) particles. It is a well-known risk factor for atherosclerotic disease33,34 and a target of ongoing clinical trials with regard to cardiovascular outcomes35. A recent MR study used 27 genetic instruments for LPA and found a link between genetically elevated LPA levels and reduced healthspan and parental lifespan36, in line with our results. Our analysis suggested that the detrimental effect of LPA may apply to aging more generally, and stringent colocalization and reverse MR tests provided additional evidence for causality. It is interesting that human endothelial cell culture experiments show that the addition of LPA increases cell-surface expression of VCAM1 (ref. 37) and can increase endothelial cell contraction and permeability38.
VCAM1 is a cell adhesion glycoprotein localized predominantly on endothelial cell surfaces and its expression is upregulated in response to inflammatory signals, which mediate adhesion and transduction of leukocytes across endothelial walls39. The link that we established between VCAM1 and human aging relied on a trans-pQTL instrument shared with β2M and is therefore more susceptible to horizontal pleiotropy, that is, the genetic variant may influence VCAM1 levels indirectly, and its effect on human aging traits could be caused by factors independent of VCAM1 levels. However, only VCAM1 colocalized with the pQTL signal, and experimental evidence from mouse studies suggests that the effect of VCAM1 on aging is likely to be causal. Specifically, VCAM1 levels in blood are known to increase with age in both humans and mice40, and treatment with anti-VCAM1 antibodies or an inducible deletion of Vcam1 improves cognitive performance of aged mice40. Of note, similar results have been found for β2M abundance and mouse knockouts41 and, as such, identification of robust genetic instruments for β2M levels is also warranted.
We were unable to robustly assess colocalization and reverse causality for the LRP12 and OLFM1 signals because genome-wide association summary statistics were not available, so the effects of these proteins on human aging should be interpreted with additional caution. LRP12 belongs to the LDL-receptor superfamily and may play a role in brain development42 and both tumor proliferation and suppression, depending on the tissue43,44. Similarly, OLFM1 is a glycoprotein involved in neuronal development and maintenance45, which has also been shown to suppress colorectal tumor metastasis46. In mouse models, Olfm1 knockouts showed reduced cerebral infarction and fertility47. However, it is unclear whether reduction of LRP12 or OLFM1 blood levels will have a beneficial effect in humans.
Importantly, our MR analysis was restricted to blood pQTLs, precluding detection of causal effects of protein levels in other tissues. Although it is likely that there are proteins with tissue-specific effects on aging, particularly in the brain, the samples needed for detection of such pQTLs are less readily available and therefore more difficult to study at scale (although progress is being made48). Still, blood may be a particularly suitable tissue to identify aging-related proteins. Connecting the circulatory systems of two mice of different ages has been shown to accelerate signs of aging in the brain, muscle and liver of the young mouse, and reverse similar signs in the old mouse49,50. Likewise, the detrimental cognitive effect of injecting old blood in mice is counteracted when anti-VCAM1 antibodies are concomitantly injected40. As such, the blood currently remains one of the most promising tissues to detect aging-related proteins.
Heme metabolism and iron levels were previously hypothesized to play a role in human aging11, and in the present study we identify the same pathway using new methods and additional data. It is of interest that both the heme pathway and the proteins we uncovered using MR are strongly linked to vascular and endothelial damage51. Across the genome, we also found an enrichment for brain tissues and pathways related to neuronal integrity. Given that endothelial cells are central to both the cardiovascular system and the blood–brain barrier52,53, and that endothelial function declines with age54, progressive endothelial dysfunction may manifest itself as an age-related disease. Indeed, recent findings suggest that the detrimental effects of the APOE ε4 allele—the largest genetic determinant of human aging—are mediated by an accelerated breakdown of the blood–brain barrier, independent of amyloid-β and tau accumulation55. We therefore speculate that molecules involved in maintaining or repairing endothelial integrity may be key to avoiding both age-related cardiovascular injury and neurodegeneration, and recommend further research into this area.
Our study demonstrates that GIP analysis of genetically correlated GWASs can increase power to detect shared genetic architecture and can identify genetic loci that would have been missed by any individual GWAS. For example, regions near AFF3, MAML3, USP28/HTR3B and MIR6074 reached genome-wide significance only in the combined analysis, and validation of MAML3 in an external survival cohort suggests that these findings may hold across populations and aging measurements. In addition, USP28 within the USP28/HTR3B locus resembles CDKN2A within the well-known CDKN2B/-AS1 locus, because both have been implicated in cellular proliferation56,57 and senescence58,59. In fact, short hairpin RNA knockdown of USP28 in vitro results in decreased CDKN2A expression58. However, additional validation and functional investigation of aging-GIP1 loci are warranted.
A secondary advantage of the GIP method is the high stability of the resultant aging-GIP1 GWAS, which appears to be largely robust to the selection of component traits. Nevertheless, we note that our analysis focused on chronological and disease measures of aging and did not include the cellular measures of aging due to their modest genetic correlations and lack of power. Whether aging-GIP1 loci and pathways accurately reflect the aging process overall, or only capture specific domains of aging, is unknown. Inclusion of future aging-related GWASs performed on Biobank-scale samples should provide insight into how well aging-GIP1 captures human aging.
An important limitation of the aging-GIP1 GWASs, and GWASs of aging-related traits more generally, is the potential confounding of aging genetics with social stratification. Socioeconomic factors exhibiting geographical clustering can induce gene–environment correlations that could inflate trait heritability and may bias results, especially for UK Biobank studies25. In our study, we found that genetic correlations of the six aging-related GWASs with socioeconomic deprivation were moderately high (up to 61% for self-rated health). This confounding can be inadvertently propagated and even amplified in the shared genetic component captured by aging-GIP1. Indeed, when explicitly removing genetic correlations with two UK Biobank GWASs of socioeconomic factors, aging-GIP1 heritability is halved. After adjustment, genetic correlations with mental and physical health traits are also somewhat reduced, although the overall patterns remain stable. Similarly, effect sizes of blood protein levels and aging-GIP1 loci—including loci that have been confidently linked to aging-related traits previously—were only slightly reduced after adjustment.
The shift of these effect sizes after adjustment was almost equal for all SNPs and MR signals. Together with the highly decreased heritability and stable genetic correlations, this suggests that the adjustment removed the unspecific polygenic background induced by UK Biobank microstructure and social confounding. We therefore consider this procedure to be a suitable approach for confounder correction, although it may be somewhat conservative for also removing the genetic background of the adjusting traits, which may genuinely be associated with aging. An alternative solution to the confounding problem could be the inclusion of a more ancestrally diverse set of aging-related GWASs from a larger variety of populations.
Despite these limitations, modeling the shared genetic component of human aging proxies has allowed us to downweigh nonbiological features, and propose pathways and proteins that may causally influence the human aging process. We share the full aging-GIP1 summary statistics without restrictions60 to encourage further MR analysis using other biomarkers, and accelerate the discovery of drug targets able to prolong mental and physical wellbeing throughout life.
Methods
Data sources
We searched PubMed and Google Scholar in April 2020 for GWASs of aging measures, including only studies for which we could obtain autosomal genome-wide summary statistics measured in at least 10,000 European-ancestry individuals. If multiple studies were performed on similar traits, we kept the study with the largest sample size. GWASs meeting inclusion criteria included extreme longevity17 (survival past 90th percentile), father and mother lifespan4, healthspan61, self-reported health16, frailty index8, epigenetic age acceleration19, telomere length5, mosaic loss of the Y chromosome7 and perceived age18. As the self-reported health GWAS used the first release of UK Biobank data (n = ~150,000), we looked up the same phenotype in the Neale Lab GWAS collection (n = ~500,000), which had a larger sample size but was otherwise measured identically62. The original derivation of each set of summary statistics is briefly described in the Supplementary Note.
For each set of summary statistics, we discarded poorly imputed (INFO < 40%), rare (minor allele frequency (MAF) < 0.5%) or poorly measured SNPs (n individuals <1% of total). The remaining SNPs were aligned to genome build GRCh37 and harmonized to match UK Biobank SNP IDs (discarding any duplicates). We then estimated the phenotypic variance of the trait (residuals) from a selection of independent SNPs provided by the MultiABEL R package13 using equation (1) from Winkler et al.63
where \({\mathrm{Var}}(Y)\) is the phenotypic variance, p and q are the major and minor allele frequencies, n is the total sample size (cases + controls, if applicable) and \({\mathrm{Var}}\left( {\widehat {\beta}} \right)\) is the variance of the effect size estimate. If n differed by SNP, we calculated the phenotypic variance separately for quintiles of n and took the mean estimate. SNP statistics were then standardized by dividing effect sizes and s.e. by \(\sqrt {{\mathrm{Var}}(Y)}\).
Estimation of genetic and nongenetic correlations
The HDL R package21 v.1.3.4 was used to calculate SNP heritabilities of aging-related GWASs and their genetic correlations with each other, using the default European-ancestry linkage disequilibrium (LD) reference panel and non-major histocompatibility complex (MHC) SNPs with MAF ≥ 0.01. All GWASs had ≥99.9% of the SNPs in this reference panel, except for the Hannum and Horvath epigenetic age acceleration GWASs, which had 96.29%. Pearson’s correlations between GWASs due to phenotypic similarity and sample overlap were calculated from z-scores of independent SNPs provided by the MultiABEL R package13 which were nonsignificant in both studies (|z| < 1.96). The number of SNPs used for this calculation is reported in Supplementary Data 1.
LD-score (LDSC) regression64 v.1.0.0 was used to calculate genetic correlations between selected GWAS summary statistics from the GWAS-MAP platform20 and the aging-related GWAS and aging-GIP summary statistics used in our study, where the longer computation time of the HDL software would have been impractical. The GWAS-MAP platform contains summary statistics for 1,329,912 complex traits and gene expression levels, and 3,642 binary traits, derived from UK Biobank62,65 and large European-ancestry consortia (for example, MAGIC, CARDIoGRAM, SSGAC and GIANT; for an up-to-date list of phenotypes, see https://phelige.com)66. We included GWASs with ≥1 million SNPs, measured in ≥10,000 individuals (if continuous) or ≥2,000 cases and controls (if binary). We further excluded the healthspan and self-rated health GWASs from the GWAS-MAP platform to avoid duplication, after which 728 traits remained. Out of specific interest67, we also calculated the genetic correlation between aging-GIPs and a case–control GWAS of COVID-19 hospitalization, with cases defined as laboratory-confirmed COVID-19 patients experiencing a severe outcome and controls defined as the rest of the population (A2_ALL excluding 23andMe, release 4)68. GWASs with counterintuitive effect directions were reversed (for example, satisfaction traits are coded from high to low in UK Biobank), and each set of GWAS summary statistics was filtered before analysis by excluding SNPs in the MHC, SNPs with MAF < 1% and, if measured, SNPs with INFO < 90%. A full list of 729 GWASs and references can be found in Supplementary Data 3. P values were adjusted for multiple testing using Bonferroni’s correction (729 traits and 6 GIPs).
The same software was used to calculate pairwise correlations between GWAS-MAP statistics, which allowed traits to be clustered based on the magnitude of their genetic similarity. For computational tractability, we included only GWASs that showed large and significant effects on aging-GIP1 (|rg| > 0.25; Padj < 0.05). Clusters were identified hierarchically by maximizing the Bayesian information criterion using the mclust R package69 v.5.4.1, up to a maximum of 100 clusters.
Identification of shared and unique genetic correlations
For each selected GWAS-MAP phenotype, genetic correlations with aging-related traits were meta-analyzed using fixed-effect, inverse-variance weighting, with heterogeneity quantified by Cochran’s Q and I2 statistics, implemented in the meta R package70 v.4.15-1. GWAS-MAP phenotypes that were significantly correlated with all six aging-related GWAS at FDR of 5% and lacking substantial heterogeneity (Phet > 0.05 and I2 < 50%) were considered to be shared. For GWAS-MAP phenotypes with evidence of heterogeneity, we performed a leave-one-out sensitivity analysis to assess which aging-related trait(s) contributed most to this heterogeneity. If heterogeneity could be completely removed by excluding a single GWAS (I2 = 0%), and exclusion of any other GWAS did not substantially reduce heterogeneity (I2 ≥ 50%), the aging-related trait outlier was considered to have a unique genetic correlation with the GWAS-MAP phenotype.
Genetically independent phenotype analysis of aging GWAS
Principal component loadings for six aging-GIPs were estimated from the genetic covariance matrix of the six chronological and holistic aging GWASs, analogous to a principal component analysis of phenotypic correlations. Specifically, eigenvectors from the genetic covariance matrix were transformed into loadings by dividing them by the square root of the phenotypic variance of the GIP. This phenotypic GIP variance was calculated as follows:
where ai is the eigenvector of the ith aging-GIP and Σph is the phenotypic variance–covariance matrix of the core aging traits. As GWASs were standardized, Σph is equivalent to the phenotypic correlation matrix, with off-diagonal correlations estimated from the correlation observed between independent null z statistics (described above).
The six chronological and holistic aging GWAS statistics were then combined on an SNP-by-SNP basis using the principal component loadings to construct genome-wide summary statistics for the six aging-GIPs. GIP effect estimates were calculated by summing effect estimates from the individual aging-related trait GWAS, each multiplied by their corresponding loading, and standardized using the expected GIP variance (equation (2)). The s.e.s of the GIP effect estimate were calculated by performing the equivalent calculation using variance arithmetic, also taking into account the phenotypic covariance between GWASs to adjust for sample overlap:
where s.e.(B) is the vector of SNP s.e.s of the core aging trait effect estimates. Effective sample sizes were then estimated based on the median z statistic and allele frequencies, that is, solving equation (1) for n. Full technical details of the GIP method are described in Supplementary Note.
Aging-GIP summary statistics were calculated for the 7,324,133 SNPs shared between quality-controlled GWAS statistics, of which 5,353,660 were common (MAF ≥ 5%) and 1,970,474 were rare (MAF < 5%). Finally, s.e.s of each aging-GIP GWAS were adjusted to account for the LDSC regression intercept, which ranged from 0.99 (GIP1) to 1.03 (GIP4).
Adjusting for genetic correlations with socioeconomic factors
UK Biobank GWAS summary statistics for household income and Townsend’s deprivation index were downloaded from the Neale Lab GWAS collection62 and subjected to the same SNP filters used for aging-related trait quality control. Genetic correlations between GWASs of aging-GIPs and the two socioeconomic GWASs were calculated using the HDL R package21 v.1.3.4. As before, phenotypic correlations were estimated from the correlation between independent null z statistics.
For each aging-GIP, adjustment loadings were then calculated as:
where \(\beta\) refers to \(\beta = {\mathrm{Cov}}({{{\mathbf{X}}}})^{ - 1} \times {\mathrm{Cov}}({\mathrm{GIP}},{{{\mathbf{X}}}})\). Here, \({\mathrm{Cov}}({{{\mathbf{X}}}})\) is the genetic covariance matrix of UK Biobank GWAS of household income and Townsend’s deprivation index, and \({\mathrm{Cov}}\left( {{\mathrm{GIP}},{{{\mathbf{X}}}}} \right)\) is the genetic covariance matrix between the aging-GIP and the two GWAS.
These loadings were used as weights in the linear combination framework to calculate new aging-GIP summary statistics, which—by definition—were uncorrelated to the UK Biobank GWAS of household income and Townsend’s deprivation index. Last, s.e.s were adjusted to account for the LDSC regression intercept, which ranged from 0.99 (GIP3) to 1.02 (GIP1).
Characterization of genome-wide significant aging-GIP1 loci
Genome-wide significant loci were defined as 500-kb regions centered on a lead genome-wide significant SNP (P < 8.3 × 10−9) in linkage equilibrium (r2 < 0.1) with other lead locus SNPs. The LD between SNPs was calculated using a random 10,000 subset of unrelated, genomically British individuals from UK Biobank71. The susieR package26 v.0.11.8 was used to perform genetic fine-mapping of the association signals within each aging-GIP1 locus, allowing for up to ten causal variants. We report posterior inclusion probabilities for SNPs within 95% credible sets. Positional annotations of credible set SNPs were retrieved using ANNOVAR72 v.2019Oct24.
SNPs within each 500-kb locus were looked up in the six aging-related trait GWASs used to construct the aging-GIPs. Any region containing an SNP associated with an aging-related trait at genome-wide significance (P < 5 × 10−8) was tested for colocalization between the trait and the aging-GIP1 signal. For this, we used the coloc R package32 v.4.0-4 with its default parameters, denoting a posterior colocalization probability (PP) of 80% as evidence of a shared GWAS signal.
Association of loci with Finnish and Japanese survival
Loci were considered to be previously replicated if they had been associated at genome-wide significance with one of the core aging traits and also had evidence of an effect in an independent cohort on the same trait (P < 0.05). We attempted to find additional evidence for an effect on aging for the aging-GIP1 loci, which had not been previously replicated.
Effects were first looked up in a GWAS of survival of FinnGen study participants73. The FinnGen study associated SNPs across the genome with the survival of 218,396 Finnish-ancestry individuals (203,244 censored, 15,152 deceased) using Genetic Analysis of Time-to-Event phenotypes (GATE) v.0.40. SNP effects were log(hazard ratios), calculated from a mixed-effect frailty model that adjusted for sex, genotyping batch, birth year and the first ten genomic principal components as fixed effects, and cryptic relatedness using the genetic relatedness matrix as random effects.
Analogously, the same SNPs (if polymorphic) were regressed against the survival of 135,983, unrelated, Japanese-ancestry individuals (97,365 censored, 30,976 deceased) from BioBank Japan74,75. In this analysis, a fixed-effect Cox’s proportional hazards model was fitted using the survival R package v.2.41:
where h(t) is the hazard at time t, given that the subject is alive at time t and h0(t) is the baseline hazard at time t; X1, X2, …, Xn are the vectors of covariates with fixed effects β1, β2, …, βn; and γ is the effect of the vector of SNP dosages G. Covariates were sex, disease status and the first 20 principal components, where disease status refers to 1 of 47 common diseases in Japan used to recruit the individuals. Each SNP was tested in a separate model.
The SNP effects from both studies were converted from log(hazard ratios) to approximate years of life by inverting the sign and multiplying the effect estimate and s.e.s by ten3. For each SNP, a combined effect was calculated by meta-analyzing the cohort-specific effects in a fixed-effect framework (weighted using inverse variance), implemented in the meta R package70 v.4.15-1. One-sided P values were adjusted for multiple testing of 23 loci using Bonferroni’s correction. The collective effect of the 21 remaining loci was calculated using a random-effect framework, to allow for heterogeneity in effect size estimates.
Aging-GIP1 leave-one-out sensitivity analyses
Leave-one-out sensitivity analyses were performed for aging-GIP1, where, one at a time, a core aging trait was excluded and aging-GIP1 loadings and summary statistics were recalculated using the remaining five traits. Genetic and nongenetic correlations were calculated between the original aging-GIP1 and each leave-one-out GIP using HDL inference and null SNP z statistics, as described above.
To test for heterogeneity in genome-wide significant aging-GIP1 loci, we estimated the difference between the lead SNP effect in aging-GIP1 and the effect in the leave-one-out GIP1 GWAS, taking into account null correlations between traits. The s.e. of the difference in effects was calculated as follows:
where s.e. (β1) and s.e. (β2) are the s.e.s of the SNP for aging-GIP1 and the leave-one-out GIP1, respectively, and r is the nongenetic correlation between GWASs. Statistical significance of the difference was assessed using Wald’s test and adjusted for multiple testing of 27 loci using Bonferroni’s correction.
Lookup of known SNP associations
Lead SNP and close proxies (r2EUR ≥ 0.8) of the aging-GIP1 loci were looked up in PhenoScanner27 and the GWAS catalog28 (accessed 3 December 2020), keeping only the traits with genome-wide significance (P < 5 × 10−8). Triallelic SNPs and associations with treatments or medications were discarded, before converting associations with the lack of a phenotype into the phenotype itself by inverting the sign (for example, ‘Qualifications: none’ to ‘Qualifications’). We then further grouped the traits based on similarities in trait names, keeping the strongest association in the group. This grouping was done by partial matching of trait names—verified manually—and keeping the shortest name. For example, ‘Melanoma’, ‘Malignant melanoma’ and ‘Malignant melanoma of skin’ were grouped and renamed to ‘Melanoma’.
Tissue enrichment
Stratified LDSV regression v.1.0.0 was used to stratify aging-GIP1 SNPs into categories and test whether the proportion of SNP heritability in a category exceeded that expected from the proportion of SNPs in the category64. We kept only HapMap3 SNPs, excluding the MHC region and SNPs with MAF < 0.05, and used the 1000 Genomes Phase 3 LDSC reference as weights. Categories tested included the 10 groups summarizing 220 cell-type specific annotations from Finucane et al.64, adjusting for the baseline model (v.1.2).
Gene and pathway enrichment
PASCAL29 was used to aggregate aging-GIP1 SNP-level P values (with and without adjustment for socioeconomic correlations) into gene scores and test these scores for enrichment against predefined gene sets. Gene sets were Hallmark (C1) and Gene Ontology Biological Process (C5.BP) sets from v.7.2 of the Molecular Signatures Database76.
Aging-GIP1 summary statistics were first aligned to the 1000 Genomes SNP build (matching the PASCAL LD reference), before being tested with default PASCAL parameters, which includes discarding SNPs with MAF < 5% and SNPs in the MHC region. Gene results passing a 5% FDR threshold were considered to be significant. For each pathway in the C1 and C5.BP datasets, PASCAL calculated two measures of significance based on χ2 and permutation statistics. We separately adjusted C1 and C5.BP for multiple testing and considered a pathway with both χ2 and permutation statistics passing a 5% FDR threshold to be significant.
Significant C5.BP pathways (with and without adjustment for socioeconomic correlations) were clustered based on their Jaccard’s similarity coefficient (that is, the size of the intersection of genes divided by the size of the union of genes). The mclust R package69 v.5.4.1 was used to maximize the Bayesian information criterion of Jaccard’s similarity matrix to identify the optimal number of clusters (up to a maximum of 100). We used the number of clusters selected by most mclust models to group the pathways.
MR of blood protein levels
Genetic instruments for blood protein levels (pQTLs) were retrieved from Zheng et al.30. We included all tier 1 instruments: cis- and trans-pQTLs shown to influence ≤5 proteins (specificity) with no evidence of heterogeneity in effect sizes between multiple protein expression studies (consistency). A total of 857 proteins had nonpalindromic SNP instruments (898 total pQTLs) present in the aging-GIP1 summary statistics. Two-sample MR of blood protein levels as exposures and aging-GIP1 as outcome was performed using the TwoSampleMR R package31 v.0.5.5. If multiple pQTL instruments were available for a protein, heterogeneity and MR-Egger sensitivity tests were also performed. This analysis was repeated with the six standardized aging-GIP1 component traits as outcome, as well as the unstandardized, combined parental lifespan GWAS from Timmers et al.4 to provide an intuitive measure of the effect. The MR effects and s.e.s from the latter were multiplied by 10 to convert them from units of negative log(hazard ratio) to approximate years of life3.
Aging-GIP1 MR results passing a 5% FDR threshold and sensitivity tests (PHet > 0.05 and PEgger > 0.05, if applicable) were taken forward for follow-up colocalization tests to rule out LD linkage. For proteins for which we had access to genome-wide summary statistics (for example, LPA, β2M and VCAM1), we used the coloc R package32 v.4.0-4 to perform colocalization analysis using the default parameters and denoted PP ≥ 80% as evidence of a shared signal. For instruments without summary statistics, we performed an LD check as described in Zheng et al.30, which involved checking whether any of the 30 strongest aging-GIP1 SNPs in a 1-Mb region centered on each pQTL were in high LD (r2EUR ≥ 0.8) with that pQTL.
Finally, proteins passing sensitivity and colocalization tests were subjected to a reverse-causality test using MR-Steiger77 and, if genome-wide summary statistics were available, a bidirectional MR analysis, both implemented in TwoSampleMR. For the bidirectional MR, we used up to 27 genome-wide significant lead SNPs from aging-GIP1 (shared between GWASs and replacing missing or palindromic SNPs with the next most significant SNP) as instruments and the protein expression statistics as outcome. Proteins significant for the MR-Steiger test (P < 0.05) and showing no evidence of reverse causality in the bidirectional MR (P > 0.05), if applicable, were considered to have robust causal effects on aging-GIP1.
Ethical oversight
BioBank Japan participants provided written informed consent and survival study protocols were approved by BioBank Japan Project ethical review boards from the Institute of Medical Sciences, University of Tokyo and the RIKEN Center for Integrative Medical Sciences. All other data were publicly available and approved by ethical committees as described in their respective publications. See Supplementary Note for FinnGen ethical approval.
Statistics and reproducibility
The statistical methods used to analyze the data are described fully in Methods, with basic data processing done using R v.3.6.0 unless specified otherwise. We used the independent FinnGen and BioBank Japan cohorts to successfully replicate 2 of the 23 new aging-related loci. Predetermination of study sample size, randomization of experiments and blinding of investigators to experiments were not applicable for our type of study.
Reporting Summary
Further information on research design is available in the Nature Research Reporting Summary linked to this article.
Data availability
Aging-related trait GWAS summary statistics can be retrieved or requested from the study authors at http://www.nealelab.is/uk-biobank (self-rated health; field 2178), https://www.longevitygenomics.org/downloads (longevity), and the following: https://doi.org/10.5281/zenodo.1302861 (healthspan), https://doi.org/10.6084/m9.figshare.9204998.v3 (frailty index), https://doi.org/10.5523/bris.21crwsnj4xwjm2g4qi8chathha (perceived age), https://doi.org/10.7488/ds/2463 (parental lifespan), https://doi.org/10.7488/ds/2631 (epigenetic age acceleration), https://doi.org/10.1016/j.ajhg.2020.02.006 (telomere length) and https://doi.org/10.1038/ng.3821(mLOY). Summary statistics for aging-GIP1 calculated in the present study have been deposited in the Edinburgh DataShare repository, available without restrictions at https://doi.org/10.7488/ds/2972. Summary statistics of the GWAS-MAP phenotypes used to calculate phenome-wide genetic correlations are available from GeneAtlas (http://geneatlas.roslin.ed.ac.uk), NealeLab (http://www.nealelab.is/uk-biobank) or their respective publications. The COVID-19 GWAS summary statistics have been made available by the COVID-19 Host Genetics Initiative at https://www.covid19hg.org/results. GWAS catalog and PhenoScanner associations can be found at https://www.ebi.ac.uk/gwas and http://www.phenoscanner.medschl.cam.ac.uk, respectively. Curated gene sets (Hallmark and GO) are available from the Molecular Signatures Database (https://www.gsea-msigdb.org). Source data are provided with this paper, or are available in the supplementary documents and upon request from the corresponding author.
Code availability
The following codes are available: HDL: https://github.com/zhenin/HDL; LDSC: https://github.com/bulik/ldsc; SuSiE: https://stephenslab.github.io/susieR; ANNOVAR: https://annovar.openbioinformatics.org; PASCAL: https://www2.unil.ch/cbg/index.php?title=Pascal; TwoSampleMR: https://mrcieu.github.io/TwoSampleMR. This code can be accessed without restrictions.
References
López-Otín, C., Blasco, M. A., Partridge, L., Serrano, M. & Kroemer, G. The hallmarks of aging. Cell 153, 1194–1217 (2013).
Zenin, A. et al. Identification of 12 genetic loci associated with human healthspan. Commun. Biol. 2, 41 (2019).
Joshi, P. K. et al. Genome-wide meta-analysis associates HLA-DQA1/DRB1 and LPA and lifestyle factors with human longevity. Nat. Commun. 8, 910 (2017).
Timmers, P. R. H. J. et al. Genomics of 1 million parent lifespans implicates novel pathways and common diseases and distinguishes survival chances. eLife 8, e39856 (2019).
Li, C. et al. Genome-wide association analysis in humans links nucleotide metabolism to leukocyte telomere length. Am. J. Hum. Genet. 106, 389–404 (2020).
Forsberg, L. A. et al. Mosaic loss of chromosome Y in peripheral blood is associated with shorter survival and higher risk of cancer. Nat. Genet. 46, 624–628 (2014).
Wright, D. J. et al. Genetic variants associated with mosaic Y chromosome loss highlight cell cycle genes and overlap with cancer susceptibility. Nat. Genet. 49, 674–679 (2017).
Atkins, J. et al. A genome-wide association study of the frailty index highlights synaptic pathways in aging. Aging Cell 20, e13459 (2019).
Mitnitski, A. B., Mogilner, A. J. & Rockwood, K. Accumulation of deficits as a proxy measure of aging. Sci. World J. 1, 323–336 (2001).
Ruby, J. G. et al. Estimates of the heritability of human longevity are substantially inflated due to assortative mating. Genetics 210, 1109–1124 (2018).
Timmers, P. R. H. J., Wilson, J. F., Joshi, P. K. & Deelen, J. Multivariate genomic scan implicates novel loci and haem metabolism in human ageing. Nat. Commun. 11, 3570 (2020).
Melzer, D., Pilling, L. C. & Ferrucci, L. The genetics of human ageing. Nat. Rev. Genet. 21, 88–101 (2020).
Shen, X. et al. Multivariate discovery and replication of five novel loci associated with immunoglobulin G N-glycosylation. Nat. Commun. 8, 447 (2017).
Ning, Z. et al. Beyond power: multivariate discovery, replication, and interpretation of pleiotropic loci using summary association statistics. Preprint at bioRxiv https://doi.org/10.1101/022269 (2015).
Tsepilov, Y. A. et al. Analysis of genetically independent phenotypes identifies shared genetic factors associated with chronic musculoskeletal pain conditions. Commun. Biol. 3, 329 (2020).
Harris, S. E. et al. Molecular genetic contributions to self-rated health. Int. J. Epidemiol. 46, 994–1009 (2017).
Deelen, J. et al. A meta-analysis of genome-wide association studies identifies multiple longevity genes. Nat. Commun. 10, 3669 (2019).
Roberts, V., Main, B., Timpson, N. J. & Haworth, S. Genome-wide association study identifies genetic associations with perceived age. J. Invest. Dermatol. 140, 2380–2385 (2020).
Gibson, J. et al. A meta-analysis of genome-wide association studies of epigenetic age acceleration. PLoS Genet. 15, e1008104 (2019).
Shashkova, T. I. et al. The GWAS-MAP platform for aggregation of results of genome-wide association studies and the GWAS-MAP| homo database of 70 billion genetic associations of human traits. Vavilov J. Genetics Breeding 24, 876–884 (2020).
Ning, Z., Pawitan, Y. & Shen, X. High-definition likelihood inference of genetic correlations across human complex traits. Nat. Genet. 52, 859–864 (2020).
ICD-10: International Statistical Classification of Diseases and Related Health Problems: Tenth Revision 2nd edn (World Health Organization, 2004); https://apps.who.int/iris/handle/10665/42980
Hill, W. D. et al. Molecular genetic contributions to social deprivation and household income in UK Biobank. Curr. Biol. 26, 3083–3089 (2016).
Abdellaoui, A., Verweij, K. J. H. & Nivard, M. G. Geographic confounding in genome-wide association studies. Preprint at bioRxiv https://doi.org/10.1101/2021.03.18.435971 (2021).
Abdellaoui, A. et al. Genetic correlates of social stratification in Great Britain. Nat. Hum. Behav. 3, 1332–1342 (2019).
Wang, G., Sarkar, A., Carbonetto, P. & Stephens, M. A simple new approach to variable selection in regression, with application to genetic fine mapping. J. R. Stat. Soc. Ser. B Stat. Methodol. 82, 1273–1300 (2020).
Kamat, M. A. et al. PhenoScanner V2: an expanded tool for searching human genotype–phenotype associations. Bioinformatics 35, 4851–4853 (2019).
Buniello, A. et al. The NHGRI-EBI GWAS catalog of published genome-wide association studies, targeted arrays and summary statistics 2019. Nucleic Acids Res. 47, D1005–D1012 (2019).
Lamparter, D., Marbach, D., Rueedi, R., Kutalik, Z. & Bergmann, S. Fast and rigorous computation of gene and pathway scores from SNP-based summary statistics. PLoS Comput. Biol. 12, e1004714 (2016).
Hemani, G. et al. The MR-base platform supports systematic causal inference across the human phenome. eLife 7, e34408 (2018).
Zheng, J. et al. Phenome-wide Mendelian randomization mapping the influence of the plasma proteome on complex diseases. Nat. Genet. 52, 1122–1131 (2020).
Giambartolomei, C. et al. Bayesian test for colocalisation between pairs of genetic association studies using summary statistics. PLoS Genet. 10, e1004383 (2014).
Boffa, M. B., Marcovina, S. M. & Koschinsky, M. L. Lipoprotein(a) as a risk factor for atherosclerosis and thrombosis: mechanistic insights from animal models. Clin. Biochem. 37, 333–343 (2004).
Nordestgaard, B. G. et al. Lipoprotein(a) as a cardiovascular risk factor: current status. Eur. Heart J. 31, 2844–2853 (2010).
Viney, N. J. et al. Antisense oligonucleotides targeting apolipoprotein(a) in people with raised lipoprotein(a): two randomised, double-blind, placebo-controlled, dose-ranging trials. Lancet 388, 2239–2253 (2016).
Arsenault, B. J. et al. Association of long-term exposure to elevated lipoprotein(a) levels with parental life span, chronic disease-free survival, and mortality risk: a Mendelian randomization analysis. JAMA Netw. Open 3, e200129 (2020).
Allen, S. et al. Expression of adhesion molecules by Lp(a): a potential novel mechanism for its atherogenicity. FASEB J. 12, 1765–1776 (1998).
Cho, T., Jung, Y. & Koschinsky, M. L. Apolipoprotein(a), through its strong lysine-binding site in KIV 10, mediates increased endothelial cell contraction and permeability via a rho/rho kinase/MYPT1-dependent pathway. J. Biol. Chem. 283, 30503–30512 (2008).
Kong, D. H., Kim, Y. K., Kim, M. R., Jang, J. H. & Lee, S. Emerging roles of vascular cell adhesion molecule-1 (VCAM-1) in immunological disorders and cancer. Int. J. Mol. Sci. 19, 13–17 (2018).
Yousef, H. et al. Aged blood impairs hippocampal neural precursor activity and activates microglia via brain endothelial cell VCAM1. Nat. Med. 25, 988–1000 (2019).
Smith, L. K. et al. Β2-microglobulin Is a systemic pro-aging factor that impairs cognitive function and neurogenesis. Nat. Med. 21, 932–937 (2015).
Grote, A. et al. LRP12 silencing during brain development results in cortical dyslamination and seizure sensitization. Neurobiol. Dis. 86, 170–176 (2016).
Bethge, N. et al. A gene panel, including LRP12, is frequently hypermethylated in major types of B-cell lymphoma. PLoS ONE 9, e104249 (2014).
Garnis, C., Coe, B. P., Zhang, L., Rosin, M. P. & Lam, W. L. Overexpression of LRP12, a gene contained within an 8q22 amplicon identified by high-resolution array CGH analysis of oral squamous cell carcinomas. Oncogene 23, 2582–2586 (2004).
Nakaya, N., Sultana, A., Lee, H. S. & Tomarev, S. I. Olfactomedin 1 interacts with the Nogo A receptor complex to regulate axon growth. J. Biol. Chem. 287, 37171–37184 (2012).
Shi, W. et al. Olfactomedin 1 negatively regulates NF-κB signalling and suppresses the growth and metastasis of colorectal cancer cells. J. Pathol. 240, 352–365 (2016).
Cheng, A. et al. Pancortin-2 interacts with WAVE1 and Bcl-xL in a mitochondria-associated protein complex that mediates ischemic neuronal death. J. Neurosci. 27, 1519–1528 (2007).
Yang, C. et al. Genomic atlas of the proteome from brain, CSF and plasma prioritizes proteins implicated in neurological disorders. Nat. Neurosci. 24, 1302–1312 (2021).
Rebo, J. et al. A single heterochronic blood exchange reveals rapid inhibition of multiple tissues by old blood. Nat. Commun. 7, 13363(2016).
Katsimpardi, L. et al. Vascular and neurogenic rejuvenation of the aging mouse brain by young systemic factors. Science 344, 630–634 (2014).
Balla, J. et al. Haem, haem oxygenase and ferritin in vascular endothelial cell injury. Nephrol. Dial. Transplant. 18, v8–v12 (2003).
Higashi, Y. & Yoshizumi, M. Endothelial function. Jpn. J. Clin. Med. 61, 1138–1144 (2003).
Engelhardt, B. Development of the blood–brain barrier. Cell Tissue Res. 314, 119–129 (2003).
Lakatta, E. G. & Levy, D. Arterial and cardiac aging: major shareholders in cardiovascular disease enterprises. Part I: aging arteries: a ‘set up’ for vascular disease. Circulation 107, 139–146 (2003).
Montagne, A. et al. APOE4 leads to blood–brain barrier dysfunction predicting cognitive decline. Nature 581, 71–76 (2020).
Popov, N. et al. The ubiquitin-specific protease USP28 is required for MYC stability. Nat. Cell Biol. 9, 765–774 (2007).
Serrano, M. The tumor suppressor protein p16INK4a. Exp. Cell. Res. 237, 7–13 (1997).
Li, F. et al. USP28 regulates deubiquitination of histone H2A and cell proliferation. Exp. Cell. Res. 379, 11–18 (2019).
Baker, D. J. et al. Naturally occurring p16Ink4a-positive cells shorten healthy lifespan. Nature 530, 184–189 (2016).
Timmers, P. R. H. J. et al. Mendelian randomization of genetically independent aging phenotypes identifies LPA and VCAM1 as biological targets for human aging [dataset]. University of Edinburgh https://doi.org/10.7488/ds/2972 (2020).
Zenin, A. et al. Genome-wide association summary statistics for human healthspan (Version 1) [dataset]. Zenodo https://doi.org/10.5281/zenodo.1302861 (2018).
Neale, B. UK Biobank—Neale lab. Neale Lab http://www.nealelab.is/uk-biobank/%0A (2018).
Winkler, T. W. et al. Quality control and conduct of genome-wide association meta-analyses. Nat. Protoc. 9, 1192–1212 (2014).
Finucane, H. K. et al. Partitioning heritability by functional annotation using genome-wide association summary statistics. Nat. Genet. 47, 1228–1235 (2015).
Canela-Xandri, O., Rawlik, K. & Tenesa, A. An atlas of genetic associations in UK Biobank. Nat. Genet. 50, 1593–1599 (2018).
Shashkova, T. I. et al. PheLiGe: an interactive database of billions of human genotype–phenotype associations. Nucleic Acids Res. 49, D1347–D1350 (2021).
Ying, K. et al. Genetic and phenotypic analysis of the causal relationship between aging and COVID-19. Commun. Med. 1, 35 (2021).
The COVID-19 Host Genetics Initiative. The COVID-19 Host Genetics Initiative, a global initiative to elucidate the role of host genetic factors in susceptibility and severity of the SARS-CoV-2 virus pandemic. Eur. J. Hum. Genet. 28, 715–718 (2020).
Scrucca, L., Fop, M., Murphy, T. B. & Raftery, A. E. Mclust 5: clustering, classification and density estimation using Gaussian finite mixture models. R J. 8, 289–317 (2016).
Schwarzer, G., Carpenter, J. R. & Rücker, G. Meta-Analysis with R (Springer International Publishing, 2015).
Bycroft, C. et al. The UK Biobank resource with deep phenotyping and genomic data. Nature 562, 203–209 (2018).
Wang, K., Li, M. & Hakonarson, H. ANNOVAR: functional annotation of genetic variants from high-throughput sequencing data. Nucleic Acids Res. 38, e164–e164 (2010).
Dey, R. et al. An efficient and accurate frailty model approach for genome-wide survival association analysis controlling for population structure and relatedness in large-scale biobanks. Preprint at bioRxiv https://doi.org/10.1101/2020.10.31.358234 (2020).
Hirata, M. et al. Overview of BioBank Japan follow-up data in 32 diseases. J. Epidemiol. 27, S22–S28 (2017).
Sakaue, S. et al. Trans-biobank analysis with 676,000 individuals elucidates the association of polygenic risk scores of complex traits with human lifespan. Nat. Med. 26, 542–548 (2020).
Liberzon, A. et al. The molecular signatures database Hallmark gene set collection. Cell Syst. 1, 417–425 (2015).
Hemani, G., Tilling, K. & Smith, G. D. Orienting the causal relationship between imprecisely measured traits using genetic instruments. Preprint at bioRxiv https://doi.org/10.1101/117101 (2017).
Acknowledgements
We thank the UK Biobank Resource, approved under application 19655. We thank the Medical Research Council Human Genetics Unit for their funding (to P.R.H.J.T. and J.F.W.: grant no. MC_UU_00007/10), and also the University of Edinburgh (to P.K.J.), the Russian Ministry of Education and Science 5-100 Excellence Programme (to E.S.T. and Y.A.T.) and the Ministry of Education and Science of the RF via the Institute of Cytology and Genetics SB RAS (to E.S.T. and Y.A.T.: grant no. 0259-2021-0009/AAAA-A17-117092070032-4). The FinnGen project and authors (T.T.J.K., W.Z. and A.G.) acknowledge funding from Business Finland (nos. HUS 4685/31/2016 and UH 4386/31/2016) as well as the industry partners AbbVie Inc., AstraZeneca UK Ltd., Biogen MA Inc., Celgene Corporation, Celgene International II Sàrl, Genentech Inc., Merck Sharp & Dohme Corp, Pfizer Inc., GlaxoSmithKline Intellectual Property Development Ltd., Sanofi US Services Inc., Maze Therapeutics Inc., Janssen Biotech Inc. and Novartis AG. The following biobanks are acknowledged for their samples in the FinnGen project: Auria Biobank (www.auria.fi/biopankki), THL Biobank (www.thl.fi/biobank), Helsinki Biobank (www.helsinginbiopankki.fi), Biobank Borealis of Northern Finland (https://www.ppshp.fi/Tutkimus-ja-opetus/Biopankki/Pages/Biobank-Borealis-briefly-in-English.aspx), Finnish Clinical Biobank Tampere (www.tays.fi/en-US/Research_and_development/Finnish_Clinical_Biobank_Tampere), Biobank of Eastern Finland (www.ita-suomenbiopankki.fi/en), Central Finland Biobank (www.ksshp.fi/fi-FI/Potilaalle/Biopankki), Finnish Red Cross Blood Service Biobank (www.veripalvelu.fi/verenluovutus/biopankkitoiminta) and Terveystalo Biobank (www.terveystalo.com/fi/Yritystietoa/Terveystalo-Biopankki/Biopankki). All Finnish Biobanks are members of BBMRI.fi infrastructure (www.bbmri.fi) and FinBB (https://finbb.fi). The Framingham Heart Study is funded by a National Institutes of Health (NIH) contract (nos. N01-HC-25195, HHSN268201500001I and 75N92019D00031; Boston University). This project was funded in part by the Division of Intramural Research, National Heart, Lung and Blood Institute (NHLBI), NIH, Bethesda, MD. The views expressed in this manuscript are those of the authors and do not necessarily represent the views of the NHLBI, the NIH or the US Department of Health and Human Services. None of the funders had any role in study design, data collection and analysis, decision to publish or preparation of the manuscript. For the purpose of Open Access, the authors have applied a CC-BY public copyright license to any author accepted manuscript version arising from this submission.
Author information
Authors and Affiliations
Consortia
Contributions
P.R.H.J.T. conceived the project, provided the methodology, software, validation, formal analysis and investigation, wrote the original draft and reviewed and edited it, and visualized and administered the project. E.S.T. and J.D. provided formal analysis, and reviewed and edited the manuscript. S.S., M.A., T.T.J.K. and W.Z. provided validation and formal analysis. S.-J.H. and C.Y. provided formal analysis. D.L. provided resources, reviewed and edited the manuscript, and supervised the project. A.G. and Y.O. provided resources and supervised the project. Y.K. provided resources and curated the data. P.K.J. reviewed and edited the manuscript, and supervised the project. J.F.W. provided resources, reviewed and edited the manuscript, supervised the project and acquired funding. Y.A.T. conceived the project, provided the methodology, software and formal analysis, reviewed and edited the manuscript, and supervised the project.
Corresponding author
Ethics declarations
Competing interests
P.R.H.J.T. is a salaried employee of BioAge Labs Inc. P.K.J. is a paid consultant for Humanity Inc. and Global Gene Corporation. The remaining authors declare no competing interests.
Peer review
Peer review information
Nature Aging thanks Xiao Dong and the other, anonymous, reviewer(s) for their contribution to the peer review of this work.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Extended data
Extended Data Fig. 1 Non-genetic correlations between ageing-related GWASs.
Off-diagonal values show the phenotypic correlation between GWASs observed due to sample overlap. Correlations were calculated using the Z scores of independent SNPs (provided by the MultiABEL R package)13 that were non-significant in both studies (|z| < 1.96). The number of SNPs used for each pairwise correlation is reported in Supplementary Data 1. EAA, Epigenetic Age Acceleration; mLOY, mosaic Loss of Y chromosome.
Extended Data Fig. 2 Adjustment of aging-GIPs to remove genetic correlations with socioeconomic status and household income.
a) Genetic correlations of aging-GIPs with UK Biobank GWASs of Townsend’s deprivation index and household income. In black are the original correlations before adjustment, in red are the correlations after adjustment. Error bars represent 95% confidence intervals based on 1,027,336 SNPs (99.75%) from the HDL R package2 LD reference panel. b) LD score plot showing the relationship between LD scores and χ2 statistics of aging-GIP1 SNPs, before and after adjustment. Each dot represents an LD score percentile. The x-axis shows the mean LD score of variants in each LD score percentile, and the y-axis shows the mean χ2 statistics of SNPs in that LD score percentile. The weighted slope of the regression can be used as a measure of the trait heritability.
Extended Data Fig. 3 Phenome-wide genetic correlations of aging-GIPs adjusted for household income and Townsend’s deprivation index.
Shown are the genetic correlations of the 25 GWAS-MAP clusters (defined in Fig. 3) with aging-GIPs adjusted for socioeconomic status and household income (as measured in UK Biobank). Values failing to pass nominal significance (two-sided Wald test P ≥ 0.05) are greyed out. See Supplementary Data 3 for the full list of correlations.
Extended Data Fig. 4 Effect sizes of lead aging-GIP1 SNPs adjusted for household income and Townsend’s deprivation index.
The x-axis shows the mean effect size estimate of the lead SNPs associated with aging-GIP1 at Bonferroni-corrected genome-wide significance (P < 8.3 × 10−9), with 95% confidence intervals represented by lines. The estimated sample sizes are 154,478 and 195,015 for the original and adjusted aging-GIP1 GWAS, respectively. The y-axis is annotated with the nearest gene(s) or cytogenetic band to each locus, with the lead SNP and the aging-GIP1-increasing allele in parentheses. In black are the original aging-GIP1 estimates, while in red are the estimates after removing all genetic correlations with GWASs of UK Biobank socioeconomic status and household income.
Extended Data Fig. 5 MR of blood protein levels on aging-GIP1 adjusted for household income and Townsend’s deprivation index.
Exposures significant for a causal effect on aging-GIP1 were tested for their effect on the aging-GIP1 residual phenotype, which was adjusted to remove all genetic correlations with GWASs of socioeconomic status and household income in UK Biobank. The x-axis shows the MR effect estimates, with lines representing 95% confidence intervals, as estimated by TwoSampleMR31. The sample sizes of exposure GWASs were 6,861 for LPA; 3,301 for VCAM1; and 3,200 for OLFM1 and LRP12. In red is the original aging-GIP1 causal effect estimate, while in red is the effect with the adjusted aging-GIP1 phenotype. LPA—apolipoprotein(a). VCAM1—vascular cell adhesion molecule 1. OLFM1—olfactomedin 1. LRP12— LDL receptor related protein 12.
Supplementary information
Supplementary Information
Supplementary Tables 1–8 and Note.
Supplementary Data 1
Genetic and non-genetic correlations between 11 ageing-related trait GWAS. Sample Size—Total sample size reported in the original study (cases + controls, if applicable). SNPs in panel—Number of SNPs shared between the GWAS and the High-definition likelihood inference of genetic correlations (HDL) reference panel (maximum of 1,029,876 SNPs). h2SNP—Observed scale SNP heritability estimated by HDL. SE—standard error of the estimate. Beta—Correlation estimate. Phenotypic correlation—correlation between Z scores of independent null SNP (Z < 1.96), capturing both correlations due to phenotypic similarity and sample overlap. nSNP—Number of independent null SNPs used in the phenotypic correlation calculation.
Supplementary Data 2
Genetic correlations of core ageing-related trait GWAS and GWAS-MAP phenotypes. [Shared] Genetic correlations showing no evidence of heterogeneity between ageing-related traits. rg_meta—fixed-effect inverse-variance genetic correlation of all six core ageing-related trait GWAS. se_meta—standard error of the meta-analysed genetic correlation. p—p value of the genetic correlation estimate. p_het—Cochran’s Q heterogeneity P value. i2—I2 heterogeneity statistic. pmid—PubMed identifier. [Unique] Genetic correlations which are uniquely strong or weak for a single ageing-related trait. outlier—ageing-related trait showing the uniquely strong or weak genetic correlation with the GWAS-MAP trait, compared to the other ageing-related traits. rg_outlier—genetic correlation estimate of the outlier trait with the GWAS-MAP trait. se_outlier—standard error of the estimate. rg_meta—fixed-effect inverse-variance-weighted meta-analysis of the genetic correlation estimates of the remaining ageing-related traits (that is excluding the outlier). se_meta—standard error of the meta-analysed genetic correlation estimate. i2_omit—I2 heterogeneity statistic after excluding the specified trait, where “fathlife” refers to Father lifespan, “resilience” refers to the inverse of Frailty index, “healthspan” refers to Healthspan, “longevity” refers to Exceptional longevity, “mothlife” refers to Mother lifespan, and “selfhealth” refers to Self-reported health. [All correlations] All genetic correlation data. [All heterogeneity] All statistics on heterogeneity between ageing-related trait GWAS.
Supplementary Data 3
Genetic correlations of the six Ageing-GIPs and GWAS-MAP phenotypes. Original GWAS correlations—Genetic correlations of the unadjusted Ageing-GIP1 GWAS. Adjusted GWAS correlations—Genetic correlations of Ageing-GIP1 after adjustment to remove all correlations with UK Biobank GWAS of household income and Townsend deprivation index. Beta—Genetic correlation estimate. SE—Standard error for the estimate. P—Two-sided nominal Wald test P value for the correlation estimate. Pbonf—Two-sided Wald test P value, Bonferroni-adjusted for testing 729 traits and six GIPs. Collection—Collection of GWAS the tested trait belongs to, as noted by GWAS-MAP. PMID—PubMed identifier.
Supplementary Data 4
SuSiE fine-mapping of genome-wide significant GIP1 loci. SNPs contained within each 95% credible set for Ageing-GIP1 loci are listed, along with their inclusion probability and functional annotation. Locus—Nearest gene(s) to the lead variant in the Ageing-GIP1 locus within 250 kb. rsID—SNP identifier of the variant in the 95% credible set. Chr—Chromosome. Position—Base-pair position (GRCh37). Set ID—SuSiE identifier for the set. PIP—Posterior inclusion probability of the variant. Annotation—Functional annotation of the variant based on its position. Gene(s)—Genes nearest to the best variant. Gene distance—Respective distance of each gene to the variant.
Supplementary Data 5
Colocalisation of genetic signals of Ageing-GIP1 loci with the original ageing-related trait GWAS. Ageing-GIP1 loci were tested for colocalisation with GWAS that had at least one SNP reaching genome-wide significance (P < 5×10−8) for the ageing related trait in the locus. Nearest gene(s)— Nearest gene or cytogenetic band to the lead Ageing-GIP1 SNP in the locus. Lead SNP—rsID of the SNP with the lowest P value. P—P value of the lead SNP. NSNP—Number of SNPs in the locus used in the colocalisation analysis. PP.H0.abf to PP.H4.abf —Posterior probability calculated from approximate Bayes factors in favour of the hypothesis H0 to H4, where H0: no association with either trait, H1: association with GIP1 only, H2: association with the trait only, H3: association with both traits, but with different causal variants, H4: association with both traits with a shared causal variant. Traits with PP.H4.abf > 0.8 are listed in bold.
Supplementary Data 6
Changes in lead Ageing-GIP1 SNP effects when excluding one of the core ageing-related trait GWAS. Nearest gene(s)—Nearest gene or cytogenetic band to the lead Ageing-GIP1 SNP in the locus. rsID—lead SNP of the locus in the main Ageing-GIP1 GWAS. Difference—Difference in effect size of the SNP in the main Ageing-GIP1 GWAS and the Ageing-GIP1 GWAS constructed from five core ageing-related trait GWAS (excluding the listed trait). SE—Standard error. P—Nominal two-sided Wald test P value for the difference. Loci with differences passing Bonferroni correction for 27 tests are marked in bold. N Het—Number of scenarios in which the SNP effect size is significantly different when excluding one of the traits.
Supplementary Data 7
Effect sizes and P values of Ageing-GIP1 loci before and after removing genetic correlations with UK Biobank GWAS of household income and Townsend deprivation index. Original—Ageing-GIP1 GWAS statistics before adjustment. Adjusted—Ageing-GIP1 GWAS statistics after adjustment. Nearest gene(s)— Nearest gene or cytogenetic band to the lead Ageing-GIP1 SNP in the locus. rsID—lead SNP of the locus. Chr—Chromosome. Position—Base-pair position (GRCh37). A1—Effect allele, increasing Ageing-GIP1. A0—Non-effect allele. Freq1—Allele frequency of the A1 allele. Beta1—Effect size of the A1 allele. SE—Standard Error. P—Nominal Wald P value.
Supplementary Data 8
Lookup of Ageing-GIP1 loci in the GWAS catalog and PhenoScanner. Locus—Nearest gene or cytogenetic band to the lead Ageing-GIP1 SNP in the locus. Trait—Shortened trait name. rsID—lead Ageing-GIP1 SNP of the locus. A1—Effect allele, associated with increased Ageing-GIP1. Proxy—SNP in linkage disequilibrium (r2EUR ≥ 0.8) with the lead SNP, for which the trait association statistics are reported. P.A1—Effect allele for the proxy, matching the lead SNP A1 allele. R2—Linkage disequilibrium r2 value. Beta1—Reported effect size associated with carrying one copy of the P.A1 allele. SE—Standard error. P—Reported P value. Units—Reported units for the effect size. PMID—PubMed identifier.
Supplementary Data 9
PASCAL gene enrichment analysis. Original GIP1—Gene scores calculated from unadjusted Ageing-GIP1 statistics. Adjusted GIP1—Gene scores calculated from Ageing-GIP1 statistics adjusted to remove all genetic correlations with UK Biobank GWAS of household income and Townsend deprivation index. Gene—HGNC gene symbol. Entrez ID—NCBI Entrez identifier for the gene. Chr—chromosome. Start—Base-pair position of the start of the coding region of the gene (GRCh37). End—Base-pair position of the end of the coding region of the gene (GRCh37). Nsnps—Number of SNPs within the gene region shared between the PASCAL reference and the GIP1 GWAS. P—Enrichment P value (based on weighted χ2 distribution). FDR—Benjamini-Hochberg-adjusted P value. Genes have been ordered by enrichment P value derived from original GIP1 statistics, and where tied, the enrichment P value derived from adjusted GIP1 statistics.
Supplementary Data 10
PASCAL gene set enrichment analysis. [Hallmark] Hallmark (C1) gene sets from the Molecular Signatures Database. Pathway—Gene set name. Nominal—Chi-square-based enrichment statistics. Empirical—Permutation-based enrichment statistics. P—P value for enrichment, not yet corrected for multiple testing. FDR—Benjamini-Hochberg-adjusted P value. GIP1 (adj) refers to pathway enrichment using gene scores based on the Ageing-GIP1 GWAS adjusted to remove genetic correlations with UK Biobank GWAS of household income and Townsend deprivation index. [GO clusters] Gene ontology (C5.BP) pathways reaching FDR < 5% in both original and adjusted Ageing-GIP1 analyses, clustered based on pathway Jaccard similarity. Cluster—Pathway grouping based on the maximised BIC criterion of the majority of mclust models (see accompanying figure). [GO all] All Gene ontology (C5.BP) pathways tested by PASCAL.
Supplementary Data 11
Technical details and sensitivity tests of the proteome-wide Mendelian randomisation and colocalisation analysis. [MR] Statistics for all 857 exposures. Exposure—Blood protein levels used as exposure. NSNP—Number of SNPs used as instrumental variable for the exposure. Method—method used to calculate the MR effect: Wald ratio for single instrument SNPs, inverse-variance-weighted test for multiple instruments. Beta—MR effect estimate. SE—Standard error of the MR effect estimate. P—Nominal, two-sided P value for the MR effect. FDR—Benjamini-Hochberg-adjusted P value. Phet—P value for heterogeneity (requires at least two SNPs). MR-Egger intercept estimate (Beta), standard error (Se), and P value (P), if enough instruments were available. [LD Check] Lookup of the 30 most significant Ageing-GIP1 SNPs surrounding the lead pQTL instrument. Lead—the pQTL used as instrument for the exposure. rsID—SNP identifier of the 30 most significant Ageing-GIP1 SNPs near the lead pQTL. Chr—chromosome. Position—Base-pair position (GRCh37). P—Nominal two-sided P value of the SNP association with Ageing-GIP1. R2—linkage disequilibrium r2 of the SNP with the pQTL. Check—check whether the SNP is at least r2 > 0.8 with the pQTL. Pass—does the locus pass the LD check? [Colocalisation] coloc test for the probability that the Ageing-GIP1 GWAS and pQTL GWAS signals colocalise. NSNP—number of SNPs used for the coloc test. PP—Posterior probability of the hypothesis, where H0 is the hypothesis of no association in either GWAS, H1 is the hypothesis that the signal is present in the pQTL GWAS only, H2 is the hypothesis that the signal is present in the Ageing-GIP1 GWAS only, H3 is the hypothesis that the signal is present in both GWAS, but is not shared, and H4 is the hypothesis that the signal is shared between GWAS. coloc—Posterior probability of colocalisation (H4) of more than 80%? [MR Steiger] Steiger causal directionality test. r.exposure—SNP variance explained by the exposure. r.outcome—SNP variance explained by the outcome. steiger_pval—P value that the exposure and outcome have the correct causal direction. [Reverse MR] MR analysis of Ageing-GIP1 on the exposure. NSNP—Number of instrumental variables shared between GWAS and used in the analysis. See [MR] for remaining header details.
Source data
Source Data Fig. 1
Statistical source data.
Source Data Fig. 2
Statistical source data.
Source Data Fig. 3
Statistical source data.
Source Data Fig. 4
Statistical source data.
Source Data Fig. 5
Statistical source data.
Rights and permissions
About this article
Cite this article
Timmers, P.R.H.J., Tiys, E.S., Sakaue, S. et al. Mendelian randomization of genetically independent aging phenotypes identifies LPA and VCAM1 as biological targets for human aging. Nat Aging 2, 19–30 (2022). https://doi.org/10.1038/s43587-021-00159-8
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1038/s43587-021-00159-8
This article is cited by
-
Mendelian randomization analyses reveal causal relationships between the human microbiome and longevity
Scientific Reports (2023)
-
Locus for severity implicates CNS resilience in progression of multiple sclerosis
Nature (2023)
-
Unravelling genetic components of longevity
Nature Aging (2022)