Abstract
Machine learning models based on DNA methylation data can predict biological age but often lack causal insights. By harnessing large-scale genetic data through epigenome-wide Mendelian randomization, we identified CpG sites potentially causal for aging-related traits. Neither the existing epigenetic clocks nor age-related differential DNA methylation are enriched in these sites. These CpGs include sites that contribute to aging and protect against it, yet their combined contribution negatively affects age-related traits. We established a new framework to introduce causal information into epigenetic clocks, resulting in DamAge and AdaptAge—clocks that track detrimental and adaptive methylation changes, respectively. DamAge correlates with adverse outcomes, including mortality, while AdaptAge is associated with beneficial adaptations. These causality-enriched clocks exhibit sensitivity to short-term interventions. Our findings provide a detailed landscape of CpG sites with putative causal links to lifespan and healthspan, facilitating the development of aging biomarkers, assessing interventions, and studying reversibility of age-associated changes.
This is a preview of subscription content, access via your institution
Access options
Access Nature and 54 other Nature Portfolio journals
Get Nature+, our best-value online-access subscription
$29.99 / 30 days
cancel any time
Subscribe to this journal
Receive 12 digital issues and online access to articles
$119.00 per year
only $9.92 per issue
Buy this article
- Purchase on Springer Link
- Instant access to full article PDF
Prices may be subject to local taxes which are calculated during checkout
Similar content being viewed by others
Data availability
All analyses in this study were conducted using publicly available data. The datasets used (Table 1) include longevity GWAS summary statistics (https://www.longevitygenomics.org/downloads/), parental lifespan GWAS summary statistics (https://datashare.ed.ac.uk/handle/10283/3209/), healthspan GWAS summary statistics (https://www.gwasarchive.org/), frailty index GWAS summary statistics (https://figshare.com/articles/dataset/Genome-Wide_Association_Study_of_the_Frailty_Index_-_Atkins_et_al_2019/9204998/), epigenetic age acceleration GWAS summary statistics (https://datashare.ed.ac.uk/handle/10283/3645/) and Gene Expression Omnibus datasets (GSE107143, GSE127985, GSE192918, GSE193795, GSE210245, GSE51954, GSE94876, GSE98056, GSE101673, GSE78773, GSE90117, GSE79257 and GSE42865). Any other data generated in this study upon which conclusions are based are available in Supplementary Tables 1–3. Source data are provided with this paper.
Code availability
MR analyses were conducted using the R packages TwoSampleMR version 0.5.6 (https://mrcieu.github.io/TwoSampleMR/) and MendelianRandomization version 0.7.0 (https://cran.r-project.org/web/packages/MendelianRandomization/index.html). Genetic correlation analysis was performed using LDSC software v1.0.1 (https://github.com/bulik/ldsc/). Mediation analysis was performed using smr-ivw v1.0 (https://github.com/masadler/smrivw/). Colocalization analysis was performed using PWCoCo v1.0 (https://github.com/jwr-git/pwcoco/). The elastic net model was trained using glmnet v4.1 (https://cran.r-project.org/web/packages/glmnet/index.html). Custom code used is available in the Supplementary Information. Algorithms of CausAge, DamAge and AdaptAge are available in the Supplementary Information, as well as ClockBase (https://www.clockbase.org/) and the bio-learn Python package (https://bio-learn.github.io/)62,63.
References
Gladyshev, V. N. et al. Molecular damage in aging. Nat. Aging 1, 1096–1106 (2021).
Sziráki, A., Tyshkovskiy, A. & Gladyshev, V. N. Global remodeling of the mouse DNA methylome during aging and in response to calorie restriction. Aging Cell 17, e12738 (2018).
Moqri, M. et al. PRC2 clock: a universal epigenetic biomarker of aging and rejuvenation. Preprint at bioRxiv https://doi.org/10.1101/2022.06.03.494609 (2022).
Petkovich, D. A. et al. Using DNA methylation profiling to evaluate biological age and longevity interventions. Cell Metab. 25, 954–960 (2017).
Meer, M. V., Podolskiy, D. I., Tyshkovskiy, A. & Gladyshev, V. N. A whole lifespan mouse multi-tissue DNA methylation clock. eLife 7, e40675 (2018).
Urban, A. et al. Precious1GPT: multimodal transformer-based transfer learning for aging clock development and feature importance analysis for aging and age-related disease target discovery. Aging 15, 4649–4666 (2023).
Marioni, R. E. et al. The epigenetic clock is correlated with physical and cognitive fitness in the Lothian Birth Cohort 1936. Int. J. Epidemiol. 44, 1388–1396 (2015).
Moqri, M et al. Biomarkers of Aging Consortium. Biomarkers of aging for the identification and evaluation of longevity interventions. Cell 31, 3758–3775 (2023).
Ying, K. et al. Genetic and phenotypic analysis of the causal relationship between aging and COVID-19. Commun. Med. 1, 35 (2021).
Kapur, K. Chapter 14 - Principles of Biostatistics. in Clinical and Translational Science (Second Edition) (eds. Robertson, D. & Williams, G. H.) 243–260 (Academic Press, 2017). https://doi.org/10.1016/B978-0-12-802101-9.00014-4
Davey Smith, G. & Hemani, G. Mendelian randomization: genetic anchors for causal inference in epidemiological studies. Hum. Mol. Genet. 23, R89–R98 (2014).
Min, J. L. et al. Genomic and phenotypic insights from an atlas of genetic effects on DNA methylation. Nat. Genet. 53, 1311–1321 (2021).
Richardson, T. G. et al. Systematic Mendelian randomization framework elucidates hundreds of CpG sites which may mediate the influence of genetic variants on disease. Hum. Mol. Genet. 27, 3293–3304 (2018).
Emdin, C. A., Khera, A. V. & Kathiresan, S. Mendelian randomization. JAMA 318, 1925–1926 (2017).
Timmers, P. R. et al. Genomics of 1 million parent lifespans implicates novel pathways and common diseases and distinguishes survival chances. eLife 8, e39856 (2019).
Deelen, J. et al. A meta-analysis of genome-wide association studies identifies multiple longevity genes. Nat. Commun. 10, 3669 (2019).
Zenin, A. et al. Identification of 12 genetic loci associated with human healthspan. Commun. Biol. 2, 41 (2019).
Atkins, J. L. et al. A genome‐wide association study of the frailty index highlights brain pathways in ageing. Aging Cell 20, e13459 (2021).
Timmers, P. R. H. J. et al. Mendelian randomization of genetically independent aging phenotypes identifies LPA and VCAM1 as biological targets for human aging. Nat. Aging 2, 19–30 (2022).
Timmers, P. R. H. J., Wilson, J. F., Joshi, P. K. & Deelen, J. Multivariate genomic scan implicates novel loci and haem metabolism in human ageing. Nat. Commun. 11, 3570 (2020).
Hemani, G., Tilling, K. & Davey Smith, G. Orienting the causal relationship between imprecisely measured traits using GWAS summary data. PLoS Genet. 13, e1007081 (2017).
Robinson, J. W. et al. An efficient and robust tool for colocalisation: pair-wise conditional and colocalisation (PWCoCo). Preprint at bioRxiv https://doi.org/10.1101/2022.08.08.503158 (2022).
Herzig, S. et al. CREB regulates hepatic gluconeogenesis through the coactivator PGC-1. Nature 413, 179–183 (2001).
Kundaje, A. et al. Integrative analysis of 111 reference human epigenomes. Nature 518, 317–330 (2015).
Saldi, T., Cortazar, M. A., Sheridan, R. M. & Bentley, D. L. Coupling of RNA polymerase ii transcription elongation with pre-mRNA splicing. J. Mol. Biol. 428, 2623–2635 (2016).
Sadler, M. C., Auwerx, C., Lepik, K., Porcu, E. & Kutalik, Z. Quantifying the role of transcript levels in mediating DNA methylation effects on complex traits and diseases. Nat. Commun. 13, 7559 (2022).
Battram, T. et al. The EWAS Catalog: a database of epigenome-wide association studies. Wellcome Open Res. 7, 41 (2022).
McCartney, D. L. et al. An epigenome-wide association study of sex-specific chronological ageing. Genome Med. 12, 1 (2020).
Horvath, S. DNA methylation age of human tissues and cell types. Genome Biol 14, R115 (2013).
Ohnuki, M. et al. Dynamic regulation of human endogenous retroviruses mediates factor-induced reprogramming and differentiation potential. Proc. Natl Acad. Sci. USA 111, 12426–12431 (2014).
Hannum, G. et al. Genome-wide methylation profiles reveal quantitative views of human aging rates. Mol. Cell 49, 359–367 (2013).
Levine, M. E. et al. An epigenetic biomarker of aging for lifespan and healthspan. Aging 10, 573–591 (2018).
Lu, A. T. et al. DNA methylation GrimAge strongly predicts lifespan and healthspan. Aging 11, 303–327 (2019).
Belsky, D. W. et al. DunedinPACE, a DNA methylation biomarker of the pace of aging. eLife 11, e73420 (2022).
Istas, G. et al. Identification of differentially methylated BRCA1 and CRISP2 DNA regions as blood surrogate markers for cardiovascular disease. Sci Rep. 7, 5120 (2017).
Toth, R. et al. Random forest-based modelling to detect biomarkers for prostate cancer progression. Clin. Epigenetics 11, 148 (2019).
Bai, C. et al. Oviductal glycoprotein 1 promotes hypertension by inducing vascular remodeling through an interaction with MYH9. Circulation 146, 1367–1382 (2022).
Jessen, W. J., Borgerding, M. F. & Prasad, G. L. Global methylation profiles in buccal cells of long-term smokers and moist snuff consumers. Biomarkers 23, 625–639 (2018).
Heyn, H., Moran, S. & Esteller, M. Aberrant DNA methylation profiles in the premature aging disorders Hutchinson-Gilford progeria and Werner syndrome. Epigenetics 8, 28–33 (2013).
Vandiver, A. R. et al. Age and sun exposure-related widespread genomic blocks of hypomethylation in nonmalignant skin. Genome Biol. 16, 80 (2015).
López-Otín, C., Blasco, M. A., Partridge, L., Serrano, M. & Kroemer, G. The hallmarks of aging. Cell 153, 1194–1217 (2013).
Lescai, F., Marchegiani, F. & Franceschi, C. PON1 is a longevity gene: results of a meta-analysis. Ageing Res. Rev. 8, 277–284 (2009).
Declerck, K. et al. Interaction between prenatal pesticide exposure and a common polymorphism in the PON1 gene on DNA methylation in genes associated with cardio-metabolic disease risk—an exploratory study. Clin. Epigenetics 9, 35 (2017).
Clement, J. et al. Umbilical cord plasma concentrate has beneficial effects on DNA methylation GrimAge and human clinical biomarkers. Aging Cell 21, e13696 (2022).
Tremblay, B. L. et al. Epigenetic changes in blood leukocytes following an omega-3 fatty acid supplementation. Clin. Epigenetics 9, 43 (2017).
Estill, M. S. et al. Assisted reproductive technology alters deoxyribonucleic acid methylation profiles in bloodspots of newborn infants. Fertil. Steril. 106, 629–639 (2016).
Bens, S. et al. Phenotypic spectrum and extent of DNA methylation defects associated with multilocus imprinting disturbances. Epigenomics 8, 801–816 (2016).
Lin, D. et al. Characterization of cross-tissue genetic-epigenetic effects and their patterns in schizophrenia. Genome Med. 10, 13 (2018).
Porcu, E. et al. Differentially expressed genes reflect disease-induced rather than disease-causing changes in the transcriptome. Nat. Commun. 12, 5647 (2021).
Ogrodnik, M. & Gladyshev, V. N. The meaning of adaptation in aging: insights from cellular senescence, epigenetic clocks and stem cell alterations. Nat. Aging 3, 766–775 (2023).
Walther, D. M. et al. Widespread proteome remodeling and aggregation in aging C. elegans. Cell 161, 919–932 (2015).
Burgess, S., Zuber, V., Valdes-Marquez, E., Sun, B. B. & Hopewell, J. C. Mendelian randomization with fine-mapped genetic data: choosing from large numbers of correlated instrumental variables. Genet. Epidemiol. 41, 714–725 (2017).
Yavorska, O. O. & Burgess, S. MendelianRandomization: an R package for performing Mendelian randomization analyses using summarized data. Int. J. Epidemiol. 46, 1734–1739 (2017).
Hemani, G. et al. The MR-Base platform supports systematic causal inference across the human phenome. eLife 7, e34408 (2018).
Võsa, U. et al. Large-scale cis- and trans-eQTL analyses identify thousands of genetic loci and polygenic scores that regulate blood gene expression. Nat. Genet. 53, 1300–1310 (2021).
Zou, H. & Hastie, T. Regularization and variable selection via the elastic net. J. R. Stat. Soc. Ser. B Stat. Methodol. 67, 301–320 (2005).
Lehne, B. et al. A coherent approach for analysis of the Illumina HumanMethylation450 BeadChip improves data quality and performance in epigenome-wide association studies. Genome Biol. 16, 37 (2015).
Bulik-Sullivan, B. K. et al. LD score regression distinguishes confounding from polygenicity in genome-wide association studies. Nat. Genet. 47, 291–295 (2015).
Bowden, J., Davey Smith, G. & Burgess, S. Mendelian randomization with invalid instruments: effect estimation and bias detection through Egger regression. Int. J. Epidemiol. 44, 512–525 (2015).
Zheng, J. et al. Phenome-wide Mendelian randomization mapping the influence of the plasma proteome on complex diseases. Nat. Genet. 52, 1122–1131 (2020).
Bell, B., Rose, C. L. & Damon, A. The Veterans Administration Longitudinal Study of Healthy Aging. Gerontologist 6, 179–184 (1966).
Ying, K. et al. Biolearn, an open-source library for biomarkers of aging. Preprint at bioRxiv https://doi.org/10.1101/2023.12.02.569722 (2023).
Ying, K. et al. ClockBase: a comprehensive platform for biological age profiling in human and mouse. Preprint at bioRxiv https://doi.org/10.1101/2023.02.28.530532 (2023).
Acknowledgements
We thank the DNA Methylation Consortium (GoDMC) for releasing the summary statistics of meQTLs. We also thank C. Kerepesi, M. Mariotti, D. L. McCartney and R. E. Marioni for their help and advice during the initial stages of this study. We thank B. Manning, D. A. Sinclair, S. Sunyaev and A. Zhavoronkov for advising. We especially thank Y. Fang for the artwork design. This study is supported by the National Institute on Aging, Impetus grants and the Michael Antonov Foundation. The FHS is funded by National Institutes of Health contracts N01-HC-25195 and HHSN268201500001I. The US Department of Veterans Affairs (VA) NAS is supported by the National Institute of Environmental Health Sciences (NIEHS; R01ES015172 and R01ES021733) as well as by the Cooperative Studies Program/ERIC, US Department of Veterans Affairs, and is a research component of the Massachusetts Veterans Epidemiology Research and Information Center (MAVERIC). Additional support to the VA NAS was provided by the US Department of Agriculture, Agricultural Research Service (contract 53-K06-510).
Author information
Authors and Affiliations
Contributions
K.Y. initiated the study and performed data collection and analyses. V.N.G. supervised this research and provided funding. H.L., M.C.S. and A.T. were involved in data analysis. A.T.L., M.M. and S.H. contributed to data interpretation. Z.K. and X.S. assisted in methodology refinement and statistical analysis. All authors contributed to writing and revising the manuscript and approved the final version for publication.
Corresponding author
Ethics declarations
Competing interests
K.Y. and V.N.G. are inventors on a patent application related to the research reported.
Peer review
Peer review information
Nature Aging thanks Daniel Belsky, Matthew Suderman, and the other, anonymous, reviewer(s) for their contribution to the peer review of this work.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Extended data
Extended Data Fig. 1 Genetic correlation between 9 lifespan-related phenotypes.
Genetic correlations were calculated using LDSC regression. Areas of the squares represent absolute values of corresponding genetic correlations. Genetic correlations that could not be estimated are shown as blanks. Two-sided test was performed. P values are corrected using Bonferroni correction for the number of tests, * P nominal < 0.05, ** P adjusted < 0.05, *** P adjusted < 0.01.
Extended Data Fig. 2 Relationship between meSNPs and causal CpGs.
Forest plot shows enrichment of meSNP (n = 161860 sites) among causal CpGs. Error bar shows the 95% confidential interval. P-value of significant results is annotated (top). Two-sided Fisher’s exact test was used to calculate the enrichment P-value. P-values were corrected using FDR for multiple comparison. Scatter plot shows Pearson’s correlation between the effect of a single CpG site estimated by MR and a single meSNP (bottom). Correlation coefficient and P-value are annotated at the top.
Extended Data Fig. 3 Relationship between estimated causal effects and evolutionary conservation.
Box plot shows the distribution of conservation scores in causal and non-causal CpG sites. Conservation scores were obtained by Learning Evidence of Conservation from Integrated Functional genomic annotations (LECIF), phastCons, and phyloP. Scatter plot shows the relationship between the absolute adjusted effect size and conservation score. * P < 0.05, ** P < 0.01, *** P < 0.001, **** P < 0.0001. From left to right, the exact P-value for the top three panels are 1.23e-0, 0.00392, 0.00213.
Extended Data Fig. 4 Enrichment analysis.
a. Bar plot shows enrichment of causal CpG sites in genomic annotations. Y-axis shows -log10(FDR) based on Fisher’s exact test, signed by log2(Odds Ratio). Causal CpG sites identified for different traits are annotated with different colors. Two dotted horizontal lines show the FDR threshold of 0.05. b. Enrichment of causal CpG sites for Aging-GIP1 with positive or negative effect size against transcription-factor-binding sites. Each horizontal bar represents an enriched term. The X-axis shows the -log10(P-value), signed by log2 (Odds ratio). The top 10 enriched terms that passed the FDR threshold of 0.05 for each direction are annotated.
Extended Data Fig. 5 Enrichment of causal CpG sites in EWAS hits.
Enrichment of putative causal CpG sites for 12 aging-related traits against EWAS hits. Each horizontal bar represents an enriched term. The X-axis shows the -log10(P-value), signed by log2 (Odds ratio). The top 10 enriched terms that passed the FDR threshold of 0.05 for each direction are annotated. Two-sided Fisher’s exact test was used to calculate the enrichment P-value. P-values were corrected using FDR for multiple comparison.
Extended Data Fig. 6 Distribution of the effect ratio of strongest meQTL and age (per year).
Scatter plot shows the top 50 age-related CpG sites. The dotted line annotates the mean and median values of the ratio.
Extended Data Fig. 7 Relationship between MR-estimated causal effects (X-axis) and age-related methylation change (Y-axis) for each significant causal CpGs identified in aging-related phenotypes.
The color scheme highlights the expected impact of age-related methylation change on aging. Error bars show the standard error of b. The size reflects the PP-H4. Only CpG sites with adjusted P-values < 0.05 after corrected for multiple comparison and relative PP-H4 > 0.7 are plotted. The Pearson’s correlation coefficient and P-value based on two-sided test are annotated at the top.
Extended Data Fig. 8 Enrichment of causal CpG sites among CpG sites that show age-related changes.
Error bar indicates the 95% confidence interval. The center indicate the estimated log2 odds ratio (N site used for each traits are shown in Fig. 1c). Bar plot shows the signed -log10(P-value) of Spearman’s correlation between age-related change and causal effect size. The orange dotted line shows the threshold of P < 0.05.
Extended Data Fig. 9 Relationship between DamAge acceleration (X-axis) and AdaptAge acceleration (Y-axis) in the test set.
The age acceleration term is age-adjusted by regressing out the chronological age. Pearson’s correlation coefficient and P value based on two-sided test are annotated.
Extended Data Fig. 10 Residual plot in the test set.
These plots show the residuals of the three clocks in the test set across age range. R-square and MSE are reported. Relationship between the number of sites in the clock model and Tau. X-axis shows the choice of causality factor Tau, and the Y-axis shows the number of CpG sites selected in elastic net models.
Supplementary information
Supplementary Information
Supplementary Methods and Supplementary Cohort Information.
Supplementary Table 1
Statistical source data for Extended Data Fig. 1.
Supplementary Code 1
The custom R code used for EWMR analysis.
Supplementary Table 2
Statistical source data for Data Fig. 1, which contains the output putative causal CpG sites from EWMR analysis.
Supplementary Table 3
The model weight for CausAge, DamAge and AdaptAge.
Source data
Source Data Fig. 1
Statistical source data.
Source Data Fig. 3
Statistical source data.
Source Data Fig. 4
Statistical source data.
Source Data Fig. 5
Statistical source data.
Source Data Fig. 6
Statistical source data.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Ying, K., Liu, H., Tarkhov, A.E. et al. Causality-enriched epigenetic age uncouples damage and adaptation. Nat Aging 4, 231–246 (2024). https://doi.org/10.1038/s43587-023-00557-0
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1038/s43587-023-00557-0
This article is cited by
-
Nature of epigenetic aging from a single-cell perspective
Nature Aging (2024)
-
Global Healthspan Summit 2023: closing the gap between healthspan and lifespan
Nature Aging (2024)
-
The long and winding road of reprogramming-induced rejuvenation
Nature Communications (2024)
-
Epigenetic clock work ticks forward
Nature Aging (2024)
-
Validation of biomarkers of aging
Nature Medicine (2024)