Skip to main content

Thank you for visiting You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

Common genetic associations between age-related diseases


Age is a common risk factor in many diseases, but the molecular basis for this relationship is elusive. In this study we identified four disease clusters from 116 diseases in UK Biobank data, defined by their age-of-onset profiles, and found that diseases with the same onset profile are genetically more similar, suggesting a common etiology. This similarity was not explained by disease categories, co-occurrences or disease cause–effect relationships. Two of the four disease clusters had an increased risk of occurrence from ages 20 and 40 years, respectively. They both showed an association with known aging-related genes, yet differed in functional enrichment and evolutionary profiles. Moreover, they both had age-related expression and methylation changes. We also tested mutation accumulation and antagonistic pleiotropy theories of aging and found support for both.

Access options

Rent or Buy article

Get time limited or full article access on ReadCube.


All prices are NET prices.

Fig. 1: Age-of-onset profiles clustered by the PAM algorithm, using dissimilarities calculated with temporal correlation measure (CORT).
Fig. 2: Genetic similarities and mediated pleiotropy across diseases.
Fig. 3: Enrichment of disease-associated genes in known longevity modulators and GO categories.
Fig. 4: Risk allele frequencies for diseases associated with different age-of-onset clusters.

Data availability

The primary data source used in the study was the UKBB resource20, which requires an application for access ( This study was conducted under application number 30688. The UKBB GWAS summary statistics provided by Neale laboratory were downloaded for Townsend Deprivation Index and diet regimes ( GTEx v.8 eQTL and expression data were accessed on 20 October 2020 via the GTEx data portal ( GWAS Catalog v.1.0.2 e96 (ref. 70) dataset was accessed on 30 July 2019 via The gene lists available in ‘Human Ageing Genomic Resources’34,35 were downloaded using and CellAge data were kindly made available before the data release on 2 October 2019 by Avelar et al.36. We accessed ChEMBL ( and PubChem ( using their APIs and UniChem ( mappings were used to map PubChem CIDs to ChEMBL IDs. DGIdb ( was used to compile drug–target gene interactions. Results of Adelman et al.37 and Marttila et al.38 age-related methylation studies were downloaded as article supplementary files. We accessed 1000 Genomes Project41 allele frequencies using the vcf file provided on the 1000 Genomes Project website (

The full set of GWAS results from this study can be accessed using BioStudies (S-BSST407) and all other results generated in the analysis are provided as Supplementary Datasets and Tables.

Code availability

BOLT-LMM (v.2.3.2) (, PLINK (v.1.90b6.4) ( and VarMap ( software were used. All remaining analyses were performed using R81 (v.3.5.0) (, using RStudio IDE (v.1.1.453) ( The following R packages were used: TSclust55 (v.1.2.4), cluster (v., HDL31 (v.1.3.8) (, VariantAnnotation63 (v.1.28.13), TxDb.Hsapiens.UCSC.hg19.knownGene64 (v.3.2.2), GenomicRanges65 (v.1.32.3), biomaRt66 (v.2.36.1), RCurl82 (v., jsonlite83 (v.1.7.1), rtracklayer67 (v.1.40.3), liftOver68 (v.1.12.0), goseq75 (v.1.40.0), preprocessCore80 (v.1.50.0) and LCV32 method ( implemented in R. The following packages were used for data handling: tidyverse84 (v.1.3.0) and data.table85 (v.1.12.4). The following packages were used for data visualization: igraph86 (v.1.2.1), ggnetwork87 (v.0.5.8), ggforce88 (v., ggpubr89 (v.0.4.0), ggrepel90 (v.0.8.2), GGally91 (v.2.0.0), RColorBrewer92 (v.1.1.2), scales93 (v.1.1.1), ggthemes94 (v.4.2.0) and pheatmap95 (v.1.0.12).

All other analysis was performed using custom codes written in bash (v.4.2) or R (v.3.5.0) and are available in GitHub at


  1. 1.

    López-Otín, C., Blasco, M. A., Partridge, L., Serrano, M. & Kroemer, G. The hallmarks of aging. Cell 153, 1194–1217 (2013).

    PubMed  PubMed Central  Google Scholar 

  2. 2.

    Crimmins, E. M. Lifespan and healthspan: past, present, and promise. Gerontologist 55, 901–911 (2015).

    PubMed  PubMed Central  Google Scholar 

  3. 3.

    Partridge, L., Deelen, J. & Slagboom, P. E. Facing up to the global challenges of ageing. Nature 561, 45–56 (2018).

    CAS  PubMed  Google Scholar 

  4. 4.

    Niccoli, T. & Partridge, L. Ageing as a risk factor for disease. Curr. Biol. 22, R741–R752 (2012).

    CAS  PubMed  Google Scholar 

  5. 5.

    Flatt, T. & Partridge, L. Horizons in the evolution of aging. BMC Biol. 16, 93 (2018).

    PubMed  PubMed Central  Google Scholar 

  6. 6.

    Medvedev, Z. A. An attempt at a rational classification of theories of ageing. Biol. Rev. Camb. Philos. Soc. 65, 375–398 (1990).

    CAS  PubMed  Google Scholar 

  7. 7.

    Medawar, P. B. Unsolved problem of biology. Med. J. Aust. 1, 854–855 (1953).

    Google Scholar 

  8. 8.

    Williams, G. C. Pleiotropy, natural selection, and the evolution of senescence. Evolution 11, 398–411 (1957).

    Google Scholar 

  9. 9.

    Bulik-Sullivan, B. et al. An atlas of genetic correlations across human diseases and traits. Nat. Genet. 47, 1236–1241 (2015).

    CAS  PubMed  PubMed Central  Google Scholar 

  10. 10.

    Pickrell, J. K. et al. Detection and interpretation of shared genetic influences on 42 human traits. Nat. Genet. 48, 709–717 (2016).

    CAS  PubMed  PubMed Central  Google Scholar 

  11. 11.

    Cross-Disorder Group of the Psychiatric Genomics Consortium et al. Genetic relationship between five psychiatric disorders estimated from genome-wide SNPs. Nat. Genet. 45, 984–994 (2013).

  12. 12.

    Cortes, A., Albers, P. K., Dendrou, C. A., Fugger, L. & McVean, G. Identifying cross-disease components of genetic risk across hospital data in the UK Biobank. Nat. Genet. 52, 126–134 (2020).

    CAS  PubMed  Google Scholar 

  13. 13.

    Solovieff, N., Cotsapas, C., Lee, P. H., Purcell, S. M. & Smoller, J. W. Pleiotropy in complex traits: challenges and strategies. Nat. Rev. Genet. 14, 483–495 (2013).

    CAS  PubMed  PubMed Central  Google Scholar 

  14. 14.

    Ellinghaus, D. et al. Analysis of five chronic inflammatory diseases identifies 27 new associations and highlights disease-specific patterns at shared loci. Nat. Genet. 48, 510–518 (2016).

    CAS  PubMed  PubMed Central  Google Scholar 

  15. 15.

    Parkes, M., Cortes, A., van Heel, D. A. & Brown, M. A. Genetic insights into common pathways and complex relationships among immune-mediated diseases. Nat. Rev. Genet. 14, 661–673 (2013).

    CAS  PubMed  Google Scholar 

  16. 16.

    Bien, S. A. & Peters, U. Moving from one to many: insights from the growing list of pleiotropic cancer risk genes. Br. J. Cancer 120, 1087–1089 (2019).

    PubMed  PubMed Central  Google Scholar 

  17. 17.

    Johnson, S. C., Dong, X., Vijg, J. & Suh, Y. Genetic evidence for common pathways in human age-related diseases. Aging Cell 14, 809–817 (2015).

    CAS  PubMed  PubMed Central  Google Scholar 

  18. 18.

    Fernandes, M. et al. Systematic analysis of the gerontome reveals links between aging and age-related diseases. Hum. Mol. Genet. 25, 4804–4818 (2016).

    CAS  PubMed  PubMed Central  Google Scholar 

  19. 19.

    Wang, J., Zhang, S., Wang, Y., Chen, L. & Zhang, X.-S. Disease-aging network reveals significant roles of aging genes in connecting genetic diseases. PLoS Comput. Biol. 5, e1000521 (2009).

    PubMed  PubMed Central  Google Scholar 

  20. 20.

    Bycroft, C. et al. The UK Biobank resource with deep phenotyping and genomic data. Nature 562, 203–209 (2018).

    CAS  PubMed  PubMed Central  Google Scholar 

  21. 21.

    Data-Field 41262 (UK Biobank, accessed 1 December 2020);

  22. 22.

    Chouakria, A. D. & Nagabhushan, P. N. Adaptive dissimilarity index for measuring time series proximity. Adv. Data Anal. Classif. 1, 5–21 (2007).

    Google Scholar 

  23. 23.

    Loh, P.-R. et al. Efficient Bayesian mixed-model analysis increases association power in large cohorts. Nat. Genet. 47, 284–290 (2015).

    CAS  PubMed  PubMed Central  Google Scholar 

  24. 24.

    Anderson, C. A. et al. Data quality control in genetic case-control association studies. Nat. Protoc. 5, 1564–1573 (2010).

    CAS  PubMed  PubMed Central  Google Scholar 

  25. 25.

    Zhou, W. et al. Efficiently controlling for case-control imbalance and sample relatedness in large-scale genetic association studies. Nat. Genet. 50, 1335–1341 (2018).

    CAS  PubMed  PubMed Central  Google Scholar 

  26. 26.

    Pe’er, I., Yelensky, R., Altshuler, D. & Daly, M. Estimation of the multiple testing burden for genomewide association studies of common variants. Nat. Preced. (2007).

  27. 27.

    Panagiotou, O. A. & Ioannidis, J. P. A. What should the genome-wide significance threshold be? Empirical replication of borderline genetic associations. Int. J. Epidemiol. 41, 273–286 (2012).

    PubMed  Google Scholar 

  28. 28.

    Human genome region MHC (Genome Reference Consortium, accessed June 2019);

  29. 29.

    Bulik-Sullivan, B. K. et al. LD Score regression distinguishes confounding from polygenicity in genome-wide association studies. Nat. Genet. 47, 291–295 (2015).

    CAS  PubMed  PubMed Central  Google Scholar 

  30. 30.

    Berisa, T. & Pickrell, J. K. Approximately independent linkage disequilibrium blocks in human populations. Bioinformatics 32, 283–285 (2016).

    CAS  PubMed  Google Scholar 

  31. 31.

    Ning, Z., Pawitan, Y. & Shen, X. High-definition likelihood inference of genetic correlations across human complex traits. Nat. Genet. (2020).

  32. 32.

    O’Connor, L. J. & Price, A. L. Distinguishing genetic correlation from causation across 52 diseases and complex traits. Nat. Genet. 50, 1728–1734 (2018).

    PubMed  PubMed Central  Google Scholar 

  33. 33.

    Gamazon, E. R. et al. Using an atlas of gene regulation across 44 human tissues to inform complex disease- and trait-associated variation. Nat. Genet. 50, 956–967 (2018).

    CAS  PubMed  PubMed Central  Google Scholar 

  34. 34.

    Tacutu, R. et al. Human ageing genomic resources: new and updated databases. Nucleic Acids Res. 46, D1083–D1090 (2018).

    CAS  PubMed  Google Scholar 

  35. 35.

    Barardo, D. et al. The DrugAge database of aging-related drugs. Aging Cell 16, 594–597 (2017).

    CAS  PubMed  PubMed Central  Google Scholar 

  36. 36.

    Avelar, R. A. et al. A multidimensional systems biology analysis of cellular senescence in aging and disease. Genome Biol. 21, 91 (2020).

    CAS  PubMed  PubMed Central  Google Scholar 

  37. 37.

    Adelman, E. R. et al. Aging human hematopoietic stem cells manifest profound epigenetic reprogramming of enhancers that may predispose to leukemia. Cancer Discov. 9, 1080–1101 (2019).

    PubMed  PubMed Central  Google Scholar 

  38. 38.

    Marttila, S. et al. Ageing-associated changes in the human DNA methylome: genomic locations and effects on gene expression. BMC Genomics 16, 179 (2015).

    PubMed  PubMed Central  Google Scholar 

  39. 39.

    Sun, L.-Q. et al. Growth retardation and premature aging phenotypes in mice with disruption of the SNF2-like gene, PASG. Genes Dev. 18, 1035–1046 (2004).

    CAS  PubMed  PubMed Central  Google Scholar 

  40. 40.

    Rodríguez, J. A. et al. Antagonistic pleiotropy and mutation accumulation influence human senescence and disease. Nat. Ecol. Evol. 1, 55 (2017).

    PubMed  Google Scholar 

  41. 41.

    Genomes Project Consortium et al. A global reference for human genetic variation. Nature 526, 68–74 (2015).

  42. 42.

    Bitarello, B. D. et al. Signatures of long-term balancing selection in human genomes. Genome Biol. Evol. 10, 939–955 (2018).

    CAS  PubMed  PubMed Central  Google Scholar 

  43. 43.

    Kosiol, C. et al. Patterns of positive selection in six mammalian genomes. PLoS Genet. 4, e1000144 (2008).

    PubMed  PubMed Central  Google Scholar 

  44. 44.

    Nielsen, R. et al. A scan for positively selected genes in the genomes of humans and chimpanzees. PLoS Biol. 3, e170 (2005).

    PubMed  PubMed Central  Google Scholar 

  45. 45.

    Shultz, A. J. & Sackton, T. B. Immune genes are hotspots of shared positive selection across birds and mammals. eLife 8, e41815 (2019).

    PubMed  PubMed Central  Google Scholar 

  46. 46.

    Fisher, R. A. The Genetical Theory of Natural Selection (Clarendon Press, 1930).

  47. 47.

    Wolfson, M., Budovsky, A., Tacutu, R. & Fraifeld, V. The signaling hubs at the crossroad of longevity and age-related disease networks. Int. J. Biochem. Cell Biol. 41, 516–520 (2009).

    CAS  PubMed  Google Scholar 

  48. 48.

    Kanchi, K. L. et al. Integrated analysis of germline and somatic variants in ovarian cancer. Nat. Commun. 5, 3156 (2014).

    PubMed  PubMed Central  Google Scholar 

  49. 49.

    Khurana, E. et al. Role of non-coding sequence variants in cancer. Nat. Rev. Genet. 17, 93–108 (2016).

    CAS  PubMed  Google Scholar 

  50. 50.

    Poduri, A., Evrony, G. D., Cai, X. & Walsh, C. A. Somatic mutation, genomic variation, and neurological disease. Science 341, 1237758 (2013).

    PubMed  PubMed Central  Google Scholar 

  51. 51.

    Zhang, L. & Vijg, J. Somatic mutagenesis in mammals and its implications for human disease and aging. Annu. Rev. Genet. 52, 397–419 (2018).

    CAS  PubMed  PubMed Central  Google Scholar 

  52. 52.

    Finkel, T., Serrano, M. & Blasco, M. A. The common biology of cancer and ageing. Nature 448, 767–774 (2007).

    CAS  PubMed  Google Scholar 

  53. 53.

    Sánchez-Valle, J. et al. Interpreting molecular similarity between patients as a determinant of disease comorbidity relationships. Nat. Commun. 11, 2854 (2020).

    PubMed  PubMed Central  Google Scholar 

  54. 54.

    Gutiérrez-Sacristán, A. et al. comoRbidity: an R package for the systematic analysis of disease comorbidities. Bioinformatics 34, 3228–3230 (2018).

    PubMed  PubMed Central  Google Scholar 

  55. 55.

    Montero, P. & Vilar, J. TSclust: an R package for time series clustering. J. Stat. Software, Articles 62, 1–43 (2014).

    Google Scholar 

  56. 56.

    Berndt, D. J. & Clifford, J. Using dynamic time warping to find patterns in time series. In Proc. 3rd International Conference on Knowledge Discovery and Data Mining 359–370 (AAAI Press, 1994).

  57. 57.

    Kaufman, L. & Rousseeuw, P. J. in Finding Groups in Data Ch. 2 (John Wiley & Sons, Inc., 1990).

  58. 58.

    Maechler, M., Rousseeuw, P., Struyf, A., Hubert, M. & Hornik, K. cluster: cluster analysis basics and extensions. Version (2019).

  59. 59.

    Tibshirani, R., Walther, G. & Hastie, T. Estimating the number of clusters in a data set via the gap statistic. J. R. Stat. Soc. B 63, 411–423 (2001).

    Google Scholar 

  60. 60.

    Loh, P.-R. BOLT-LMM v2. 3.1 User Manual (2017).

  61. 61.

    Chang, C. C. et al. Second-generation PLINK: rising to the challenge of larger and richer datasets. Gigascience 4, 7 (2015).

    PubMed  PubMed Central  Google Scholar 

  62. 62.

    Stephenson, J. D., Laskowski, R. A., Nightingale, A., Hurles, M. E. & Thornton, J. M. VarMap: a web tool for mapping genomic coordinates to protein sequence and structure and retrieving protein structural annotations. Bioinformatics (2019).

  63. 63.

    Obenchain, V. et al. VariantAnnotation: a Bioconductor package for exploration and annotation of genetic variants. Bioinformatics 30, 2076–2078 (2014).

    CAS  PubMed  PubMed Central  Google Scholar 

  64. 64.

    Carlson, M. TxDb.Hsapiens.UCSC.hg19.knownGene: annotation package for TxDb object(s). R package version 3.2.2 (Bioconductor Package Maintainer, 2015).

  65. 65.

    Lawrence, M. et al. Software for computing and annotating genomic ranges. PLoS Comput. Biol. 9, e1003118 (2013).

    CAS  PubMed  PubMed Central  Google Scholar 

  66. 66.

    Durinck, S., Spellman, P. T., Birney, E. & Huber, W. Mapping identifiers for the integration of genomic datasets with the R/Bioconductor package biomaRt. Nat. Protoc. 4, 1184–1191 (2009).

    CAS  PubMed  PubMed Central  Google Scholar 

  67. 67.

    Lawrence, M., Gentleman, R. & Carey, V. rtracklayer: an R package for interfacing with genome browsers. Bioinformatics 25, 1841–1842 (2009).

    CAS  PubMed  PubMed Central  Google Scholar 

  68. 68.

    liftOver: Changing genomic coordinate systems with rtracklayer::liftOver. Version 1.12.0 (Bioconductor Package Maintainer, 2020).

  69. 69.

    Lage, K. et al. A large-scale analysis of tissue-specific pathology and gene expression of human disease genes and complexes. Proc. Natl Acad. Sci. USA 105, 20870–20875 (2008).

    CAS  PubMed  Google Scholar 

  70. 70.

    Buniello, A. et al. The NHGRI-EBI GWAS Catalog of published genome-wide association studies, targeted arrays and summary statistics 2019. Nucleic Acids Res. 47, D1005–D1012 (2019).

    CAS  PubMed  PubMed Central  Google Scholar 

  71. 71.

    Kim, S. et al. PubChem 2019 update: improved access to chemical data. Nucleic Acids Res. 47, D1102–D1109 (2019).

    PubMed  Google Scholar 

  72. 72.

    Chambers, J. et al. UniChem: a unified chemical structure cross-referencing and identifier tracking system. J. Cheminform. 5, 3 (2013).

    CAS  PubMed  PubMed Central  Google Scholar 

  73. 73.

    Gaulton, A. et al. The ChEMBL database in 2017. Nucleic Acids Res. 45, D945–D954 (2017).

    CAS  PubMed  Google Scholar 

  74. 74.

    Cotto, K. C. et al. DGIdb 3.0: a redesign and expansion of the drug–gene interaction database. Nucleic Acids Res. 46, D1068–D1073 (2018).

    CAS  PubMed  Google Scholar 

  75. 75.

    Young, M. D., Wakefield, M. J., Smyth, G. K. & Oshlack, A. Gene ontology analysis for RNA-seq: accounting for selection bias. Genome Biol. 11, R14 (2010).

    PubMed  PubMed Central  Google Scholar 

  76. 76.

    Ashburner, M. et al. Gene ontology: tool for the unification of biology. Nat. Genet. 25, 25–29 (2000).

    CAS  PubMed  PubMed Central  Google Scholar 

  77. 77.

    Gene Ontology Consortium.The Gene Ontology Resource: 20 years and still GOing strong. Nucleic Acids Res. 47, D330–D338 (2019).

    Google Scholar 

  78. 78.

    Benjamini, Y. & Yekutieli, D. The control of the false discovery rate in multiple testing under dependency. Ann. Stat. 29, 1165–1188 (2001).

    Google Scholar 

  79. 79.

    GTEx Consortium.The Genotype-Tissue Expression (GTEx) pilot analysis: multitissue gene regulation in humans. Science 348, 648–660 (2015).

    PubMed Central  Google Scholar 

  80. 80.

    Bolstad, B. preprocessCore: a collection of pre-processing functions. Version 1.50.0 (2020).

  81. 81.

    R Core Team. R: A Language and Environment for Statistical Computing. Version 3.5.0 (2018).

  82. 82.

    Temple Lang, D. RCurl: General Network (HTTP/FTP/…) Client Interface for R. Version (2020).

  83. 83.

    Ooms, J. The jsonlite package: a practical and consistent mapping between JSON data and R objects. Preprint at arXiv (2014).

  84. 84.

    Wickham, H. et al. Welcome to the tidyverse. J. Open Source Softw. 4, 1686 (2019).

    Google Scholar 

  85. 85.

    Dowle, M. & Srinivasan, A. data.table: extension of data.frame. Version 1.12.4 (2020).

  86. 86.

    Csardi, G. & Nepusz, T. The igraph software package for complex network research. Version 1.2.1 (2006).

  87. 87.

    Briatte, F. ggnetwork: geometries to plot networks with ‘ggplot2’. Version 0.5.8 (2020).

  88. 88.

    Pedersen, T. L. ggforce: accelerating ‘ggplot2’. Version (2020).

  89. 89.

    Kassambara, A. ggpubr: ‘ggplot2’ based publication ready plots. Version 0.4.0 (2020).

  90. 90.

    Slowikowski, K. ggrepel: automatically position non-overlapping text labels with ‘ggplot2’. Version 0.8.2 (2020).

  91. 91.

    Schloerke, B. et al. GGally: extension to ‘ggplot2’. Version 2.0.0 (2020).

  92. 92.

    Neuwirth, E. RColorBrewer: colorbrewer palettes. Version 1.1.2 (2014).

  93. 93.

    Wickham, H. & Seidel, D. scales: scale functions for visualization. Version 1.1.1 (2020).

  94. 94.

    Arnold, J. B. ggthemes: extra themes, scales and geoms for ‘ggplot2’. Version 4.2.0 (2019).

  95. 95.

    Kolde, R. pheatmap: pretty heatmaps. Version 1.0.12 (2019).

Download references


This research has been conducted using the UK Biobank Resource (application no. 30688). The authors thank the GWAS Catalog team for providing the list of studies using UK Biobank data; J. Stephenson and R. Laskowski for their help in running VarMap tool; and M. Somel, S. Ozanne, P. Beltrao and W. Huber for fruitful discussions. This research was funded in whole, or in part, by the Wellcome Trust (098565/Z/12/Z). For the purpose of Open Access, the authors have applied a CC BY public copyright licence to any author accepted manuscript version arising from this submission. H.M.D., D.K.F., L.P. and J.M.T. were funded by this Wellcome Trust grant. The work was also supported by the European Molecular Biology Laboratory (J.M.T), the EMBL International PhD Programme (H.M.D) and Comisión Nacional de Investigación Científica y Tecnológica - Government of Chile (CONICYT scholarship; M.F.). The funders had no role in study design, data collection and analysis, decision to publish or preparation of the manuscript.

Author information




H.M.D. conceived and designed the study with contributions from L.P. and J.M.T. H.M.D. analyzed the data with the help of D.K.F. and M.F.V. H.M.D. interpreted the results and wrote the manuscript with contributions from all authors. All authors read, revised and approved the final version of this manuscript.

Corresponding authors

Correspondence to Handan Melike Dönertaş or Janet M. Thornton.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Peer review information Nature Aging thanks Luke O’Connor, Luke Pilling and the other, anonymous, reviewer(s) for their contribution to the peer review of this work.

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Extended data

Extended Data Fig. 1 Disease categories and co-occurrences.

a, Disease hierarchy for the 116 diseases included in the analysis. The nodes are colored by the disease categories as indicated in the legend. b, Disease co-occurrence matrix summarizing relative risk scores and correlations. Each row and column denote diseases ordered by hierarchical clustering of risk scores. The color is defined by relative risk scores while the size is determined by ϕ value, indicating the robustness of the association (see Methods). The diagonal tiles are colored by the UK Biobank’s disease hierarchy to visualize if diseases from the same category cluster together. Associations for the 62 diseases that have at least one relative risk ratio higher than four (log2RR ≥ 2) or lower than minus four log2RR ≤ −2 are plotted.

Source data

Extended Data Fig. 2 Distribution of median age-of-onset across disease categories.

Points show diseases grouped by categories (individual boxplots). Categories are ordered by the median value of the median age-of-onset. The boxplots show the first and third quartiles, the median (dark line), and the whiskers extend from the quartiles to the last point in 1.5xIQR distance to the quartiles.

Source data

Extended Data Fig. 3 The number of significant variants across diseases, age-of-onset clusters, and disease categories.

a, Number of diseases for different number of significant variants (pBOLT-LMM≤5e-8). Diseases with the highest number of associations (N≥10,000) are given as an inset table. b, Comparison of the number of significant associations (y-axis, on a log scale) across age-of-onset clusters (x-axis) (ANOVA after excluding cluster 4, p = 0.06). Since the y-axis is on a log scale, diseases with zero significant associations are not shown on the graph. c, The same as (b) but for disease categories. Categories are ordered by the median number of significant SNPs. The boxplots (b-c) show the first and third quartiles, the median (dark line), and the whiskers extend from the quartiles to the last point in 1.5xIQR distance to the quartiles.

Source data

Extended Data Fig. 4 The raw and corrected values of genetic similarities within and across age-of-onset clusters.

a, The difference between genetic similarity within and across age-of-onset clusters, calculated between 47 diseases. Y-axis shows the genetic similarity (see Methods). b, The same as (a) but the y-axis is corrected for disease category and co-occurrence using a linear model. This panel is the same as Figure 2b and given here only for easier comparison. The boxplots show the first and third quartiles, the median (dark line), and the whiskers extend from the quartiles to the last point in 1.5xIQR distance to the quartiles. P-values are calculated using F-test on a linear model between genetic similarity scores and different/same age of onset clusters for panel a and including different/same disease category and disease co-occurrence (risk ratio) as covariates in panel b.

Source data

Extended Data Fig. 5 Genetic similarities calculated using the high-definition likelihood (HDL) inference method.

a, The correlation between the genetic similarity scores calculated using the SNP overlap-based odds ratio (x-axis) and HDL (y-axis). Blue points show the similarities calculated between diseases in different age of onset clusters and red points show the similarities calculated between diseases in the same age of onset cluster. The correlation coefficient and p-value are calculated using a two-sided Spearman correlation test. The linear regression line (blue) and 95% confidence interval (gray shaded area) are shown. b, The difference between genetic similarity within and across age-of-onset clusters, calculated between 59 diseases. Y-axis shows the genetic similarity calculated using HDL. The difference between different and same age clusters is tested using a two-sided Wilcoxon test. The boxplots show the first and third quartiles, the median (dark line), and the whiskers extend from the quartiles to the last point in 1.5xIQR distance to the quartiles.

Source data

Extended Data Fig. 6 The overlap between genes associated with selected aging-related traits and genes associated with diseases in different clusters.

The x-axis shows the log2 enrichment score, and the y-axis shows the age-of-onset clusters. The numbers of genes in each cluster (for both multidisease and multicategory genes) are given. The size of the points shows the statistical significance based on a one-sided permutation test (large points show nominal p-value ≤ 0.05, small ‘x’ indicates non-significant overlaps – none of the associations are significant after multiple testing correction), and the color shows different aging-related GWAS Catalog traits. The colored numbers near the points show the numbers of overlapping genes.

Source data

Extended Data Fig. 7 Drug-target gene interaction network for the drugs specifically targeting multicategory genes in age-dependent clusters.

‘Drug-target gene’ interaction network for the drugs that specifically target multicategory cluster 1, cluster 2, or cluster ‘1 & 2’ genes as determined by Fisher’s exact test. Blue diamonds show the drugs with a significant association or targeting only one gene in these gene groups. Diamonds without written names are only represented with the ChEMBL IDs in the datasets and did not have names. Drug labels written in bold are drugs approved for different conditions. Circles represent the genes targeted by the significant hits, colored by their age-of-onset cluster. Gray circles show the genes targeted by these drugs but are not among the gene set of interest.

Source data

Supplementary information

Supplementary Information

Reporting Summary

Supplementary Tables

Supplementary Tables 1–10. Descriptions of each table are available within the file.

Source data

Source Data Fig. 1

Statistical source data.

Source Data Fig. 2

Statistical source data.

Source Data Fig. 3

Statistical source data.

Source Data Fig. 4

Statistical source data.

Source Data Extended Data Fig. 1

Statistical source data.

Source Data Extended Data Fig. 2

Statistical source data.

Source Data Extended Data Fig. 3

Statistical source data.

Source Data Extended Data Fig. 4

Statistical source data.

Source Data Extended Data Fig. 5

Statistical source data.

Source Data Extended Data Fig. 6

Statistical source data.

Source Data Extended Data Fig. 7

Statistical source data.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Dönertaş, H.M., Fabian, D.K., Fuentealba, M. et al. Common genetic associations between age-related diseases. Nat Aging 1, 400–412 (2021).

Download citation


Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing