Abstract
Extreme longevity in humans has a strong genetic component, but whether this involves genetic variation in the same longevity pathways as found in model organisms is unclear. Using whole-exome sequences of a large cohort of Ashkenazi Jewish centenarians to examine enrichment for rare coding variants, we found most longevity-associated rare coding variants converge upon conserved insulin/insulin-like growth factor 1 signaling and AMP-activating protein kinase signaling pathways. Centenarians have a number of pathogenic rare coding variants similar to control individuals, suggesting that rare variants detected in the conserved longevity pathways are protective against age-related pathology. Indeed, we detected a pro-longevity effect of rare coding variants in the Wnt signaling pathway on individuals harboring the known common risk allele APOE4. The genetic component of extreme human longevity constitutes, at least in part, rare coding variants in pathways that protect against aging, including those that control longevity in model organisms.
This is a preview of subscription content, access via your institution
Access options
Access Nature and 54 other Nature Portfolio journals
Get Nature+, our best-value online-access subscription
$29.99 / 30 days
cancel any time
Subscribe to this journal
Receive 12 digital issues and online access to articles
$119.00 per year
only $9.92 per issue
Rent or buy this article
Get just this article for as long as you need it
$39.95
Prices may be subject to local taxes which are calculated during checkout




Data availability
All summary statistics for the longevity association of rare coding variants in our Ashkenazi Jewish longevity cohort are available at http://zdzlab.einsteinmed.org/1/longevity.html. Due to privacy concerns for our research participants, individual-level genetic data from the Einstein longevity study are not publicly available; however, anonymized data will be shared by request from a qualified academic investigator, providing the data transfer is approved by the Institutional Review Board and regulated by a material transfer agreement. The German longevity cohort data are part of the PopGen Biobank (Schleswig-Holstein, Germany) and can be accessed through a Material Data Access Form (http://www.uksh.de/p2n/Information+for+Researchers.html). Sequence and phenotype data of the UK Biobank and ADSP cohorts are available at https://bbams.ndph.ox.ac.uk/ams/ and https://www.ncbi.nlm.nih.gov/projects/gap/cgi-bin/study.cgi?study_id=phs000572, respectively. All software used in our analyses was open source and is described in Methods.
References
Lopez-Otin, C., Blasco, M. A., Partridge, L., Serrano, M. & Kroemer, G. The hallmarks of aging. Cell 153, 1194–1217 (2013).
Kenyon, C. J. The genetics of ageing. Nature 464, 504–512 (2010).
Kirkwood, T. B. Understanding the odd science of aging. Cell 120, 437–447 (2005).
Campisi, J. et al. From discoveries in ageing research to therapeutics for healthy ageing. Nature 571, 183–192 (2019).
Kenyon, C., Chang, J., Gensch, E., Rudner, A. & Tabtiang, R. A. C. elegans mutant that lives twice as long as wild type. Nature 366, 461–464 (1993).
Ayyadevara, S., Alla, R., Thaden, J. J. & Shmookler Reis, R. J. Remarkable longevity and stress resistance of nematode PI3K-null mutants. Aging Cell 7, 13–22 (2008).
Tatar, M. et al. A mutant Drosophila insulin receptor homolog that extends life-span and impairs neuroendocrine function. Science 292, 107–110 (2001).
Clancy, D. J. et al. Extension of life-span by loss of CHICO, a Drosophila insulin receptor substrate protein. Science 292, 104–106 (2001).
Holzenberger, M. et al. IGF-1 receptor regulates lifespan and resistance to oxidative stress in mice. Nature 421, 182–187 (2003).
Johnson, S. C., Rabinovitch, P. S. & Kaeberlein, M. mTOR is a key modulator of ageing and age-related disease. Nature 493, 338–345 (2013).
Herskind, A. M. et al. The heritability of human longevity: a population-based study of 2872 Danish twin pairs born 1870-1900. Hum. Genet. 97, 319–323 (1996).
Vijg, J. & Suh, Y. Genetics of longevity and aging. Annu. Rev. Med. 56, 193–212 (2005).
Christensen, K., Johnson, T. E. & Vaupel, J. W. The quest for genetic determinants of human longevity: challenges and insights. Nat. Rev. Genet. 7, 436–448 (2006).
Perls, T. T., Bubrick, E., Wager, C. G., Vijg, J. & Kruglyak, L. Siblings of centenarians live longer. Lancet 351, 1560 (1998).
Melzer, D., Pilling, L. C. & Ferrucci, L. The genetics of human ageing. Nat. Rev. Genet. 21, 88–101 (2020).
Zhang, Z. D. et al. Genetics of extreme human longevity to guide drug discovery for healthy ageing. Nat. Metab. 2, 663–672 (2020).
Deelen, J. et al. Gene set analysis of GWAS data for human longevity highlights the relevance of the insulin/IGF-1 signaling and telomere maintenance pathways. Age 35, 235–249 (2013).
Broer, L. et al. GWAS of longevity in CHARGE consortium confirms APOE and FOXO3 candidacy. J. Gerontol. A 70, 110–118 (2015).
Deelen, J. et al. A meta-analysis of genome-wide association studies identifies multiple longevity genes. Nat. Commun. 10, 3669 (2019).
Cash, T. P. et al. Exome sequencing of three cases of familial exceptional longevity. Aging Cell 13, 1087–1090 (2014).
Nygaard, H. B. et al. Whole-exome sequencing of an exceptional longevity cohort. J. Gerontol. A Biol. Sci. Med. Sci. 74, 1386–1390 (2018).
Shindyapina, A. V. et al. Germline burden of rare damaging variants negatively affects human healthspan and lifespan. eLife 9, e53449 (2020).
Guha, S. et al. Implications for health and disease in the genetic signature of the Ashkenazi Jewish population. Genome Biol. 13, R2 (2012).
Hunt, R. C., Simhadri, V. L., Iandoli, M., Sauna, Z. E. & Kimchi-Sarfaty, C. Exposing synonymous mutations. Trends Genet. 30, 308–321 (2014).
Lee, S., Abecasis, G. R., Boehnke, M. & Lin, X. Rare-variant association analysis: study designs and statistical tests. Am. J. Hum. Genet. 95, 5–23 (2014).
Rentzsch, P., Witten, D., Cooper, G. M., Shendure, J. & Kircher, M. CADD: predicting the deleteriousness of variants throughout the human genome. Nucleic Acids Res. 47, D886–D894 (2019).
Sundaram, L. et al. Predicting the clinical impact of human mutation with deep neural networks. Nat. Genet. 50, 1161–1170 (2018).
Fafian-Labora, J. et al. FASN activity is important for the initial stages of the induction of senescence. Cell Death Dis. 10, 318 (2019).
Brosh, R. M. Jr. & Bohr, V. A. Human premature aging, DNA repair and RecQ helicases. Nucleic Acids Res. 35, 7527–7544 (2007).
Lin, J. R., Zhang, Q., Cai, Y., Morrow, B. E. & Zhang, Z. D. Integrated rare variant-based risk gene prioritization in disease case-control sequencing studies. PLoS Genet. 13, e1007142 (2017).
Tasan, M. et al. Selecting causal genes from genome-wide association studies via functionally coherent subnetworks. Nat. Methods 12, 154–159 (2015).
Lupton, M. K. et al. The role of ABCA1 gene sequence variants on risk of Alzheimer’s disease. J. Alzheimers Dis. 38, 897–906 (2014).
Sims, R. et al. Rare coding variants in PLCG2, ABI3, and TREM2 implicate microglial-mediated innate immunity in Alzheimer’s disease. Nat. Genet. 49, 1373–1384 (2017).
Liu, D. J. & Leal, S. M. Replication strategies for rare variant complex trait association studies via next-generation sequencing. Am. J. Hum. Genet. 87, 790–801 (2010).
Beecham, G. W. et al. The Alzheimer’s Disease Sequencing Project: study design and sample selection. Neurol. Genet. 3, e194 (2017).
Lek, M. et al. Analysis of protein-coding genetic variation in 60,706 humans. Nature 536, 285–291 (2016).
Mahley, R. W. Apolipoprotein E: from cardiovascular disease to neurodegenerative disorders. J. Mol. Med. 94, 739–746 (2016).
Erikson, G. A. et al. Whole-genome sequencing of a healthy aging cohort. Cell 165, 1002–1011 (2016).
Eichner, J. E. et al. Apolipoprotein E polymorphism and cardiovascular disease: a HuGE review. Am. J. Epidemiol. 155, 487–495 (2002).
Liu, H. et al. Augmented Wnt signaling in a mammalian model of accelerated aging. Science 317, 803–806 (2007).
Kirkwood, T. B. & Finch, C. E. Ageing: the old worm turns more slowly. Nature 419, 794–795 (2002).
Kirkwood, T. B. et al. What accounts for the wide variation in life span of genetically identical organisms reared in a constant environment? Mech. Ageing Dev. 126, 439–443 (2005).
Caruso, A. et al. Inhibition of the canonical Wnt signaling pathway by apolipoprotein E4 in PC12 cells. J. Neurochem. 98, 364–371 (2006).
Klaus, A. & Birchmeier, W. Wnt signalling and its impact on development and cancer. Nat. Rev. Cancer 8, 387–398 (2008).
Palomer, E., Buechler, J. & Salinas, P. C. Wnt signaling deregulation in the aging and Alzheimer’s brain. Front. Cell Neurosci. 13, 227 (2019).
Ng, L. F. et al. WNT signaling in disease. Cells 8, 826 (2019).
Timmers, P. R. et al. Genomics of 1 million parent lifespans implicates novel pathways and common diseases and distinguishes survival chances. eLife 8, e39856 (2019).
Yang, J. et al. MiR-34 modulates Caenorhabditis elegans lifespan via repressing the autophagy gene atg9. Age 35, 11–22 (2013).
Piazzesi, A. et al. Replication-independent histone variant H3.3 controls animal lifespan through the regulation of pro-longevity transcriptional programs. Cell Rep. 17, 987–996 (2016).
Reichwald, K. et al. High tandem repeat content in the genome of the short-lived annual fish Nothobranchius furzeri: a new vertebrate model for aging research. Genome Biol. 10, R16 (2009).
Course, M. M. et al. Evolution of a human-specific tandem repeat associated with ALS. Am. J. Hum. Genet. 107, 445–460 (2020).
Chang, C. C. et al. Second-generation PLINK: rising to the challenge of larger and richer datasets. Gigascience 4, 7 (2015).
Dewey, F. E. et al. Inactivating variants in ANGPTL4 and risk of coronary artery disease. N. Engl. J. Med. 374, 1123–1133 (2016).
Das, S. et al. Next-generation genotype imputation service and methods. Nat. Genet. 48, 1284–1287 (2016).
McCarthy, S. et al. A reference panel of 64,976 haplotypes for genotype imputation. Nat. Genet. 48, 1279–1283 (2016).
Euesden, J., Lewis, C. M. & O’Reilly, P. F. PRSice: polygenic risk score software. Bioinformatics 31, 1466–1468 (2015).
Choi, S. W. & O’Reilly, P. F. PRSice-2: polygenic risk score software for biobank-scale data. Gigascience 8, giz082 (2019).
Jansen, I. E. et al. Genome-wide meta-analysis identifies new loci and functional pathways influencing Alzheimer’s disease risk. Nat. Genet. 51, 404–413 (2019).
Nikpay, M. et al. A comprehensive 1,000 Genomes-based genome-wide association meta-analysis of coronary artery disease. Nat. Genet. 47, 1121–1130 (2015).
Mahajan, A. et al. Fine-mapping type 2 diabetes loci to single-variant resolution using high-density imputation and islet-specific epigenome maps. Nat. Genet. 50, 1505–1513 (2018).
Malik, R. et al. Multiancestry genome-wide association study of 520,000 subjects identifies 32 loci associated with stroke and stroke subtypes. Nat. Genet. 50, 524–537 (2018).
Schumacher, F. R. et al. Association analyses of more than 140,000 men identify 63 new prostate cancer susceptibility loci. Nat. Genet. 50, 928–936 (2018).
Michailidou, K. et al. Association analysis identifies 65 new breast cancer risk loci. Nature 551, 92–94 (2017).
Wolpin, B. M. et al. Genome-wide association study identifies multiple susceptibility loci for pancreatic cancer. Nat. Genet. 46, 994–1000 (2014).
Wang, X. Firth logistic regression for rare variant association tests. Front. Genet. 5, 187 (2014).
Wu, M. C. et al. Rare-variant association testing for sequencing data with the sequence kernel association test. Am. J. Hum. Genet. 89, 82–93 (2011).
Flannick, J. et al. Exome sequencing of 20,791 cases of type 2 diabetes and 24,440 controls. Nature 570, 71–76 (2019).
Chen, J., Bardes, E. E., Aronow, B. J. & Jegga, A. G. ToppGene Suite for gene list enrichment analysis and candidate gene prioritization. Nucleic Acids Res. 37, W305–W311 (2009).
Kuleshov, M. V. et al. Enrichr: a comprehensive gene set enrichment analysis web server 2016 update. Nucleic Acids Res. 44, W90–W97 (2016).
Raudvere, U. et al. g:Profiler: a web server for functional enrichment analysis and conversions of gene lists (2019 update). Nucleic Acids Res. 47, W191–W198 (2019).
Subramanian, A. et al. Gene set enrichment analysis: a knowledge-based approach for interpreting genome-wide expression profiles. Proc. Natl Acad. Sci. USA 102, 15545–15550 (2005).
Oakes, D. & Peterson, D. R. Survival methods: additional topics. Circulation 117, 2949–2955 (2008).
Chen, H. et al. Comprehensive assessment of computational algorithms in predicting cancer driver mutations. Genome Biol. 21, 43 (2020).
Flachsbart, F. et al. Immunochip analysis identifies association of the RAD50/IL13 region with human longevity. Aging Cell 15, 585–588 (2016).
Muller, N. et al. IL-6 blockade by monoclonal antibodies inhibits apolipoprotein (a) expression and lipoprotein (a) synthesis in humans. J. Lipid. Res. 56, 1034–1042 (2015).
Van Hout, C. V. et al. Exome sequencing and characterization of 49,960 individuals in the UK Biobank. Nature 586, 749–756 (2020).
Acknowledgements
This work was supported by NIH grant nos. R01 HG008153 (Z.D.Z.), R01 AG061155 (S.M.), R01 AG060747 (M.D.G.), R01 AG057909 (N.B. and Z.D.Z.), P01 AG017242 (J.V.) and U19 AG056278 (J.V., P.D.R., L.J.N., Y.S. and W.C.L.) and a Career Scientist Award from the Irma T. Hirschl Trust to Z.D.Z. We thank the the Popgen Biobank and the Popgen 2.0 Network at Kiel University for help with recruitment of some of the long-lived individuals. G.G.T. was supported by the Deutsche Forschungsgemeinschaft (German Research Foundation) through project no. 390870439 (EXC 2150 – ROOTS). We thank T. Wang (Albert Einstein College of Medicine) for comments and suggestions. We thank the Management and Leadership Team at RGC for contributing to securing funding, study design and oversight and reviewing the manuscript (G. Abecasis, A. Baras, M. Cantor, G. Coppola, A. Deubler, A. Economides, L. A. Lotta, J. D. Overton, J. G. Reid and A. Shuldiner). We thank Sequencing and Lab Operations at RGC for performing and being responsible for sample sequencing (J. Marcovici, E. Weihenig, A. Lopez and J. D. Overton); for performing and being responsible for exome sequencing (A. DeVito, J. LaRosa, L. Widom, C. Beechert, C. Forsythe, E. D. Fuller, M. Lattari, M. Sotiropoulos Padilla, S. E. Wolf, A. Lopez and J. D. Overton); for conceiving and being responsible for laboratory automation (T. D. Schleicher, Z. Gu, A. Lopez and J. D. Overton); and for being responsible for sample tracking and the library information management system (M. Pradhan, K. Manoochehri, R. H. Ulloa and J. D. Overton). We thank Genome Informatics at RGC for performing, and being responsible for, the analysis needed to produce exome and genotype data (X. Bai, A. Hawes, W. Salerno and J. G. Reid); for providing computing infrastructure development and operational support (G. Eom and J. G. Reid); for providing variant and gene annotations and their functional interpretation of variants (S. Balasubramanian and J. G. Reid); and for conceiving and being responsible for creating, developing and deploying analysis platforms and computational methods for analysis of genomic data (E. K. Maxwell, J. C. Staples, L. Habegger and J. G. Reid). We thank Research Program Management at RGC for contributing to the management and coordination of all research activities, planning, execution and reviewing of the manuscript (M. B. Jones and L. J. Mitnaul).
Author information
Authors and Affiliations
Consortia
Contributions
J.-R.L. and Z.D.Z. conceived the formal analysis. J.-R.L. executed the formal analysis. P.S.-C. and A.S. obtained the study resources. Z.D.Z., P.S.-C., J.M., Q.Z. and T.G. performed data curation. Z.W. performed variant imputation. V.N., G.G.T., M.D.G., A.F., A.N. and S.G. participated in replication analysis. Z.D.Z. and N.B. conceived of the research goals and acquired funding. J.-R.L. and Z.D.Z. wrote the original draft. J.V., Y.S., S.M., P.D.R., L.J.N., W.C.L., V.G., K.Y., G.A., M.L., M.R.J. and N.N. participated in review and editing. The R.G.C. performed WES and SNP array genotyping.
Corresponding author
Ethics declarations
Competing interests
J.V. is a founder of Singulomics Corp. P.D.R. and L.J.N. are cofounders of NRTK Biosciences. All other authors declare no competing interests.
Additional information
Peer review information Nature Aging thanks George Martin and the other, anonymous reviewer(s) for their contribution to the peer review of this work.
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Extended data figures and tables
Extended Data Fig. 1 The replication study of gene-set longevity association using the WES data of the German longevity cohort.
The longevity case-control study consists of 1,265 longevity cases and 4,195 longevity controls. P* denotes P-value corrected for 12 categories of rare variants using the minimal-P value test from Flannick et al.67 (Methods). The text for the significant association denotes the lowest raw P-value among different groups of tested rare variants and FDR. (A) Full longevity cohort. (B) APOE4 stratified cohorts.
Extended Data Fig. 2 The replication study of gene-set longevity association using the UK Biobank WES data.
The longevity case-control study consists of 104 cases with at least one parent age at death ≥ 100 years and 23,405 controls with both parent age at death < 95 years. P* denotes P-value corrected for 12 categories of rare variants using the minimal-P value test from Flannick et al.67 (Methods). The text for the significant association denotes the lowest raw P-value among different groups of tested rare variants and FDR. (a) Full longevity cohort. (b) APOE4 stratified cohorts.
Extended Data Fig. 3 The replication study of gene-set longevity association using the ADSP WES data.
The longevity case-control study consists of 1,121 non-AD cases with age ≥ 90 years and 38 non-AD controls with age < 75 years. P* denotes P-value corrected for 12 categories of rare variants using the minimal-P value test from Flannick et al.67 (Methods). The text for the significant association denotes the lowest raw P-value among different groups of tested rare variants and FDR.
Extended Data Fig. 4 Gene-set rare variant association in the APOE4-stratied cohorts of the discovery (Ashkenazi Jewish) longevity cohort.
P* denotes P-value corrected for 6 categories of tested variants using the minimal-P value test from Flannick et al.67 (Methods). The text for the significant association denotes the lowest raw P-value among different groups of tested rare variants and FDR.
Extended Data Fig. 5 Lifespan analysis of protective variants in WNT signaling genes for noncentenarians.
P denotes uncorrected P-value derived from linear regression with the log-transformed age at death as the outcome and the gender as a covariate (See Methods). ‘WNT low’ and ‘WNT high’ represent the alternative allele count of rare variants in WNT signaling genes ≤ 1 and > 1 (the median), respectively. In parentheses are the numbers of individuals. MD stands for ‘median difference’. The asterisk denotes FDR < 0.05. (a) The lifespan difference of individuals carrying a high and low burden of protective rare variants in WNT signaling genes. (b) Negative effects of APOE4 on lifespan with high and low burden of protective rare variants in WNT signaling for noncentenarians.
Extended Data Fig. 6 Lifespan analysis of protective variants in WNT signaling genes for centenarians.
P denotes uncorrected P-value derived from linear regression with the log-transformed age at death as the outcome and the gender as a covariate (See Methods). ‘WNT low’ and ‘WNT high’ represent the alternative allele count of rare variants in WNT signaling genes ≤ 1 and > 1 (the median), respectively. In parentheses are the numbers of individuals. MD stands for ‘median difference’. (a) The lifespan difference of individuals carrying a high and low burden of protective rare variants in WNT signaling genes. (b) Negative effects of APOE4 on lifespan with high and low burden of protective rare variants in WNT signaling for centenarians.
Extended Data Fig. 7 Disease-PRS analyses for centenarian and control.
This shows the results of PRS analyses for age-related diseases in the centenarian cohort. In the boxplots, points represent individuals, and horizontal lines represent upper fence (maximum in Q3 + 1.5×IQR), upper quartile (Q3), median, lower quartile (Q1), lower fence (minimum in Q1 − 1.5×IQR), sequentially from top to bottom; IQR: interquartile range (25th to the 75th percentile). n = 910 biologically independent samples in the boxplots on the right panels for coronary artery disease, type 2 diabetes, stroke, and pancreatic cancer. n = 339 and 571 biologically independent samples in the boxplots on the right panels for prostate cancer and breast cancer, respectively. Above the boxplot on the right are raw and adjusted (in parentheses) P-values for the best prediction in the Nagelkerke’s R2 plot on the left, which were calculated based on logistic regression and the permutation test in PRSice-2, respectively. For stroke, breast cancer, prostate cancer, and pancreatic cancer, no robust association was observed between their PRS and the longevity status as originally defined in our cohort. (a) Coronary artery disease. (b) Coronary artery disease without considering SNPs within 1Mbps of rs7412 or rs429358 (SNPs for the APOE haplotype). (c) Type 2 diabetes. (d) Stroke. (e) Prostate cancer. Only males are considered. (f) Breast cancer. Only females are considered. (g) Pancreatic cancer.
Extended Data Fig. 8 Basic statistics of the lifespan cohort.
(a) Lifespan distribution of 553 individuals. (b) Survival curves of 202 males and 351 females composing the analyzed cohort. Females have a significant survival rate than males based on cox regression model (P = 1.71E-07; coxph in R package).
Extended Data Fig. 9 Correlation between lifespan and common-variant genetic risk of age-related diseases.
P-values were based on the result of linear regression (regress log lifespan on genetic disease risk) corrected for gender. (a) Alzheimer’s disease. The plots on the left and right show the boxplot and survival curves of APOE4 + and APOE4 − , respectively. MD stands for ‘Median Difference’. In the boxplots, points represent individuals, and horizontal lines represent upper fence (maximum in Q3 + 1.5×IQR), upper quartile (Q3), median, lower quartile (Q1), lower fence (minimum in Q1 − 1.5×IQR), sequentially from top to bottom; IQR: interquartile range (25th to the 75th percentile). n = 553 biologically independent samples. (b) Coronary artery disease. r represents ‘correlation coefficient’. (c) Type 2 diabetes.
Extended Data Fig. 10 Flowcharts of sample collection for different analyses.
(a) Flowchart of sample collection for PRS analyses and lifespan analyses of rare variants and disease PRS. Refer ‘Rare variant association analysis’ subsection for the strategy of removing kinship for PRS analysis that involves longevity status. The strategy of removing kinship in lifespan analyses is to randomly exclude one in pairs of individuals with the proportion of alleles shared identity-by-descent (IBD) > 0.4. (b) Flowchart of sample collection for rare variant association tests, network-integrated analyses, and lifespan analyses of rare variants (and APOE4).
Supplementary information
Rights and permissions
About this article
Cite this article
Lin, JR., Sin-Chan, P., Napolioni, V. et al. Rare genetic coding variants associated with human longevity and protection against age-related diseases. Nat Aging 1, 783–794 (2021). https://doi.org/10.1038/s43587-021-00108-5
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1038/s43587-021-00108-5
This article is cited by
-
Proceedings of the Post-Genome Analysis for Musculoskeletal Biology Workshop
Current Osteoporosis Reports (2023)