Senescence has long been a public health challenge as well as a fascinating evolutionary problem. There is neither a universally accepted theory for its ultimate causes, nor a consensus about what may be its impact on human health. Here we test the predictions of two evolutionary explanations of senescence—mutation accumulation and antagonistic pleiotropy—which postulate that genetic variants with harmful effects in old ages can be tolerated, or even favoured, by natural selection at early ages. Using data from genome-wide association studies (GWAS), we study the effects of genetic variants associated with diseases appearing at different periods in life, when they are expected to have different impacts on fitness. Data fit theoretical expectations. Namely, we observe higher risk allele frequencies combined with large effect sizes for late-onset diseases, and detect a significant excess of early–late antagonistically pleiotropic variants that, strikingly, tend to be harboured by genes related to ageing. Beyond providing systematic, genome-wide evidence for evolutionary theories of senescence in our species and contributing to the long-standing question of whether senescence is the result of adaptation, our approach reveals relationships between previously unrelated pathologies, potentially contributing to tackling the problem of an ageing population.
Senescence, the biological process of organismic decay with ageing, is coupled with an increased risk of certain diseases. With an estimated threefold increase in the number of people above age 80 yr in the next half century 1 , age-related diseases pose a global public health challenge. Knowledge of the evolutionary causes of senescence could contribute new strategies for managing age-related diseases. While many evolutionary hypotheses on the causes of senescence have been proposed 2 , the most established ones are the mutation accumulation (MA) theory 3 , the antagonistic pleiotropy (AP) theory of senescence 4 and the disposable soma (DS) theory 5 . The two first hypotheses rely on the reduced efficiency of natural selection with increased age. The MA theory proposes that deleterious mutations with effects expressed later in life should be more difficult for natural selection to eliminate 6 . The AP theory adds an adaptive aspect: mutations that are damaging for the organism later in life (and hence contribute to senescence) could actually be favoured by natural selection if they are advantageous early in life, resulting in increased reproductive success of their carriers 7,8 . Finally the DS theory suggests that organisms face a trade-off between dedicating energy to reproduction or investing it in the maintenance and growth of their somas. The AP and DS theories both suggest that senescence is simply a by-product of an investment early in life and, indeed, many authors agree that DS is a particular instance of AP 9,10 . While AP specifies that genetic variants favoured in the fertile stages may cause ageing or physiological decay later in life, DS specifies that senescence occurs because of genetic variants favoured when fostering reproduction at the cost of impairing the growth and maintenance of the somatic parts of the organism, which will eventually lead to the accumulation of molecular and cellular damage. Besides these three, other evolutionary hypotheses have also been proposed (Supplementary Information section 1). As senescence is a highly complex phenomenon, ideas about it are better understood as complementing than as excluding each other. Still, each theory makes predictions that suggest ways of testing them. For instance, the MA and the AP hypotheses both predict that specific mutations in particular genes will cause senescence, while the DS theory is based on the general failure of repair mechanisms, which will lead to stochastic accumulation of molecular and cellular damage.
The efforts carried out so far to assess the three main hypotheses have focused on non-human organisms (particularly involving Drosophila) and have obtained a variety of sometimes contradictory results
The strong relationship between senescence and age-related diseases, together with the current abundance of genome-phenome information
Information on the effects of genetic variants associated with complex disease is abundant, particularly thanks to the GWAS that have accumulated over the past decade
. However, this information is indirect in two senses. First, the vast majority of studies do not include measures of the reproductive success of participants, and, thus, although a link between disease and fitness is clear, current data preclude making it quantitative. Second, the approach of GWAS is based on genetic markers tagging the true causal variants. Still, genetic associations are known to reflect the frequencies and effect sizes of causal variants
and, therefore, genetic markers associated simultaneously with several diseases are likely to indicate the effects of underlying pleiotropic causal variants. Excluding infectious diseases (which lack a specific age of onset and thus cannot be used for our purposes) and focusing on people with Eurasian ancestry (for whom most data are available), we gathered from the NHGRI-EBI GWAS Catalog
a total of 2,559 unique single nucleotide polymorphisms (SNPs) associated with 120 different diseases (Supplementary Data 1 and 2) with P values < 10−5. Here, we classify these associations as linked to early or late-onset diseases according to information on their age of onset from Medscape
. Despite archaeological and life history evidence to suggest that 40–50 yr constitutes an acceptable threshold separating early from late age in humans
Results and discussion
Our first observation is that risk alleles from SNPs associated with diseases with late ages of onset tend to have significantly higher frequencies than risk alleles associated with diseases that manifest themselves earlier in life (Fig. 1a). Also, late-onset variants tend to explain a larger proportion of the genetic variance (Fig. 1b). Differences between the two groups of diseases stop being significant when ages 36 yr and beyond are used as a threshold separating early and late onset. This makes sense if one considers that using a higher age threshold means that diseases with biologically meaningful late onsets are misclassified as ‘early onset’. For instance, Alzheimer’s disease would be mistakenly considered as ‘early onset’ if an age threshold of 60 yr was used. These results are strongly suggestive of natural selection allowing late-onset genetic variants with large effects on disease risk to reach higher frequencies, an observation consistent with the MA theory. Note that these results do not reflect any particular late-life disease, but the aggregation of data from all complex diseases for which information on their genetic architecture is available.
The case for an evolutionary explanation of age-related disease becomes even stronger when considering patterns of pleiotropy. Of the 2,559 variants analysed, 80 SNPs have been associated with two or more diseases. In addition, 158 SNPs have been associated with different pathologies by different studies but present high linkage disequilibrium (LD) with others in this set (r2 ≥ 0.8, in people of European ancestry). Eliminating redundant variants, this adds up to a total of 266 disease pairs (involving 219 SNPs) (Supplementary Data 3, Supplementary Table 2 and Supplementary Information section 3), in which a single pleiotropic variant might be mediating the risk of two or more diseases. The excess of antagonistic early–late pleiotropies that is predicted by the AP hypothesis can be tested by a simple 2 × 2 contingency table (example table in Fig. 1c). The test compares the numbers of antagonistic versus agonistic pleiotropies for diseases appearing in the same period of life (early–early and late–late) to the numbers of these pleiotropies between diseases that appear in different periods of life (early–late pairs). Again, we performed all tests in two-year steps for each age between 10 and 60 yr, defining an early–late onset threshold (Supplementary Data 4). The basic theoretical expectation is clearly fulfilled: relative to antagonistic pleiotropies related to diseases that belong to the same period in life, there is a highly significant excess of antagonistic pleiotropies when age thresholds from 40 to 50 yr are considered, peaking at ages 46–50 yr (Fig. 1c). The same trend was observed with several stricter LD thresholds (r2 ≥ 0.9, r2 = 1) for the set of pleiotropies defined through high LD between disease-associated SNPs (Supplementary Fig. 1). Interestingly, this age barrier seems to reflect the biology of our species. For instance, menopause typically sets in at between 45 and 55 yr in current societies, and it would have started earlier in the past
The results above constitute the first evidence that both the accumulation of deleterious mutations due to the weakening of purifying selection with age and the action of positive selection in favour of mutations that have protective early onset (but deleterious late-onset) pleiotropic effects help to explain patterns of age-related disease in our species. However, these observations in themselves do not constitute a formal test of a relationship between these disease variants and senescence 27 . Such a test can be performed to ascertain whether the set of genetic variants (219 SNPs) that have been linked to pleiotropies between diseases in the present study tend to map in genes that have been related to senescence by alternative methods. We identified four independent datasets of senescence genes. They include a set whose expression is altered between pre-senescent and senescent states in an ageing mouse model 28 , a set that has been linked to ageing phenotypes in humans and curated in the GenAge database 29 , a set whose expression profiles change with age in humans 30 , and finally a set whose methylation profiles change between newborns and centenarians 31 . In all datasets we observe a highly significant excess of association between senescence genes and pleiotropic SNPs (Table 1 and Supplementary Information section 5). Together these results constitute the first systematic evidence of senescence genes being associated with pleiotropies and suggests a fundamental role of pleiotropy in extant variation in human ageing patterns.
An important aspect of the AP model is that it is an adaptive theory of senescence, involving the action of positive selection on variants that increase survival or fertility at early ages. This immediately suggests using molecular evolution techniques to check for the signature of natural selection in genes and genomic regions involved in early–late antagonistic pleiotropies. Of course, we cannot expect this to be a dominant factor in the adaptive history of our species, as selection scans conducted so far have shown that the most outstanding cases of adaptation in our lineage are related to the immune system, perception or fertility
The example of CDKN2A is particularly interesting, as this locus is associated with four antagonistic pleiotropies involving five SNPs and five diseases. The T allele (CEU frequency = 54%, AFR frequency = 100%) of the intronic SNP rs2157719 is protective for glioma while increasing the risk of type 2 diabetes, glaucoma, coronary heart disease and nasopharyngeal cancer. We suggest that protection for glioma, a relatively frequent early onset and often fatal cancer, has been favoured at the price of increased risk of the four later-onset conditions. Indeed, population differentiation and a hierarchical boosting method 35 link this variant to a genomic region influenced by selection. We elaborate on the biology and selective footprint for this and the other two loci in Supplementary Information section 7.
Finally, our approach relating disease, pleiotropy, senescence and adaptation may have a practical application. As it implies a shared genetic architecture between different phenotypes, pleiotropy may help to explain disease comorbidities 36 , a phenomenon that has been highlighted by the study of electronic health records 37 . We gathered comorbidities from the largest study to date, performed on the Danish population 37 , and noted that ten of our disease pairs involved in pleiotropies do present comorbidities (Supplementary Data 6). This is a clear underestimation, as the Danish dataset, even if comprehensive, contains only data from patients followed throughout 14 yr, so comorbidities involving diseases with ages of onset separated by decades could not have been detected. Still, the overlap between comorbidities and pleiotropies constitutes a further validation of our results and suggests that pleiotropies detected by analysis of genome-phenome information can guide the future study of comorbidity within the increasingly large pool of electronic health records available worldwide.
Here we have tested and provide evidence for some of the most influential models of senescence. We show, first, that senescence partly results from adaptive processes; second that it is linked to both early and late-onset diseases; and, third, that current variation of ageing patterns in our species can be partially explained by ongoing evolutionary processes. Greater efforts are needed, not only to clarify the adaptive history of the genes harbouring early–late antagonistic pleiotropies, but also to understand how links between early and late-onset diseases that were so far unsuspected can inform about the biology of disease and, perhaps, the medical and societal decisions that are required by an ageing population.
GWAS database construction
The full NHGRI GWAS Catalog 18 was downloaded from www.ebi.ac.uk/gwas and filtered to keep only binary disease traits, thus excluding anthropometric quantitative characters such as body mass index or cholesterol levels, as well as those disease traits or particular studies whose marker-associated effect sizes were reported on a quantitative scale (rather than as odds ratio). Moreover, we also excluded infectious diseases (such as tuberculosis, malaria and leprosy, among others) because they appear at any age and thus cannot be properly assigned as early or late-onset diseases. The final list consisted of 120 diseases (see Supplementary Data 1).
We considered only GWAS performed and/or replicated in Eurasian populations. Notably, as common causal variants are shared between populations from European and Asian origins, most GWAS results are replicated in both populations 20 . For those instances in which the GWAS Catalog does not indicate the risk allele, we used the original bibliographic sources. The final dataset consisted of 2,559 unique SNPs, associated with one or more conditions and adding up to a total of 2,774 records (Supplementary Data 2).
Ages of onset
Each disease was assigned an approximate age of onset. For this, we mainly used the electronic database eMedicine (Medscape; http://reference.medscape.com/ )21 , a continuously updated and widely cited medical peer-reviewed database written by physicians specialized in each field as reported elsewhere 22,38 . In most cases, we assigned the average or median age of onset reported in the ‘Epidemiology’ section of the Medscape entry, where information on age, ethnicity, sex, mortality and morbidity is gathered. When an interval for ages of onset was reported, we extracted the lowest age of the interval. Moreover, all analyses involving ages of onset (Fig. 1) were replicated using the median age of onset instead of the lower bound of the interval for the age of onset and, in all cases, results were totally consistent with the previous classification criteria.
Multiple variations on the definition of pleiotropy have been proposed 39 , including the application of the ‘pleiotropic’ adjective to a molecule, an allele, a genetic marker, a gene, or any other entity related to diseases or complex traits. Here, we focus on the classical definition of the action of a single genetic variant on more than one biological process 40 . We collected cases of putative pleiotropies through two different approaches: first, we note cases in which the same SNP has been associated with two or more conditions by different studies; second, we also consider cases in which SNPs in LD with each other have been associated with different pathologies. This second approach is necessary because the various commercial arrays used in GWAS contain distinct sets of SNPs, so the same causal variants might be tagged by different SNPs in different studies. Hence, when two SNPs are in high LD in the relevant population and each is associated to a different disease, we also consider them as pleiotropy. In particular, for each one of the 2,559 SNPs defined as ‘seed SNPs’, we searched for potential SNPs in high LD (r2 ≥ 0.8, based on 85 CEU individuals from the 1000 Genomes Project, phase 1 41 ) within 1 Mb (500 kb up- and downstream) physical distance. The markers above the specified LD threshold were then searched in the GWAS Catalog and, only when found, were associated to another condition different from that of the seed SNP; we considered them as one pleiotropy. Often, more than one marker is found with this procedure, and in such cases we reported as many pleiotropies as pairwise combinations of markers were found in LD. To avoid repetitions, pleiotropies implying the same pair of diseases falling within a range of 200 kb 20 from any of the SNPs in another already-called pleiotropy were filtered out, and only a single pair, or pleiotropy, was kept. Additionally, three other LD thresholds of r2 ≥ 0.7, r2 ≥ 0.9 and r2 = 1 were used to ensure consistency (Supplementary Fig. 1).
Direction of effect sizes in pleiotropies: agonistic versus antagonistic
To ascertain whether pleiotropies display agonistic or antagonistic effects, we compared their effect sizes and risk alleles involved. We first considered all pleiotropies based on a single SNP, that is, every pair of conditions to which a given SNP had been associated. If risk alleles were different, we classified the pleiotropy as antagonistic, and as agonistic otherwise. When pairs of diseases associated with different SNPs in high LD (r2 ≥ 0.8) were suggestive of a pleiotropy, we used haplotypes for classification. Antagonistic pleiotropies were called when a risk and a protective allele were linked in the same haplotype; otherwise, we classified the pleiotropies as agonistic. In some cases, when LD between members of the pleiotropy is not perfect (r2 < 1), low-frequency haplotypes may exist in-between, which harbour all possible combinations of risk and protective alleles. To avoid confusion in the assignation of agonistic and antagonistic effects, we filtered out any putative pleiotropies in which any minor haplotypes with an inconsistent combination of alleles reached 10% frequency. Interestingly, no pleiotropies were discarded by this filter when using r2 ≥ 0.8 to detect LD-based pleiotropies.
Time of manifestation of pleiotropic effects: same period versus early–late
Pleiotropies can also be classified according to the time of onset of the pairs of diseases they involve. Diseases with an age of onset lower or equal than a given age threshold will be considered as early onset, whereas diseases after that threshold will be considered as late onset. With this classification, and given any age threshold, we can distinguish two classes of pleiotropy: those that manifest themselves in the same period of life (‘same period pleiotropies’, when both diseases are either early onset or late-onset) and those with manifestations at different periods of life (‘early–late pleiotropies, when they involve one early onset and one late-onset disease).
At each age threshold, the two classifications above were cross-tabulated in a 2 × 2 contingency table, in which rows contain information regarding the times of manifestation of pleiotropies (same period versus early–late) and columns inform about the direction of the effects (antagonistic versus agonistic), and then subsequently tested for independence with a chi-squared test (Supplementary Data 4).
The authors declare that data supporting the findings of this study are available within the paper and its Supplementary Information files.
How to cite this article: Rodríguez, J. A. et al. Antagonistic pleiotropy and mutation accumulation influence human senescence and disease. Nat. Ecol. Evol. 1, 0055 (2017).
This work was supported by Ministerio de Ciencia e Innovación, Spain (SAF2011-29239 to E.B., and BFU2012-38236 and BFU2015-68649-P to A.N.), by Direcció General de Recerca, Generalitat de Catalunya (2014SGR1311 and 2014SGR866), by the Spanish National Institute of Bioinfomatics of the Instituto de Salud Carlos III (PT13/0001/0026) and by FEDER (Fondo Europeo de Desarrollo Regional)/FSE (Fondo Social Europeo). U.M.M. is supported by Project 3 of NIGMS P01 GM099568 (B. Weir, University of Washington). E.B. is the recipient of an ICREA Academia Award. The authors thank J. Bertranpetit, P. Muñoz-Cánoves, R. Nesse and B. Charlesworth for helpful comments and advice. We also thank H. Laayouni, F. Casals and F. Calafell for comments on the manuscript, and the Navarro Lab members, especially D. Hartasánchez and M. Brasó, for discussion and comments.
Estimated age of onset for the diseases used in the present study.
List of the associations SNP-disease retrieved from the GWAS Catalog and used for the present study.
List of the 266 pleiotropies found in the present study. These include both, the ones involving the same SNP in two diseases and these involving pairs of SNPs with r2 ≥ 0.8.
Chi square 2×2 tables for number of pleiotropies inside each defined category, considering early–late thresholds from 10 to 60.
Antagonistic early–late pleiotropies (r2 ≥ 0.8.) (n = 26) for an age threshold of 46 years as transition early–late.
Pleiotropy and comorbidity overlap.
Excess of pleiotropy in different gene sets at different levels, compared to genome-wide.
Genes in the Sousa-Victor et al. ageing gene set28.
Disease–SNP associations reported after crossing ageing genes from Sousa-Victor et al.28 with the GWAS Catalog used in present study.
Pleiotropies found in the Sousa-Victor et al. ageing gene set28.
Genes in the Magalhães et al. ageing gene set29.
Disease–SNP associations reported after crossing genes from Magalhães et al.29 with the GWAS Catalog used in present study.
Pleiotropies found in the Magalhães et al. ageing gene set29.
Genes in the Harries et al. age expression changing gene set30.
Disease–SNP associations reported after crossing ageing genes from Harries et al.30 with the GWAS Catalog used in the present study.
Pleiotropies found in the Harries et al. age expression changing gene set30.
Number of SNPs, diseases and average risk allelic frequency for early and late at each age threshold (10 to 60 years) from Fig. 1a.
Number of SNPs, diseases and average genetic variance for early and late at each age threshold (10 to 60 years) from Fig. 1b.