The unusually long lifespans of humans and the persistence of post-reproductive lifespans in women represent evolutionary puzzles because natural selection cannot directly favour continued living in post-menopausal women or elderly men. Suggested sources of indirect selection require genetic correlations between fitness and survival or reproduction at younger ages, reproduction in the opposite sex, or late-life contributions to offspring or grandoffspring fitness. Here we apply quantitative genetic analyses to data from a historical human population to explicitly test these evolutionary genetic hypotheses. Total genetic selection increased the male post-50 lifespans by 0.138 years per generation; 94% of this arose from indirect selection acting to favour early-life fitness in both sexes. These results argue strongly against life-history models of ageing that depend on trade-offs between reproduction and late-life survival. No source of indirect selection for female post-50 lifespan was detected, deepening the mystery of why female post-reproductive survival persists. This result is probably due to recent changes in the genetic architecture of female lifespan, and it highlights the need for similar quantitative genetic analyses of human populations at other points along demographic transitions.
Natural selection favours increased lifespans whenever continued living is expected to yield future reproductive dividends, and this expectation declines with advancing age in humans of both sexes1. In males, the prevailing assumption is that late-life reproduction selects for late-life survival2, but this hypothesis remains untested and males often succeed in living long beyond the last ages of male reproduction3. In females, the late-life attenuation of phenotypic selection is more extreme, as menopause reduces this selective force to zero in middle age. Nevertheless, even women in primitive hunter–gatherer and horticultural populations can live many decades post-menopause4,5,6 in apparent violation of simple evolutionary predictions that late-acting deleterious mutations should accumulate unchecked7 (or even aided8) by natural selection. While there is controversy regarding the precise mechanisms for the genesis of post-reproductive lifespans9,10,11, evolutionary theory requires that its continued persistence must be explained by selection for traits with which it is genetically correlated (indirect selection).
Three evolutionary mechanisms have been suggested to explain the maintenance of post-reproductive lifespans. The ‘inter-age correlation model’ proposes that genes for early-age survival or reproductive function also benefit late-age survival12. The ‘inter-sex correlation model’ proposes that late-life survival genes are shared between the sexes2,13. As males do not experience menopause, selection for these genes in men can favour post-menopausal survival in females. The ‘(grand)maternal models’ suggest that prolonged lifespans of maternal or grandmaternal caregivers convey a fitness advantage to the related recipients of that care. When the same genes affect caregiving and late-life survival, care may generate indirect selection for late-age survival in females7,14,15. To describe this mechanism in quantitative genetic terms, we must invoke the concept of ‘indirect genetic effects’ (IGEs)16; these are the effects that genes have on the phenotypes of social partners. IGEs differ from ‘direct genetic effects’—the influence that one’s own genes has on one’s own phenotype. The (grand)maternal models require a positive genetic correlation between the caregiver-derived IGEs for fitness and the direct genetic effects for lifespan. By predicting an evolutionary response of late-life survival to selection to counter the deleterious effects of new mutations, all three models assume that positive genetic correlations between late-age lifespan and fitness have arisen and are maintained by recurrent mutation. While some studies have demonstrated phenotypic correlations and associations that are consistent with grandmaternal effects4,17, evidence for a positive genetic correlation represents the ‘smoking gun’ necessary to demonstrate the true efficacy of an evolutionary pathway to maintain post-reproductive survival.
We applied ‘animal model’ quantitative genetic analyses18 to estimate genetic correlations between post-reproductive lifespan and sex-specific fitness components. ‘Animal models’ have been used in the past to infer evolution by natural selection of life-history traits in other historical human populations19,20. We used these genetic correlations in conjunction with estimates of phenotypic selection gradients21 to quantitatively compare the importance of candidate evolutionary pathways to explain the persistence of post-50 lifespans in both sexes. We chose 50 years of age as it approximates the age at menopause in humans22 and it has been used previously as a reference age for describing post-reproductive lifespans in humans17. Our human phenotypic and pedigree data came from a subset of the Utah Population Database23,24, which derives from a population of pioneers of the American west that colonized the Utah Territory from 1847. The primary subject cohort comprised all individuals born between 1860 and 1889 and their siblings (n = 128,129). This population was chosen as it was recent enough to present sufficient data to permit powerful statistical analyses while being old enough to exhibit natural fertility and other features of a less-modern environment23. Pre-historical, hunter–gatherer, and modern populations are each lacking in one or more respect.
Each individual was associated with values for relative fitness (w, the relative contribution of an individual to the next generation, which is properly defined in the context of an age-structured population as the individual reproductive value at birth; see Methods) and the following sex-specific traits: the number of years survived beyond 50 (LS50), fitness accumulated before 50 (w1), survival to 50 (P50), and fitness accumulated at 50 and beyond (w2). We took a three-part approach to investigating genetic selection for late-life survival. First, we estimated the genetic covariation between fitness and sex-specific LS50. This predicts a response to selection and provides an estimate of the total selection acting to increase genetic values for late-life lifespan. We then investigated on a finer scale the degree to which specific hypothesized sources of selection act to favour (or disfavour) post-50 survival genes in both sexes. This required a careful articulation of the various evolutionary models put into a quantitative genetic perspective. This was the motivation for the second part of our study, the aim of which was to provide a unified conceptual model for the genetic selection of post-reproductive female lifespan that (1) generalizes across all previous evolutionary genetic hypotheses and (2) parameterizes these hypotheses in terms of estimable quantitative genetic values. To distinguish among these evolutionary models, we then estimated the parameters from this conceptual quantitative genetic model and thus estimated the degree to which selection for post-50 lifespan genes is driven by direct or indirect selection via inter-age, inter-sex or (grand)maternal effects.
Estimating net genetic selection for late-life lifespan
Significant heritability was found for w (); this was within the range of other estimates of the heritability of fitness in natural animal populations; for example, the collared flycatcher (females: 21 ± 6%; males: 7 ± 6%)25, red deer (females: 8.6 ± 2.3%; males: 3.5 ± 3.1%)26, female soay sheep (2.6 ± 1.5%)27, and great tits (females: 0.2 ± 3%; males: 2 ± 4%)28. Sex-specific lifespan was also heritable ( and ) (Table 1). These estimates appear to be slightly lower than the results from Danish twin studies (0.26–0.33)29,30, one of which reported slightly lower heritability in females29, but they attributed this to higher environmental variance. In our study, we found greater environmental variance (105.09 versus 93.04 yr2) but lower genetic variance (23.83 versus 29.0 yr2) in females versus males. An ‘animal model’ study of a preindustrial Finnish population estimated the heritability of female and male post-15 lifespans to be 0.175 and 0.167, respectively20. These estimates were similar to our estimates of post-50 lifespans, but with associated standard errors five to ten times greater.
These significant heritability estimates indicate that post-50 lifespan had the potential to evolve by natural selection in both sexes. However, while there was significant net selection acting to increase the genetic values for lifespan in males (by 0.138 years per generation), the estimate of genetic selection for post-50 lifespan in females was insignificant (Table 2): fitness was genetically correlated with ♂LS50 (r g = 0.110, s.e. = 0.031) but not with ♀LS50 (r g = −0.050, s.e. = 0.034). Genes tended to have the same effects on the post-50 lifespan in both sexes, but this tendency was not absolute as inter-sex genetic correlations for LS50 were high (r g = 0.817, s.e. = 0.032) but significantly less than one (P < 0.0001; Table 2 and Supplementary Table 4). This suggests that at least some sex-independent lifespan genes that have beneficial effects on male fitness are neutral or deleterious in females, and this allows for the difference in genetic selection for lifespan between males and females.
Conceptual quantitative genetic model for the evolution of late lifespan
The pathways by which selection might act to increase lifespan beyond age Y (after which there is no female reproduction) are illustrated in Fig. 1. Each pathway is identified individually in the figure as a product of a phenotypic selection gradient (straight arrows A–C), genetic variance (straight arrows D–J) and genetic correlations (curved arrows K–S). Genetic correlations include relatedness between social partners (R), as this is the within-trait genetic correlation among individuals. Direct genetic selection for male lifespan at age Y is BISH, and indirect selection for male lifespan may derive from a genetic correlation with early female (ADKH) or early male (CJQH) fitness. Direct selection for post-Y lifespan genes in females is impossible (as Y is defined as the age beyond which females do not reproduce), but indirect selection can come from the inter-age correlation model (a genetic correlation with early female fitness (ADLE) or early male fitness (CJME)), the inter-sex correlation model (a genetic correlation with late male fitness (BINE)) and the (grand)maternal model via early female fitness (AFRPE) or male fitness (CGROE). Pathways that connect genetic values of lifespan to fitness through other identified intermediates are possible (and these may, in principle, contribute to a response to selection). However, some are not highlighted explicitly here because they either have not been suggested elsewhere to be important or have been found in this study to be insignificant contributors to the genetic selection of post-Y lifespan.
To understand the (grand)maternal model in more detail, imagine an allele that improves female survival post-Y in a focal individual (contributes to the genetic value ♀GLSY ), where the focal individual is the (grand)offspring. Under the (grand)maternal effect model, for this allele to be selected to increase in frequency in focal individuals, it must be genetically correlated (paths P or O, depending on the sex of the affected (grand)offspring) with an allele that causes (grand)mothers to improve the fitness of their (grand)offspring (the latter is an IGE that contributes either to or ). This allele has no direct effect on the fitness of the focal individual when expressed in the (grand)offspring, but it has an indirect effect because it is more likely to be present in the (grand)mother of the focal individual as a result of relatedness (path R, contributes to or ). (Grand)mothers affect the fitness of focal individuals via the action of the indirect effect allele (path F or G) and thus allows indirect selection both for the indirect effect allele16 and for the allele that improves post-Y survival. From an inclusive fitness perspective31, kin selection for the (grand)maternal effect genes ♀ or ♂ derives from the product of relatedness between the (grand)mother and (grand)offspring (R) and the fitness benefit of the effect to (grand)offspring fitness genes (AF or CG). As we are interested in selection for post-Y lifespan genes, ♀GLSY , through (grand)mother effects, we find the correlated response to selection by multiplying kin selection for ♀ or ♂ by the correlation (P or O) between these genes and those for the post-Y lifespan genes. Finally, changes in genetic values for lifespan are manifested on this phenotype in proportion to its amount of genetic variation, E.
Distinguishing evolutionary models of late-life lifespan
We parameterized all of the relevant pathways in Fig. 1. All traits were significantly heritable, with the exception of ♀w2 (Table 1). This confirms the expected lack of genetic variance in female late-life fitness and thus no potential for direct selection for female late-life survival genes. Therefore, this trait was not considered in the subsequent analyses. No traits had significant IGE variation derived from mothers. In fact, four traits (w, ♀w1, ♂w1 and ♂P50) had significant maternal effects (Supplementary Table 1), but the lack of IGE variation must be interpreted to mean that while mothers influenced the phenotypes of their children, this influence was not heritable. Total fitness had significant maternal and grandmaternal effects arising through both the maternal and paternal grandmothers, but these also had no significant genetic basis (Table 1 and Supplementary Table 1) and were therefore not heritable. As genetic covariance cannot exist in the absence of genetic variation, there was no evidence to support either the maternal or grandmaternal effects models of indirect selection for late-life lifespan in either sex (that is, neither path P in AFRPE nor path O in CGROE can exist).
We estimated genetic correlations between male and female LS50 and all remaining fitness-related traits (Table 3). All showed positive genetic covariance with ♂LS50, but only ♀P50 and ♂P50 covaried with ♀LS50. Phenotypic selection gradients for the fitness-related traits were estimated by multiple regression32 (Table 4). Each of these gradients multiplied by the genetic covariance between that trait and sex-specific LS50 (Table 3) defines that trait’s independent contribution to the per-generation evolutionary change in sex-specific LS50 (Fig. 1 and Table 5). The sum of estimates of genetic selection for male LS50 (+0.156 years per generation) was within half of the standard error of the estimate of total selection for male LS50 genes (+0.138 years per generation), indicating that both methods agreed (the estimated total covariance equalled the sum of estimated partial covariances). Direct selection for male late-life lifespan was relatively weak: late-life fitness explained only 4.6% of the genetic selection for late-life survival. Selection for male late-life survival genes was almost entirely driven by indirect selection for male and female early life fitness (w1), explaining 38 and 55% of the selection for male late-life lifespan genes, respectively.
Antagonistic selection for different components of fitness could not explain the lack of overall genetic selection for late-life female lifespan, as no component source was independently significant: indirect selection for female late-life lifespan was weak as a result of either weak genetic correlations (as with early fitness in both sexes and late male fitness) or weak phenotypic selection for the correlated traits (survival to 50 in both sexes). Thus, reweighting the relative strength of phenotypic selection on different fitness components, such as might happen by shifts in the mating system that emphasize phenotypic selection for late-age male fertility2, cannot result in net selection to favour ♀LS50 genes.
In this study, we have attempted to quantify the relative importance of alternative evolutionary models of late-life survival in humans. We found a very high genetic correlation between female and male lifespans (+0.817), which is a degree of association that is usually associated with strictly constrained evolutionary pathways. Nevertheless, we found very different predicted responses to selection for female and male lifespans. While our results show that natural selection favours LS50 genes in males, we found no selection for late-life survival genes in females. This is very surprising because females appear to live at least as long as men in hunter–gatherer populations33,34,35, and a simple evolutionary explanation for this relationship requires that selection for late-life survival in female genes must be at least as strong as selection in male genes. While we cannot say whether this null relationship is general to all recent human populations, we can suggest possible explanations for the apparent disassociation between fitness and female post-50 lifespan genes in the Utah population.
One possibility is that genetic (grand)maternal effects on fitness existed in the population, but our pedigree was too shallow to detect them. This seems unlikely given that the pedigree for the 1860–1889 cohort was four generations deep, and the detection of maternal and grandmaternal genetic effects requires three and four generations, respectively. However, to investigate this possibility, we extended our analysis to subsequent cohorts and searched for genetic maternal and maternal grandmother effects on survival to 16 years of age, P16. We focused on survival rather than fitness for two reasons. First, relevant human evolution models emphasize early (grand)child survival as a focus of (grand)maternal care (for example, refs 4,15,17). Second, we could enlarge our pool of phenotyped individuals because we were not restricted to use only those individuals with complete reproductive records. These cohorts were collections of individuals born between 1860 and 1889, 1890 and 1899, 1900 and 1909, 1910 and 1919, 1920 and 1929, 1930 and 1939, and 1940 and 1949. The pedigree grew in breadth and depth with each subsequent decadal cohort (presumably the power to detect maternal and grandmaternal genetic effects grew accordingly). The largest contained 18,339 unique maternal sibships and 5,746 unique maternal grandmother sibships (Supplemental Table 4). We found no evidence for either genetic maternal or maternal grandmaternal effects on P16 (Supplementary Table 6). We conclude that this population was truly devoid of meaningful genetic (grand)maternal effects for survival.
Another possibility is that recent demographic changes eliminated or mitigated the influence of ancestral care. The study population began to migrate to the Utah Territory in the 1840s and while many of the individuals in the 1860–1889 cohorts can be associated with grandmothers in the pedigree, there is no guarantee that these ancestors co-migrated and provided care. Furthermore, fertility in this population was very high (married women born between 1870 and 1874, for example, produced an average of 7.0 live births each23) and infant mortality was relatively low compared with previous and contemporaneous populations36. This suggests that maternal, and especially grandmaternal, genetic effects might have been diluted by unusually large family sizes. However, this dilution should have been lessened over subsequent decadal cohorts because fertility was reduced23 and the frequency of resident grandmothers probably increased as the colonization event receded into the past, but the genetic effects were still absent (see above).
The persistence of the post-reproductive lifespan in women4,5,6 remains puzzling, as we have exhaustively investigated all proposed evolutionary pathways and found no evidence for any source of response to selection for late-life female lifespan. We believe that the most likely explanation for this absence is that one or more genetic correlations involving late-life female survival were positive in the past. Genetic correlations are known to switch sign as environments change37, although the existance of any general pattern for the direction of these changes is unclear38. Our results suggest that recent evolutionary processes are insufficient to explain the persistence of the female lifespan. This interpretation highlights how little we understand about how changes in human ecology may have altered the relationships between genes, lifespan and fitness. More quantitative genetic analyses such as this should be applied to other human populations to better understand the among-population distributions of relevant genetic correlations. The quantitative genetic approach introduced here provides a conceptual framework for future studies of human evolutionary demography—a field that has yet to embrace an indirect genetic perspective to understanding ageing in a social context39. If genetic correlations are shown to vary among populations, new theory is needed to link these differences to changes in human ecology.
In contrast with the female results, we found strong evidence for genetic selection to favour late-life male lifespan; this may provide some explanation for the unusually long lifespan of humans compared with other primates and most other animals40,41,42. Counter to models of human lifespan that emphasize the evolutionary role of late-life male reproduction2,13, direct selection was not the main cause of its genetic selection; instead indirect selection via early fitness in both males and females explained the majority of genetic selection. Future applications of the ‘animal model’ to this population may succeed in identifying on a finer scale which ages before 50 are the most important contributors. Our result has important consequences for evolutionary theories of ageing. The ‘antagonistic pleiotropy’ model8 and its mechanistically detailed application ‘disposable soma theory’43,44 argue that the attenuation of the strength of selection with increasing age will cause genes with advantageous early-life fitness effects but deleterious late-life mortality costs to spread through a population. This is expected to cause negative genetic correlations across early and late fitness traits45, which we did not observe. ‘Mutation accumulation’ models7 instead view ageing as a strictly maladaptive phenomenon where late- acting deleterious mutations are allowed to accumulate due to relaxed selection. While traditional mutation accumulation models assume that gene effects on survival rates are completely age dependent1,46, observations that mortality rates may not always increase with age in the very old47 have prompted the development of mutation accumulation models that assume positive genetic correlations between early function and late survival12. Our results provide evidence for the existence of these positive genetic correlations in humans that suppress the evolution of senescence and promote longer life48,49,50.
We used individual human records collected by the Utah Population Database, a descendant-based genealogical database. These records included information on years of birth and death and the identities of mothers and fathers for individuals born up to 1950 and their descendants (1,732,394 unique individuals). In our primary analyses, we constrained our focal population to those individuals with mothers who reproduced between 1860 and 1889 to limit the effects of secular trends and to restrict our sample size to allow computationally tractable analyses. We also included all of the siblings of these individuals who were born outside of this time window. Only those individuals who were indicated by the database to have known years of death and complete reproductive histories were analysed. Individuals with insufficient information to describe all fixed effects (see Fixed effects below) were excluded from the study. This left 128,129 informative individuals in the phenotyped generation. Phenotypes were assigned only to these individuals, but the pedigree used in the analyses was generated from the union of the focal population and all individuals born before 1890. This pedigree contained 179,759 individuals with a depth of up to four generations (enough to detect any grandmaternal genetic effect variance). Fitness (see below) followed from birth records contained in the complete database.
This was a pre-contraception population with large family sizes23. For individuals born in the years 1860–1889, the expected number of offspring ranged between four and six, depending on the year of birth51. Survival rates were generally high (Supplementary Fig. 1). For these reasons, Malthusian growth rates were positive for all birth year cohorts, ranging from 0.007 to 0.024 (Supplementary Fig. 2). The predominant residency pattern at this time was neolocal, with first-degree relatives living in close proximity to newly married individuals52. In a study of one Utah county from 1880, Mineau and Anderton53 estimated that within the first five years of marriage involving a man 22.5 years or younger, 71% of couples lived in the same community or county as at least one of his parents, while 13% of these couples lived in the same house as his parent.
As selection is defined in terms of the covariance between relative fitness and traits of interest, we calculated the relative fitness, w, of individuals using the individual reproductive value at birth. This is an index trait defined as one-half of the number of children born (lifetime reproductive success), with each annual contribution discounted by the Malthusian growth rate characteristic for each birth-year cohort54. Thus, for any individual, i, born in year k, its relative fitness is , where M ijk is the number of offspring born to individual i at age j, and r k is its cohort’s Malthusian growth rate. The individual lifetime reproductive success divided by the cohort-specific mean was also calculated, but the results were not considered further as they correlated extremely well with w (r = + 0.986). Relative fitness was also calculated over two age ranges (up to 50 years of age and over 50) by summing annual fitness contributions over the appropriate age intervals to arrive at w1 and w2. The threshold age of 50 was chosen because it represented an age beyond which female reproduction was negligible and therefore unlikely to be under meaningful direct selection. Other threshold ages could be reasonably used for other applications of the animal model to understanding the genetics of lifespan. The heritability analysis (see below) revealed no evidence for heritable variation for post-50 female fitness. Survival to 50 years, P50, was defined as a binary trait (1 = success, 0 = failure). Years lived beyond 50 (LS50) and the aforementioned fitness traits were sex-specific: trait values of ‘NA’ were assigned to all individuals of the alternative sex. We defined the traits w2 and LS50 to be conditioned on successful survival to 50. Those individuals who failed to survive to 50 were assigned values of ‘NA’ for these traits.
Phenotypic selection gradient estimation
Sex and sex-specific P50, w1 and w2 collectively explained all variation for relative fitness (R2 = 1) in a multiple regression when ‘NA’ values were treated as non-existent data55. This approach has been used previously to re-derive Hamilton’s indicators from a multiple regression perspective24,54 and to estimate phenotypic selection gradients for other traits in this population56. Estimates for phenotypic selection gradients are given in Table 4. Although there was selection for sex (a partial covariance between relative fitness and sex holding other traits constant), this trait was not analysed further as it could not contribute to an indirect response to selection for any trait because it has no genetic variance. Viewed from a quantitative genetic perspective, there is no variation for direct genetic effects on sex because the sex of offspring cannot be predicted by the sex of the parents (every individual invariably has only one parent of each sex). While maternal genetic effects on sex ratio can exist, in principle, the absence of maternal genetic effect variance for relative fitness in this case (see Results) indicates that any such IGEs present in this population do not contribute to trait evolution.
This approach to estimating phenotypic selection gradients imputes nominal trait values for individuals for whom trait values are not logically permitted to be expressed (for example, male-limited traits in females or late-acting traits in individuals who died early)55. The imputed values are equal to the mean trait values of the fraction of the population that is allowed to express the trait. Indicator, or ‘dummy’, variables signal whether or not imputed values are used for particular individuals. Multiple indicator variables can be used simultaneously, and individual indicators can themselves be imputed if their expression is also logically precluded from some portion of the population. Consider the post-50 contribution to male relative fitness, ♂w2, for example. As only males who survive to 50 are exposed to direct phenotypic selection for this trait, three variables must be considered: ‘sex’, ♂P50 (male survival to 50) and ♂w2. The trait ‘sex’ acts as an indicator for ♂P50: males are either ‘0’ or ‘1’, and all females are given a nominal value of 0.743 as this is the fraction of male births that survive to age 50. The trait ♂P50 acts as an indicator for ♂w2: male survivors are awarded trait values according to the amount and timing of post-50 reproduction, and all females and males who die before 50 are assigned the nominal value 0.0216 because this is the mean value for ♂w2 among the male survivors.
All traits and indicator variables are included in the multiple regression. Multivariate selection gradients follow from the estimated partial regression coefficients, with each gradient weighted by the proportion of the population that has the trait. For our example above, the partial regression coefficient for ♂w2 is 1 (because late-age derived relative fitness is, by definition, still relative fitness), but the phenotypic selection gradient for ♂w2 is 0.371 because only 74.3% of born males survived to 50, and only half of all births are male. Phenotypic selection gradients for ‘conditioned’ traits (those traits that are expressed only by individuals who have particular values for other traits) provide correct predictions for the multivariate response to selection when applied to a multivariate breeder’s equation21, but care should be taken to understand the conditional nature of these traits when interpreting these phenotypic selection gradients on their own. For example, variation in early male fitness, ♂w1, and late male fitness, ♂w2, does not collectively explain all of the fitness variation in males because there is a mean total relative fitness difference between males who do and do not survive to 50. This difference is not derived from post-50 differences (because the imputation strategy equates the expected ♂w2 values of the two groups). Fitness variance derived from selection for ♂P50 is also needed to completely describe total male fitness variance, and this selection follows from the difference between survivors and non-survivors for mean ♂w1 values. In this example, phenotypic selection for ♂P50 is small but positive (+0.011) because individuals who survive to 50 generate slightly more fitness before 50 than those who do not survive. A negative phenotypic selection gradient for this trait would not have been illogical. Indeed, this might be expected when early fitness is associated with large costs to mid-life survival.
Genetic and environmental variance and covariance estimation
Human studies of lifespan heritability have traditionally used either twin-based29,30,57 or family clustering58,59,60 approaches. Twin-based approaches account for otherwise misleading effects of shared environments, but appropriate datasets are rare. In contrast, family clustering approaches are applicable to a wider range of datasets, but there may be problems with common environments. Neither use all available information efficiently when large pedigrees contain individuals with many different degrees of relatedness. ‘Animal models’—a form of linear mixed-effects models—offer an alternative approach to decomposing phenotypic variances and covariances into additive genetic and environmental components18,61. This approach uses pedigrees to construct matrices containing pairwise relatedness between all individuals; this allows the most efficient possible use of all available phenotypes.
The mixed-effects approach allows simultaneous estimation of fixed effects that may contribute to phenotypic variance but may confound estimates of genetic (co)variation if fixed effects are not identified. The random effects generally include additive genetic and environmental effects (residuals), but when the models are specified to include effects associated with shared mothers, they can partition the residual variance further into maternal effect variance and a new residual effect variance. It should be emphasized that while these maternal effects can include the influence that the mothers have on the phenotype of their offspring beyond the genes that they transmit, they will also include other aspects of the environment that are shared by individuals with the same mothers (for example, socioeconomic status shared among siblings). An important feature of this study is that the mixed model can be specified so as to partition the maternal effect variance into two more components: (1) the maternal IGE variance (the part of the maternally produced environmental variance that is heritable) and (2) the maternal indirect environmental effect variance (the part of the maternally produced variance that is not heritable). Following this same logic, models can be further specified to include the grandmaternal effect, and these can be likewise partitioned into grandmaternal IGEs and grandmaternal indirect environmental effects. For these models, the grandmaternal indirect environmental effect variance includes the effects of environment common to all individuals who share the same grandmother. The residual variance is generated by environmental variance due to effects that are not shared by siblings (in all models) or by cousins (in the models that include grandmaternal effect terms).
Associations between genetic relatedness and common environmental effects have the potential to bias estimates of additive genetic variance if the source of the common environment is not specified in the mixed models. For example, individuals who live in common areas but happen to share a greatgrandparent might resemble each other more than would be expected from sharing 1/32 of their genes. If the effects of area are not included in the model, estimates of genetic variance are inflated unless greatgrandparental effects are fit. The pedigree depth of four generations used here is sufficient to discriminate between genetic and non-genetic causes of phenotypic similarity among first cousins, but common environments shared between more distantly related individuals could, in principle, bias our results. However, our models find very small and statistically insignificant grandparental and maternal genetic effects. Failing to include these in the models had no material effect on our estimates of additive genetic variance. Given that common environmental effects between first cousins are unimportant, it seems unlikely that a common environment shared between more distantly related individuals would bias our results in meaningful ways.
Grandpaternal effects were not modelled in this study for two reasons. First, grandmaternal and grandpaternal effects are likely to be conflated in tractable mixed models, and the grandmaternal effect variance that we estimated already accounts for these sources of phenotypic variation. Second, what we identify as ‘grandmaternal’ effects are both very small and lacking evidence for a heritable basis, and therefore decomposing this variance was unlikely to reveal any interesting genetic covariation. Any environmental effects common to individuals with a shared grandfather contribute to the grandmother indirect environmental effect variance or the maternal indirect environmental effect variance (for models with and without fit grandmaternal effects, respectively).
Our candidate fixed effects were year of birth, age of mother at birth and three parameters that described the birth order among siblings with a shared mother (that is, the number of older siblings, number of individuals born in the same year and number of younger siblings). For sex-specific LS50 and all fitness traits, we used ASReml 4.062 to fit univariate fixed-effects models using the set of candidate fixed effects as factors. This software implements restricted maximum-likelihood to estimate jointly the fixed and random effects. While not all fixed effects had a significant effect on all traits (as determined by Wald tests), every fixed effect had a significant effect on at least one trait (Supplementary Table 2). As random-effect variances estimated from mixed models are conditioned on the fixed effects, we used the entire set of candidate fixed effects in all subsequent models to simplify the interpretation of genetic architecture.
Univariate ‘animal models’
For all analysed traits, variance components and random-effect structures were first investigated using univariate ‘animal models’ of the general form
where y is a vector of phenotypes, μ is the mean, b is a vector of the fixed effects described in the previous section, u is a vector of random effects, X and Z are design matrices linking individual records to the appropriate fixed and random effects, and e is a vector of residual errors. For each trait, models were fit with one random effect corresponding to additive genetic effects (in addition to the residual effects), and the Akaike Information Criterion (AIC) value was measured from this fit. If this AIC value was lower than the AIC value derived from the model with residuals defined as the only random effect, then a new model was fit that added maternal effects as an additional random effect. If this yielded an even lower AIC value, the maternal effect term was replaced with a maternal genetic and a maternal residual term, and a new model was fit. For each trait, we used the AIC values to define the best model (Supplementary Table 1), and the random effects included within these were incorporated into the subsequent multivariate analyses. Because, as expected, the best model of ♀w2 had no additive genetic effect variance, this trait was not included in further analyses. For w, the model with maternal genetic effect variance fit slightly better than the model with unspecified maternal effects. In this case, an additional model was considered in which paternal grandmother and maternal grandmother effects replaced maternal genetic effects. This dual-grandmother effect model provided the best AIC values, but follow-up models that sought to partition these into genetic and non-genetic grandmother effects failed to produce meaningful results. AIC and likelihood ratio tests were used to select best models and to test for significant variance terms in all univariate analyses.
Multivariate ‘animal models’—total genetic selection for LS50
We estimated genetic covariances between sex-specific LS50 and w using trivariate equivalents of the ‘animal models’ represented in equation (1), in these estimates y is a matrix of phenotypes for each of the traits of interest, and µ is a vector of means for each phenotypic trait. Each model included fixed, additive genetic and residual effects for all traits. Maternal effects for fitness were also included. Three models were compared: (1) unconstrained genetic covariances; (2) genetic covariance between fitness and female lifespan constrained to be zero; and (3) genetic covariance between fitness and male lifespan constrained to be zero (Supplementary Table 3). The comparison of models 1 and 2 tests for genetic covariance between fitness and female lifespan of greater than zero, and the comparison of models 1 and 3 allows the test for the same parameter for males.
Multivariate ‘animal models’—components of genetic selection for LS50
Genetic covariances between sex-specific LS50 and w were explored on a finer scale by replacing w with heritable fitness determinants in a multivariate ‘animal model’. As the univariate analyses found significant maternal effect contributions only for ♂w1, ♀w1 and ♂P50, maternal effects were fitted only for these traits in the multivariate analyses. The full multivariate model failed to converge when ♂P50 was included, but it successfully converged when this trait was removed. Thus, to estimate genetic covariances between ♂P50 and LS50 in both sexes and all sex-specific heritable fitness components, we used six independent bivariate analyses. The results from all seven models are shown in Table 3.
As the larger multivariate model took many weeks to converge, we judged that complete hypothesis testing involving all constrained versions of this model was impractical. Pairwise bivariate models were used instead. For each trait pair, three ‘animal models’ were fit: (1) a model with unconstrained genetic covariances; (2) a model with genetic covariance constrained to be zero and (3) a model with genetic correlations constrained to be ± 0.9999 (depending on the direction of the genetic correlation estimated by the unconstrained model). The AIC values for all of these models were compared (Supplementary Table 4).
Decadal cohort analyses
Decadel cohort analyses were performed as described in Univariate ‘animal models’, except survival to 16 years of age (P16) was the only trait considered, and univariate models were applied independently to individuals born in each of the decades between 1890 and 1949 (plus siblings). As before, individuals with insufficient data to fit fixed effects or to define exact age at death were excluded from the analyses, but complete reproductive histories were not required for inclusion in the decadal cohort analyses of survival to 16. Relevant sample sizes for all cohorts are given in Supplementary Table 5. Also, once the model progression indicated the presence of additive genetic effect variance for P16 (as it did for all cohorts), two new models were fit. The first replaced the maternal term with maternal genetic and environment terms. The second additional model kept the maternal term and added a maternal grandmother term. If the second model was preferred (this happened only once), a final model was fit that replaced the maternal grandmother term with maternal grandmother genetic and environment terms. AIC and likelihood ratio tests were used to select best models for each cohort (Supplementary Table 6).
Method for predicting a univariate response to selection
Two methods are widely used for predicting evolutionary change caused by natural selection over one generation. The univariate ‘breeder’s equation’63,64 uses the product of a selection gradient and additive genetic variance associated with a trait of interest. The ‘Robertson–Price identity’65,66 instead uses the genetic covariance between relative fitness and the trait of interest. Under ideal circumstances, when the trait of interest does not correlate with another trait with a causal relationship with fitness, the two approaches yield the same result. However, under more realistic conditions, such as when one wishes to predict a response to selection in a wild or otherwise uncontrolled population, the ‘breeder’s equation’ can yield misleading results67, and the ‘Robertson–Price identity’ is recommended68. There appears to be no advantage to using the univariate ‘breeder’s equation’ when the means exist to estimate the genetic covariance between relative fitness and the trait of interest. The present study adopts the ‘Robertson–Price’ approach to estimate a response to selection for post-50 lifespan (in the section ‘Estimating net genetic selection for late-life lifespan’) because genetic correlations between lifespan and traits with causal effects on fitness is central to evolutionary models of post-reproductive lifespan (see Fig. 1). No such issues exist with our application of the ‘multivariate breeder’s equation’ in the section ‘Distinguishing evolutionary models of late-life lifespan’ because all possible fitness traits are considered simultaneously in our estimate of selection gradients (the multiple coefficient of determination for the regression of fitness on all fitness traits is one).
Human research participants
This study complied with all relevant ethical regulations. An ethical review of this study was provided by an institutional review board and administered through the office of the Vice President for Research at the University of Utah. Informed consent was impossible as all subjects were deceased.
The data that support the findings of this study are available from the Pedigree and Population Resource of the Huntsman Cancer Institute, University of Utah. Restrictions apply to the availability of these data, which were used under license for the current study and so are not publicly available. However, data are available from the authors upon reasonable request and with permission from the Huntsman Cancer Institute.
Publisher’s note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
We thank the Pedigree and Population Resource of the Huntsman Cancer Institute, University of Utah (funded in part by the Huntsman Cancer Foundation) for its role in the ongoing compilation, maintenance and support of the Utah Population Database. We also thank K. Smith for providing the data used in this study. C.A.W. was funded by a Natural Environment Research Council postdoctoral fellowship (NE/I020245/1) and a University of Edinburgh Chancellor’s fellowship. We thank A. Gilmour, J. Hadfield and A. Wilson for helpful technical advice. Comments from J. Pemberton, L. Kruuk, D. Nussey, P. Smiseth and B. Whittaker greatly improved the paper.
Electronic supplementary material
Supplementary figures and tables