Abstract
For species with overlapping generations, the most widely used method to calculate effective population size (N_{e}) is Hill’s, the key parameter for which is lifetime variance in offspring number (\({V}_{k\bullet }\)). Hill’s model assumes a stable age structure and constant abundance, and sensitivity to those assumptions has been evaluated previously. Here I evaluate the robustness of Hill’s model to extreme patterns of reproductive success, whose effects have not been previously examined: (1) very strong reproductive skew; (2) strong temporal autocorrelations in individual reproductive success; and (3) strong covariance of individual reproduction and survival. Genetic drift (loss of heterozygosity and increase in allele frequency variance) was simulated in agestructured populations using methods that generated no autocorrelations or covariances (Model NoCor); or created strong positive (Model Positive) or strong negative (Model Negative) temporal autocorrelations in reproduction and covariances between reproduction and survival. Compared to Model NoCor, the other models led to greatly elevated or reduced \({V}_{k\bullet }\), and hence greatly reduced or elevated N_{e}, respectively. A new index is introduced (ρ_{α},_{α+}), which is the correlation between (1) the number of offspring produced by each individual at the age at maturity (α), and (2) the total number of offspring produced during the rest of their lifetimes. Mean ρ_{α},_{α+} was ≈0 under Model NoCor, strongly positive under Model Positive, and strongly negative under Model Negative. Even under the most extreme reproductive scenarios in Models Positive and Negative, when \({V}_{k\bullet }\) was calculated from the realized population pedigree and used to calculate N_{e} in Hill’s model, the result accurately predicted the rate of genetic drift in simulated populations. These results held for scenarios where agespecific reproductive skew was random (variance ≈ mean) and highly overdispersed (variance up to 20 times the mean). Collectively, these results are good news for researchers as they demonstrate the robustness of Hill’s model even in extreme reproductive scenarios.
Similar content being viewed by others
Introduction
The majority of populations in nature are age structured, but most evolutionary theory was originally developed for models that assume discrete generations. Considerable progress has been made in adapting discretegeneration theory to account for age structure (e.g., Charlesworth 1994; Cushing 1994; Lande et al. 2003), but this process remains challenging. One of the most important parameters in evolutionary biology is effective population size (N_{e}), which, in addition to determining the rate of genetic drift, modulates the effectiveness of natural selection and hence the rate of evolutionary adaptation. Wright’s original concept of N_{e} (1931, 1938) assumed that generations were discrete. Of the various methods researchers have proposed for incorporating age structure into the concept of effective size, that of Hill (1972) is the most widely used. Hill showed that if the age structure is stable and the population produces a constant number (N_{1}) of offspring in each cohort, N_{e} per generation is given by
where T is generation length (average age of parents of the newborn cohort) and \({V}_{k\bullet }\) is the variance in lifetime reproductive success (LRS), measured as the variance in lifetime number of offspring among the N_{1} individuals in a cohort.
The assumption in Hill’s model of stable age structure and constant N has been extensively evaluated with diverse life histories and found to be robust to random demographic variability (Waples et al. 2011; 2014); these evaluations also found Hill’s model to be robust to skewed adult sex ratios and to modestly overdispersed variance in reproductive success. However, the models used in those evaluations assumed independence of reproduction and survival, which means that the robustness of Hill’s model to temporal autocorrelation in reproduction and covariance of reproduction and survival has not been rigorously tested. This is an important gap, for two major reasons.
First, temporal autocorrelations and covariances are common in many species. The theory of lifehistory evolution is based on the premise that biological constraints impose intrinsic tradeoffs between reproduction now and reproduction later, and between reproduction and subsequent survival (Williams 1966; Bell 1980; Reznick 1992; Roff 1992). These tradeoffs imply a negative temporal autocorrelation in individual reproduction and a negative covariance between reproduction and survival. In the real world, however, the opposite patterns also can be found, and these patterns can be explained (and even expected) when one accepts the possibility that individuals are not interchangeable (Van Noordwijk and De Jong 1986). Persistent individual differences in reproductive success (Lee et al. 2011) are generally taken to reflect individual differences in “quality”, which is a rather slippery concept but is generally thought to be positively correlated with fitness (Wilson and Nussey 2010). Individual differences in quality can be influenced by both genetic and environmental factors (Byholm et al. 2007) and can persist across many reproductive seasons [e.g., owing to longlasting maternal effects (Mousseau and Fox 1998; Kruuk 2004) or mating dynamics (McElligott and Hayden 2000; Pelletier et al. 2006)]. In species for which fecundity increases with age (as applies to most ectotherms with indeterminate growth), persistent individual differences are likely the rule rather than the exception: individuals that are large for their age early in life will generally have relatively high reproductive success, and they also are likely to be large for their age in later years (Waples and Feutry 2022). Likewise, if individual quality is high for both reproduction and survival, that can offset any costs of reproduction and lead to a positive covariance (Smith 1981; Pelletier et al. 2006), a result that also can occur if mortality is anthropogenically modulated (e.g., if hunters avoid killing female moose with calves; Lee et al. 2020).
The second factor is that although monitoring evolutionary dynamics in agestructured populations remains logistically challenging, especially for longlived species, the recent genomics revolution has greatly increased our ability to reconstruct population pedigrees using noninvasive genetic samples. Consider, then, the following scenario:

A researcher is conducting a longterm pedigreed study of their focal species.

The pedigree data are sufficient to robustly estimate generation length and variance in LRS.

The biology of the focal species is such that it generates substantial autocorrelation of reproductive success and/or covariance of reproduction and survival.
A key question then becomes, “If the empirical estimates of T and \({{V}_{k}}_{\bullet }\) are used in Eq. (1) to estimate N_{e}, will the result accurately reflect the rate of genetic drift in the population?”
The goal of this study is to answer this question. Using hypothetical vital rates for populations with overlapping generations, multiyear population pedigrees are simulated according to three models that (a) simulate independence of reproduction and survival over time (Model “NoCor”); (b) generate strong positive autocorrelations and covariances (Model “Positive”); and (c) generate strong negative autocorrelations and covariances (Model “Negative”). Each model is simulated under three levels of reproductive skew: low, medium, and high. Along each simulated pedigree, genetic variation is tracked at a number of loci, and the rates of genetic drift are quantified by two common metrics: rate of decline in heterozygosity, and rate of increase in allele frequency variance. These observed rates are compared with expected rates based on N_{e} calculated from the population pedigree using Eq. (1).
Methods
Population demography
The core evaluations modeled reproduction and genetic change in a hypothetical population with age at maturity α = 3 and maximum age ω = 10, which produced a maximum adult lifespan of AL = 8 years (ages 3–10, inclusive; see core vital rates in Table 1). Population dynamics followed the discrete time, birthpulse model of Caswell (2001), where individuals that reach age x produce on average b_{x} offspring and then survive to age x + 1 with probability s_{x}. Because previous work had evaluated sensitivity to the constantsize assumption, to limit the number of potentially confounding variables, age structure was fixed and defined by the vector of cumulative survival through age x, l_{x}. Offspring were enumerated at age 1, so setting l_{1} = 1 and letting N_{1} be the number of offspring in each cohort that reach age 1, the numbers in each successive age class (x = 2,ω) are given by N_{x} = N_{1}l_{x.} For the core life table in Table 1, N_{1} = 200 and the full vector of ageclass abundance was (rounded to the nearest integer): N_{x} = [200, 140, …8]. Total abundance was ΣN_{x} = 649, of which 340 were juveniles, so the adult census size was N_{Adult} = 309. These numbers apply to a single sex; sex ratio was 1:1 in the core analyses, so the total population size was twice as large.
The core life table allowed patterns of agespecific fecundity and reproductive skew to differ between the sexes. In all analyses, fecundity was constant with age in females and reproduction followed a Poisson process (all ϕ = 1), so every year all females behaved like a single WrightFisher population with essentially random variation in offspring number. In males, three reproductiveskew levels were modeled. In LowSkew, b_{x} and ϕ_{x} were identical to values for females. In ModerateSkew, fecundity for males was proportional to age, and reproductive skew was moderately strong for all ages (ϕ_{x} = 5, indicating that variance in offspring number was 5 times the mean). In HighSkew, male b_{x} was also proportional to age, and reproductive skew was very strong (all ϕ_{x} = 20). Fecundities were scaled to values that would produce a stable population by ensuring that Σb_{x}l_{x} = 2. With fecundities so scaled, the expected number of offspring produced in each time period by each age class within each sex was B_{x} = b_{x}N_{x}, with ΣB_{x} = 2N_{1} = 400 total offspring, of which half are male and half female.
Modeling reproduction
To ensure that the realized distribution of offspring numbers closely approximated the parametric values in the life table, an algorithm was developed (NegBinom) that used a negativebinomial simulator to generate random vectors of offspring numbers (k) expected to have the desired mean and variance. A disadvantage of this approach is that the realized mean of the simulated distribution is a random variable; as a consequence, the total number of offspring produced (Σk) also varies randomly and only by chance equals the target number. To maintain a constant total number of offspring in each cohort, the following procedure was used:

For each age x, the rnbinom function in R was used to generate N_{x} random k values, using the parameterization mu = b_{x} and size = b_{x}^{2}/(V_{k(x)} − b_{x}), with V_{k(x)} = ϕ_{x}b_{x}. This function requires V_{k(x)} > b_{x}, so for LowSkew scenarios with V_{k(x)} = b_{x} the rpois function was used instead.

The total length of all the agespecific k vectors was compared with the target cohort size = 2N_{1}. This process was repeated until the total length fell in the range [2N_{1}, 1.05 × 2N_{1}]. At that point, the vectors of offspring numbers were converted to a list of parental IDs and this list was randomly downsampled (if necessary) to reach exactly 2N_{1} offspring.

This entire process was repeated for the opposite sex.
Two related issues require consideration:

1.
Does reproduction in one time period affect reproduction in any subsequent time period?

2.
Does reproduction in one time period affect the probability of survival to subsequent time periods?
To see the consequences of a positive answer to either of these questions, note that the expected lifetime reproductive success [E(k•)] of an individual is a simple additive function of its agespecific fecundity and its age at death (q):
The variance of a sum, however, also depends on the covariance structure of the terms being added: var(A + B) = var(A) + var(B) + 2cov(A,B). More generally, when applied to the variance of a sum of q terms:
Since \(\mathrm{var}\left({k}_{x}\right)={{{\phi }}_{x}b}_{x}\), the above can be written as follows:
Equation (3) gives the variance in offspring number among individuals that die at a single age (q). The lifetime variance in offspring number among all individuals in a cohort can be obtained using the definition of variance as E(x^{2}) – [E(x)]^{2}. In the current notation,
where D_{q} is the number of individuals that die after reaching age q and \({{SS}}_{q}={D}_{q}[\mathrm{var}\left({k{\bullet }}_{q}\right)+{\left({\sum }_{x=1}^{q}{b}_{x}\right)}^{2}]\) is the sum of the squared offspring numbers for these D_{q} individuals. If we ignore individuals that die before reaching age at maturity, \(\varSigma \left({D}_{q}\right)\) is the number of individuals in each cohort that reach adulthood. The total sums of squares is obtained by summation: \({{SS}}_{T}=\Sigma {({SS}}_{q})\), and the lifetime variance in offspring number is calculated as follows:
The AgeNe model (Waples et al. 2011) calculates \({V}_{k\bullet }\) and N_{e} using agespecific vital rates from an expanded life table and assuming that expected values of all the covariance terms are zero.
Developing analytical expectations for the covariance terms when they cannot be assumed to be zero is not a simple task, especially in any generalized form. However, the simulation algorithm can be tweaked to generate positive or negative correlations in reproductive success, as can be illustrated with a hypothetical example involving the lifetime reproductive success of a cohort of 20 individuals that is tracked for a 5year maximum lifespan (Table 2). NAs in the table indicate individuals that were not alive at the specified age. In this example, 8 individuals died after age 1 and before reaching age 2, 4 died after age 2, and 3 each died after ages 3 and 4, leaving just 2 individuals that survived to age 5. Expected fecundity increased linearly with age (b_{x} = [1, 2, 3, 4, 5]), and the parametric withinage variance was 5 times the mean (ModerateSkew; ϕ = 5) for each age. To populate the table, vectors of offspring numbers were randomly generated for each age using the rnbinom function in R (R Core Team 2021) (generating 20 random values for age 1 with mean = 1 and variance = 5; 12 random values for age 2 with mean = 2 and variance = 10; etc.).
Results of one random realization of this process are shown on the left side of Table 2. At age 1, 2 individuals produced exactly 1 offspring, single individuals produced 2, 3, 5, and 6 offspring each, and the remaining 14 individuals produced no offspring. The “LRS” column shows that across the original cohort, lifetime offspring number ranged from 0 (6 individuals) to 31 (individual 5). Three factors contribute to the variance in LRS in this example. (1) Individuals that (by luck or pluck) survive to older ages have more opportunities to add to their LRS. (2) Because fecundity increases with age in this example, longerlived individuals get an additional bonus because their reproductive success is higher in later years. (3) Variance in reproductive success is overdispersed within each age (ϕ > 1). Collectively, these factors cause lifetime \({V}_{k\bullet }\) (58.1) to be much higher than the mean (4.85). So far, this example has not implemented any covariance of survival and reproduction or any temporal autocorrelations in individual reproductive success over time. We refer to this as Model NoCor.
Temporal correlations in reproductive success are easy to generate by sorting the randomly generated vectors of offspring number before mapping them to individuals. The right side of Table 2 shows the results of sorting the k vector for each age such that the largest value is assigned to individual 1, the next largest to individual 2, and so on. This simple ploy accomplishes two things: (1) it creates persistent individual differences in reproductive success, which manifest as positive correlations between an individual’s reproductive output across time; and (2) it creates positive correlations between reproduction and subsequent survival, which enhance the strength of the persistent individual differences. This is Model Positive. The net result is that \({V}_{k\bullet }\) more than doubles (to 122.9) while the mean remains the same.
In empirical datasets, the pairwise covariance terms can be challenging to deal with because (a) they are very numerous for longlived species, and (b) sample sizes are generally small for comparisons involving older age classes. Here a new metric is introduced (ρ_{α},_{α+}) that generally can be applied to all individuals in a cohort that reach age at maturity. This metric represents the Pearson correlation coefficient between two vectors: k_{α} = offspring number for all individuals at the age at first reproduction (α), and k_{α+} = LRS − k_{α} = lifetime reproductive success of the same individuals for all subsequent years. In the example for Model NoCor in Table 2, this correlation is slightly positive (ρ_{α},_{α+} = 0.14) but not significantly so (P > 0.5 for a twotailed test). For the extreme Model Positive, in contrast, this correlation was close to unity (ρ_{α},_{α+} = 0.96; P < 0.001).
Negative temporal correlations in reproductive success are easy to generate by reversing the sorting process and assigning the largest k value each year to the individual with the highest ID number (Model Negative; see Supplementary Fig. S1). Since individuals with the highest ID numbers are the ones that die each year after reproduction, this ensures that new individuals get to reproduce each year, which in turn reduces \({V}_{k\bullet }\). Applying Model Negative to the simulated data in Table 2 reduced \({V}_{k\bullet }\) sharply (to 17.4) and led to a significantly negative correlation (ρ_{α},_{α+} = −0.43; P < 0.05; Supplementary Table S1).
To illustrate an alternative way to generate correlations between reproduction and survival, the analyses in the main text were repeated using a second simulation algorithm. TheWeight (Waples 2020, 2022a) is a generalized WrightFisher model that allows for unequal parental expectations of reproductive success, specified by a vector of parental weights, W. Details for how this algorithm was implemented are in Supplementary Information.
It is worth noting that Eq. (3) for \(\mathrm{var}\left({k{\bullet }}_{q}\right)\) (and by extension Eq. (4) for \({V}_{k\bullet }\)) does not contain any terms for the covariance of individual reproduction and survival. To the extent these covariances are nonzero, they can provide insights into key evolutionary processes. With respect to variance in reproductive success, however, any effects of these covariances manifest themselves as positive or negative autocorrelations in individual reproductive success over time, so for the analysis of \({V}_{k\bullet }\) it is sufficient to focus on these autocorrelation terms.
Tracking genetic drift
Table 2 illustrates how lifetime \({V}_{k\bullet }\) was calculated from simulated demographic data. To determine whether Hill’s N_{e} based on these \({V}_{k\bullet }\) values accurately predicted the rate of genetic drift, two common genetic metrics were monitored. The expected variance in allele frequency (\({V}_{p(t)}\)) after t generations of genetic drift is (Hedrick 2000):
where p_{0} is the initial allele frequency. In isolated populations with no mutation, random changes in allele frequency also cause an increase in homozygosity over time, such that after t generations the expected amount of remaining heterozygosity (H_{t}) is a simple function of initial average heterozygosity (\({H}_{0}\)) and N_{e} (Crow and Kimura 1970):
To monitor these metrics, genetic variation was tracked at L unlinked, diallelic (~SNP) loci. Genotypes were recorded as [0, 1, 2], indicating the number of copies of the focal allele each individual carried. In each replicate simulation, the population was initialized by filling each age class with the appropriate number (N_{x}) of individuals. In year 0, all individuals were designated as heterozygotes (genotype “1”) at every locus, so initial allele frequencies were all p_{0} = 0.5. In year 1 and subsequent years, mean observed H_{t} and \({V}_{p(t)}\) were computed for all members of the newborn cohort. As a single episode of random mating is sufficient to establish HardyWeinberg genotypic ratios, mean H at year 1 was on average 0.5. Years were converted into generations using the relationship t = y/T, where y is elapsed time in years (T = generation length was 4.84 for Scenario LowSkew and 5.22 for the other scenarios where male fecundity increased with age).
Observed rates of genetic drift were compared with expected rates calculated in two ways. Under Model NoCor, where probabilities of reproduction and survival are independent over time, the expected variance in LRS and hence Hill’s N_{e} can be calculated from an expanded life table (agespecific survival, fecundity, and ϕ) using the AgeNe model (Waples et al. 2011). The resulting N_{e} was then used in Eqs. (5) and (6) to generate expected values for the two genetic drift indices. The second approach used the population pedigree from the simulations to calculate \({V}_{k\bullet }\) for each annual cohort of offspring, and from this, a “pedigree” N_{e} was calculated every year using Eq. (1). The harmonic mean pedigree N_{e} was then used in Eqs. (5) and (6) to predict expected rates of genetic drift based on the actual population pedigree.
With age structure, the rate of increase in \({V}_{p(t)}\) reaches a steady state only after a burnin period lasting several generations, which means that Eq. (5) might not be accurate in the early years. To account for this effect, after the burnin period, the empirical \({V}_{p(t)}\) for year 50 was averaged across replicates to produce \({V}_{p({Burnin})}\), and the subsequent increase in V_{p} with time was calculated as \(\varDelta {V}_{p(t)}={V}_{p(t)}{V}_{p({Burnin})}\). With this adjustment, \(E(\varDelta {V}_{p\left(t\right)})\) can be calculated from Eq. (5), replacing \({p}_{0}\left(1{p}_{0}\right)\) with the mean \({p}_{50}\left(1{p}_{50}\right)\) averaged across loci and replicates at the end of the burnin period.
Each replicate simulation was run for 500 years. Except as noted, results shown here are averaged across 10 replicate simulations, each tracking genetic variation at L = 100 loci.
Results
The two simulation algorithms were both successful in achieving the desired level of reproductive skew and covariances/autocorrelations. For simplicity, results for NegBinom are presented in the main text and those for TheWeight are in Supplementary Information.
Model NoCor
Population demography
For Model NoCor, analytical expectations for lifetime \({V}_{k\bullet }\) and N_{e} are possible based on the vital rates in Table 1, and these provide a useful reference point for evaluating the results of the simulations. Scenario LowSkew is the simplest as fecundity is constant in both sexes, with ϕ = 1 for all ages. This means that all adults reproducing each year behave like a single WrightFisher population with Poisson variation in reproductive success. Under this scenario, the parametric expectation for \({V}_{k\bullet }\) is 10.2 (Table 3). Poisson variance in LRS would lead to \({V}_{k\bullet }\) equal to the mean (which must be \(\bar{k}{\bullet }=\) 2 in a stable population), so most of the total lifetime variance can be attributed to variation in longevity (some individuals live longer than others and have more opportunities to reproduce). In Scenario ModerateSkew, male fecundity increases with age, and at each age, the variance in male offspring number is 5 times the mean (ϕ = 5). Together these factors increase parametric \({V}_{k\bullet }\) to 16.1 (Table 3). In Scenario HighSkew, ϕ takes a rather extreme value of 20, and parametric \({V}_{k\bullet }\) grows to 31.1—over 15 times the lifetime mean.
These demographic changes have predictable consequences for effective population size under Model NoCor. In Scenario LowSkew, where both sexes have identical vital rates (leading to T = 4.84 and \({V}_{k\bullet }=10.2\)), parametric N_{e} from Eq. (1) is 4 × 400 × 4.84/(2 + 10.18) = 636. In the scenarios with moderate to high skew, increasing fecundity with age increased generation length in males from 4.84 to 5.59, so across both sexes overall T was 5.21. All else being equal, N_{e} increases linearly with generation length (Eq. (1)). However, oversdispersion in male reproductive success substantially increased in these scenarios, and this more than offset the modest increase in generation length. The parametric expectations for N_{e} are 452 for Scenario ModerateSkew and 253 for Scenario HighSkew (Table 3), which are, respectively, 29 and 60% lower than for Scenario LowSkew.
The new correlation metric, ρ_{α},_{α+}, examines the association between an individual’s reproductive success in the first year of sexual maturity (age 3 for the core life table) and the rest of its life. As expected under Model NoCor, mean values of ρ_{α},_{α+} for both sexes for all three scenarios were close to zero, all falling in the range [−0.01, +0.01] (data not shown).
Mean empirical \({V}_{k\bullet }\) in the NoCor simulations agreed well with the parametric expectations (Fig. 1). Harmonic mean N_{e} calculated from the simulated pedigrees was within 1% of the parametric expectation for Scenarios LowSkew and ModerateSkew and ~7% higher for Scenario HighSkew (Table 3). This latter result reflects the difficulty in precisely modeling strongly overdispersed variance in reproductive success, especially when older age classes have few individuals (only 8 of each sex for the oldest age class in the core life table).
Tracking genetic drift
For all three scenarios under Model NoCor, the mean ratio of observed to expected heterozygosity calculated over the last 50 years of each replicate was close to unity (all values within 1% of 1.0; Table 3). A comparable result was found for comparisons of observed and expected variance in allele frequency (Supplementary Table S4). These results are consistent with but extend previously reported results for Hill’s model. Waples et al. (2014) found excellent agreement between observed and expected rates of decline in heterozygosity in simulations based on vital rates for 20 different species, but for most species, it was assumed that ϕ = 1 for all ages. Results for Scenario HighSkew, with very high withinage reproductive skew (ϕ = 20), are therefore new.
Models with correlations
Population demography
The realized annual offspring numbers each year were sorted in Models Positive and Negative. Although these offspring numbers spanned a relatively small range of values for Scenario LowSkew, when sorted before assigning to individuals they had a substantial effect on population demography. For Scenario LowSkew, \({V}_{k\bullet }\) more than doubled under Model Positive and was nearly halved under Model Negative (Fig. 1), with corresponding changes to pedigree N_{e} (Table 3), and ρ_{α},_{α+} was strongly positive in both sexes (0.85) for Model Positive and strongly negative in both sexes (−0.68 to −0.69) for Model Negative (Table 3).
With stronger reproductive skew, results were even more dramatic. For Model Positive, ρ_{α},_{α+} was >0.9 (Table 3). These strong positive correlations concentrated reproduction in just a few individuals, which in turn substantially increased lifetime variance in reproductive success. With ϕ = 5 (Scenario ModerateSkew), lifetime \({V}_{k\bullet }\) increased almost fourfold, and with ϕ = 20 (Scenario HighSkew), \({V}_{k\bullet }\) more than quadrupled, to >200 (Fig. 1). Increases in \({V}_{k\bullet }\) caused corresponding decreases in effective size (Table 3). For Scenario ModerateSkew, N_{e} was less than half of the parametric value expected under Model NoCor, and for Scenario HighSkew realized N_{e} was less than 30% of the value expected Model NoCor.
In Model Negative, individuals who were assigned the largest numbers of offspring each year all died before reaching the next age. This created a strong negative correlation between initial and subsequent reproductive success: ρ_{α},_{α+} = −0.34 for Scenario ModerateSkew and −0.11 for Scenario HighSkew (Table 3). These negative correlations minimized disparities in lifetime reproductive success and reduced \({V}_{k\bullet }\) compared to expectations under the NoCor model (Fig. 1) and consequently increased effective size (Table 3).
Tracking genetic drift
Even for extreme versions of correlated reproduction, use of the pedigree N_{e} in Hill’s Eq. (1) accurately predicted the rates of loss of heterozygosity and increase in allele frequency variance (Table 3, Supplementary Table S2, and Figs. 2 and 3). For loss of heterozygosity, all deviations from expectations were <1% except for the extremely overdispersed (ϕ = 20) Scenario HighSkew under Model Positive, where mean heterozygosity in the last 50 years was 1.5% higher than expected (Table 3). Stochastic variation in the rate of increase in allele frequency variance was somewhat higher, but for all Scenario × Model combinations, the observed change in \({V}_{p(t)}\) was within a few % of the expected (Supplementary Table S4), with an overall mean observed/expected ratio of 1.007.
Alternate life histories
Results so far have all used variations of the vital rates in Table 1, which apply to a hypothetical species with 10 age classes. Simulations were also conducted for shorter lifespans (5 years, with age at maturity 1) and longer lifespans (20 years, with age at maturity 5) (Supplementary Table S5). In both cases, fecundity was constant with ϕ = 1 in females, and fecundity was proportional to age with ϕ = 5 in males. As shown in Supplementary Fig. S1, Eq. (1) based on N_{e} calculated from the actual pedigrees accurately predicted the rates of loss of heterozygosity and increase in allele frequency variance for these different life histories.
Precision
As the main focus of this paper is to evaluate potential bias in Eq. (1) when it is applied to extreme demographic scenarios, a great deal of replication has been used to smooth out random demographic and genetic stochasticity to produce mean results that are qualitatively repeatable. Empirical datasets, on the other hand, generally are collected from a single realized population pedigree and might include data for a relatively small number of genes. As a reminder to researchers evaluating such empirical datasets, an example is included that generated a single 500year population pedigree and tracked the decline of heterozygosity in 10 different sets of 50 unlinked diallelic loci (Fig. 4). At year 500 the average heterozygosity across the total 500 loci was close to the expected value from Eq. (6) using the realized pedigree (0.256), but in one set of 50 loci mean (H_{obs}) was >0.3 and in another set it was <0.2. Comparable results for allele frequency variance are shown in Figure 1 of Waples (2022b).
Discussion
Hill’s (1972) method for calculating effective population size is surprisingly robust to extreme reproduction scenarios. As expected, introducing strong autocorrelations in reproduction and covariance between reproduction and survival caused dramatic changes in lifetime variance in reproductive success. For Scenario LowSkew, mean \({V}_{k\bullet }\) was 3.8 times as large under Model Positive (positive autocorrelations and covariances) as it was under Model Negative (negative autocorrelations and covariances). For the scenarios that included substantial overdispersion of withinage reproductive success, the proportional differences were even greater (7.7 and 6.4 times larger for Model Positive for Scenarios ModerateSkew and HighSkew, respectively). The autocorrelations and covariances that arose when implementing Models Positive and Negative did not affect generation length, so when the empirically derived estimates of \({V}_{k\bullet }\) were inserted in Hill’s Equation 1, they also led to realized effective sizes that differed dramatically among the three models (Table 3).
The most important result from this study is that, when \({V}_{k\bullet }\) is computed from the population pedigree, N_{e} calculated from Eq. (1) accurately predicts the realized rate of genetic drift when inserted in Eq. (5) (for rate of increase in allele frequency variance) and 6 (for rate of loss of heterozygosity). Excellent agreement between observed and predicted rates of genetic drift was found for diverse life histories (5, 10, and 20year lifespans, with age at maturity 1, 3, or 5 years), for identical or different vital rates for males and females, and for extreme skew in reproductive success (variance up to 20 times the mean), all across nearly 100 generations of evolution.
These results are good news for researchers. Random processes in agestructured populations create dynamic heterogeneity in survival and reproduction (Vindenes et al. 2008; Tuljapurkar et al. 2009), and these processes are implemented here as Model NoCor. But the biological attributes of many species create autocorrelations of reproduction and/or covariances in reproduction and survival. As implemented here, Models Positive and Negative are more extreme than are likely to be found in most real species, but that was intentional. In the worstcase scenarios found here, observed rates of genetic drift were still within a few percent of those expected, and these scenarios involved very strong reproductive skew within ages. For all realistic applications to natural populations, therefore, Eq. (1) can be considered to be a very reliable predictor of effective population size.
Although pairwise correlations of individual reproductive success in different years can provide valuable and detailed information regarding reproductive tradeoffs, the new index introduced here (ρ_{α},_{α+}) provides what appears to be a robust summary across the full lifespan. ρ_{α},_{α+} is the correlation between two vectors, one listing the number of offspring produced by each individual at the first age of sexual maturity (α) and the other listing the total number of offspring produced by the same individuals during the rest of their lifetimes. As expected, these correlations averaged close to 0 under Model NoCor and were consistently very high under Model Positive. Under Model Negative these average correlations were consistently negative, with a magnitude that depended on the degree of withinage reproductive skew. An advantage of this summary index compared to pairwise correlations of reproduction at specific ages is that the length of the two vectors considered by ρ_{α},_{α+} is the number of individuals in the cohort, rather than the inevitably smaller (and variable) number that survive to later ages. This increases statistical power, so ρ_{α},_{α+} might be used initially for diagnostic purposes before focusing in more detail on specific ages.
Engen et al. (2005, 2009) developed an alternative way to calculate N_{e} when generations overlap and Hill’s (1972, 1979) assumptions of constant N and stable age structure are not met. In their model, the overall variance in N or total reproductive value over time arises from two additive components: an environmental variance \({\sigma }_{e}^{2}\), which quantifies the effects of fluctuating environments over time, and a demographic variance \({\sigma }_{d}^{2}\), which quantifies the effects of random demographic stochasticity within one time period, during which the environment is constant. Engen’s model uses a Leslie matrix to relate \({\sigma }_{d}^{2}\) to a population’s agespecific vital rates (probabilities of reproduction and survival), and this allows formal consideration of the covariance of an individual’s fecundity at time t and that individual’s survival to time t + 1. However, the time horizon for considering demographic stochasticity is only one time period, and Engen’s model does not include a term for lifetime variance in offspring number, so a direct comparison with Hill’s model is only possible for some very simplified scenarios. The focus on a single time step means that Engen’s model cannot explicitly account for persistent individual differences in reproductive success (Lee et al. 2011), nor the effects of reproduction on subsequent survival that last more than one time period.
Data availability
All results presented here were generated by simulations. R code to conduct the simulations is available in Supplementary Information.
References
Bell G (1980) The costs of reproduction and their consequences. Am Nat 116(1):45–76
Byholm P, Nikula A, Kentta J, Taivalmäki J‐P (2007) Interactions between habitat heterogeneity and food affect reproductive output in a top predator. J Anim Ecol 76:392–401
Caswell H (2001) Matrix population models: construction, analysis, and interpretation, 2nd edn. Sinauer Associates, Sunderland, MA
Charlesworth B (1994) Evolution in agestructured populations, 2nd edn. Cambridge University Press, Cambridge
Crow JF, Kimura M (1970) An introduction in population genetics theory. Harper and Row, New York (NY)
Cushing JM (1994) The dynamics of hierarchical agestructured populations. J Math Biol 32:705–729
Engen S, Lande R, Sæther BE (2005) Effective size of a fluctuating agestructured population. Genetics 170:941–954
Engen S, Lande R, Sæther BE, Dobson FS (2009) Reproductive value and the stochastic demography of agestructured populations. Am. Nat. 174:795–804
Hedrick PW (2000) Genetics of populations, 2nd edn. Jones and Bartlett, Sudbury (MA)
Hill WG (1972) Effective size of population with overlapping generations. Theor Popul Biol 3:278–289
Hill WG (1979) A note on effective population size with overlapping generations. Genetics 92(1):317–322
Kruuk LE (2004) Estimating genetic parameters in natural populations using the ‘animal model’. Philos Trans R Soc B Biol Sci 359(1446):873–890
Lande R, Engen S, Saether BE (2003) Stochastic population dynamics in ecology and conservation. Oxford University Press.
Lee AM, Engen S, Sæther BE (2011) The influence of persistent individual differences and age at maturity on effective population size. Proc Royal Soc B Biol Sci 278:3303–3312
Lee AM, Myhre AM, Markussen SS, Engen S, Solberg EJ, Haanes H, Røed K, Herfindal I, Heim M, Sæther BE (2020) Decomposing demographic contributions to the effective population size with moose as a case study. Mol Ecol 29(1):56–70
McElligott AG, Hayden TJ (2000) Lifetime mating success, sexual selection and life history of fallow bucks (Dama dama). Behav Ecol Sociobiol 48:203–210
Mousseau TA, Fox CW (eds) (1998) Maternal effects as adaptations. Oxford University Press, New York.
Pelletier F, Hogg JT, FestaBianchet M (2006) Male mating effort in a polygynous ungulate. Behav Ecol Sociobiol 60:645–654
R Core Team (2021). R: a language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. https://www.Rproject.org/
Reznick D (1992) Measuring the costs of reproduction. Trends Ecol Evol 7(2):42–45
Roff D (1992) Evolution of life histories: theory and analysis. Chapman and Hall, New York
Smith JN (1981) Does high fecundity reduce survival in song sparrows? Evolution 35:1142–1148.
Tuljapurkar S, Steiner UK, Orzack SH (2009) Dynamic heterogeneity in life histories. Ecol Lett 12(1):93–106
Van Noordwijk AJ, De Jong G (1986) Acquisition and allocation of resources: their influence on variation in life history tactics. Am Nat 128(1):137–142
Vindenes Y, Engen S, Sæther BE (2008) Individual heterogeneity in vital parameters and demographic stochasticity. Am Nat 171(4):455–467
Waples RS (2020) An estimator of the Opportunity for Selection that is independent of mean fitness. Evolution 74:1942–1953
Waples RS (2022a) TheWeight: a simple and flexible algorithm for simulating nonideal, agestructured populations. Methods Ecol Evol 13:2030–2041
Waples RS (2022b) What is N_{e}, anyway? J Hered 113:371–379
Waples RS, Do C, Chopelet J (2011) Calculating N_{e} and N_{e}/N in agestructured populations: a hybrid FelsensteinHill approach. Ecology 92:1513–1522
Waples RS, Antao T, Luikart G (2014) Effects of overlapping generations on linkage disequilibrium estimates of effective population size. Genetics 197:769–780
Waples RS, Feutry P (2022) Closekin methods to estimate census size and effective population size. Fish Fish 23:273–293
Williams GC (1966) Natural selection, the costs of reproduction, and a refinement of Lack’s principle. Am Nat 100(916):687–690
Wilson AJ, Nussey DH (2010) What is individual quality? An evolutionary perspective. Trends Ecol Evol 25(4):207–214
Wright S (1931) Evolution in Mendelian populations. Genetics 16(2):97–159
Wright S (1938) Size of population and breeding structure in relation to evolution. Science 87:430–431
Acknowledgements
The author is grateful to Bill Hill for many insightful discussions over the years, relating to effective population size as well as other topics. I thank Steinar Engen and BerntErik Saether for useful discussions. Per Erik Jorde provided comments that substantially improved the manuscript.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Competing interests
The author declares no competing interests.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Associate editor: Armando Caballero.
Supplementary information
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author selfarchiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Waples, R.S. Robustness of Hill’s overlappinggeneration method for calculating N_{e} to extreme patterns of reproductive success. Heredity 131, 170–177 (2023). https://doi.org/10.1038/s41437023006336
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1038/s41437023006336