Introduction

Inbreeding occurs if parents have ancestors in common, which cannot be avoided in a population of finite size. The phenomena associated with inbreeding include inbreeding depression (most often seen in fitness traits), genetic drift of allele frequencies, loss of heterozygosity and loss of genetic variance. Most managed populations have relatively small census sizes and the rate of inbreeding (ΔF) needs to be controlled. An understanding of inbreeding depression is therefore important in developing management strategies for various goals, ranging from conservation to artificial selection. The impact of inbreeding depression (and other phenomena) has frequently been found to depend upon ΔF, with faster rates having more impact (Ehiobu et al., 1989; Wang et al., 1999; Pedersen et al., 2005). The effect of inbreeding rate may be explained by natural selection: slow inbreeding would increase the number of generations for selection to purge the genetic load, leading to a smaller impact for a given inbreeding coefficient (Templeton and Read, 1984; Lacy and Ballou, 1998). A different distinction, which we consider in this paper, is whether an increment of inbreeding was recent in origin (‘new’ inbreeding) or occurred further back in the population history (‘old’ inbreeding). Our perspective contrasts with classical models of inbreeding depression (e.g. Falconer and Mackay, 1996), which would regard the impact of ‘new’ inbreeding as the continuing drift of pre-existing deleterious recessive alleles that have not been fixed. In these classical models, the distribution of fixation time with respect to size is random. A different viewpoint is provided by simulation studies of purging (Wang et al., 1999), which suggest that alleles with large effects will tend to be removed by natural selection in early generations, while alleles of smaller effect continue to segregate over longer periods. In addition, ‘new’ inbreeding might represent the impact of newly arisen mutations, whereas ‘old’ deleterious mutations could have left the population. These effects could lead to differences between the effects of ‘new’ and ‘old’ inbreeding, even if occurring at a constant rate.

Obtaining biological data on time-dependent differences are difficult, since it requires populations maintained at relatively constant rates of inbreeding over long periods to avoid confounding the timing of the inbreeding with the rate. Such populations are rare, particularly in the numbers that would provide adequate power for the analysis. However, long-term selection experiments in mice provide such an opportunity. This study therefore analyses the data from a mouse (Mus musculus) population maintained over 125 generations, which was described previously by Holt et al. (2005b). We test the hypothesis of a differential impact of ‘new’ and ‘old’ inbreeding. The trait of interest was the first litter size, which was the selected trait.

Materials and methods

Experimental structure

The animals were from the Norwegian long-term selection experiment for litter size in mice (Mus musculus) that was started in 1972, including the original base and all sublines spanning 124 further generations of selection. The base population originated from a cross of two outbred strains (LAC Grey and CFW), imported into Norway from Great Britain in 1968 (Joakimsen and Baker, 1977). Three different lines were produced from this population. After the start of the experiment a Dutch line (B) was imported into the experiment and was used to produce a number of different crosses and further sublines. Joakimsen and Baker (1977), Vangen (1993), and Holt et al. (2005b) all describe these lines in more detail; however, since the structure of the experiment is important for understanding the data and outcomes of the analysis, the inter-relationship and nomenclature of the lines is shown in Figure 1. The nomenclature of Holt et al. (2005b) has been retained for consistency.

Figure 1
figure 1

The different periods and selection objectives of the selection experiment. Period 1: generation 0–44, Period 2: generation 45–70, and Period 3: generation 71 onwards (Reproduced from Holt et al., 2005b).

Line K was the control line and was maintained throughout without selection. Line L was initially selected for low first litter size, but after approximately 20 generations a selection plateau was reached and, at generation 45, the selection was reversed to high first litter size. Line H (and B) was selected for high litter size, as was line X, the cross between them. The new set of sublines (H4, H8, H12, HK, and H1), created in generation 45 by combining B, X and H, were selected for high litter size in different maternal environments. Finally these were recombined once more, to form the H2, again selected for high litter size, with HK2 managed as a control for the new high line (H2).

The selection was based on the phenotypic litter size of the female, with two sons and two daughters selected from the best 50% of first parity litters in the selected lines, and random selection in the K line. Selection within litters was made at random. With the exception of lines created to study different maternal environments, dams were restricted to rearing a maximum of eight pups in all lines and generations, with additional pups removed on the day of birth. Among the excepted lines, dams in H4, H8, and H12 lines were restricted to rearing a maximum of 4, 8, and 12 pups from birth, respectively, while dams in H1 were unrestricted. The dams in the HK line were restricted to a maximum of eight pups.

The highest inbreeding coefficient calculated in relation to the original founders for the whole population was reached in the K line, 0.64, and coefficients of 0.59, 0.42, 0.30, 0.22, 0.37, and 0.30 were reached for L, H, B, X, H2, and HK2 lines respectively. The ultimate level of inbreeding was approximately 0.27 in the HK, H1, H4, H8, and H12 lines. Full-sib, half-sib, and cousin-mating were avoided in the mating plans, so the more extreme points within the data for ‘new’ inbreeding, that is those having greatest leverage on the estimate of depression, will not have been produced from close mating.

Calculation of ‘old’ and ‘new’ inbreeding

For every animal i born in generation u, an inbreeding coefficient can be calculated with respect to a base generation at generation t, Fi(t,u), for all t<u. With this notation, the total inbreeding for individual i, the inbreeding coefficient with reference to the original base, generation 0, is denoted by Fi(0,u); the founders of the B line are also included in this initial base. For every individual, Fi(0,u) can be decomposed into ‘old’ inbreeding and ‘new’ inbreeding with respect to an intermediate generation t using an approach following Wright (1921), using the identity (1−Fi(0,u))=(1−Fi,(0,t)) (1−Fi,(t,u)). We define Fi,old(t,u)=F(0,t), and rearranging gives the following equation for the old inbreeding:

We define the new inbreeding Fi,new(t,u) by the difference:

Note that Equation (1) expresses Fi,old(t,u) in terms of inbreeding coefficients for individual i with respect to the original base (Fi(0,u)) and the intermediate base (Fi(t,u)), both of which are easily calculated.

Justification for this decomposition is given in Appendix 1 where it is shown that in the classical model of inbreeding depression the regression coefficients on Fi,old(t,u) and Fi,new(t,u) when fitted simultaneously are equal, and furthermore are equal to the simple regression on total inbreeding Fi(0,u).

Data

To look at the differential impact of ‘old’ and ‘new’ inbreeding, sets of inbreeding coefficients were calculated classified by the period of ‘new’ inbreeding, measured by m=ut. The number of generations of inbreeding considered as ‘new’ varied between 5 and 100 in steps of 5, creating up to 20 different pairs of Fi,old(t,u) and Fi,new(t,u) for each animal. Inbreeding coefficients were calculated from the pedigree using the method of Meuwissen and Luo (1992). From this point forward, the dependence of Fi,old(t,u) and Fi,new(t,u) on i, u and t will be left implicit and Fold and Fnew will be used for simplicity.

There were 29 365 animals with observations on first litter size, and the total pedigree file contained 74 630 animals. The data was collected into sets, S(m), where S(m) contains records for all animals having a pair Fold and Fnew with m generations of new inbreeding. A summary of the records available for each set S(m) is shown in Table 1. The numbers decline as m increases from 5 to 100 because with an increasing number of generations assumed as ‘new’ inbreeding, the number of animals with sufficient data decreased: for example if one assumes 30 generations as ‘new’ inbreeding the calculation of Fnew was only possible for the animals born between generation 30 and generation 125. As would be expected, the mean coefficient of ‘new’ inbreeding in Table 1 increases with increasing number of generations assumed as ‘new’ inbreeding while the mean coefficient of ‘old’ inbreeding decreases.

Table 1 The number of records and mean coefficients of ‘new’ and ‘old’ inbreeding for each different set of records S(m)

Analysis

Statistical analyses were carried out with SAS (SAS Institute Inc., 1999), MATLAB, and ASREML (Gilmour et al., 1999).

Overall pattern of depression

Given the objective of the study was to examine the differential expression of inbreeding depression over time, the relationship between performance and inbreeding coefficient was examined in each of the lines. Our approach extends the analysis of Holt et al. (2005a) since the inbreeding depression was not assumed to be linear and instead a spline model was fitted to the data using ASREML (Gilmour et al., 1999), with separate curves fitted to each line. The model fitted to the data was

where Yujk represented the litter size (number of pups born alive) of animal k in line j at generation u; μu is the effect of generation u; π, ϕ and ψ are, respectively, regression coefficients on the additive contribution (gujk), heterosis (hujk), and recombination loss (rujk) associated with the crossing of the Dutch founders in relation to the original Norwegian founders; α and βj are the pooled linear regression on the total inbreeding coefficient Fujk calculated using the original base population, and the estimate within line j respectively; spline and splinej the pooled smoothing spline for Fujk and the spline fitted to line j; zujk is the additive genetic effect of animal k; and eujk is the residual error. The effects of generations (μu) were treated as fixed effects, as were ψ, π, ϕ, and α. The interaction between line and Fujk, and zujk were assumed to be random effects, and the smoothing splines were also fitted as random effects (White et al., 1999). The distribution of zujk was assumed to be multivariate normal, MVN(0,σa2A), where A is the numerator relationship matrix, and the distribution of residuals was assumed to be MVN(0, σe2 I). The contributions of the Dutch founders (0gujk1) were calculated for each individual from the pedigree, and the coefficients for heterosis and recombination loss were calculated by

where gsire and gdam denote the Dutch contribution to the sire and dam of animal k in line j at generation u.

Analysis of ‘old’ and ‘new’ inbreeding

Following this preliminary analysis, more specific models were fitted to examine the effects of ‘old’ and ‘new’ inbreeding. Effects found to be non-significant in model (3) were omitted, which included the heterosis and recombination loss between the Dutch and Norwegian founding populations. The following model was fitted to each of the 20 sets S(5), …, S(100):

To test the statistical significance of the difference between the regression coefficients α and β, the identity Fujk=Fujk,old+Fujk,new was used (see equation 2).

To test for biases inherent in the data structure, model (4) was fitted to a simulated trait with the inbreeding depression completely described by a linear decline in relation to inbreeding coefficient. In the simulated data the entire pedigree structure, and hence the inbreeding coefficients, were identical to the real data set. The genetic variation in the simulated trait was assumed to follow an infinitesimal model (Bulmer, 1980) with a heritability of 0.17 for first litter size, and an inbreeding depression of 1.3 phenotypic standard deviations when completely inbred. These parameters were based on the analysis of the observed data set with all available information (results not shown). Note that, in principle, models (3) and (4) account for the selection in the mouse lines in an additive model, because of the inclusion of the additive genetic effect of the individual animals, zujk, with a covariance structure described by the relationship matrix A (Henderson, 1976). This model implicitly estimates the effect of zujk as the effect of its sire and its dam plus that of an independent within-family component. Thus, if the sire and the dam both have a high (or low) z due to selection, the expected zujk of the offspring is also high (low).

Results

All results presented are based on all available information, that is pooling different lines. In the analysis using the model in Equation (3) there was no evidence of heterosis between the original lines and the Dutch line B and no evidence of recombination loss. Line B had a greater number of pups than the original unselected line, with the difference estimated to be 3.6 (s.e. 1.1) pups.

Magnitude of inbreeding and ΔF

Figure 2 shows log(1–Ft) against time, and shows the relative rates of inbreeding. Since Ft=1−(1−ΔF)t, log(1–Ft) changes linearly with time when ΔF is constant, and with a slope of log(1−ΔF)≈−ΔF. Another way to show the relationship between inbreeding and time is to plot the inbreeding coefficient F against time, as was done in the study of Holt et al. (2005b) with a similar data set to the one used in this study; however, this relationship is non-linear at constant ΔF. The trends for each line in Figure 2 were linear, indicating a near constant ΔF within line, although Figure 2 shows only the L, the K, and, and the H line. The slopes are broadly indicative of the similar ΔF across lines: the rates shown vary from 0.0086 to 0.0125, with the mean ΔF over all lines of 0.0086 that is Ne=58.1.

Figure 2
figure 2

The natural logarithm of (1–F) plotted against generation number for the L, K, and H lines. The K line is present from generation 0 to 124. The L line, coincident for much of the time with the K line, is present from generation 0 to 106 and the line is present from generation 0 to 44.

Figure 3 summarises the fitted relationship between the inbreeding coefficient and the number of pups born alive for the L, K, and H lines, after adjustment for the other terms shown in Model 3. The use of the individual animal model with the numerator relationship matrix in the model accounts for the artificial selection applied. Most lines, some of which are not presented in Figure 3, showed a reduced number of pups born alive with an increasing inbreeding coefficient. This general conclusion did not hold for L, where a downward trend was followed by an upward trend, until in generation 45 the downward trend resumed. A possible reason for this was a failure in the model to accommodate the selection plateau found in this line (Holt et al., 2005b), since the model described by (3) assumes that selection intensity will result in selection progress and, consequently, a reduced litter size: given the flexibility of the model, the selection plateau is interpreted by the model as a balance between the expected selection response and a positive inbreeding depression.

Figure 3
figure 3

The number of pups with respect to the inbreeding coefficient for the L line (solid grey), the K line (solid black), and the H line (dashed black) based on a spline model with separate curves fitted to each line.

‘Old’ and ‘new’ inbreeding

An inbreeding depression of −4.24 (s.e. 1.00) pups per unit inbreeding was estimated for the total level of inbreeding, using a model that includes only a regression on the total amount of inbreeding. When model (4) was fitted the estimated inbreeding depression for the ‘new’ inbreeding varied between −11.53 and −0.74 pups per unit inbreeding, and ranged from – 5.22 to 15.51 for the ‘old’ inbreeding, respectively, for the different sets (Figure 4). The estimated s.e. for Fnew varied from 1.21 to 8.55 and those for Fold varied from 1.12 to 12.33 (also shown in Figure 4). The critical test in the analyses was for the difference between the regression coefficients on Fnew and Fold, and Figure 5 shows the log10 of the P-values for all 20 two-sided tests of the difference between ‘new’ and ‘old’ inbreeding in the different data sets. In 15 out of the 20 sets, the estimates were negative and statistically different from zero (P<0.05), indicating that the ‘new’ inbreeding had a greater impact than the ‘old’ inbreeding. These sets included all those between S(25) and S(85) inclusive.

Figure 4
figure 4

Regression coefficients and standard errors for inbreeding depression in the number of pups in the first litter, when dividing the total level of inbreeding into components Fnew (solid line) and Fold (dashed line). The values on the X-axis are equal to the number of generations considered in Fnew, with the generations of ‘old’ inbreeding equal to 124 minus this value. The dotted line shows the estimated inbreeding depression from regression on the total level of inbreeding only using the complete dataset.

Figure 5
figure 5

Logarithm to base 10 of the statistical significance (P) plotted against m, the number of generations considered in Fnew. The generations of ‘old’ inbreeding are equal to 124 minus this value. The solid horizontal line denotes the Bonferroni-adjusted 5% significance threshold, assuming 20 independent tests, and the dashed horizontal line indicates the nominal 5% value.

The statistically significant negative estimates ranged from −4.86 (s.e. 2.40) to −22.83 (s.e. 5.66). We conclude that the effect was real notwithstanding the multiple tests: (i) if tests are treated as independent the observation of 15 out of 20 is unusual, since only 1 would be expected, yet the different tests have much data and structure in common, considerably reducing the number of ‘independent’ tests; and (ii) again assuming 20 independent tests, the results remain statistically significant even after a Bonferroni correction to the significance threshold. For a Type I error of 0.05 overall, the Bonferroni correction for 20 independent tests gives a significance threshold of 1–0.950.05≈2.5 × 10−3.

To examine whether these results were an artefact of the data structure or selection three further analyses were carried out. First, all analyses were repeated with a data set where the L line was excluded from the analysis to see if the selection plateau and the reversed selection in this line had an impact upon the results: However, the results showed the same pattern and were significant. Second we carried out the analysis only in the K line, which was not subject to selection. While the results of this analysis had a qualitatively similar pattern to those shown in Figure 4, the outcome was not statistically significant due to the much larger standard errors caused by the greatly reduced numbers of observations. Third we analysed a simulated dataset constructed as described in Materials and methods in which inbreeding depression followed the classical model, and as shown in Figure 6, no significant differences were observed between the regression coefficients for Fold and Fnew, with all estimates consistent with the simulated value of depression of 1.3 base phenotypic standard deviations for complete inbreeding.

Figure 6
figure 6

Regression coefficients and s.e. for inbreeding depression in a simulated trait when dividing the total level of inbreeding into components Fnew (solid line) and Fold (dashed line). The values on the X-axis are equal to the number of generations considered as ‘new’ inbreeding, with the generations of ‘old’ inbreeding equal to 124 minus this value. The dotted line shows the estimated inbreeding depression from regression on the total level of inbreeding in each dataset.

Discussion

Inbreeding and inbreeding depression have been analysed intensively in evolutionary biology, conservation genetics, and animal breeding. Nevertheless, the mechanisms of the phenomena are still not completely clear. In the present paper the inbreeding depression on first litter size in mice was analysed for the number of pups born alive with respect to the timing of the inbreeding, that is how deep in the pedigree did the common ancestors occur. We found that the ‘new’ inbreeding had a higher impact on the inbreeding depression, than the ‘old’ inbreeding. This contrast between ‘old’ and ‘new’ inbreeding is distinct from ‘slow’ and ‘fast’ accumulating inbreeding.

The analysis presented in this paper made it possible to compare ‘old’ inbreeding with ‘new’ inbreeding since results were obtained from selection results with lines kept for long periods at near-constant ΔF. This introduces a crucial difference from the study of Holt et al. (2005a). These authors were examining a similar situation to that discussed by Kristensen and Sørensen (2005), where the terms ‘old’ and ‘new’ inbreeding are defined as slowly accumulating and arising from mating closely related animals respectively. However, the natural interpretation of the terms ‘fast’ and ‘slow’ would apply to contrasts of ΔF among populations, and unfortunately the definitions used as a basis for the studies of Kristensen and Sørensen (2005) and Holt et al. (2005a) confound ‘old’ with ‘slow’ and ‘new’ with ‘fast’. In our current study these two factors were not confounded.

The results obtained are a departure from the classical model. In the classical model (e.g. Falconer and Mackay, 1996), traits showing inbreeding depression are assumed to have multiple alleles, most simply a wild type and a mutant, present in the base population and exhibiting dominance. Inbreeding depression is then observed due to the reduction in the frequency of heterozygotes for these alleles, as the population moves towards homozygosity. It is shown in Appendix 1 that when defined as in this study, regression on ‘old’ inbreeding and ‘new’ inbreeding has the same expectations in the classical model. If we had defined Fnew as the actual inbreeding coefficient relative to the ‘new’ base then the same quantitative model would predict that the regression on Fnew to be less than Fold.

Potential explanations for the observed departure include the emergence of new mutations in the population, or natural selection on the loci that display non-additivity for the trait also associated with fitness. Epistasis may also play a role, particularly where artificial selection is involved, since evolving combinations of loci may generate new non-additive variation (e.g. Carlborg et al., 2006). However, natural selection in some form seems necessary to ensure the sustainability of populations if the finding that ‘new’ inbreeding is more potent than ‘old’ is a general one. This can be seen by considering a sequence of ‘new’ base populations over time, from which a repeated depression in fitness would be predicted. In contrast, the classical model discounts the magnitude of depression from this new base by an amount that depends on the extent of inbreeding from the old base (see Appendix 1). In the absence of natural selection this repeated depression in fitness would eventually result in the extinction of the population, but it is feasible with natural selection. Individuals with measured phenotypes at time t may exhibit a particular degree of depression relative to an arbitrary base, the merit of their genetic contribution as parents and ancestors for t′>t will increase over time as frequencies of their deleterious alleles they themselves carry are (generally) reduced in the descendants. This process is identical to that described by purging (Templeton and Read, 1984). There has been continuing debate on its effectiveness (Ballou, 1997), however, a cautious conclusion is that purging occurs, but success in removing deleterious recessive alleles is only partial (Wang et al., 1999; Reed et al., 2003) or is made less effective by changing environments (Bijlsma et al., 1999).

Quantitative mechanistic support for the results of this study can be found among the purging models simulated by Wang et al. (1999), in particular where both selection and dominance coefficients for mutants follow exponential distributions. Such distributions have empirical support (Mackay et al., 1992; Keightley, 1994; Caballero and Keightley, 1994). Wang et al. (1999) showed how alleles with different magnitudes of selection coefficients contributed to inbreeding depression over time, and showed how those with larger selection coefficients may both lead to an initial decrease in fitness and a subsequent recovery in fitness. An ad hoc analysis of the graphical data these authors present (see Wang et al., 1999, Figure 2) show responses that, when interpreted with our models, predict new inbreeding to be more potent than the old, with the interpretation of natural selection as described in the paragraphs above. For a population of stable structure the expected inbreeding depression will be non-linear in relation to F defined by some arbitrary but fixed base, with the slope decreasing in magnitude towards zero. This shape of fitness-associated inbreeding depression was observed for litter size (Weiner et al, 1992) in a sheep population, which was rapidly inbred (pre-dominantly through offspring–parent mating) in the absence of artificial selection, and in the simulations of Wang et al. (1999). With the considerably larger effective population size in our mouse population this curved response is not clearly evident in Figure 3 but, as described in the Results section, when the shape of the inbreeding depression is allowed to vary completely freely (as in Figure 3) then the inbreeding depression can be influenced by the nature of the response to artificial selection.

It should be noted that it is difficult in analyses of real data to fully separate artificial selection from purging, which is a form of natural selection, since the standard analysis of model (4) corrects for selection based upon the trait of analysis in a fully additive infinitesimal model. However, in non-additive models this correction will be biased because of (i) the non-additivity of gene expression, and (ii) the changes in gene frequency and genetic variance over generations that accompany selection. However, while this complicates interpretation we consider the results on the relative importance of ‘old’ and ‘new’ inbreeding to be sufficiently robust. The results of this study are not influenced by a structural bias of the data. This could be concluded because (i) the control line alone has a qualitatively similar pattern with estimates that are consistent but with large standard errors; (ii) quantitatively similar results are obtained when the low line were excluded, since the selection plateau and the reversed selection might have influenced the estimates; and (iii) when we simulate data with a strict classical depression and based upon our population pedigree, there was no distinction between ‘new’ and ‘old’ inbreeding.

The perspectives of Meuwissen and Woolliams (1994) are still valid, whereby a sufficiently large effectively population size may reduce the potential rate of depression per generation to a point where it may be fully offset by natural selection (unless the artificial selection being imposed, strongly counters natural selection). However, the task of setting a critical or acceptable ΔF for managing genetic variation in optimum genetic contribution selection (Meuwissen, 1997; Grundy et al., 1998) becomes more complex in light of our results. The total inbreeding depression will not only be influenced by the rate itself, a finding that has previously been demonstrated in both experimental (Pedersen et al., 2005) and simulated (Wang et al., 1999) populations, but also the extent of depression expected may vary over time.

Conclusion

By considering a long-term selection experiment where rates of inbreeding were maintained approximately constant with Ne60, it was possible to contrast ‘old’ and ‘new’ inbreeding without confounding it with ‘slow’ and ‘fast’ inbreeding. This was achieved by considering the impact of moving the base generation. In this mouse population the ‘new’ inbreeding was found to cause more inbreeding depression when at least 25 generations were classified to be as ‘new’ inbreeding. This was a clear departure from the classical model of inbreeding depression which predicts that no distinction should have been observed. For this pattern to be general it was concluded that natural selection in the form of purging must play a role in reducing the impact of deleterious alleles over time, thereby influencing the magnitude of the inbreeding depression observed.