Introduction

When related individuals mate, their offspring are generally less viable, less fertile or smaller than the population mean – a phenomenon known as inbreeding depression. Consequently, inbreeding has been the focus of considerable attention in a number of areas of biology including animal and crop production, human medicine, conservation biology and the evolution of mating systems (Thornhill, 1993; Hedrick and Kalinowski, 2000; Keller and Waller, 2002). Inbreeding depression arises because inbreeding increases the probability that an individual will be (a) homozygous for segregating deleterious recessive alleles and (b) homozygous at loci exhibiting overdominance (Falconer, 1989; Lynch and Walsh, 1998). Deleterious recessive alleles are thought to be the major cause of inbreeding depression (Charlesworth and Charlesworth, 1999).

Inbreeding depression can be inferred in a number of ways (Keller and Waller, 2002). The most straightforward approaches utilise the inbreeding coefficient f (Wright, 1922) – defined as the probability that two alleles at a locus are identical by descent (ibd). For example, relatives can be deliberately mated to produce individuals of known f, which are then compared to outbred individuals from the same population. A similar approach is to use pedigree records to calculate f for all of the individuals in the population. Inbreeding depression is then inferred by regressing phenotype (or log-transformed phenotype) on the inbreeding coefficient (Morton et al, 1956; Lynch and Walsh, 1998).

If inbreeding coefficients are unavailable, an alternative approach is to examine the association between marker heterozygosity (typically measured at 5–10 loci) and phenotypic value. This approach, sometimes termed heterozygosity–fitness correlations (HFCS), originated with the advent of soluble allozyme markers (Allendorf and Leary, 1986; Mitton, 1993; David, 1998). Initial investigations with allozymes sought to address whether genetic variation was maintained by drift or selection; in other words, did genotype at individual loci have a direct effect on fitness? However, in recent years, a number of studies have reported significant relationships between multilocus heterozygosity (hereafter MLH) in noncoding DNA and fitness-related traits in wild populations, with inbreeding depression usually regarded as the most likely explanation for the relationship (Coltman et al, 1999; Marshall and Spalton, 2000; Slate et al, 2000; Amos et al, 2001; Acevedo-Whitehouse et al, 2003). This explanation is intuitively appealing as inbred individuals are expected to be relatively homozygous throughout the genome. However, the inbreeding coefficient and MLH do not measure the same quantity. When two alleles at a locus are ibd, the genotype is said to be autozygous, otherwise the genotype is allozygous. Allozygous genotypes may be homozygous (identical by state) or heterozygous, but in the absence of recent mutation an autozygous genotype is always homozygous (Hartl and Clark, 1997).

Unfortunately, there are (at least) three possible explanations for HFCs (David, 1998; Hansson and Westerberg, 2002), only one of which requires inbreeding. The first is that some or all of the marker loci have a direct effect on fitness with heterozygous genotypes conferring the greatest fitness (ie the locus exhibits overdominance). This explanation can generally be excluded for studies using microsatellite loci as they are usually nonfunctional. The second explanation is that the markers are in physical linkage with either overdominant or dominant loci that influence fitness. If a marker and trait locus are in linkage disequilibrium, then individuals that are heterozygous at the marker locus will also tend to be heterozygous at the trait locus. This explanation is often termed a local effect. A third explanation is that the heterozygosity at marker loci reflects heterozygosity at unlinked trait loci. Such an association is only expected to arise in populations that exhibit variance in inbreeding and is sometimes termed a general effect or identity disequilibrium (Weir and Cockerham, 1973). It is this explanation that is invoked when it is claimed that an HFC indicates inbreeding depression. Note that the first two explanations do not require inbreeding. Spurious HFCs can also arise when individuals are sampled from several populations that exhibit between-population variation in both heterozygosity and nongenetic (eg environmental) components of trait variance. In other words, the HFC is an artefact of population structure.

Given that f and MLH are popular metrics for inferring inbreeding depression, it is surprising that the relationship between the two has only rarely been investigated theoretically (Bierne et al, 2000) or empirically (Hedrick et al, 2001; Curik et al, 2003). Further investigation of the relationship would not only help determine when MLH can be used as a reliable alternative to f, but would also help to ascertain the genetic basis of HFCs (Hansson and Westerberg, 2002). Indeed, most of the available HFC theory predicts that MLH and fitness, under the inbreeding hypothesis, should only correlate as a consequence of their common dependence on f. When only MLH and fitness are known, f remains an implicit variable. But having an independent knowledge of f from pedigrees allows one to test properly the central point of the theory. To our knowledge, no study combines estimates of heterozygosity at a large number of loci dispersed across the genome, accurate knowledge of f using detailed pedigrees, and measures of phenotypic variation. Domestic livestock populations offer opportunities to address this problem because stud books are often available, and comprehensive genotype data sets are arising from genome-wide scans for quantitative trait loci (QTL).

The aims of this study were to

  1. i)

    predict the relationship between MLH and individual f under a simple model whereby the expected heterozygosity at a marker locus is a function of f, and heterozygosity is uncorrelated across marker loci in individuals of the same f;

  2. ii)

    test this theoretical prediction with an extensive data set (590 individuals of known f, typed at up to 138 microsatellites) from a domestic sheep population.

  3. iii)

    provide some predictions on the relationship between f and MLH in several intensively studied populations, in order to determine how effectively microsatellites can be used to infer inbreeding depression in wild populations.

Methods

The model

It is assumed that the inbreeding level of an individual is characterised by a single f value determined by its pedigree. All loci are assumed to be equally affected by inbreeding. Note that estimates of f based on known pedigrees may slightly differ from the true value of f, as founders are assumed to have an f of zero. However, provided the pedigrees are correct and several generations (three or more) deep, they should enable good estimates of true f. The distribution of f has probability density function p(f).

Let hi be the heterozygosity (0 for a homozygote and 1 for a heterozygote at marker locus i). The expected heterozygosity E(hi) at locus i is h0,i(1−f) for a given f, where h0,i is the genetic diversity at locus i, or the expected heterozygosity in the absence of inbreeding.

It is assumed that any correlation among hi at different loci across individuals only arises due to within-population variation in f. In other words, no correlation will be expected in individuals that share the same value of f. This is usually true for unlinked loci. Note that 96% of locus pairs are unlinked in the Coopworth sheep data set presented in this study, with only 0.12% of locus pairs located within 10 cM of each other.

The term H refers to standardised MLH, calculated as the proportion of typed loci at which an individual was heterozygous divided by the population mean heterozygosity of those typed loci. This standardisation was initially used by Coltman et al (1999) in a study of Soay sheep, where individuals had been typed at different subsets of the same loci. The standardisation ensures that all individuals are measured on an identical scale and has subsequently been employed elsewhere (Slate et al, 2000; Amos et al, 2001). Standardised MLH is usually highly correlated with the more traditional nonstandardised version:

where i is the mean heterozygosity of all individuals typed at locus i and summation is across all typed loci.

The predicted distribution of heterozygosity

f has distribution with mean E(f) and variance σ2(f). We first derive the moments of the distribution of multiple-locus heterozygosity as functions E(f) and σ2(f). Then we will use these expressions to derive the expected correlations between f and heterozygosity, and between heterozygosity and fitness.

From the relationship above, the expected heterozygosity at locus i is

and, by definition, E(H)=1.

Given that hi can only take the values 0 or 1, hi2=hi, the variance in hi is

The covariance in heterozygosity among loci i and j is

where

and

which yields

The variance in H is

where

Then, after some algebra, we obtain

This expression involves several terms that depend on intrinsic characteristics of the set of loci analysed (number of loci, genetic diversity and how it varies across loci). In order to increase clarity, we can consider a first-order approximation that makes the simplifying assumption that all loci have the same genetic diversity h0. The expression for the variance in H then becomes

where L is the number of typed loci. It is apparent that, as the number of typed loci increases, the third term alone will be a good approximation of the variance in H, which will be roughly proportional to the variance in inbreeding.

The predicted correlation in heterozygosity between two loci

The correlation in heterozygosity between two loci is obtained using the above expressions for the covariance and variance in heterozygosity at loci i and j:

The standard error of the estimated r(hi, hj) is (Zar, 1996)

The predicted correlation between individual heterozygosity and inbreeding coefficient

The covariance between heterozygosity and f is

The first term of this expression reduces to

so that the covariance between heterozygosity and f can be written as

The covariance between H and f is

and the correlation between H and f is

The predicted correlation between H and fitness traits

We assume the Morton et al (1956) model for the relationship between inbreeding and fitness traits such that the trait (or the logarithm of the trait) declines as a linear function of f. Thus, W=abf, where W is the trait, a is a constant and b is the inbreeding load. b can be estimated by linear regression of W on f.

If we assume that all correlations between heterozygosity and the trait arise as a result of inbreeding, then

r(W, f) is estimated by the regression of the trait on f, while r(H, f) is defined above (equation (4)).

Application of the model to a real data set

Coopworth sheep were developed by crossing the Romney and Border Leicester breeds in New Zealand in the 1950s. The breed society was formally recognised in 1968 and Coopworths are now the second most numerous breed in New Zealand. We investigated a population of Coopworth sheep that was founded from six farms in the 1970s and has been the subject of divergent selection for backfat depth since 1981 (Morris et al, 1997). Subsequently, five F1 (fat × lean) sires have been backcrossed to both the fat and lean lines as part of an experiment to map QTL for morphological traits (Campbell et al, 2003). A total of 590 progeny were typed at up to 138 approximately evenly spaced microsatellite loci, spanning all 26 autosomes. Every individual was measured for a number of morphological traits and various potential explanatory variables were also recorded (see below). Inbreeding coefficients were calculated using the routine PROC INBREED, implemented in SAS (SAS Institute, Cary, NC, USA). For every individual, 7–10 generations of ancestors were known, dating back to the foundation of the selection lines, enabling accurate calculation of f. It was assumed that all founder individuals had an f of zero and were unrelated. MLH at all 138 loci was calculated, and converted to standardised MLH (see Coltman et al (1999) and above). Hereafter, MLH refers to the standardised version (H in the above model). Note that progeny were not genotyped at loci for which the sire was homozygous. Thus the genotype data file was only 73% complete – equivalent to an average of 101 genotypes per individual. The mean number of typed loci per half-sib family ranged from 98 (sire 603) to 106 (sires 610 and 616). In subsequent analyses, we report the expected relationship between f and MLH assuming that 101 loci were typed.

These data were then used to address three questions:

  1. 1)

    What is the relationship between f and MLH?

  2. 2)

    Is heterozygosity correlated between loci?

  3. 3)

    Do either f or MLH explain phenotypic variation?

The relationship between the two measures of inbreeding and 10 morphological traits (empty body weight, hot carcass weight, spleen weight, liver weight, heart weight, backfat depth at the 12th rib, tibia length, carcass length, longissimus dorsi weight and testes weight) was investigated. All traits appeared normally distributed (spleen weight was log-transformed), so univariate general linear modelling was employed.

The following terms were initially included in all models: sire, sex, rearing rank (litter size), maternal selection line (fat or lean), slaughter order (the first animal to be slaughtered on a given day is assigned rank 1, the next is assigned rank 2, etc.) and date of birth. All terms were factorial except slaughter order and date of birth, which were fitted as covariates. Initially, models were constructed with all terms fitted as both main effects and first-order interactions. Statistical significance of each term was assessed by F ratios. A minimal model was constructed by dropping all terms that were not significant at P<0.05. The minimal model was then used as a baseline model, to which genetic terms (f or MLH) were added. Both genetic terms were initially fitted as main effects and as interactions with sire. Note that terms containing f and MLH were not fitted in the same model. A significant interaction term would indicate between-sire variation in the number of segregating partially deleterious recessive alleles. The nine traits that were measurable in both sexes were positively correlated with each other (all correlations P<0.001), so multivariate analysis of variance (MANOVA) was also employed. Statistical analyses were implemented in S-plus 6.0 (Insightful, Seattle, WA, USA).

Application of the model to other data sets

In addition to making a comparison between predictions from the model and observations in an extensive QTL mapping data set, we also examined the likely relationship between f and MLH in a number of other wild and domestic populations. This analysis was restricted to populations for which the mean and variance of f had been estimated, and for which descriptions of microsatellite marker variability were available. The analysis may not be exhaustive, but it does include a number of the best-known vertebrate populations for which inbreeding depression has been reported.

Results

Predictions from the model

We first focus on the question of whether MLH is a good predictor of f. From equation (4), it can be seen that the correlation between these two variables is a function of the mean and variance of f, and of the variance in H. The variance in H is itself dependent on the number of loci typed (see equations (1) and (2)). Given the above, an attempt was made to parameterise the model with realistic estimates of mean(f) and σ2(f). A literature review identified a number of studies for which these statistics were reported (in a few cases the authors were contacted to obtain σ2(f)). The review included most of the types of population that are frequently the focus of inbreeding depression studies, including island populations (Soay sheep, red deer, song sparrows, collared flycatchers, Darwin's finches), captive populations (wolves), re-introductions (Arabian oryx) and domestic organisms (Coopworth sheep, Lipizzan horses). Details of each study population and f summary statistics are reported in Table 1. Note that microsatellite markers have been typed in every species. Mean f ranged from a minimum of 0.002 (collared flycatchers) to 0.103 (wolves) – a 50-fold difference. The variance in f ranged from 0.0005 (collared flycatchers) to 0.0192 (wolves) – a 38-fold difference.

Table 1 Predicted relationship between inbreeding coefficient (f) and MLH in 12 vertebrate populations

Using these parameter estimates, the relative importance of mean(f), σ2(f) and the number of typed loci on the correlation coefficient r(H, f) was examined (see Figure 1a–c). Generally, the relationship between f and H was weak, especially for values of mean(f) and σ2(f) that are most commonly observed in wild and domestic vertebrate populations. The relationship was largely insensitive to mean(f) (see Figure 1a), but was very dependent on σ2(f) (see Figure 1b). This result is unsurprising – no relationship is expected in a population where all individuals have the same f, regardless of the actual value of f. Note that when 10 loci are typed the correlation between f and H is always weak (r<0.5). For populations with σ2(f)<0.005 (nine out of 12 populations considered here meet this criterion), a genome scan of 200 highly variable markers would only produce crude estimates of individual f. In summary, unless the study population has very high variance in f, MLH is only weakly correlated to f.

Figure 1
figure 1

Estimated correlation coefficient between MLH (H) and inbreeding coefficient (f) as a function of (a) mean f, (b) the variance in f and (c) the number of loci typed (L). It is assumed that each locus has a mean heterozygosity of 0.7. Unless otherwise stated, it is assumed that each individual is typed at 10 loci.

Analysis of the Coopworth sheep data set

Individual inbreeding coefficient was positively skewed and possibly bimodally distributed (Figure 2a), while MLH followed an approximately normal distribution (Figure 2b). f had a mean of 0.052 and a variance of 0.008 (Table 1). In comparison with the other 11 populations reported in Table 1, this gave Coopworth sheep the fourth largest mean f and the ninth largest variance in f. Thus, the population is not unusual in terms of observed levels of inbreeding. Standardised MLH was highly correlated with its unstandardised equivalent (r=0.984).

Figure 2
figure 2

Distribution of (a) individual inbreeding coefficients, (b) individual MLH among 590 Coopworth sheep.

Relationship between heterozygosity and inbreeding coefficient

MLH was significantly and negatively correlated with f (r=−0.177; P<0.0001; Figure 3). However, the correlation coefficient between the two variables was weak, despite MLH being measured at a far larger number of loci than any similar study. In fact, the observed correlation coefficient between f and H was considerably weaker than that predicted under the model (predicted r=−0.39). Using equation (3a), the expected correlation coefficient between f and MLH has a standard deviation of 0.038, such that the observed r(H, f) was 5.6 standard deviations lower than expected (P<0.001).

Figure 3
figure 3

Relationship between individual inbreeding coefficient and MLH in 590 Coopworth sheep.

Is heterozygosity correlated across the genome?

For marker heterozygosity to provide an estimate of heterozygosity at functionally important loci, it should also be correlated between individual marker loci. We calculated the correlation coefficients between heterozygosity at pairs of individual loci – for example, the correlation between heterozygosity at locus a and at locus b. The process was repeated for every pair of loci and the sign of the correlation noted. A standard sign test cannot be used to determine whether there were significantly more positive than negative correlations than expected by chance, because not every correlation is independent. Instead, we used the randomisation approach described by Slate and Pemberton (2002). There were more negative correlations (4653) than there were positive correlations (4630), and there were not significantly more positive correlations than expected by chance (P=0.331; based on 10 000 permutations). However, feeding the observed values of marker heterozygosity, mean f and σ2(f) into equation (3), it becomes clear that the power of this test is weak. The expected correlation in heterozygosity between two markers is only 0.0045, with a standard deviation almost 10-fold greater (0.0412). In other words, a negative correlation is expected to be observed almost as often as a positive correlation.

An alternative solution to the problem is to pool together all of the correlations between the 138 loci within one test. The sum of the covariances in heterozygosity between each pair of loci can be calculated as (Lynch and Walsh, 1998)

where hi is individual heterozygosity at locus i and hj is individual heterozygosity at locus j, and each individual is typed at n=138 loci, yielding σ(hi, hj)=0.51. Note that missing values were replaced by the locus population mean, providing a conservative estimate of the test statistic. To determine the statistical significance of the test statistic, individual heterozygosity was permuted across individuals (sampled without replacement) at each locus, and σ(hi, hj) was recalculated. This process was repeated 10 000 times, and the actual test statistic was not significant (P=0.178). In summary, heterozygosity is not correlated across the genome, indicating that in this population marker heterozygosity would not accurately reflect heterozygosity at unlinked functionally important loci.

For the purposes of this paper, loci are regarded as linked if they map within 50 cM of each other (alleles at linked loci separated by greater distances can be regarded as independently segregating). Among linked pairs of loci, heterozygosity was more often positively correlated than negatively correlated (104 positive versus 68 negative correlations; P=0.003; 1000 permutations of the data). Furthermore, the correlation in heterozygosity was a function of the Kosambi centiMorgans distance between loci; r(hi, hj)=0.066–0.0015 × distance, P=0.013, n=172, r2=0.036 (Figure 4). Thus, individual marker heterozygosity does appear to be an indicator of heterozygosity at linked loci.

Figure 4
figure 4

Relationship between r(hi, hj) and distance (in Kosambi centiMorgans) for linked pairs of microsatellites that map within 50 cM of each other.

Do MLH or f explain variation in morphological traits?

Univariate analyses did not reveal significant associations between any morphological trait and f fitted as a main effect. However, when fitted as an interaction term with sire, f explained significant variation in the following traits: empty body weight (F4,551=4.19, P=0.002); hot carcass weight (F4,552=4.96, P<0.001); spleen weight (F4,556=4.85, P<0.001); liver weight (F4,559=3.73, P=0.005); heart weight (F4,554=3.68, P=0.006); tibia length (F4,554=2.94, P=0.020); carcass length (F4,559=4.63, P<0.002). Detailed models are presented in Table 2. For all traits, the association was largely driven by a strong effect observed in the half-sib progeny of sire 616, with relatively inbred animals being generally smaller. These data suggest that, relative to the other four sires, sire 616 carries more deleterious recessive alleles at loci that influence growth traits. The significant relationship between f and morphometric variation among the progeny of sire 616 is not attributable to outlying data points. The two terms, f and f × Sire, together explained between 0.75% (backfat depth) and 2.80% (spleen weight) of the variation in any trait.

Table 2 Univariate analysis of 10 morphometric traits examined in a population of Coopworth sheep

MLH was not a significant term in any univariate analysis, whether fitted as a main effect or as an interaction term with sire. In summary, f, but not MLH, revealed inbreeding depression for morphometric traits.

The relationship between f and morphometric variation was also revealed by the multivariate analysis. The interaction between sire and f was a significant term when fitted in the MANOVA of the nine morphometric traits (F36,2208=1.451, P=0.041) but f fitted as a main effect was not (F9,549=0.903, P=0.523). MLH was not a significant term either as a main effect (F9,549=1.5, P=0.144) or as an interaction with sire (F36,2208=0.715, P=0.896). Note that f and MLH were not fitted in the same model for either the univariate or the multivariate analyses.

Predictions for other vertebrate populations

In addition to the Coopworth sheep population, summary statistics relating to f and marker heterozygosity were collected for 11 other populations. These data were then used to estimate the correlation coefficient between f and MLH (a) with the markers that have been typed in the study population to date, and (b) if 100 markers of mean heterozygosity 0.7 were typed. Estimates are presented in Table 1. The population for which MLH was the best predictor of f was Scandinavian wolves with an expected r(H, f)=−0.71 if the 29 documented microsatellites were typed and an expected r(H, f)= −0.90 if 100 loci were typed. The population for which MLH was worst at predicting f was the collared flycatchers (Ficedula albicollis) on the Swedish Island of Gotland, with an expected r(H, f)=−0.08 if the three documented microsatellites were typed and an expected r(H, f)=−0.32 if 100 loci were typed. Generally, heterozygosity would not provide robust estimates of f, even when 100 loci are typed. For example, the expected r(H, f) is weaker than –0.5 for five of the 12 populations and weaker than −0.7 for nine of the populations.

In seven of the populations, r(H, f) had actually been estimated, enabling a comparison between expected and observed correlation coefficients (Table 1). In Scandinavian wolves and Large Ground Finches, the observed and expected correlation coefficients were almost identical. In four of the five other populations, r(H, f)observed was weaker than r(H, f)expected, perhaps due to errors in estimation of f (see Discussion).

Discussion

The primary objective of this study was to establish if and when MLH can be used as a robust surrogate for individual f. A theoretical model and empirical data both suggest that the correlation between MLH and f is weak unless the study population exhibits unusually high variance in f. The Coopworth sheep data set used in this study comprised a considerably larger number of genotypes (590 individuals typed at 138 loci) than any similar study, yet MLH was only weakly correlated to individual f. Furthermore, f explained significant variation in a number of morphometric traits (typically 1–2% of the overall trait variance), but heterozygosity did not. From equation (5), it can be seen that the expected correlation between trait value and MLH is the product of the correlation coefficient between f and the trait (hereafter r(W, f)) and r(H, f). Estimates of the proportion of phenotypic trait variation explained by f are scarce, although from the limited available data 2% seems a typical value (see for example Kruuk et al, 2002; this paper, Table 2). Assuming r(W, f)2=0.02, and given the median value of r(H, f)=−0.21 reported in Table 1, a crude estimate of average r(W, H) is 0.03, which is equivalent to MLH explaining <0.1% of trait variance. These findings are consistent with a recent meta-analysis that reported a mean r(W, H) of 0.09 for life history traits and 0.01 for morphometric traits (Coltman and Slate, 2003). In summary, MLH is a poor replacement for f, such that very large sample sizes are required to detect variance in inbreeding in most populations.

If MLH has limited power to detect variance in inbreeding, what is the most parsimonious explanation for studies reporting HFCs? One possibility is that marker loci are in physical linkage with loci that determine trait variation, such that detected HFCs are attributable to local effects rather than a general effect caused by variance in inbreeding. In fact, general and local effects reflect the same phenomenon, that is, the existence of deleterious recessives and/or overdominant alleles dispersed throughout individual genomes. Heterozygosity at marker loci will always be more correlated to autozygosity at closely linked fitness loci than at unlinked loci, as linkage is known to increase identity disequilibria. However, depending on how many fitness loci are segregating in a population, and on population structure and history, the contribution of the chromosomal vicinity of a given marker to HFC may or may not be negligible compared to that of the rest of the genome. Models of general effects (such as the above model) assume (i) that inbreeding depression is homogeneously distributed throughout the genome, rather than concentrated in a few loci, and (ii) that all loci are unlinked. When one or both of these conditions are seriously violated, general effect models predict less HFC than there actually is, as local effects take more importance. In such conditions, we would also expect that the contributions of different marker loci to HFC become significantly different, and that linked markers tend to behave in the same way with regard to HFC. It should be noted that although 4% of our loci were linked, only 0.11% of locus pairs were separated by a distance of less than 10 cM. Furthermore, the maximum correlation in heterozygosity observed between any pair of loci was very low (r=0.01), indicating that each of the 138 loci can be regarded as an independent sample of heterozygosity.

Although no association was detected between MLH and trait variation in Coopworth sheep, it is notable that heterozygosity was only correlated between linked loci and that the correlation declined as a function of physical distance (Figure 4). Thus, if an association had been detected, it could potentially be attributable to a local effect. Among studies reporting significant HFCs, a recent analysis of Great Reed Warblers (Acrophalus aruninaceus) provides the best evidence for local effects being the underlying mechanism (Hansson et al, 2001). The Great Reed Warbler experiment maximised the probability of detecting local effects, as it was conducted within pairs from the same brood (hence each member of a pair had the same f, and general effects were excluded). The more heterozygous member of a pair had greater probability of recruiting to the adult population. Elsewhere, local effects and identity disequilibrium were found to explain simultaneously an association between birth weight and MLH at 71 microsatellites typed in red deer (Cervus elaphus). MLH was positively and significantly associated with birth weight, and heterozygosity was correlated across loci (Slate and Pemberton, 2002). However, heterozygosity at two individual loci explained additional variation in birth weight after MLH at the remaining loci was fitted to the model. The two loci were subsequently shown to be physically linked to birth weight QTL on two separate chromosomes (Slate et al, 2002). Thus, local effects have been demonstrated to be a cause of some HFCs. Note that the studies we refer to concern only vertebrates, the main source of pedigreed data sets. The mating systems of vertebrates (obligate biparental reproduction, frequent postnatal dispersal) may leave less opportunity to generate a high variance in inbreeding, than exists in other organisms such as molluscs or self-fertile plants, in which HFC has traditionally been observed. It may be that vertebrate populations are especially favourable situations in which to observe local effects.

It is also clear that there is a publication bias in favour of HFCs of greatest magnitude (Coltman and Slate, 2003). Often HFC studies are conducted simply because the marker data are available, for example, after microsatellites were typed to examine population genetic structure or for parentage analysis. There may be a tendency for spurious associations to be reported in the literature and presented as evidence for inbreeding depression. Significant HFCs are only likely to be caused by inbreeding depression if σ2(f) or r(W, f) are large. Alternatively, those studies that do reveal a significant association may represent the low proportion of experiments expected to generate a significant test statistic despite a lack of power. Therefore, it seems reasonable to conclude that any attempt to infer inbreeding depression via variance in MLH is likely to end in failure even when very large numbers of individuals or markers (or both) are typed. Furthermore, those experiments that do reveal significant HFCs usually reveal little information about the underlying mechanism, and in the absence of additional support do not provide evidence of inbreeding depression.

Overall, the model was a good predictor of the observed correlation between f and MLH (r(H, f)observed= 0.11+1.06r(H, f)expected; r2=0.78, P<0.01; df=5). There was a trend for r(H, f)observed to be weaker than r(H, f)expected among those populations for which the comparison was possible. The most likely explanation for this trend is that the pedigrees contained some errors, resulting in errors in estimated f. Alternatively, the founder animals in each pedigree had nonzero values of f or were related. Under either scenario, individual f inferred from the pedigree may be inaccurate, adding noise to r(H, f)observed. This explanation is supported by the observation that the Scandinavian Wolf and Large Ground Finch populations provided very similar estimated and observed correlation coefficients. The wolves were from a captive population that has been closely managed and is small, making pedigree errors unlikely. Furthermore, the eight founder animals came from four geographical locations (two animals per location) so that each founder can safely be assumed to be unrelated to at least six of the seven other founders (Hedrick et al, 2001). The Finch population was recently founded by a small number of immigrant birds, making accurate pedigree construction relatively straightforward. In contrast, the other populations were all large or were not intensively managed, making inaccuracies in estimated f more likely. Given that observed and estimated correlations were only available for seven populations, it seems prudent to avoid drawing more solid conclusions at this stage.

One area of future investigation that might be addressed is to examine whether alternative microsatellite-based variables are superior indicators of f. One measure, mean d2, has received recent attention, although theoretical (Tsitrone et al, 2001), empirical (Slate and Pemberton, 2002) and meta-analytical (Coltman and Slate, 2003) studies suggest that this metric is even less useful than heterozygosity. A more promising alternative is the use of methods that estimate the relatedness of an individual's parents from the focal individual's genotype (Ritland, 1996; Lynch and Ritland, 1999; Amos et al, 2001). The potential advantage of such an approach is that allele frequency is incorporated into the measure; for example, an individual homozygous for a rare allele is more likely to be inbred than an individual homozygous for a common allele. Unfortunately, the Coopworth sheep microsatellites were scored in such a way as to make such an analysis impossible.

For the purposes of this analysis, the Coopworth sheep data set was chosen for its magnitude (with respect to the number of animals and loci typed), rather than as an attempt to better understand the genetics of production traits in sheep per se. However, the observed inbreeding depression and/or heterosis in this population is not surprising or unprecedented. Inbreeding depression for morphological traits in other breeds has been described elsewhere (Falconer, 1989; Wiener et al, 1992). Furthermore, the two lines from which these sheep were derived showed an asymmetric response towards selection for backfat depth (Morris et al, 1997), an observation consistent with a trait having a relatively large dominance variance component (Falconer, 1989; Frankham, 1991; Merilä and Sheldon, 1999), and so being susceptible to inbreeding depression.

To conclude, a theoretical model and an extensive data set suggest that MLH is a poor indicator of f even in populations where inbreeding is common. These findings are consistent with previous investigations that have failed to detect significant HFCs in large, randomly mating populations (Houle, 1989; Savolainen and Hedrick, 1995) or in structured populations (Whitlock, 1993). Furthermore, it is apparent that marker heterozygosity does not always provide a robust estimate of genome-wide heterozygosity but may reflect heterozygosity at linked loci. These issues should always be considered when attempting to detect inbreeding depression in populations for which pedigree records are unavailable.