Genetic architecture and lifetime dynamics of inbreeding depression in a wild mammal

Stoffel, M. A.; Johnston, S. E.; Pilkington, J. G.; Pemberton, J. M.

doi:10.1038/s41467-021-23222-9

Download PDF

Article
Open access
Published: 20 May 2021

Genetic architecture and lifetime dynamics of inbreeding depression in a wild mammal

Nature Communications volume 12, Article number: 2972 (2021) Cite this article

7592 Accesses
43 Citations
6 Altmetric
Metrics details

Subjects

Abstract

Inbreeding depression is ubiquitous, but we still know little about its genetic architecture and precise effects in wild populations. Here, we combine long-term life-history data with 417 K imputed SNP genotypes for 5952 wild Soay sheep to explore inbreeding depression on a key fitness component, annual survival. Inbreeding manifests in long runs of homozygosity (ROH), which make up nearly half of the genome in the most inbred individuals. The ROH landscape varies widely across the genome, with islands where up to 87% and deserts where only 4% of individuals have ROH. The fitness consequences of inbreeding are severe; a 10% increase in individual inbreeding F_ROH is associated with a 60% reduction in the odds of survival in lambs, though inbreeding depression decreases with age. Finally, a genome-wide association scan on ROH shows that many loci with small effects and five loci with larger effects contribute to inbreeding depression in survival.

Genetic gains underpinning a little-known strawberry Green Revolution

Article Open access 19 March 2024

Hybrid speciation driven by multilocus introgression of ecological traits

Article Open access 17 April 2024

Complexity of avian evolution revealed by family-level genomes

Article 01 April 2024

Introduction

Inbreeding depression, the reduced fitness of offspring from related parents, has been a core theme in evolutionary and conservation biology since Darwin¹. The detrimental effects of inbreeding on a broad range of traits, individual fitness and population viability have now been recognised across the animal and plant kingdoms^{1,2,3,4,5,6,7,8,9}. With the ongoing decline of animal populations¹⁰ and global habitat fragmentation¹¹, rates of inbreeding are likely to accelerate, and so it is increasingly important to have a detailed understanding of its genetic causes and fitness consequences to inform conservation strategies. However, we still know very little about some of the most fundamental features of inbreeding depression in wild populations^7,12, such as its precise strength, how it varies across life-history stages^13,14,15, and if it is driven by loci with weak and/or strong deleterious effects on fitness. As genomic data proliferates for wild populations, it is increasingly possible to quantify the distribution of effect sizes at loci underpinning inbreeding depression. By determining this genetic architecture, we can improve our understanding of the relationship between inbreeding, purging and genetic rescue^7,16, with important implications for the persistence and conservation of small populations^17,18,19.

Inbreeding decreases fitness because it increases the fraction of the genome which is homozygous and identical-by-descent (IBD). This unmasks the effects of (partially-) recessive deleterious alleles or in rarer cases may decrease fitness at loci with heterozygote advantage^4,20. While the probability of IBD at a genetic locus was traditionally estimated as the expected inbreeding coefficient based on a pedigree^21,22, modern genomic approaches enable us to gain a much more detailed picture. Genome-wide markers or whole-genome sequences are now helping to unravel the genomic mosaic of homo- and heterozygosity and, unlike pedigree-based approaches, capture individual variation in homozygosity due to the stochastic effects of Mendelian segregation and recombination^12,23. This makes it possible to quantify realised rather than expected individual inbreeding, and to measure IBD precisely along the genome^24,25.

An intuitive and powerful way of measuring IBD is through runs of homozygosity (ROH), which are long stretches of homozygous genotypes²⁶. An ROH arises when two IBD haplotypes come together in an individual, which happens more frequently with increasing parental relatedness. The frequency and length of ROH in a population vary along the genome due to factors such as recombination, gene density, genetic drift and linkage disequilibrium^{27,28,29,30,31} and extreme regions can potentially pinpoint loci under natural selection^24,31. Regions with high ROH density, known as ‘ROH islands’³², have low genetic diversity and high homozygosity and have been linked to loci under positive selection in humans³¹. Regions where ROH are rare in the population, known as ‘ROH deserts’, could be due to loci under balancing selection or loci harbouring strongly deleterious mutations under purifying selection^{12,24,33,34,35}. Moreover, the abundance of ROH also varies among ROH length classes, which are shaped by the effective population size (N_e)^30,31,36,37 and can vary in their deleterious allele load^38,39,40,41. Longer ROH are a consequence of closer inbreeding and their relative abundance is indicative of recent N_e. This is because their underlying IBD haplotypes have a most recent common ancestor (MRCA) in the recent past with fewer generations for recombination to break them up. In contrast, shorter ROH are derived from more distant ancestors and their relative abundance in the population reflects N_e further back in time⁴².

Individual inbreeding can be measured as the proportion of the autosomal genome in ROH (F_ROH), which is an estimate of realised individual inbreeding F⁴³. F_ROH has helped to uncover inbreeding depression in a wide range of traits in humans and farm animals^5,29,44 and is often preferable to other SNP-based inbreeding estimators in terms of precision and bias^5,45,46,47. While F_ROH condenses the information about an individual’s IBD into a single number, quantifying the genomic locations of ROH across individuals makes it possible to identify the loci contributing to inbreeding depression and estimate their effect sizes¹². As mapping inbreeding depression in fitness and complex traits requires large samples in addition to dense genomic data³⁷, the genetic architecture of inbreeding depression has mostly been studied in humans and livestock^48,49. However, individual fitness will be different under natural conditions and consequently, there is a need to study inbreeding depression in wild populations to understand its genetic basis in an evolutionary and ecological context. To date, only a handful of studies have estimated inbreeding depression using genomic data in the wild^{13,14,50,51,52,53}. While these studies show that inbreeding depression in wild populations is more prevalent and more severe than previously thought, all of them used genome-wide inbreeding coefficients and did not explore the underlying genetic basis of depression.

The Soay sheep of St. Kilda provides an exceptional opportunity for a detailed genomic study of inbreeding depression. The Soay sheep is a primitive breed that was brought to the Scottish St. Kilda archipelago around 4000 years ago⁵⁴, and has survived on the island of Soay ever since. Although the Soay sheep have been largely unmanaged on Soay, there is written and genomic evidence of an admixture event with the now-extinct Dunface breed ~150 years or 32 generations ago⁵⁵. In 1932, 107 Soay sheep were transferred to the neighbouring island of Hirta where they are unmanaged. On Hirta the population increased and nowadays fluctuates between 600 and 2200 individuals. A part of the population in Village Bay became the subject of a long-term individual-based study, and detailed life history, pedigree and genotype data have been collected for most individuals born since 1985⁵⁴.

Here, we combine annual survival data for 5952 free-living Soay sheep over a 35 year period with 417 K partially imputed genome-wide SNP markers to estimate the precise effects and genetic basis of inbreeding depression. First, we quantify the genomic consequences of inbreeding through patterns of ROH among individuals and across the genome. We then calculate individual genomic inbreeding coefficients F_ROH to model inbreeding depression in annual survival and estimate its strength and dynamics across the lifetime. Finally, we explore the genetic architecture of inbreeding depression using a mixed-model-based genome-wide association scan on ROH to shed light on whether depression is caused by many loci with small effects, few loci with large effects or a mixture of both.

Results

Genotyping and imputation

All study individuals have been genotyped on the Illumina Ovine SNP50 BeadChip assaying 51,135 SNPs. In addition, 189 individuals have been genotyped on the Ovine Infinium High-Density chip containing 606,066 SNPs. To increase the genomic resolution for our analyses, we combined autosomal genotypes from both SNP chips with pedigree information to impute missing SNPs in individuals genotyped at lower marker density using AlphaImpute⁵⁶. Cross-validation showed that imputation was successful, with a median of 99.3% correctly imputed genotypes per individual (Supplementary Table 1). Moreover, the inferred inbreeding coefficients F_ROH were very similar when comparing individuals genotyped on the high-density chip (median F_ROH = 0.239) and individuals with imputed SNPs (median F_ROH = 0.241), indicating no obvious bias in the abundance of inferred ROH based on imputed data (Supplementary Fig. 1). After quality control, the genomic dataset contained 417,373 polymorphic and autosomal SNPs with a mean minor allele frequency (MAF) of 23% (Supplementary Fig. 2) and a mean call rate of 99.5% across individuals.

Patterns of inbreeding in the genome

We first explored how inbreeding and long-term small population size (estimated N_e = 194 ⁵⁷)-shaped patterns of ROH in Soay sheep (Fig. 1). Individuals had a mean of 194 ROH (sd = 11.6) longer than 1.2 Mb, which on average made up 24% of the autosomal genome (i.e. mean F_ROH = 0.24, range = 0.18–0.50, Supplementary Fig. 3). The mean individual inbreeding coefficient F_ROH of sheep born in a given year remained constant over the course of the study period (Supplementary Fig. 4). Among individual variation in ROH length was high: the average ROH in the seven most inbred sheep was more than twice as long as ROH in the seven least inbred sheep (6.83 vs. 2.72 Mb, respectively; Fig. 1a) although the average number of ROH was similar (170 vs. 169, respectively).

**Fig. 1: Runs of homozygosity (ROH) variation among individuals and across the genome.**

The abundance of ROH in Soay sheep also varied considerably among ROH length classes (Fig. 1b). The largest fraction of IBD in the population consisted of ROH between 2.4 and 4.9 Mb originating around 8–16 generations ago, which made up 8.1% of an individual’s genome on average. Long ROH > 19.5 Mb were found in 38.2% of individuals (Fig. 1b). However, long ROH made up on average only 0.6% of the genome of the least inbred individuals with pedigree inbreeding F_ped < 0.1. In contrast, long ROH extended over 7% and 18% of the genome in inbred individuals with F_ped > 0.1 and F_ped > 0.2, respectively (Supplementary Fig. 5).

The frequency of ROH in the population varied widely across the genome (Fig. 1c). We scanned ROH in non-overlapping 500 Kb windows, and classified the 0.5% windows with the highest ROH density as ROH islands and the 0.5% windows with the lowest ROH density as ROH deserts³¹. The top ROH island on chromosome 1 (227–227.5 Mb) contained ROH in 87% of individuals, while only 4.4% of individuals had an ROH in the top ROH desert on chromosome 11 (58.5–59 Mb, see Supplementary Table 2 for a list of the top ROH deserts and islands).

ROH density and recombination rate

The wide variation in ROH density along the genome could be partially explained by recombination because regions with high recombination rate produce shorter ROH, and these are less likely to be detected by ROH calling algorithms³⁰. Notably, recombination by itself does not impact the underlying true proportion of IBD, but only affects the ROH length distribution. Consequently, regions with high recombination could create putative ROH deserts without a change in the true levels of IBD, because short ROH are less likely to be called³⁰. To evaluate how much variation in ROH density along the genome is due to recombination rate variation and how much of it is tracking the underlying levels of IBD, we constructed a linear mixed model with ROH density (proportion of individuals with ROH) measured in 500 Kb windows as response variable, window recombination rate in cM/Mb based on the Soay sheep linkage map⁵⁸ and window SNP heterozygosity as fixed effects as well as a chromosome identifier as a random effect.

Recombination rate and heterozygosity together explained 42% of the variation in ROH density (marginal R² = 0.42, 95% CI [0.40, 0.44], Supplementary Table 3A), with the majority of variation explained by heterozygosity (semi-partial R² = 0.38, 95% CI [0.36, 0.40], Fig. 2b) and only around 4% explained by recombination rate (semi-partial R² = 0.04, 95% CI [0.02, 0.07], Fig. 2a and Supplementary Fig. 6 for a chromosome-wise plot). The pattern is similar when re-running the model only on windows identified as ROH islands and deserts, where ROH density is largely explained by heterozygosity (semi-partial R² = 0.89, 95% CI [0.83, 0.94], Fig. 2d), with only a small proportion of the variation explained by recombination rate variation (semi-partial R² = 0.07, 95% CI [0.01, 0.12], Fig. 2c). Consequently, although recombination rate impacts ROH lengths and hence ROH detection probabilities, this accounts for only a small proportion of the variation in detected ROH density, which mostly reflects the underlying patterns of IBD along the genome.

**Fig. 2: Correlates of ROH density variation across the genome.**

Lastly, we explored how much variation in ROH density was explained by recombination when using different minimum ROH thresholds. We repeated the analysis with a dataset based on a minimum ROH length threshold of 0.4 Mb, and a second dataset with a minimum ROH length of 3 Mb (Supplementary Table 3B, C). Compared to the original dataset with a minimum ROH length of 1.2 Mb, recombination explained less variation in ROH density in the dataset including shorter ROH (semi-partial R² = 0.01, 95% CI [0.00, 0.04]) and more variation in the dataset consisting only of longer ROH (semi-partial R² = 0.08, 95% CI [0.06, 0.11]). Consequently, recombination rate variation has a larger impact on the detected abundance of longer ROH across the genome.

Inbreeding depression in survival

Survival is a key fitness component. In Soay sheep, more than half of all individuals die over their first winter, minimising their chances to reproduce (Supplementary Fig. 7). Sheep survival is assessed through routine mortality checks which are conducted throughout the year. Over 80% of sheep in the study area are found after their death⁵⁰, resulting in a total of 15,889 annual survival observations for 5952 sheep. The distribution of individual inbreeding coefficients F_ROH in different age classes revealed that highly inbred individuals rarely survive their early years of life and never reach old ages (Fig. 3a). However, the strength of inbreeding depression appeared to decline at older ages (Fig. 3b). For example, in sheep older than four years, the proportion of survivors among the most inbred individuals was only marginally lower than among the least inbred individuals (Fig. 3b).

**Fig. 3: Inbreeding depression in annual survival.**

We modelled the strength of inbreeding depression across the lifetime using an animal model with a binomial error distribution and annual survival as a response variable. Overall, the effect of inbreeding on survival was strong: in lambs (age 0), a 10% increase in F_ROH was associated with a 0.4 multiplicative change in the odds of survival (odds ratio, OR [95% credible interval, CI] = 0.40 [0.30, 0.53], Supplementary Table 4), or a 60% reduction (1–0.40) in the odds of survival. This translates into non-linear survival differences on the probability scale. For example, a male non-twin Soay sheep lamb with an F_ROH 10% above the mean had a 23% lower probability of surviving its first winter compared to an average lamb (F_ROH = 0.34 vs F_ROH = 0.24; Fig. 3c). Across the lifetime, the model estimates for the interactions between F_ROH and the different life stages predicted a decrease in the strength of inbreeding depression in later life stages (Fig. 3c) with the largest predicted difference between early (age 1, 2) and late-life (age 5+, OR [95% CI] = 2.03 [1.08, 3.82], Supplementary Table 4).

We next estimated the inbreeding load in Soay sheep as the diploid number of lethal equivalents 2B. Lethal equivalents are a concept rooted in population genetics, where one lethal equivalent is equal to a group of mutations which, if dispersed across individuals, would cause one death on average⁵⁹. We followed suggestions by Nietlisbach et al.⁶⁰ and refitted the survival model with a Poisson distribution and logarithmic link function using a simplified model structure without interactions for better comparability across studies. This gave an estimate of 2B = 4.57 (95% CI 2.61–6.55) lethal equivalents for Soay sheep annual survival.

Genetic architecture of inbreeding depression

To quantify the survival consequences of being IBD at each SNP location, we used a modified genome-wide association study (GWAS). Unlike in traditional GWAS where p-values of additive SNP effects are of interest, we analysed the effects of ROH status for both alleles at every SNP. Specifically, at a diallelic locus, ROH result either from two IBD haplotypes containing allele A or from two IBD haplotypes containing allele B. If strongly deleterious recessive alleles exist in the population, they could be associated with ROH based on allele A haplotypes or ROH based on allele B haplotypes. To test this, we constructed a binomial mixed model of annual survival for each SNP position. In each model, we fitted two ROH status predictors. The first predictor was assigned a 1 if allele A was homozygous and part of an ROH and a 0 otherwise. The second predictor was assigned a 1 if allele B was homozygous and part of an ROH and a 0 otherwise. Model estimates and p-values for these two predictors, therefore, reflect whether ROH are associated with survival consequences at each SNP location and for each allele. In the GWAS model, we also controlled for the additive SNP effect and mean individual inbreeding F_ROH (based on all autosomes except for the focal chromosome), alongside a range of other individual traits and environmental effects (see ‘Methods’ section for details).

A GWAS on allele-wise ROH status can detect deleterious recessive alleles at specific regions when ROH effects reach genome-wide significance. Moreover, the distribution of ROH status effects across the genome can also be informative of the overall number of deleterious recessive alleles contributing to inbreeding depression through ROH. Under the null hypothesis that ROH status does not have an effect on survival at any SNP position, we would expect a 50/50 distribution of negative and positive ROH status estimates due to chance. In contrast, we found many more negative than positive effects of ROH status on survival across the genome than expected by chance (Fig. 4a, b; 465 K neg. vs. 354 K pos.; exact binomial test p = 2.2 * 10⁻¹⁶). Moreover, the proportion of negative ROH effects increases for larger model estimates (Fig. 4a) and smaller p-values (Fig. 4b). We tested this statistically using two binomial generalised linear models (GLMs), with effect direction as a binary response, and model estimate and p-value, respectively, as predictors. ROH effects were more likely to be negative when their model estimate was larger (log-OR [95% CI] = 0.35 [0.344, 0.358]) and when their p-value was smaller (log-OR [95% CI] = −3.82 [−3.84, −3.80]). Consequently, it is likely that a large number of recessive deleterious alleles contribute to inbreeding depression, which manifests in negative ROH effects spread across many loci.

**Fig. 4: GWAS of SNP-wise ROH status effects on annual survival.**

The GWAS revealed genome-wide significant ROH effects in seven regions on chromosomes 3 (two regions), 10, 14, 18, 19 and 23 (Fig. 4c and Supplementary Table 5). In five of these regions, ROH status for one of the alleles was associated with negative effects on survival, likely caused by relatively strongly deleterious recessive alleles. ROH in two further regions on chromosomes 3 and 19 were associated with increased survival probabilities, possibly due to haplotypes with positive effects on survival. To explore the genomic regions with large ROH effects further, we quantified the ROH density and SNP heterozygosity in 2 Mb windows around the top GWAS hits (Supplementary Fig. 8). Strongly deleterious recessive alleles might be expected to occur in regions of elevated heterozygosity where they are rarely expressed in their homozygous state. Heterozygosity was higher than average around the top SNPs on chromosomes 10, 14, 18 and 23, and ROH frequency was lower around the top SNPs on chromosomes 10, 18, 19 and 23, but overall, but we did not observe a convincing pattern of genetic diversity across the five regions harbouring strongly deleterious mutations (Supplementary Fig. 8).

Discussion

The Soay sheep on St. Kilda have existed at a small population size in relative isolation for thousands of years⁵⁴. As a consequence, levels of IBD are high in the population and ROH make up nearly a quarter of the average autosomal Soay sheep genome. Although this is still an underestimate as we only analysed ROH longer than 1.2 Mb, it is three times as high as the average F_ROH estimated across 78 mammal species based on genome-sequence data⁶¹ and only slightly lower than in some extremely inbred and very small populations such as mountain gorillas⁶², Scandinavian grey wolves²⁴ or Isle Royale wolves¹⁷.

The distribution of ROH length classes can provide insights into population history and levels of inbreeding^24,31,36. In Soay sheep, the largest fraction of IBD was comprised of ROH with lengths between 1.2 and 4.9 Mb. These ROH originate from haplotypes around 8–32 generations ago and their relative abundance reflects a smaller N_e of the population in the recent past. While there is considerably higher uncertainty in estimating the time to the MRCA for ROH when these are measured based on physical rather than genetic map lengths¹², this corroborates with historical knowledge: the Soay sheep population was indeed smaller in the early twentieth century when 107 sheep were translocated from the island of Soay to their current location on Hirta, after the last humans left St. Kilda⁵⁴. Current levels of inbreeding were most visible in the variation in long ROH (>19.5 Mb), which made up <1% of the genome of the least inbred individuals on average, but 18% of the genome of highly inbred individuals with pedigree inbreeding coefficients F_ped > 0.2 (Supplementary Fig. 5).

The abundance of ROH in the population also varied substantially across the genome. In the most extreme ROH deserts and islands, only 4% of individuals and up to 87% of individuals had ROH, respectively, compared to 24% on average. These detected ROH islands and deserts could be regions with genuinely low or high levels of IBD due to genetic drift or natural selection, but they might also be a consequence of low or high recombination rates³¹. However, recombination itself cannot change the true abundance of IBD³⁰. Instead, ROH with a given coalescent time will be shorter in regions with high recombination and are less likely to be detected by ROH calling algorithms³⁰. We modelled this and found that only 4% of the variation in ROH density across the genome was explained by variation in recombination rate, but the impact of recombination was greater on the detected densities of longer ROH. Consequently, the association between ROH density and recombination could change with both the minimum ROH threshold and the average inbreeding levels in a population. In line with the low genome-wide effects of recombination on ROH density, many of the ROH islands and deserts also had very similar recombination rates (Fig. 2c). Ruling out recombination as a major driver for ROH islands and deserts opens up the possibility for future studies to compare the extreme ROH density in islands and deserts to expectations under simulated neutral scenarios to test for positive and purifying selection, respectively^24,31.

ROH deserts might for example harbour loci contributing to inbreeding depression, as strongly deleterious alleles are likely to cause ROH to be rare in their genomic vicinity due to purifying selection removing homozygous haplotypes^7,24,31. However, because of the near absence of ROH, genome-wide association analyses are unlikely to pick up deleterious effects due to a lack of statistical power, and indeed none of our top GWAS hits was located in a ROH desert. An alternative option is to test for deficits of homozygous genotypes under expected frequencies, as deployed in farm animals and plants to identify embryonic lethals^33,34,35, though these methods require either an experimental setup or very large sample sizes with tens to hundreds of thousands of individuals. In contrast, ROH islands with very high ROH abundances probably contain very few recessive deleterious alleles, as these are regularly exposed to selection when homozygous and hence likely to be purged from the population. Instead, it is possible that ROH islands have emerged around loci under positive selection^29,31 through hard selective sweeps³⁰.

We have only recently begun to understand the precise consequences of inbreeding for individual fitness in natural populations. In Soay sheep, we found that the odds of survival decreased by 60% with a mere 10% increase in F_ROH, adding to a small yet growing body of genomic studies reporting stronger effects of inbreeding depression in wild populations than assumed in pre-genomics times^{13,14,50,51,53}. Other recent examples include lifetime breeding success in red deer, which is reduced by up to 95% in male offspring from half-sib matings¹³ and lifetime reproductive success in helmeted honeyeaters, which is up to 90% lower with a 9% increase in homozygosity⁵¹. The traditional way to compare inbreeding depression among studies is to estimate the inbreeding load of a population using lethal equivalents⁵⁹, although differences in methodology and inbreeding estimates can make such direct comparisons difficult⁶⁰. We estimated the diploid number of lethal equivalents 2B for Soay sheep annual survival at 4.57 (95% CI 2.61–6.55). While this is a low to moderate inbreeding load compared to the few available estimates from wild mammals obtained from appropriate statistical models⁶⁰, none of these estimates are based on genomic data and they vary in their exact fitness measure as well as the degree to which they control for environmental and life-history variation. As such, average estimates of lethal equivalents might change in magnitude with the increasing use of genomics in individual-based long-term studies.

Inbreeding depression is dynamic across life, and genomic measures are starting to unravel how inbreeding depression affects fitness at different life stages in wild populations^13,14,50. Under the mutation accumulation hypothesis⁶³, the adverse effects of deleterious mutations expressed late in life should become stronger as selection becomes less efficient. Assuming mutation accumulation, inbreeding depression is expected to increase with age too⁶⁴, but empirical evidence is sparse^15,65,66. In contrast, we showed that inbreeding depression in Soay sheep becomes weaker at later life stages. In addition, the sample for each successive age class consists of increasingly outbred individuals (Fig. 3a) due to a higher death rate among inbred individuals earlier in life. This suggests that the effects of intragenerational purging⁶⁷ outweigh mutation accumulation in shaping the dynamics of inbreeding depression across the lifetime.

The effect size distribution of loci underpinning inbreeding depression has to our knowledge not been studied in wild populations using fitness data, although deleterious mutations have been predicted from sequencing data, for example in ibex and Isle Royale wolves^17,18. Theoretical predictions about the relative importance of weakly and strongly deleterious (partially-) recessive alleles will depend on many factors, such as the distribution of dominance and selection coefficients for mutations relative to the effective population size, and the frequency of inbreeding^68,69. However, we could expect that small populations purge largely deleterious recessive mutations more efficiently as these are more frequently exposed to selection in the homozygous state^7,16,19,41, while weakly deleterious mutations can more often drift to higher frequencies. We estimated the effect of ROH status on Soay sheep survival for each of the two alleles at every SNP position within a GWAS framework. The effect size distribution revealed predominantly negative effects of ROH status on survival, particularly towards larger model estimates, showing that many alleles with weakly deleterious effects (or at low frequencies) probably contribute to inbreeding depression in survival.

Associations between ROH and survival reached genome-wide significance in seven regions on six chromosomes. In two of these regions, allele-specific ROH are predicted to increase survival, a fascinating observation we intend to explore in more detail but which is beyond the scope of this manuscript. In five further regions, ROH caused significant depression in survival, presumably due to loci harbouring strongly deleterious recessive alleles. This is unexpected, as Soay sheep have a long-term small population size with an estimated N_e of 197⁵⁷, and strongly deleterious mutations should be rapidly purged. On the one hand, it is possible that genetic drift counteracted the effects of purifying selection and has allowed deleterious mutations to increase in frequency and be detected in a GWAS. On the other hand, a relatively recent admixture event with the Dunface sheep breed around 150 years ago⁵⁵ could have introduced deleterious variants into the population and recent selection has not been efficient enough to purge them from the population yet. Identifying the loci harbouring these strongly deleterious alleles will be challenging as ROH overlapping a given SNP vary in length among individuals, which makes it difficult to pinpoint an exact effect location. Nevertheless, we have shown that it is possible to identify the haplotypes carrying deleterious alleles with large effects. The frequencies of such haplotypes could be monitored in natural populations, and individuals carrying them could be selected against in conservation breeding programs. To sum up, our study shows how genome-wide marker information for a large sample of individuals with known fitness can deepen our understanding of the genetic architecture and lifetime dynamics of inbreeding depression in the wild.

Methods

Study population, pedigree assembly and survival measurements

The Soay sheep (Ovis aries) is a primitive sheep breed descended from Bronze Age domestic sheep and has lived unmanaged on island of Soay in St. Kilda archipelago, Scotland for thousands of years. When the last human inhabitants left St. Kilda in 1932, 107 Soays were transferred to the largest island, Hirta, and have roamed the island freely and unmanaged ever since. The population increased and fluctuates nowadays between 600 and 2200 individuals. A part of the population in the Village Bay area of Hirta (57 49′N, 8 34′W) has been the subject of a long-term individual-based study since 1985⁵⁴. Most individuals born in the study area (95%) are ear-tagged and DNA samples are obtained from ear punches or blood sampling. Routine mortality checks are conducted throughout the year with peak mortality occurring at the end of winter and beginning of spring. Overall, around 80% of deceased animals are found⁵⁰. For the analyses in this paper, survival was defined as dying (0) or surviving (1) from the 1st May of the previous year to the 30th April of that year, with measures available for 5952 individuals from 1979 to 2018. Annual survival data were complete for all individuals in the analysis, as the birth year was known and the death year of an individual was known when it has been found dead during one of the regular mortality checks on the island. We focused on annual measures as this allowed us to incorporate the effects of age and environmental variation.

To assemble the pedigree, we inferred parentage for each individual using 438 unlinked SNP markers from the Ovine SNP50 BeadChip, on which most individuals since 1990 have been genotyped⁷⁰. Based on these 438 markers, we inferred pedigree relationships using the R package Sequoia⁷¹. In the few cases where no SNP genotypes were available, we used either field observations (for mothers) or microsatellites⁷². All animal work was carried out in compliance with all relevant ethical regulations for animal testing and research according to UK Home Office procedures and was licensed under the UK Animals (Scientific Procedures) Act of 1986 (Project License no. PPL70/8818).

Genotyping

We genotyped a total of 7,700 Soay on the Illumina Ovine SNP50 BeadChip containing 51,135 SNP markers. To control for marker quality, we first filtered for SNPs with minor allele frequency (MAF) > 0.001, SNP locus genotyping success >0.99 and individual sheep genotyping success >0.95. We then used the check.marker function in GenABEL version 1.8-0⁷³ with the same thresholds, including identity by state with another individual <0.9. This resulted in a dataset containing 39,368 polymorphic SNPs in 7700 sheep. In addition, we genotyped 189 sheep on the Ovine Infinium HD SNP BeadChip containing 606,066 SNP loci. These sheep were specifically selected to maximise the genetic diversity represented in the full population as described in Johnston et al.⁵⁸. As a quality control, monomorphic SNPs were discarded, and SNPs with SNP locus genotyping success >0.99 and individual sheep with genotyping success >0.95 were retained. This resulted in 430,702 polymorphic SNPs for 188 individuals. All genotype positions were based on the Oar_v3.1 sheep genome assembly (GenBank assembly ID GCA_000298735.1⁷⁴).

Genotype imputation

In order to impute genotypes to high density, we merged the datasets from the 50 K SNP chip and from the HD SNP chip using PLINK v1.90b6.12 with –-bmerge⁷⁵. This resulted in a dataset with 436,117 SNPs including 33,068 SNPs genotyped on both SNP chips. For genotype imputation, we discarded SNPs on the X chromosome and focused on the 419,281 SNPs located on autosomes. The merged dataset contained nearly complete genotype information for 188 individuals who have been genotyped on the HD chip, and genotypes at 38,130 SNPs for 7700 individuals who have been genotyped on the 50 K chip. To impute the missing SNPs, we used AlphaImpute v1.98⁵⁶, which combines information on shared haplotypes and pedigree relationships for phasing and genotype imputation. AlphaImpute works on a per-chromosome basis, and phasing and imputation are controlled using a parameter file (for the exact parameter file, see analysis code). Briefly, we phased individuals using core lengths ranging from 1 to 5% of the SNPs on a given chromosome over 10 iterations, resulting in a haplotype library. Based on the haplotype library, missing alleles were imputed using the heuristic method over five iterations which allowed us to use genotype information imputed in previous iterations. We only retained imputed genotypes for which all phased haplotypes matched and did not allow for errors. We also discarded SNPs with call rates below 95% after imputation. Overall, this resulted in a dataset with 7691 individuals, 417,373 SNPs and a mean genotyping rate per individual of 99.5% (range 94.8–100%).

To evaluate the accuracy of the imputation we used 10-fold leave-one-out cross-validation. In each iteration, we masked the genotypes unique to the high-density chip for one random individual that had been genotyped at high-density (HD) and imputed the masked genotypes. This allowed a direct comparison between the true and imputed genotypes. The imputation accuracy of the HD individuals should reflect of the average imputation accuracy across the whole population because HD individuals were selected to be representative of the genetic variation observed across the pedigree (see Johnston et al.⁵⁸ for details).

ROH calling and individual inbreeding coefficients

The final dataset contained genotypes at 417,373 SNPs autosomal SNPs for 5925 individuals for which annual survival data was available. We called runs of homozygosity (ROH) with a minimum length of 1200Kb and spanning at least 50 SNPs with the --homozyg function in Plink⁷⁵ and the following parameters: --homozyg-window-snp 50 --homozyg-snp 50 --homozyg-kb 1200 --homozyg-gap 300 --homozyg-density 200 --homozyg-window-missing 2 --homozyg-het 2 --homozyg-window-het 2. We chose 1200Kb as the minimum ROH length because between-individual variability in ROH abundance becomes very low for shorter ROH. Moreover, ROH of length 1200Kb extend well above the LD half decay in the population, thus capturing variation in IBD due to more recent inbreeding rather than linkage disequilibrium (Supplementary Fig. 9). The minimum ROH length of 1200Kb also reflects the expected length when the underlying IBD haplotypes had a most recent common ancestor haplotype 32 generations ago, calculated as (100/(2 $*$ g)) cM/1.28 cM/Mb where g is 32 generations and 1.28 is the sex-averaged genome-wide recombination rate in Soay sheep^42,58. To plot the ROH length distribution, we used the same formula to cluster ROH according to their physical length into length classes with expected MRCA ranging from 2 to 32 generations ago (Fig. 1b). Notably, ROH with the same physical length can have different coalescent times in different parts of the genome, causing a higher variance around the expected mean length than ROH measured in terms of genetic map length^12,42. We then calculated individual inbreeding coefficients F_ROH by summing up the total length of ROH for each individual and dividing this by the total autosomal genome length⁴³ (2452 Mb).

ROH landscape and recombination rate variation

To quantify variation in population-wide ROH density and its relationship with recombination rate and SNP heterozygosity across the genome, we used a sliding window approach. For all analyses, we calculated these estimates in 500 Kb non-overlapping sliding windows comparable to similar studies^24,30; each window contained 85 SNPs on average. Specifically, we first calculated the number of ROH overlapping each SNP position in the population using PLINK --homozyg. We then calculated the mean number of ROH overlapping SNPs in 500 Kb non-overlapping sliding windows in the population (Fig. 1c). To estimate the top 0.5% ROH deserts and islands³¹, windows with less than 35 SNPs (the percentile of windows with the lowest SNP density) were discarded. To estimate the impact of recombination rate on ROH frequency across the genome, we then quantified the recombination rate in 500 Kb windows using genetic distances from the Soay sheep linkage map⁷⁶. Window heterozygosity was calculated as the mean SNP heterozygosity of all SNPs in a given window. Next, we constructed a linear mixed model in lme4⁷⁷ with population-wide ROH density (defined as the proportion of individuals with ROH) per window as response, window recombination rate and heterozygosity as fixed effects and chromosome ID as random intercept (Supplementary Table 3). The fixed effects in the model were standardised using z-transformation. We estimated the relative contribution of recombination and heterozygosity to variation in ROH density by decomposing the marginal R² of the model⁷⁸ into the variation explained uniquely by each of the two predictors using semi-partial R² as implemented in the partR2 package⁷⁹, with 95% confidence intervals obtained by parametric bootstrapping.

Modelling inbreeding depression in survival

We modelled the effects of inbreeding depression in annual survival using a Bayesian animal model in INLA⁸⁰. Annual survival data consists of a series of 1 s followed by a 0 in the year of a sheep’s death, or only a 0 if it died as a lamb, and we consequently modelled the data with a binomial error distribution and logit link. We used the following model structure:

$${\rm{Pr }}\left({{\rm{surv}}}_{i}=1\right)=\, {{\rm{logit}}}^{-1}({\beta }_{0}+{{{F}}}_{{{\rm{ROH}}}_{{\rm{i}}}}{\beta }_{1}+{{\rm{earlyLife}}}_{i}{\beta }_{2}+{{\rm{midLife}}}_{i}{\beta }_{3}{+{\rm{lateLife}}}_{i}{\beta }_{4}\\ +{{\rm{sex}}}_{i}{\beta }_{5}+{{\rm{twin}}}_{i}{\beta }_{6}+{{{F}}}_{{{\rm{ROH}}}_{i}}{{\rm{earlyLife}}}_{i}{\beta }_{7}+{{{F}}}_{{{\rm{ROH}}}_{i}}{{\rm{midLife}}}_{i}{\beta }_{8}\\ +{{{F}}}_{{{\rm{ROH}}}_{i}}{{\rm{lateLife}}}_{i}{\beta }_{9}+{\alpha }_{j}^{{\rm{capture}}\; {\rm{year}}}+{\alpha }_{k}^{{\rm{birth}}\; {\rm{year}}}+{\alpha }_{l}^{{\rm{id}}}+{u}_{l}^{{\rm{ped}}})$$

(1)

$${\alpha }_{j}^{{\rm{capture}}\,{\rm{year}}} \sim N(0,\,{\sigma }_{{\rm{year}}}^{2}),\;\qquad\qquad {\mathrm{for}}\,\,j=1,\ldots ,40\\ {\alpha }_{k}^{{\rm{birth}}\,{\rm{year}}} \sim N(0,\,{\sigma }_{{\rm{birth}}\,{\rm{year}}}^{2}), \;\, {\mathrm{for}}\,k=1,\ldots ,40\\ {\alpha }_{l}^{{\rm{id}}} \sim N(0,\,{\sigma }_{{\rm{id}}}^{2}), \; {\mathrm{for}}\,\,l=1,\ldots ,5925\\ {u}_{l}^{{\rm{ped}}} \sim N(0,\,A{\sigma }_{A}^{2}), \; {\mathrm{for}}\,\,l=1,\ldots ,5925$$

Here, ${\rm{Pr }}\left({{\rm{surv}}}_{i}=1\right)$ is the probability of survival for observation i, which depends on the intercept ${\beta }_{0}$, a series of fixed effects ${\beta }_{1}$ to ${\beta }_{9}$, the random effects $\alpha$, which are assumed to be normally distributed with mean 0 and variance ${\sigma }^{2}$ and the breeding values ${u}_{l}^{{\rm{ped}}}$, which have a dependency structure corresponding to the pedigree, with a mean of 0 and a variance of $A{\sigma }_{A}^{2}$, where A is the relationship matrix and ${\sigma }_{A}^{2}$ is the additive genetic variance. We used a pedigree-derived additive relatedness matrix for computational efficiency as has previously been described for INLA models⁸¹. Variance component estimates for additive genetic effects have also previously been shown to be very similar when using a pedigree and an SNP-derived relatedness matrix in Soay sheep⁷⁰. Also, additive genetic variance in Soay sheep fitness components is very low (Supplementary Table 4 and refs. ^72,82). As fixed effects, we fitted the individual inbreeding coefficient F_ROH (continuous), the life stage of the individual (categorical predictor with four levels: lamb (age = 0, reference level), early life (age = 1, 2), mid-life (age = 3, 4), late-life (age = 5+), sex (0 = female, 1 = male)), and a variable indicating whether an individual was born as a twin (0 = singleton, 1 = twin). We also fitted an interaction term of F_ROH with life stage to estimate how inbreeding depression changes across the lifetime. As life stage was fitted as a factor, the model estimates three main effects for the differences in the log-odds of survival between lambs (reference category) and early life, mid-life and late-life respectively. Similarly, the interaction term between F_ROH and life stage estimates the difference in the effect of F_ROH on survival (i.e. inbreeding depression) between lambs and individuals in early life, mid-life and late-life, respectively. As random intercepts, we fitted the birth year of an individual, the observation year to account for survival variation among years and the sheep identity to account for repeated measures. For all random effects, which are estimated in terms of precision rather than the variance in INLA, we used log-gamma priors with both shape and inverse-scale parameter values of 0.5. Supplementary Table 4 gives an overview over the coding and standardisation of the predictors.

Before modelling, we mean-centred^83,84 and transformed F_ROH from its original range 0–1 to 0–10, which allowed us to directly interpret model estimates as resulting from a 10% increase in genome-wide IBD rather than the difference between a completely outbred and a completely inbred individual. Finally, we report model estimates as odds ratios in the main paper, and on the link scale (as log-odds ratios) in the Supplementary Material.

Estimating lethal equivalents

The traditional way of comparing inbreeding depression among studies is to quantify the inbreeding load in terms of lethal equivalents, i.e. a group of genes that would cause on average one death if dispersed in different individuals and made homozygous⁵⁹. However, differences in statistical methodology and inbreeding measures make it difficult to compare the strength of inbreeding depression in terms of lethal equivalents among studies⁶⁰. Here, we used the approach suggested by Nietlisbach et al.⁶⁰ and refitted the survival animal model with a Poisson distribution and a logarithmic link function. We were interested in lethal equivalents for the overall strength of inbreeding depression rather than its lifetime dynamics, so we fitted a slightly simplified animal model with F_ROH, age, age², twin and sex as fixed effects and birth year, capture year, individual id and pedigree relatedness as random effects. The slope of F_ROH estimates the decrease in survival due to a 10% increase in F_ROH, so we calculated the number of diploid lethal equivalents 2B as –(ß_FROH)/0.10 $*$ 2 where ß_FROH is the Poisson model slope for F_ROH.

Mapping inbreeding depression

To map the effects of inbreeding depression in survival across the genome, we used a modification of a genome-wide association study^48,49,85. For each of the ~417 K SNPs, we fitted a binomial mixed model with logit link in lme4⁷⁷ with the following model structure:

$${\rm{Pr }}\left({{\rm{surv}}}_{i}=1\right)=\; {{\rm{logit}}}^{-1}({\beta }_{0}+{{\rm{SNP}}}_{{{\rm{RO}}{{\rm{H}}}_{{\rm{alleleA}}}}_{i}}{\beta }_{1}+{{\rm{SNP}}}_{{{\rm{RO}}{{\rm{H}}}_{{\rm{alleleB}}}}_{i}}{\beta }_{2}+{{\rm{SNP}}}_{{{\rm{ADD}}}_{i}}{\beta }_{3}\\ +{{{F}}}_{{{\rm{RO}}{{\rm{H}}}_{{\rm{mod}}}}_{i}}{\beta }_{4}+{{\rm{age}}}_{i}{\beta }_{5}+{{\rm{age}}}_{i}^{2}{\beta }_{6}+{{\rm{sex}}}_{i}{\beta }_{7}+{{\rm{twin}}}_{i}{\beta }_{8}\\ +{\beta }_{9-16}{{\rm{PC}}}_{1-7}+{\alpha }_{j}^{{\rm{capture}}\; {\rm{year}}}+{\alpha }_{k}^{{\rm{birth}}\; {\rm{year}}}+{\alpha }_{l}^{{\rm{id}}})$$

(2)

$${\alpha }_{j}^{{\rm{capture}}\,{\rm{year}}} \sim N(0,\,{\sigma }_{{\rm{capture}}\,{\rm{year}}}^{2}), \; {\mathrm{for}}\,\,j=1,\ldots ,40\\ {\alpha }_{k}^{{\rm{birth}}\,{\rm{year}}} \sim N(0,\,{\sigma }_{{\rm{birth}}\,{\rm{year}}}^{2}), \; {\mathrm{for}}\,k=1,\ldots ,40\\ {\alpha }_{l}^{{\rm{id}}} \sim N(0,\,{\sigma }_{{\rm{id}}}^{2}), \; {\mathrm{for}}\,\,l=1,\ldots ,5925$$

Where the effects of interest are the two ROH status effects, ${{\rm{SNP}}}_{{{\rm{ROH}}}_{{\rm{alleleA}}}}$ and ${{\rm{SNP}}}_{{{\rm{ROH}}}_{{\rm{alleleB}}}}$. These are binary predictors indicating whether allele A at a given SNP is in an ROH (${{\rm{SNP}}}_{{{\rm{ROH}}}_{{\rm{alleleA}}}}$ = 1) or not (${{\rm{SNP}}}_{{{\rm{ROH}}}_{{\rm{alleleA}}}}$ = 0), and whether allele B at a given SNP is an ROH (${{\rm{SNP}}}_{{{\rm{ROH}}}_{{\rm{alleleB}}}}$= 1) or not (${{\rm{SNP}}}_{{{\rm{ROH}}}_{{\rm{alleleB}}}}$ = 0). These predictors test whether an ROH has a negative effect on survival, and whether the haplotypes containing allele A or the haplotypes containing allele B are associated with the negative effect and therefore carry the putative recessive deleterious mutation. SNP_ADD is the additive effect for the focal SNP (0, 1, 2 for homozygous, heterozygous, homozygous for the alternative allele, respectively), and controls for the possibility that a potential negative effect of ROH status is simply an additive effect. ${{{F}}}_{{{\rm{ROH}}}_{{\rm{mod}}}}$ is the mean inbreeding coefficient of the individual based on all chromosomes except for the chromosome where the focal SNP is located. We fitted F_ROH as we were interested in the effect of ROH status at a certain locus on top of the average individual inbreeding coefficient. Sex and twin are again binary variables representing the sex of the individual and whether it is a twin. Age and age² control for the linear and quadratic effects of age, and are fitted as continuous covariates. Because it was computationally impractical to fit 417 K binomial animal models with our sample size and because the additive genetic variance in survival was very small (posterior mean [95% CI] = 0.29 [0.22, 0.37], see Supplementary Table 4), we did not fit an additive genetic effect. Instead, we used the top 7 principal components of the variance-standardised additive relationship matrix (PC_1–7) as fixed effects⁷⁵. Again, we added birth year, capture year and individual id as random effects. For each fitted model, we extracted the estimated slope of the two SNP_ROH predictors and their p-values, which were calculated based on a Wald-Z test. To determine a threshold for genome-wide significance we used the ‘simpleM’ procedure⁸⁶. The method uses composite linkage disequilibrium to create a SNP correlation matrix and calculates the effective number of independent tests, which was much lower than the number of SNPs (n_eff = 39184) because LD stretches over large distances in Soay sheep (Supplementary Fig. 9). We then doubled this number to 78368, as we conducted two tests per model, and used this value for a Bonferroni correction⁸⁷ of p-values, resulting in a genome-wide significance threshold of p < 6.38 $*$ 10⁻⁷.

Reporting summary

Further information on research design is available in the Nature Research Reporting Summary linked to this article.

Data availability

All data are available on Zenodo (https://doi.org/10.5281/zenodo.4609701)⁸⁸. Source data are provided with this paper.

Code availability

Code was written in R 3.6.1⁸⁹ and relied heavily on tidyverse 1.3.0⁹⁰, data.table 1.14.0⁹¹ and ggplot2 3.3.3⁹². The full analysis code is available on Zenodo (https://doi.org/10.5281/zenodo.4587676)⁹³ and GitHub (https://github.com/mastoffel/sheep_ID).

References

Darwin, C. The Effect of Cross and Self Fertilization in the Vegetable Kingdom (John Murray, 1876).
Angeloni, F., Ouborg, N. J. & Leimu, R. Meta-analysis on the association of population size and life history with inbreeding depression in plants. Biol. Conserv. 144, 35–43 (2011).
Article Google Scholar
Bozzuto, C., Biebach, I., Muff, S., Ives, A. R. & Keller, L. F. Inbreeding reduces long-term growth of Alpine ibex populations. Nat. Ecol. Evol. 3, 1359–1364 (2019).
Article PubMed Google Scholar
Charlesworth, D. & Willis, J. H. The genetics of inbreeding depression. Nat. Rev. Genet. 10, 783–796 (2009).
Article CAS PubMed Google Scholar
Clark, D. W. et al. Associations of autozygosity with a broad range of human phenotypes. Nat. Commun. 10, 1–17 (2019).
Article ADS CAS Google Scholar
Crnokrak, P. & Roff, D. A. Inbreeding depression in the wild. Heredity 83, 260–270 (1999).
Article PubMed Google Scholar
Hedrick, P. W. & Garcia-Dorado, A. Understanding inbreeding depression, purging, and genetic rescue. Trends Ecol. Evol. 31, 940–952 (2016).
Article PubMed Google Scholar
Keller, L. Inbreeding effects in wild populations. Trends Ecol. Evol. 17, 230–241 (2002).
Article Google Scholar
Ralls, K., Ballou, J. D. & Templeton, A. Estimates of lethal equivalents and the cost of inbreeding in mammals. Conserv. Biol. 2, 185–193 (1988).
Article Google Scholar
Ceballos, G., Ehrlich, P. R. & Dirzo, R. Biological annihilation via the ongoing sixth mass extinction signaled by vertebrate population losses and declines. Proc. Natl Acad. Sci. USA 114, E6089–E6096 (2017).
Article CAS PubMed PubMed Central Google Scholar
Haddad, N. M. et al. Habitat fragmentation and its lasting impact on Earth’s ecosystems. Sci. Adv. 1, e1500052 (2015).
Article ADS PubMed PubMed Central Google Scholar
Kardos, M., Taylor, H. R., Ellegren, H., Luikart, G. & Allendorf, F. W. Genomics advances the study of inbreeding depression in the wild. Evol. Appl. 9, 1205–1218 (2016).
Article PubMed PubMed Central Google Scholar
Huisman, J., Kruuk, L. E., Ellis, P. A., Clutton-Brock, T. & Pemberton, J. M. Inbreeding depression across the lifespan in a wild mammal population. Proc. Natl Acad. Sci. USA 113, 3585–3590 (2016).
Article ADS CAS PubMed PubMed Central Google Scholar
Chen, N., Cosgrove, E. J., Bowman, R., Fitzpatrick, J. W. & Clark, A. G. Genomic consequences of population decline in the endangered Florida scrub-jay. Curr. Biol. 26, 2974–2979 (2016).
Article CAS PubMed PubMed Central Google Scholar
Keller, L., Reid, J. & Arcese, P. Testing evolutionary models of senescence in a natural population: age and inbreeding effects on fitness components in song sparrows. Proc. R. Soc. B: Biol. Sci. 275, 597–604 (2008).
Article CAS Google Scholar
Robinson, J. A., Brown, C., Kim, B. Y., Lohmueller, K. E. & Wayne, R. K. Purging of strongly deleterious mutations explains long-term persistence and absence of inbreeding depression in island foxes. Curr. Biol. 28, 3487–3494 (2018).
Article CAS PubMed PubMed Central Google Scholar
Robinson, J. A. et al. Genomic signatures of extensive inbreeding in Isle Royale wolves, a population on the threshold of extinction. Sci. Adv. 5, eaau0757 (2019).
Article ADS PubMed PubMed Central Google Scholar
Grossen, C., Guillaume, F., Keller, L. F. & Croll, D. Purging of highly deleterious mutations through severe bottlenecks in Alpine ibex. Nat. Commun. 11, 1–12 (2020).
Article CAS Google Scholar
Kyriazis, C. C., Wayne, R. K. & Lohmueller, K. E. Strongly deleterious mutations are a primary determinant of extinction risk due to inbreeding depression. Evol. Lett. 5, 33–47 (2021).
Hedrick, P. W. What is the evidence for heterozygote advantage selection? Trends Ecol. Evol. 27, 698–704 (2012).
Article PubMed Google Scholar
Pemberton, J. Measuring inbreeding depression in the wild: the old ways are the best. Trends Ecol. Evol. 19, 613–615 (2004).
Article PubMed Google Scholar
Wright, S. Coefficients of inbreeding and relationship. Am. Naturalist 56, 330–338 (1922).
Article Google Scholar
Franklin, I. R. The distribution of the proportion of the genome which is homozygous by descent in inbred individuals. Theor. Popul. Biol. 11, 60–80 (1977).
Article CAS PubMed MATH Google Scholar
Kardos, M. et al. Genomic consequences of intensive inbreeding in an isolated wolf population. Nat. Ecol. Evol. 2, 124–131 (2018).
Article PubMed Google Scholar
Kardos, M., Luikart, G. & Allendorf, F. W. Measuring individual inbreeding in the age of genomics: marker-based measures are better than pedigrees. Heredity 115, 63–72 (2015).
Article CAS PubMed PubMed Central Google Scholar
Broman, K. W. & Weber, J. L. Long homozygous chromosomal segments in reference families from the centre d’Etude du polymorphisme humain. Am. J. Hum. Genet. 65, 1493–1500 (1999).
Article CAS PubMed PubMed Central Google Scholar
Gibson, J., Morton, N. E. & Collins, A. Extended tracts of homozygosity in outbred human populations. Hum. Mol. Genet. 15, 789–795 (2006).
Article CAS PubMed Google Scholar
Wang, H., Lin, C.-H., Chen, Y., Freimer, N. & Sabatti, C. Linkage disequilibrium and haplotype homozygosity in population samples genotyped at a high marker density. Hum. Hered. 62, 175–189 (2006).
Article PubMed Google Scholar
Curik, I., Ferenčaković, M. & Sölkner, J. Inbreeding and runs of homozygosity: A possible solution to an old problem. Livest. Sci. 166, 26–34 (2014).
Article Google Scholar
Kardos, M., Qvarnström, A. & Ellegren, H. Inferring individual inbreeding and demographic history from segments of identity by descent in Ficedula flycatcher genome sequences. Genetics 205, 1319–1334 (2017).
Article PubMed PubMed Central Google Scholar
Pemberton, T. J. et al. Genomic patterns of homozygosity in worldwide human populations. Am. J. Hum. Genet. 91, 275–292 (2012).
Article CAS PubMed PubMed Central Google Scholar
Nothnagel, M., Lu, T. T., Kayser, M. & Krawczak, M. Genomic and geographic distribution of SNP-defined runs of homozygosity in Europeans. Hum. Mol. Genet. 19, 2927–2935 (2010).
Article CAS PubMed Google Scholar
Hedrick, P. W., Hellsten, U. & Grattapaglia, D. Examining the cause of high inbreeding depression: analysis of whole-genome sequence data in 28 selfed progeny of Eucalyptus grandis. N. Phytol. 209, 600–611 (2016).
Article CAS Google Scholar
Jenko, J. et al. Analysis of a large dataset reveals haplotypes carrying putatively recessive lethal and semi-lethal alleles with pleiotropic effects on economically important traits in beef cattle. Genet. Select. Evol. 51, 9 (2019).
Article Google Scholar
VanRaden, P. M., Olson, K. M., Null, D. J. & Hutchison, J. L. Harmful recessive effects on fertility detected by absence of homozygous haplotypes. J. Dairy Sci. 94, 6153–6161 (2011).
Article CAS PubMed Google Scholar
Kirin, M. et al. Genomic runs of homozygosity record population history and consanguinity. PLoS ONE 5, e13996 (2010).
Ceballos, F. C., Joshi, P. K., Clark, D. W., Ramsay, M. & Wilson, J. F. Runs of homozygosity: windows into population history and trait architecture. Nat. Rev. Genet. 19, 220–234 (2018).
Article CAS PubMed Google Scholar
Szpiech, Z. A. et al. Long runs of homozygosity are enriched for deleterious variation. Am. J. Hum. Genet. 93, 90–102 (2013).
Article CAS PubMed PubMed Central Google Scholar
Szpiech, Z. A. et al. Ancestry-dependent enrichment of deleterious homozygotes in runs of homozygosity. Amer. J. Hum. Genet. 105, 747–762 (2019).
Zhang, Q., Guldbrandtsen, B., Bosse, M., Lund, M. S. & Sahana, G. Runs of homozygosity and distribution of functional variants in the cattle genome. BMC Genomics 16, 542 (2015).
Article PubMed PubMed Central CAS Google Scholar
Stoffel, M. A., Johnston, S. E., Pilkington, J. G. & Pemberton, J. M. Mutation load decreases with haplotype age in wild Soay sheep. Preprint at bioRxiv https://doi.org/10.1101/2021.03.04.433700 (2021).
Thompson, E. A. Identity by descent: variation in meiosis, across genomes, and in populations. Genetics 194, 301–326 (2013).
Article CAS PubMed PubMed Central Google Scholar
McQuillan, R. et al. Runs of homozygosity in European populations. Am. J. Hum. Genet. 83, 359–372 (2008).
Article CAS PubMed PubMed Central Google Scholar
Leroy, G. Inbreeding depression in livestock species: review and meta-analysis. Anim. Genet. 45, 618–628 (2014).
Article CAS PubMed Google Scholar
Keller, M. C., Visscher, P. M. & Goddard, M. E. Quantification of inbreeding due to distant ancestors and its detection using dense single nucleotide polymorphism data. Genetics 189, 237–249 (2011).
Article PubMed PubMed Central Google Scholar
Kardos, M., Nietlisbach, P. & Hedrick, P. W. How should we compare different genomic estimates of the strength of inbreeding depression? PNAS 115, E2492–E2493 (2018).
Article CAS PubMed PubMed Central Google Scholar
Gazal, S. et al. Inbreeding coefficient estimation with dense SNP data: comparison of strategies and application to HapMap III. Hum. Hered. 77, 49–62 (2014).
Article PubMed Google Scholar
Ferenčaković, M., Sölkner, J., Kapš, M. & Curik, I. Genome-wide mapping and estimation of inbreeding depression of semen quality traits in a cattle population. J. Dairy Sci. 100, 4721–4730 (2017).
Article PubMed CAS Google Scholar
Pryce, J. E., Haile-Mariam, M., Goddard, M. E. & Hayes, B. J. Identification of genomic regions associated with inbreeding depression in Holstein and Jersey dairy cattle. Genet. Select. Evol. 46, 71 (2014).
Bérénos, C., Ellis, P. A., Pilkington, J. G. & Pemberton, J. M. Genomic analysis reveals depression due to both individual and maternal inbreeding in a free-living mammal population. Mol. Ecol. 25, 3152–3168 (2016).
Article PubMed PubMed Central Google Scholar
Harrisson, K. A. et al. Lifetime fitness costs of inbreeding and being inbred in a critically endangered bird. Curr. Biol. 29, 2711–2717 (2019).
Article CAS PubMed Google Scholar
Hoffman, J. I. et al. High-throughput sequencing reveals inbreeding depression in a natural population. Proc. Natl Acad. Sci. USA 111, 3775–3780 (2014).
Article ADS CAS PubMed PubMed Central Google Scholar
Niskanen, A. K. et al. Consistent scaling of inbreeding depression in space and time in a house sparrow metapopulation. PNAS 117, 14584–14592 (2020).
Article CAS PubMed PubMed Central Google Scholar
Clutton-Brock, T. H. & Pemberton, J. M. Soay Sheep: Dynamics and Selection in an Island Population (Cambridge University Press, 2004).
Feulner, P. G. D. et al. Introgression and the fate of domesticated genes in a wild mammal population. Mol. Ecol. 22, 4210–4221 (2013).
Article CAS PubMed Google Scholar
Hickey, J. M., Kinghorn, B. P., Tier, B., van der Werf, J. H. & Cleveland, M. A. A phasing and imputation method for pedigreed populations that results in a single-stage genomic evaluation. Genet. Select. Evol. 44, 9 (2012).
Article Google Scholar
Kijas, J. W. et al. Genome-wide analysis of the world’s sheep breeds reveals high levels of historic mixture and strong recent selection. PLoS Biol. 10, e1001258 (2012).
Article CAS PubMed PubMed Central Google Scholar
Johnston, S. E., Bérénos, C., Slate, J. & Pemberton, J. M. Conserved genetic architecture underlying individual recombination rate variation in a wild population of Soay Sheep (Ovis aries). Genetics 203, 583–598 (2016).
Article CAS PubMed PubMed Central Google Scholar
Morton, N. E., Crow, J. F. & Muller, H. J. An estimate of the mutational damage in man from data on consanguineous marriages. Proc. Natl Acad. Sci. USA 42, 855 (1956).
Article ADS CAS PubMed PubMed Central Google Scholar
Nietlisbach, P., Muff, S., Reid, J. M., Whitlock, M. C. & Keller, L. F. Nonequivalent lethal equivalents: models and inbreeding metrics for unbiased estimation of inbreeding load. Evol. Appl. 12, 266–279 (2019).
Article PubMed Google Scholar
Brüniche-Olsen, A., Kellner, K. F., Anderson, C. J. & DeWoody, J. A. Runs of homozygosity have utility in mammalian conservation and evolutionary studies. Conserv. Genet. 19, 1295–1307 (2018).
Article CAS Google Scholar
Xue, Y. et al. Mountain gorilla genomes reveal the impact of long-term population decline and inbreeding. Science 348, 242–245 (2015).
Article ADS CAS PubMed PubMed Central Google Scholar
Medawar, P. B. An Unsolved Problem of Biology (H.K. Lewis, 1952).
Charlesworth, B. & Hughes, K. A. Age-specific inbreeding depression and components of genetic variance in relation to the evolution of senescence. Proc. Natl Acad. Sci. USA 93, 6140–6145 (1996).
Article ADS CAS PubMed PubMed Central Google Scholar
Husband, B. C. & Schemske, D. W. Evolution of the magnitude and timing of inbreeding depression in plants. Evolution 50, 54–70 (1996).
Article PubMed Google Scholar
Wilson, A. J. et al. Evidence for a genetic basis of aging in two wild vertebrate populations. Curr. Biol. 17, 2136–2142 (2007).
Article CAS PubMed Google Scholar
Enders, L. S. & Nunney, L. Reduction in the cumulative effect of stress-induced inbreeding depression due to intragenerational purging in Drosophila melanogaster. Heredity 116, 304 (2016).
Article CAS PubMed Google Scholar
Bataillon, T. & Kirkpatrick, M. Inbreeding depression due to mildly deleterious mutations in finite populations: size does matter. Genet. Res. 75, 75–81 (2000).
Article CAS PubMed Google Scholar
Glémin, S. How are deleterious mutations purged? Drift versus nonrandom mating. Evolution 57, 2678–2687 (2003).
PubMed Google Scholar
Bérénos, C., Ellis, P. A., Pilkington, J. G. & Pemberton, J. M. Estimating quantitative genetic parameters in wild populations: a comparison of pedigree and genomic approaches. Mol. Ecol. 23, 3434–3451 (2014).
Article PubMed PubMed Central Google Scholar
Huisman, J. Pedigree reconstruction from SNP data: parentage assignment, sibship clustering and beyond. Mol. Ecol. Resour. 17, 1009–1024 (2017).
Article CAS PubMed PubMed Central Google Scholar
Morrissey, M. B. et al. The prediction of adaptive evolution: empirical application of the secondary theorem of selection and comparison to the breeder’s equation. Evol. Int. J. Org. Evol. 66, 2399–2410 (2012).
Article Google Scholar
Aulchenko, Y. S., Ripke, S., Isaacs, A. & Van Duijn, C. M. GenABEL: an R library for genome-wide association analysis. Bioinformatics 23, 1294–1296 (2007).
Article CAS PubMed Google Scholar
Jiang, Y. et al. The sheep genome illuminates biology of the rumen and lipid metabolism. Science 344, 1168–1173 (2014).
Article ADS CAS PubMed PubMed Central Google Scholar
Purcell, S. et al. PLINK: a tool set for whole-genome association and population-based linkage analyses. Am. J. Hum. Genet. 81, 559–575 (2007).
Article CAS PubMed PubMed Central Google Scholar
Johnston, S. E., Stoffel, M. A. & Pemberton, J. M. Variants at RNF212 and RNF212B are associated with recombination rate variation in Soay sheep (Ovis aries). Preprint at bioRxiv https://doi.org/10.1101/2020.07.26.217802 (2020).
Bates, D., Mächler, M., Bolker, B. & Walker, S. Fitting linear mixed-effects models using lme4. J. Stat. Softw. 67, 1–48 (2015).
Nakagawa, S. & Schielzeth, H. A general and simple method for obtaining R2 from generalized linear mixed-effects models. Methods Ecol. Evol. 4, 133–142 (2013).
Article Google Scholar
Stoffel, M. A., Nakagawa, S. & Schielzeth, H. partR2: partitioning R2 in generalized linear mixed models. Preprint at bioRxiv https://doi.org/10.1101/2020.07.26.221168 (2020).
Rue, H., Martino, S. & Chopin, N. Approximate Bayesian inference for latent Gaussian models by using integrated nested Laplace approximations. J. R. Stat. Soc. 71, 319–392 (2009).
Article MathSciNet MATH Google Scholar
Holand, A. M., Steinsland, I., Martino, S. & Jensen, H. Animal models and integrated nested Laplace approximations. G3 3, 1241–1251 (2013).
Article PubMed PubMed Central Google Scholar
Johnston, S. E. et al. Life history trade-offs at a single locus maintain sexually selected genetic variation. Nature 502, 93 (2013).
Article ADS MathSciNet CAS PubMed Google Scholar
Gelman, A. & Hill, J. Data Analysis Using Regression and Multilevel/Hierarchical Models. (Cambridge University Press, 2006).
Schielzeth, H. Simple means to improve the interpretability of regression coefficients. Methods Ecol. Evol. 1, 103–113 (2010).
Keller, M. C. et al. Runs of homozygosity implicate autozygosity as a schizophrenia risk factor. PLoS Genet. 8, e1002656 (2012).
Gao, X., Starmer, J. & Martin, E. R. A multiple testing correction method for genetic association studies using correlated single nucleotide polymorphisms. Genet. Epidemiol. 32, 361–369 (2008).
Article PubMed Google Scholar
Holm, S. A simple sequentially rejective multiple test procedure. Scand. J. Stat. 6, 65–70 (1979).
Stoffel, M. A., Johnston, S. E., Pilkington, J. G. & Pemberton, J. M. Data for ‘Genetic architecture and lifetime dynamics of inbreeding depression in a wild mammal’. Zenodo https://doi.org/10.5281/zenodo.4609701 (2021).
R Core Team. R: A Language and Environment for Statistical Computing (R Foundation for Statistical Computing, 2019).
Wickham, H. et al. Welcome to the tidyverse. J. Open Source Softw. 4, 1686 (2019).
Article ADS Google Scholar
Dowle, M. & Srinivasan, A. data.table: Extension of ‘data.frame’ (2020).
Wickham, H. ggplot2: Elegant Graphics for Data Analysis (Springer, 2016).
Stoffel, M. A., Johnston, S. E., Pilkington, J. G. & Pemberton, J. M. Code for ‘Genetic architecture and lifetime dynamics of inbreeding depression in a wild mammal’. Zenodo https://doi.org/10.5281/zenodo.4587676 (2021).

Download references

Acknowledgements

We thank the National Trust for Scotland for permission to work on St. Kilda and QinetiQ, Eurest and Kilda Cruises for logistics and support. We thank Ian Stevenson and many volunteers who have helped with data collection and management and all those who have contributed to keeping the project going. SNP genotyping was conducted at the Wellcome Trust Clinical Research Facility Genetic Core. This work has made extensive use of the Edinburgh Compute and Data Facility (http://www.ecdf.ed.ac.uk/). We thank John Hickey, Steve Thorn, Andrew Whalen and Martin Johnsson for help with the imputation. We are grateful for discussions with David Clark, Holger Schielzeth, Matthias Galipaud, Jon Slate, Jisca Huisman, Jarrod Hadfield, Peter Visscher, Anna Hewett, Deborah Charlesworth, Brian Charlesworth and the Wild Evolution Group. We also thank Emily Humble for comments on an earlier draft of the manuscript. The project was funded through an outgoing Postdoc fellowship from the German Science Foundation (DFG) awarded to M.A.S. and a Leverhulme Grant (RPG-2019-072) awarded to J.M.P. and S.E.J. Field data collection has been supported by NERC over many years, and most of the SNP genotyping was supported by an ERC Advanced Grant to JMP.

Author information

Authors and Affiliations

Institute of Evolutionary Biology, School of Biological Sciences, University of Edinburgh, Edinburgh, UK
M. A. Stoffel, S. E. Johnston, J. G. Pilkington & J. M. Pemberton

Authors

M. A. Stoffel
View author publications
You can also search for this author in PubMed Google Scholar
S. E. Johnston
View author publications
You can also search for this author in PubMed Google Scholar
J. G. Pilkington
View author publications
You can also search for this author in PubMed Google Scholar
J. M. Pemberton
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

J.M.P. and M.A.S. designed the study. J.G.P. is the main Soay sheep project fieldworker and collected samples and life-history data. J.M.P. has run the long-term Soay sheep study and organised the SNP genotyping. S.E.J. built the core genomic database, including genotyping, quality control and linkage mapping. M.A.S. conducted data analyses and drafted the manuscript. M.A.S., J.M.P. and S.E.J. jointly contributed to concepts, ideas and revisions of the manuscript.

Corresponding author

Correspondence to M. A. Stoffel.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Peer review information Nature Communications thanks Marty Kardos and the other, anonymous, reviewers for their contribution to the peer review of this work. Peer reviewer reports are available.

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplementary Information

Peer Review File

Reporting Summary

Source data

Source Data

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Stoffel, M.A., Johnston, S.E., Pilkington, J.G. et al. Genetic architecture and lifetime dynamics of inbreeding depression in a wild mammal. Nat Commun 12, 2972 (2021). https://doi.org/10.1038/s41467-021-23222-9

Download citation

Received: 03 June 2020
Accepted: 29 March 2021
Published: 20 May 2021
DOI: https://doi.org/10.1038/s41467-021-23222-9

This article is cited by

The genetic status and rescue measure for a geographically isolated population of Amur tigers
- Yao Ning
- Dongqi Liu
- Guangshun Jiang
Scientific Reports (2024)
Translating genomic advances into biodiversity conservation
- Carolyn J. Hogg
Nature Reviews Genetics (2024)
Investigating pedigree- and SNP-associated components of heritability in a wild population of Soay sheep
- Caelinn James
- Josephine M. Pemberton
- Sara Knott
Heredity (2024)
Inbreeding depression in Sable Island feral horses is mediated by intrinsic and extrinsic variables
- Julie Colpitts
- Philip Dunstan McLoughlin
- Jocelyn Poissant
Conservation Genetics (2024)
Selection, recombination and population history effects on runs of homozygosity (ROH) in wild red deer (Cervus elaphus)
- Anna M. Hewett
- Martin A. Stoffel
- Josephine M. Pemberton
Heredity (2023)

Comments

By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.