In Robinson et al. (2022), we employed genomic and simulation analyses to demonstrate that the critically endangered vaquita porpoise has a reduced potential for future inbreeding depression (inbreeding load), and is therefore not doomed to extinction by deleterious genetic factors. Garcia-Dorado and Hedrick (2022) (hereafter, GD&H) critique our analysis for not sufficiently demonstrating that the vaquita has a low inbreeding load and a good chance of recovery in the absence of continued incidental mortality in fishing gillnets.

Our conclusion that the vaquita likely has a very low inbreeding load is supported, first and foremost, by the finding that there are very few deleterious mutations that can contribute to inbreeding depression segregating in the vaquita due to its small historical population size. There is widespread agreement that inbreeding depression is overwhelmingly due to the exposure of recessive deleterious mutations that are concealed as heterozygotes (Charlesworth and Willis 2009; Hedrick and Garcia-Dorado 2016). In the relative absence of such mutations, there is simply very little fuel for inbreeding depression. For example, our genomic analysis shows that vaquitas have ~17 heterozygous loss of function (LOF) mutations per individual, whereas the blue whale genome has ~248 of such mutations. Loss of function mutations can, in many cases, have severe effects on fitness because they disrupt the function of protein-coding genes (though see MacArthur and Tyler-Smith 2010). Under the assumption that such LOF mutations are (partially) recessive, a low count of heterozygous LOF mutations as observed in the vaquita implies a low inbreeding load, whereas a high count as observed in the blue whale implies a high inbreeding load.

GD&H are correct that we do not precisely know the selection (s) or dominance (h) coefficients for putatively deleterious mutations, such that our identification of few segregating deleterious mutations in vaquitas offers only qualitative insight into inbreeding load. This limitation motivated our complementary simulation analysis, where we employ a distribution of s for new mutations that was inferred from our genomic dataset. Although we agree that our model, like any model, makes assumptions that should be critically evaluated, our model is informed by the best-available information on selection and demographic parameters in the vaquita. Below, we review some of the key components of the vaquita and our model that inform our conclusions.

What should be considered a “small” historical population size?

GD&H assert that inbreeding load in the vaquita cannot be low since the species has a large historical population size. Although the historical population size of vaquita may seem large relative to the very small recent effective population sizes (Ne) that are observed in endangered species, the historical size of the vaquita population is not large when considering the broader context of long-term Ne in mammals or other taxa. Long term Ne in mammals (defined here as Ne = π/(4μ) where π is heterozygosity and μ is the mutation rate) is typically on the order of tens to hundreds of thousands, and rarely below 5000. For example, using published estimates of genome-wide heterozygosity in 42 species of mammals, we recently estimated a median long-term Ne of 21,875 (Kyriazis et al. 2022). Based on our estimate of the mutation rate in vaquitas of μ = 5.8e−9 and estimate of π = 9.04e−5 (Robinson et al. 2022), vaquitas are estimated to have a long-term Ne of 3896, the second lowest of the 42 species in this dataset (Kyriazis et al. 2022). This finding is also supported by fitting more complex non-equilibrium demographic models to genomic datasets for the vaquita, which similarly estimate Ne < 5000 going back tens or hundreds of thousands of years (Morin et al. 2021; Robinson et al. 2022). The species with the smallest long-term Ne in this dataset, the Channel Island fox, has been previously shown to exhibit no phenotypic signs of inbreeding depression despite having experienced severe recent bottlenecks (Robinson et al. 2018). By contrast, North American gray wolves have a large long-term Ne of ~92,000 (assuming μ = 4.5e−9 (Koch et al. 2019) and π = 1.65e−3 (Robinson et al. 2019)), which may help explain numerous observed instances of severe inbreeding depression in the species (Fredrickson et al. 2007; Räikkönen et al. 2009; Robinson et al. 2019). Thus, although the historical vaquita population size may seem large relative the current size of the population, it is still vastly smaller than the population sizes observed in most other species of mammals.

Although these long-term effective population sizes may not be representative of the current size of many threatened or endangered populations, they are essential for modelling the inbreeding load in a species. This is because long-term demographic processes have a major impact on patterns of segregating (recessive) deleterious variation, the key determinant of the inbreeding load. Though recent declines can influence patterns of segregating variation and inbreeding load, these effects often take tens or hundreds of generations to manifest. For example, as we show in our analysis, the recent and dramatic decline in the vaquita over the past ~35 years or ~3 generations does not appear to have impacted genetic diversity or inbreeding load in the species (Robinson et al. 2022). This is likely the case for many other large mammals that have experienced declines over the past century, given the long generation times typical of these species. Thus, low inbreeding loads in species with long generation times are likely a product of small long-term historical population sizes, rather than recent human-mediated declines. However, the extent to which this is true for a given species will depend on the duration and severity of decline, something that can readily be assessed using simulations.

What should be considered a “typical” inbreeding load?

A useful approach for determining whether computational models of inbreeding depression are reasonable is to compare the predicted inbreeding load (B) from such models to those that have been estimated from natural populations (note that we report values in terms of the diploid inbreeding load [2B] in Robinson et al. (2022) but report the haploid inbreeding load [B] here to be consistent with GD&H). GD&H cite an average estimate from O’Grady et al. (2006) of B = 6, derived from an analysis of 16 existing inbreeding load estimates, as being a typical inbreeding load for wild populations. Based on this result, they then claim that our model-based prediction of B = 0.48 for the vaquita is unreasonably low. However, it has previously been shown that the inbreeding load estimate from O’Grady et al. (2006) is unreliable and upwardly biased, in part due to methodological issues associated with using generalized linear models with a logit link to estimate the inbreeding load (Nietlisbach et al. 2019). Specifically, Nietlisbach et al. (2019) found that the three highest inbreeding load values reported by O’Grady et al. (2006) are overestimates due to biased statistical models or issues with the original datasets (note that many of these same concerns apply to estimates reported in Hedrick and Garcia-Dorado (2016)). Moreover, the average estimate from O’Grady et al. (2006) is also likely to be upwardly biased due to their approach of summing together inbreeding load estimates from different life stages, an approach that ignores widespread pleiotropy for mutations underlying fitness (Pickrell et al. 2016; Boyle et al. 2017). This approach can contribute to upward bias by potentially double or triple counting the effects of recessive deleterious mutations that contribute to inbreeding depression at different life stages (see Kyriazis et al. (2022) for further discussion).

Based on 22 estimates that are deemed to be reliable and unbiased, Nietlisbach et al. (2019) instead report a median inbreeding load for survival to sexual maturity in vertebrates of B = 2.25. Although our predicted inbreeding load of B = 0.48 is somewhat lower than this median, this is expected given the small historical population size and low levels of segregating (recessive) deleterious variation in the vaquita. Importantly, we note that this estimate of B = 2.25 is based on only 13 species, nearly all of which are birds, and should therefore be interpreted cautiously. Thus, obtaining additional high-quality estimates of the inbreeding load from wild populations represents a key area of future research that will enable better assessment of simulation models (see Kyriazis et al. 2022, for further discussion). Nevertheless, based on available evidence, our model predicts an inbreeding load that is consistent with reliable estimates from natural populations.

How should we estimate selection and mutation parameters?

GD&H claim that our selection and dominance parameters are incorrect because they differ from estimates of selection and dominance parameters derived from experiments in Drosophila (e.g., Simmons and Crow 1977; Pérez-Pereira et al. 2021, 2022). However, such experimental estimates are well known to be biased towards strongly deleterious variation, as mutations with more mild effects cannot be observed in an experimental setting (Davies et al. 1999; Eyre-Walker and Keightley 2007). These experimental approaches are also limited due to issues of identifiability of selection and mutation parameters (Lynch et al. 1999; Eyre-Walker and Keightley 2007; Halligan and Keightley 2009). In other words, the distributions of observed fitness in the experimental populations can be explained by a high mutation rate and a low strength of selection (s) or vice versa, making interpretation of these experimental results challenging. Finally, experimental approaches are also only possible for laboratory organisms such as Drosophila or yeast, thus their relevance for understanding deleterious mutation parameters in natural populations of mammals such as the vaquita is unclear.

These limitations have motivated the increasing use of sequence-based estimates of the distribution of s, which leverage genetic variation datasets to estimate selection parameters while controlling for demography (Eyre-Walker and Keightley 2007). Such approaches are widely employed in population genetics (Eyre-Walker et al. 2006; Boyko et al. 2008; Ma et al. 2013; Chen et al. 2017; Huber et al. 2017, 2018; Kim et al. 2017; Tataru et al. 2017; Huang et al. 2021) and give a much more complete picture of the full spectrum of deleterious mutations, from strongly to mildly deleterious. Nevertheless, these approaches do have limitations in that they are not well suited for estimating the proportion of lethal mutations (Wade et al. 2022) and often assume additivity during inference (though see Huber et al. (2018)). Addressing these limitations is an area of ongoing research.

Given these limitations of sequence-based approaches, we explored a variety of dominance models in our analysis, as well as models with an additional proportion of lethal mutations (see Fig. S21 and Table S6 in Robinson et al. 2022). We also explored results when assuming a selection and dominance model proposed by Kardos et al. (2021) that is similar to that of Pérez-Pereira et al. (2022) (see Kyriazis et al. 2022, for a detailed comparison of these models). In all cases, we found that vaquita recovery was still the likely outcome when assuming a 90% reduction in bycatch morality rates (Robinson et al. 2022). In fact, we observed much lower predicted extinction rates under the model proposed by Kardos et al. (2021), perhaps due to more efficient purging in models with a high fraction of lethal mutations (Robinson et al. 2022). In sum, models with a higher lethal mutation rate, like those favored by GD&H, also support our main conclusion that recovery is possible.

Finally, we note we did not consider the impact of non-coding deleterious mutations, as these mutations are generally inferred to be weakly deleterious (s on the order of 1e−3; (Torgerson et al. 2009; Murphy et al. 2021; Dukler et al. 2022)) and therefore not highly relevant to modelling inbreeding depression. Indeed, the analysis of Pérez-Pereira et al. (2021) supports this assumption by suggesting that such deleterious mutations, though relevant over evolutionary timescales, are not particularly relevant in a conservation context. Future work should aim to refine estimates of the strength of selection against non-coding deleterious mutations and explore their potential impact on genetic load and extinction risk.

How can we validate simulation models?

Although the above verbal arguments serve as justification for the simulation analysis we present in Robinson et al. (2022), several approaches can be employed to more quantitatively validate and compare selection and dominance models informed by experimental versus sequence-based studies. These include comparing proposed models in terms of (1) how well they agree with patterns of genetic variation in sequencing datasets and (2) how well they agree with reliable empirical estimates of the inbreeding load.

In Kyriazis et al. (2022), we undertook this task using humans as a focal species. Humans are useful for this exercise because there are extensive genetic variation datasets (Auton et al. 2015), published demographic models (e.g., Gutenkunst et al. 2009; Gravel et al. 2011; Li and Durbin 2011; Tennessen et al. 2012), estimates of mutation rates and coding sequence length (Keightley 2012), estimates of the distribution of s (Eyre-Walker et al. 2006; Boyko et al. 2008; Kim et al. 2017), and estimates of the inbreeding load (Bittles and Neel 1994) and number of segregating recessive lethals per individual (Gao et al. 2015). Moreover, humans are much more closely related to the vaquita than Drosophila and have a long-term Ne that is typical for mammals (Ne = ~17,000). In Kyriazis et al. (2022), we leveraged these existing estimates of human demography, mutation rates, and coding sequence length to compare proposed selection and dominance models from Pérez-Pereira et al. (2022) and Kardos et al. (2021) to a model we previously presented in Kyriazis et al. (2021), which is similar to that of Robinson et al. (2022). In Kyriazis et al. (2022), we also propose a new model that better incorporates the impacts of recessive lethal mutations (Wade et al. 2022).

When comparing these various selection and dominance models, we found that sequence-based models, such as the model used in Robinson et al. (2022), make predictions that are consistent with empirical estimates of the inbreeding load in humans, whereas models based on experimental approaches do not (Kyriazis et al. 2022). For example, our model proposed in Kyriazis et al. (2022) predicts an inbreeding load of B = 3.2 and ~0.9 recessive lethal mutations per diploid. These predictions are compatible with existing evidence in humans (note that Bittles and Neel (1994) estimate B = 0.7 for humans, though this is likely to be an underestimate as it is based only on juvenile mortality). By contrast, the model proposed by Pérez-Pereira et al. (2022) predicts an inbreeding load of B = 14, vastly exceeding available estimates in humans. The Pérez-Pereira et al. (2022) model also predicts ~12 recessive lethal mutations per individual, whereas available estimates are on the order of 0.6 mutations per diploid (Gao et al. 2015).

Comparing predicted patterns of genetic variation from these models to those observed in humans also provides support for sequence-based models. Specifically, the Pérez-Pereira et al. (2022) model predicts a large overabundance of rare mutations, with 72.8% of nonsynonymous mutations predicted to be singletons (variants with frequency 1/2n in a sample). However, only 56.8% of such mutations are observed to be singletons in the European sample from the 1000 Genomes dataset (Auton et al. 2015; Kyriazis et al. 2022). This large excess of rare mutations predicted by Pérez-Pereira et al. (2022) is a consequence of the extreme strength of negative selection in the model, which results in deleterious mutations being kept at low frequency. Importantly, the model we propose in Kyriazis et al. (2022) predicts 57.3% of mutations to be singletons, in good agreement with observed patterns of genetic variation in humans (Auton et al. 2015).

Overall, these findings are consistent with previous research demonstrating that selection parameters derived from experimental studies in Drosophila and other taxa are biased towards strongly deleterious mutations (Davies et al. 1999; Eyre-Walker and Keightley 2007; Kyriazis et al. 2022). Thus, this analysis helps validate the use of sequence-based estimates of selection parameters, such as those employed in Robinson et al. (2022), given that they are consistent both with patterns of genetic variation and empirical estimates of the inbreeding load.

Other concerns

GD&H critique our simulations for ignoring stochastic environmental and demographic factors and not modelling a loss of adaptive potential in the vaquita. However, our analysis does incorporate demographic stochasticity, as we model survival and reproduction probabilistically. Although we do not model environmental stochasticity or loss of adaptive potential, the threat that these factors pose to the vaquita, if any, is entirely unknown. Moreover, we emphasize that the aim of our analysis was to test the assumption that the vaquita is doomed to extinction by inbreeding depression. We do not interpret our model projections beyond demonstrating the qualitative result that inbreeding depression alone does not impede recovery in the species, as we agree with the general view that population viability models should be interpreted cautiously in terms of their ability to accurately predict future population sizes (Beissinger and Westphal 1998). Though incorporating factors such as environmental stochasticity may influence our model predictions, they would not change our central conclusion that recovery remains possible.

Conclusions

In conclusion, we agree with GD&H that predictive models should be critically evaluated. Indeed, the critical evaluation of proposed mutation and selection models that we present in Kyriazis et al. (2022) serves as justification for the type of sequence-based models we employed in Robinson et al. (2022). Moreover, this analysis demonstrates that models based on experimental results, such as those proposed by Pérez-Pereira et al. (2022) and Kardos et al. (2021), are biased towards strongly deleterious variation and are not consistent with patterns of genetic variation or empirical estimates of the inbreeding load in humans. Nevertheless, our analysis in Robinson et al. (2022) found that recovery was still the likely outcome when assuming models with a much higher fraction of strongly deleterious variation. However, we emphasize that all predictive models should be interpreted cautiously, given that there is often a fair amount of uncertainty in parameter estimates that can be challenging to validate with orthogonal sources of information (Beissinger and Westphal 1998). For instance, having field-based estimates of the inbreeding load for the vaquita would represent a valuable source of additional information for our study, however, such data do not currently exist, and may never exist given the perilous circumstances of the species. Future work should aim to validate the sorts of models we employ in Robinson et al. (2022) for wild species where field-based estimates of the inbreeding load exist. Such work could greatly strengthen conclusions drawn from such predictive models.

All modelling considerations aside, our conclusion that recovery is possible is also supported by all field surveys since late 1990s, including those in 2019 and 2021, where active healthy vaquitas with calves have been sighted (Rojas-Bracho et al. 2022). Despite an almost certain increase in gillnetting within the small range where vaquitas remain, vaquitas continue to survive at higher numbers than expected (Rojas-Bracho et al. 2022). The possibility of recovery is also supported by numerous examples of recovery for species that have dwindled below 20 individuals (Wiedenfeld et al. 2021), many of which were once thought to be doomed to extinction. Although inbreeding depression may in many contexts contribute to population decline, the naive assumption that it will inevitably doom small populations is a dangerous view that is harmful to species conservation.