Polyandry blocks gene drive in a wild house mouse population

Gene drives are genetic elements that manipulate Mendelian inheritance ratios in their favour. Understanding the forces that explain drive frequency in natural populations is a long-standing focus of evolutionary research. Recently, the possibility to create artificial drive constructs to modify pest populations has exacerbated our need to understand how drive spreads in natural populations. Here, we study the impact of polyandry on a well-known gene drive, called t haplotype, in an intensively monitored population of wild house mice. First, we show that house mice are highly polyandrous: 47% of 682 litters were sired by more than one male. Second, we find that drive-carrying males are particularly compromised in sperm competition, resulting in reduced reproductive success. As a result, drive frequency decreased during the 4.5 year observation period. Overall, we provide the first direct evidence that the spread of a gene drive is hampered by reproductive behaviour in a natural population.


Supplementary Note 1 Estimating Heritability of Polyandry
In analysis 1), we modelled the frequency of polyandry in the population using a generalised linear mixed effects model (GLMM) that uses pairwise relatedness matrix as a random effect (also known as an animal model). In this supplementary section, we provide additional detail on the implementation of the model. Random Effect Structure Two random effect variables were included in the model, the random additive genetic effect (V A ) and maternal identity (V P E ).
1. To estimate the amount of phenotypic trait variation V P that can be attributed to heritable genetic variation among females (V A ) we used the parentage analysis (see Methods of main text) to construct a pairwise relatedness matrix of all 225 females in the data set. From the entire pedigree, we removed all non-informative animals (animals that did not reproduce and/or were not responsible for a link between two informative animals) using the prunePed function in the R package MCMCglmm 4 . The remaining pedigree contained a total of 451 individuals, of which 41 individuals (9%) were treated as founders due to unknown maternal and paternal links. The average depth of the pedigree was 8.44, the maximum depth equaled 17. About 50% of the individuals had an inbreeding coefficient greater than 0, and the average degree of inbreeding was 0.065. The mean pairwise relatedness was 0.068, and about 20% of dyads had values greater than 0.125. Pedigree summary statistics were produced with the help of the Pedantics package in R 11 .
2. Several females in the data set reproduced more than once. We accounted for such repeated female reproduction by fitting female identity as an additional random effect variable. This allowed us to test for potential, systematic differences in polyandry rates among females. In quantitative genetics studies, this variance component is usually termed V P E (for permanent environment).
Fixed Effect Structure Additional to the random effects variables, we fitted a number of additional predictors as fixed explanatory variables. In a full model, we investigated the effect of female t genotype (+/+ and +/t), adult population size (Supplementary Figure 3) and average monthly temperature at the time when the litter was born, as well as the size of the litter (without interactions). Note that the inclusion of litter size as an explanatory variable is of crucial importance here, as we expect a higher probability of genetic polyandry in larger litters based on chance alone (because of the reduced sampling error in larger litters). Implementation Details Bayesian analyses require the specification of prior probability distributions for all random and fixed predictors used in the model. We used relatively uninformative priors for both fixed effects (normally distributed with a mean of 0 and a variance of 10 8 ) and random effects (inverse Wishart distributed, with variances set to 1 and degree of belief of 1). Because estimated variance components V A and V E were close to zero, we used parameter expansion to ensure proper mixing of the posterior chains. Model outcomes were robust with regard to the (reasonable) choice of prior distribution. Note that residual variance V R cannot be estimated in binary models 3;12;13 and were therefore set to a fixed, arbitrary value of 1. It is important to mention that, while the arbitrary choice of V R does affect absolute estimates of the other variance components V A and V P E , it has only minor effects on the relative magnitude of the three variance components. Hence, the choice of V R should not affect our heritability estimates. We run all models for 10 6 iterations, with a burn-in of 5,000 and a thinning interval of 3,000 to avoid autocorrelation among the samples from the posterior distribution. After running a full-model including all fixed and random variables, we removed non-significant fixed factors in a stepwise fashion.
Calculating Heritability Heritability h 2 is defined as the proportion of phenotypic variance V P that is accounted for by additive genetic variance V A . We were interested in the heritability of a female's propensity for genetic polyandry. In the model here, the propensity is not estimated on the scale at which the trait was measured (data-scale), but on the underlying logit-scale (latent-scale).
Accordingly, we used the following expression to calculate the latent-scale heritability of genetic polyandry where the logistic variance is proportional to π 2 3 12 .
Limitations There are several potential explanations for the low heritability measured here. One possibility is that there is considerable additive genetic variation V A , but this variation is negligible in view of the far greater, other sources of variation (i.e. V R , V P E Thanks to a high-quality dataset and a state-of-the-art statistical method, we were nevertheless able to derive heritability estimates with relatively narrow confidence bands. It hence seems unlikely that the low heritability estimate can exclusively attributed to insufficient statistical power. Finally, it is possible that there is in fact only little heritable and individual variation for polyandry in this system. As discussed in the main text, we are not the first study that finds polyandry to be largely determined by non-genetic factors such as population density. polyandry rates on reproduction, hence using a predictor variable that is not corrected for litter size. In this case, we find a statistically relevant, positive relationship with reproductive output. However, this analysis has the reverse problem. Hence, it is difficult to assess how much of this positive relationship is biologically relevant, and how much can be attributed to the lower detection probability of genetic polyandry in smaller litters. This is a common problem of studies that analyse the fitness consequences of mating partner number (e.g. Gerlach et al. 2 ) and, to our knowledge, there is no straight-forward solution. Note that, irrespective of these methodological difficulties, we expected selection gradients on polyandry to differ between females of different t genotype, as they should affect both genotypes in similar fashion, which we did not detect (see main text).

Supplementary Note 3 A Model of Polyandry and Gene Drive
The basic theoretical argument as to how sperm competition will affect t haplotype frequencies in a population has been developed previously. Haig  We consider a population of infinite size with non-overlapping generations. For simplicity, we assume that, apart from t/t lethality, there are no survival differences between genotypes (see Manser et al. 9 for an examination of this case). Let y be the frequency of +/t heterozygote adult individuals in the current generation g. Note that, since t/t are not viable, the frequency of +/+ homozygotes is simply given by 1 − y . To calculate the frequency change of +/t individuals ∆y from the current (g) to the next, non-overlapping generation (g + 1), individuals in the selection lines undergo the following life cycle.

Mating and Fertilisation
The key quantity in the model is the probability that a given female is fertilised by a t sperm s, which will be a composite measure of the probability that a female mates with a given male and the probability of t fertilisation given mating. For simplicity, we assume that females randomly mate either once (at probability 1 − π) or twice (at probability π). Parameter π hence measures the polyandry frequency in the population.
The probability of t fertilisation in the case where the female mates with a single male (s 1 ) will simply be the product of the probability of mating with a +/t male (y ) and the probability of t In the case where a female mates with two males, there are two ways in which a female can be fertilised by a t sperm (s 2 ). First, she may mate with two +/t males (at probability y 2 ), in which case the probability of t fertilisation would be y 2 d. Alternatively, she may mate with both a +/t and a +/+ male (at probability 2y (1 − y )). Crucially, we here assume that the +/t males have a sperm competitiveness r relative to +/+ males (whose competitiveness equals unity). A +/t male thus only fertilises r r +1 eggs when competing against a wildtype male, of which d will be t. Overall, we have s 2 = y 2 d + 2y (1 − y )d r r +1 . If we take sum over both cases s = (1 − π)s 1 + πs 2 and simplify, we get As expected, sperm competition only causes deviations from the monandry case (y d) if females mate with multiple males (π > 0), drive males differ in their sperm competitive ability (r = 0), and if wildtype males are present in the population (1 − y > 0). The last condition is due to the fact that +/t sperm competition disadvantage is only relevant in matings that involve wildtype males, which will increase with their frequency.
Offspring Production and Embryo Lethality Next, we calculate the proportion of +/t individuals produced in the population in the next, non-overlapping generation y (g + 1) (after t/t embryo mortality). Because segregation ratios are Mendelian in females, the fraction of t eggs in the population will simply be e = y 2 .
Heterozygote +/t zygotes form if a t egg is fertilized by a + sperm (at frequency e(1 − s)) or if a + egg is fertilized by a t sperm (at frequency (1 − e)s). y (g + 1) will then be the fraction of heterozygotes divided by the total amount of individuals which survive into adulthood. t/t homozygotes, which are formed at a frequency e s, perish in utero, the amount of live offspring is thus 1 − e s. We have the following recurrence relation

A Monandry
The recurrence equation 4 allows us to derive the equilibrium frequencyŷ where the change in t frequency ∆y = y (g + 1) − y (g) = 0. We first calculate the equilibria under monandry where π = 0, thus s = dy . If we substitute into equation 4, we recover the three equilibria identified in the classic model by Bruck 1

B Polyandry
Secondly, we examine a scenario where polyandry and sperm competition occurs (π > 0). Substituting 2 and 3 into our recurrence equation 4, we get Fixed Points Again, we calculate equilibrium points by setting ∆y = y (g + 1) − y (g) = 0. For Eq.
Invasion Analysis Examining the stability of the internal equilibria is difficult. Instead, let us focus on invasion criteria of the two alleles. The stability ofŷ 1 = 0 will determine whether the t haplotype can invade a population that is fixed for the wildtype allele. Calculating the first order derivative of Eq. 6 atŷ 1 = 0, λ 1 ≡ dy (n+1) dy y =ŷ1 gives This expression is quite intuitive to interpret. It is positive for all parameter values (0 < d < 1 and 0 < r < ∞) and thus, the t haplotype can invade if λ 1 > 1. Drive can thus spread in a population whenever Once again (as in Eq. 2), the term π 1−r 1+r measures the effect of polyandry on the ability of drive invasion. In the absence of polyandry (π = 0), the driver can invade as soon as we have gene drive (d > 0.5), a result that has been reported previously 1 . With polyandry (π > 0), it is easy to see how low drive male sperm competitiveness (r < 1) and high polyandry rates (π) make the invasion condition more restrictive. For example, if all females in the population mate multiply (π = 1) and drive is complete (d = 1), 9 reduces to r > 1 3 , thus t male sperm competitiveness has to be greater than 1/3 for the t haplotype to invade. If drive males have the same competitiveness as wildtype males (r = 1), 1−r 1+r = 0, and we again recover the monandry model condition. The right inequality in 9 is rearranged such that we have the polyandry threshold above which the t can no longer establish, which maybe useful for practical purposes. Again, the regime of polyandry rates that impede drive spread gets larger if drive is ineffective and drive sperm disadvantage is small.
We can ask the reverse question and analyse the circumstances under which the wildtype allele can invade a population where all individuals are heterozygotes (atŷ 2 = 1). In this case, we have which, again, is positive for all biologically relevant values. In this case, we find that the wildtype allele can invade a population if r < − d 2 +dπ+d−2 d 2 −dπ+d−2 .

Supplementary Note 4 Estimating Model Parameters
Analyses performed in this study and Sutter and Lindholm 14 allowed us to obtain reliable estimates of all model parameters for our study population (Supplementary Table 2 . This figure is based on paternity share of the +/t male P t , which is slightly different to our sperm competitiveness measure here. We have P t = r r +1 or, equivalently, r = Pt 1−Pt , which gives us r = 0.126 [0.066, 0.244]. Note that the magnitude of the sperm competitive disadvantage suggests that t haplotypes cause more damage to +/t ejaculates that expected based on the numeric reduction of sperm due to + sperm reduction alone. In the latter scenario, a +/t male's sperm competitiveness would be indirectly proportional to drive strength d as r Polyandry Frequency π. In this study, we have estimated (genetic) polyandry rates π g based on the paternity information. This is likely to underestimate the actual (behavioural) polyandry rate π in the population, as not all males are necessarily successful in fertilisation (π ≥ π g ). However, we can use the polyandry model above to get an approximate idea of the discrepancy between genetic and behavioural polyandry ∆π = π − π g . Importantly, and additional to the three model parameters (π, r, d) and drive genotype frequency (y ), the probability of misassigning a litter ∆π is dependent on the sample size, i.e. the litter size L. For example, in a species where litter size is only one, paternity information is completely uninformative with respect to detecting polyandry.
Inferring Behavioural Polyandry Rates To calculate ∆π, we need to determine the frequency of all The probability of an ordered mating between a female of genotype o and two males of genotype m and n, respectively, under random mating will be where k = m + n + o counts the number of +/t genotypes in the mating and y denotes the frequency of +/t genotypes (also see Supplementary Table 3).
The probability of paternity P m,n of a male of genotype m when competing against a male of genotype n at fertilisation will then be where r again measures +/t male sperm competitiveness relative to +/+ wildtypes. However, things are slightly more complicated because we are not measuring paternity probabilities at fertilisation but at birth. We thus have to account for t/t embryo lethality-every time an embryo of a +/t female is fertilised by a +/t male, a fraction d 2 will perish due to t/t lethal effects. Hence, at birth, we have the following paternity probabilities (again summarised in Supplementary Table 3) given that a polyandrous mating occurs. We miss a multiple mating whenever all L offspring of a The overall difference between behavioural and genetic polyandry will be ∆π = π − π g = π Pr(∆π|π). Solving for π, we have π = π g 1 − Pr(∆π|π) .
We can now substitute our best estimates for our model parameters (r , d, π, see Supplementary Supplementary Table 3. Different polyandrous unordered mating combinations, their frequency f , and the expected paternity shares of the two males at birth (after t/t mortality).