Adaptation is maintained by the parliament of genes

Fields such as behavioural and evolutionary ecology are built on the assumption that natural selection leads to organisms that behave as if they are trying to maximise their fitness. However, there is considerable evidence for selfish genetic elements that change the behaviour of individuals to increase their own transmission. How can we reconcile this contradiction? Here we show that: (1) when selfish genetic elements have a greater impact at the individual level, they are more likely to be suppressed, and suppression spreads more quickly; (2) selection on selfish genetic elements leads them towards a greater impact at the individual level, making them more likely to be suppressed; (3) the majority interest within the genome generally prevails over ‘cabals’ of a few genes, irrespective of genome size, mutation rate and the sophistication of trait distorters. Overall, our results suggest that even when there is the potential for considerable genetic conflict, this will often have negligible impact at the individual level.


2) Simple Selfish Genetic Elements vs Trait Distorters p63
Relation to Eshel (1984) and Eshel (1985) (3) Relation to Cosmides & Tooby (1981) 15). The trait distorter (D 1 ) is also associated with a transmission bias at meiosis (t) which is varied along the y axis. We consider trait distorters that induce suppressor spread (c sup <c trait ) and ask whether such trait distorters can cause appreciable trait distortion before they are ultimately suppressed and purged from the population. The red line plots the formula t=c trait /(1-c trait ); above this line, trait distorters can spread from rarity. We plot the number of generations (on a log 10 scale) until equilibrium is reached (trials that did not equilibrate by 20,000,000 generations were capped). We see that less costly trait distorters (c trait only slightly greater than c sup ) can invade even with a relatively low transmission bias (t), and are purged at a very slow rate, causing extended non-equilibrium trait distortion. More costly trait distorters (c trait large compared to c sup ) require a high transmission bias (t) to invade, and if they can invade, they are purged relatively quickly, causing shorter non-equilibrium trait distortion. Therefore, non-equilibrium trait distortion is either not-so-costly and extended, or costly and ephemeral, and so has limited impact on individual fitness maximisation in either case.

(b) Agent based simulation (a) Population genetics
Efficacy of the selective forces promoting k target (T1)

Supplementary Note 3 Sex Ratio Distortion
We examine sex ratio evolution in a diploid species, in a large outbreeding (panmictic) population, with non-overlapping generations, and where males and females are equally costly to produce. Fisher 1 and many others have shown that, in this scenario, individuals would be selected to invest equally in male and female offspring 2,3 . We assume genetic sex determination, with males as XY and females as XX 4 .
We consider a selfish genetic element residing on an X chromosome, that may gain a propagation advantage by distorting the offspring sex ratio towards a greater production of females. The genes that do not gain a propagation advantage from female sex ratio bias reside on both the autosomes and the Y chromosome 5 . We focus on suppressors in the autosomes, for simplicity, and because this is the larger group of genes, constituting the majority within the parliament of genes 6 .
Consequently, we focus our analyses on when an X driver and an autosomal suppressor can spread.
Our overall aim is to assess, given the potential for suppression, the extent that an X chromosome driver can distort the sex ratio away from the individual optimum. The individual optimum is taken to be the evolutionarily stable strategy (ESS) adopted by individuals in the absence of selfish trait distortion, which is an equal investment in offspring of both sexes. We build our model in a step-wise manner, as described in the "Equilibrium Models" section of the main text. Aspects of questions 1-3 have been analysed before with respect to sex ratio, but we go over them here for the specific case of our model, and to elucidate the underlying selective forces. There is available data on the fitness consequences of sex ratio distortion and suppression, and so, in this case, we aim for a biologically realistic model that can be parameterised.

(1) Spread of a Trait distorter
We considered the spread, in the absence of suppression, of a selfish sex ratio distorter that skews offspring sex ratio towards females. In the literature, selfish X drivers are often denoted by SR (for sex-ratio distortion), with non-distorting rival alleles denoted by ST (for standard) 7 . However, we denote the trait distorting and non-distorting alleles respectively by D 1 and D 0 for consistency across our models.
We assume that normal (D 0 /Y) males produce X and Y sperm equally. The trait distorter (D 1 ) causes D 1 /Y males to kill Y-bearing sperm, leading to a female-biased sex ratio [8][9][10][11][12][13] . In males with an unsuppressed trait distorter, its proportion of X-bearing sperm, and correspondingly, the proportion of its offspring that are female, is given by 0.5(1+ ), where denotes the proportion of Y-bearing sperm that are killed (0< ≤1).
We assume that males with an unsuppressed trait distorter (D 1 ) suffer a fertility cost as a result of sperm death, and have a reduced ejaculate size of 1-/2, relative to 1 in all other males [14][15][16][17][18][19] . We assume that, each generation, each female copulates with λ random males, and that each sperm cell is equally competitive in the female's internal store. The likelihood of a male's sperm fertilising an egg (fertility, F) is given by his ejaculate size relative to the total amount of ejaculate that female has received. Letting l be the proportion of males in the present generation with an unsuppressed trait distorter, the fertility of those males with an unsuppressed trait distorter (F drive ), and the fertility of those without (F normal ), is given by: . ( There is no sperm competition, and therefore no fertility cost of sex ratio distortion, when females are singly mated (F normal =F drive when λ=1). There is increased sperm competition at higher female mating rates, meaning the relative fertility cost of sex ratio distortion (F normal /F drive ) increases and plateaus for high λ at F normal /F drive = 1/(1-/2).
The trait distorter has no fitness consequences for females, and so the condition for the spread of the trait distorter (D 1 ) allele is that D 1 /Y males sire more female offspring than D 0 /Y males. In the absence of suppression, D 1 /Y males have F drive (1+ )/2 female offspring, and D 0 /Y males have F normal /2 female offspring, meaning the trait distorter (D 1 ) is selected when: The left-hand side of Supplementary Equation 3 gives the between-individual relative fertility cost of trait distortion and the right-hand side gives the within-individual

(2) Spread of an autosomal suppressor
We assume that the sex ratio distorter can be suppressed by an autosomal allele (suppressor), as has been found in many Drosophila species [28][29][30][31] . We base our model upon the biology of Nmy, which suppresses the X chromosome trait distorter For consistency across models, we denote the autosomal suppressor allele as S 1 , and the wild type non-suppressor allele as S 0 . We assume that the suppressor (S 1 ) is dominant, meaning individuals bearing at least one suppressor (S 1 ) allele suffer no sperm death and consequentially no fertility loss or sex ratio distortion. We assume that the suppressor (S 1 ) is only expressed in the presence of an active trait distorter (in D 1 /Y males). When the suppressor is expressed it leads to a cost, which reduces the probability (V) that an individual survives from zygote to adult [35][36][37][38] , from V normal =1 in individuals without an active suppressor, to V suppression =1-c sup in individuals with one. The cost of suppression is a fixed cost (c sup ) of activating an RNAi pathway.
Assuming alternatively that the suppression cost affects fertility rather than viability does not qualitatively change our results (Scott, unpublished).
We ask when an autosomal suppressor (S 1 ) will spread from rarity, given that an X chromosome trait distorter (D 1 ) is at fixation. Given that the suppressor only has phenotypic effects in D 1 /Y males, it will spread from rarity if D 1 /Y males bearing a suppressor (S 1 ) have more mated offspring than D 1 /Y males lacking a suppressor (S 0 /S 0 ). Assuming that the trait distorter and non-suppressor alleles are at fixation, and random mating, D 1 /Y males with a suppressor will have V suppression *F normal *(½)*((1+ ) /2) mated female offspring, and V suppression *F normal *(½)*((1-)/2) mated male offspring, leading to a total of V suppression *F normal *(¼) mated offspring. D 1 /Y males lacking a suppressor will have a total of 2*V normal *F drive *((1-)/2)*((1+ )/2) mated offspring. Suppressed D 1 /Y males will therefore have more offspring, and the suppressor allele (S 1 ) will spread from rarity, when the following condition is satisfied: The overall cost of letting the trait distorter (D 1 ) go unsuppressed is a product of the costs to fertility (F normal /F drive ) and offspring mating success (1/(1-2 )). For a suppressor to spread, this must be greater than the viability cost of suppression (V normal /V suppression ). Consequently, analogous to previous results, the suppressor (S 1 ) will only spread when the trait distorter (D 1 ) leads to appreciable trait distortion 36,[39][40][41][42] .
A previous model asked whether female-biased sex ratio distortion can select for compensatory evolution on autosomes, such that the autosomes evolve to encode a male-biased sex ratio in the absence of the trait distorter 43 . It found that compensatory evolution does not evolve when the female-biased sex ratio distorter is transmitted into female offspring with 100% certainty, as is the case for X drivers acting in males. This is why we did not allow compensatory strategies to evolve on autosomes in our model, and only allowed autosomes to suppress the trait distorter.

(3) Consequences for organism trait values
We turn to the question of how trait distorter-suppressor dynamics affect sex ratio.
When both the trait distorter (D 1 ) and suppressor (S 1 ) are in a population, the genotypes they are in matters (epistasis), and so we explicitly track the frequencies of all 15 possible genotypes, with 15 recursions. The 15 equations represent the generational changes in each of the 15 possible genotypes. We let p fi and q mi be the proportion of the ith female genotype and the ith male genotype, respectively, in the current generation (Supplementary Table 1). We let p fi ' and q mi ' be the frequencies of female and male genotypes in the next generation. The population sex ratio is given by the population proportion of females, ∑p f . The equations are listed in Supplementary Table 2. We note that, in the absence of the trait distorter (D 1 ), population sex ratio evolves to 0.5, and after this, genotype frequencies remain constant over time (Hardy-Weinberg equilibrium).
We iterated these recursions to find the trait distorter (D 1 ) and suppressor (S 1 ) frequencies, and the population sex ratio (∑p f ), at equilibrium (Supplementary Figure   3). When we introduced both the trait distorter and suppressor at low frequencies, we confirmed our above results that the trait distorter (D 1 ) initially spreads to fixation, and that the suppressor allele (S 1 ) only invades and reaches high frequencies if it is suppressing a strong trait distorter (high ).
We used our recursions to examine whether the spread of the suppressor led to the subsequent loss of the trait distorter. As the suppressor increases in frequency, the population sex ratio becomes less biased, and the fitness benefit of further trait distorter suppression is reduced (negative frequency dependence). This means that, fixed at c sup =0.03. We consider trait distorters that are purged from the population, after being suppressed, at equilibrium (λ>1). We plot the number of generations (on a log 10 scale) until equilibrium is reached (trials that did not equilibrate by 50,000 generations were capped). We see that stronger trait distorters (high ) are purged at a faster rate, reducing the potential for non-equilibrium sex ratio distortion. Increased female mating (high λ) increases the fertility cost of distortion, meaning trait distorters are purged at a faster rate.
bearing the selfish genetic element (D 1 ) arises solely because an individual level trait (sex ratio) is suboptimal (not ½). Sex ratio distortion is often negligible even in this special case (λ=1), indicating that the parliament of genes can act for the sole purpose of trait (sex ratio) restoration, without the additional incentive of fertility recovery.
Additionally, we considered the effects of model parameters on sex ratio. We found that increasing the rate of female mating (λ) and decreasing the cost of suppression Time for trait distorter to be purged (generations) No. Mates Trait Distortion ( ) (c sup ) both led to a reduced tolerance of drive, and a correspondingly reduced level of sex ratio distortion (Supplementary Figure 4).
Finally, we considered how far sex ratio can be distorted in the time period after the trait distorter initially invades and before the trait distorter is suppressed and purged from the population. We iterated our recursions and timed how many generations it took to reach equilibrium. We found that stronger trait distorters (higher ) are suppressed and purged from the population more quickly, especially at higher female mating rates (λ) where the fertility cost of sex ratio distortion is greater (Supplementary Figure 5).

4) Evolution of trait distortion
In the above analyses, we assumed that the strength of the trait distorter (D 1 ) was a fixed parameter ( ). We now consider the consequence of allowing the level of trait distortion to evolve 20 . We first consider the scenario in which there is no suppressor.
We take a game theoretical approach to find the evolutionarily stable strength of X chromosome sex ratio distortion ( *) in the absence of suppression. We assume a population where all males have an X chromosome with the same strength of distortion, denoted by a capital K. We then assume that a mutation arises in the X chromosome of one male in the population that causes it to assume a new strength of distortion, denoted by . We wish to find the strength of distortion that, when adopted by every X chromosome in the population, cannot be invaded by the mutant X chromosome adopting a different strength of distortion. This strength of distortion ( *) represents the evolutionarily stable strategy (ESS) 44 .
Trait distorters have no effect in females, so the fitness of the mutant trait distorter depends only on its action in males. The male bearing the mutant trait distorter has fertility given by its proportional sperm contribution to a female mate's sperm store: The mutant trait distorter is passed into (1+ )/2 offspring, and so the fitness of the mutant X chromosome is proportional to: The ESS strength of X chromosome distortion is the value of * that satisfies "Y "$ = 0 and " C Y "$ C < 0 when = = *, and is given by: When females mate singly (λ=1) or doubly (λ=2), maximal trait distorter strength is favoured ( *=1), resulting in population collapse due to lack of males. As female mating frequency increases to λ≥3, the increased fertility cost of distortion means that the equilibrium strength of distortion ( *) decreases 20  We now consider what sex ratio will evolve in the presence of a suppressor (S 1 ). We Game theory the same suppressor allele (S 1 ) [45][46][47][48] . In Supplementary Table 4, we display 27 recursions to describe the generational changes in genotype frequencies when the alleles D 1 , D 0 , D 2 , Y, S 0 and S 1 are segregating in a population (notation defined in Supplementary Tables 1 & 3). We note that, in the absence of the trait distorters (D 1 and D 2 ), population sex ratio evolves to 0.5, and after this, genotype frequencies remain constant over time (Hardy-Weinberg equilibrium). These equations reduce to those in Supplementary Table 2 when genotypes bearing D 2 are set to zero.
We work out the evolved level of sex ratio distortion, under the assumption that trait distorter strength is initially low, and the additional assumption of weak selection. We assume the mutant trait distorter (D 2 ) is only slightly stronger than the trait distorter from which it is derived (D 1 ), so that = + , where is positive and very small (weak selection 49 ). We see if a mutant trait distorter can spread by iterating our recursions in Supplementary Table 4  We find that, in the presence of the suppressor allele (S 1 ), weakly distorting X chromosomes (low-) can evade suppression and successfully distort sex ratio.
These weak trait distorters will be displaced by slightly more distorting mutants (D 2 ).
If the cost of suppression (c sup ) is sufficiently low, this displacement causes the frequency of the suppressor allele (S 1 ) to increase in response. This trend means Supplementary Table 3: Further Selection coefficients, drive values, and genotype frequency notation. For each male and female genotype, its proportion in the population at generation t, and its probability of maturing from a zygote to an adult (viability, V) is given. For each male genotype, the proportion of X chromosomes in its sperm store (drive), and its probability of successfully fertilising the female's egg cell after copulation (fertility, F), is given. Male fertility (F) depends on the number of mates each female has per generation (λ) according to: . l, n, and 1-l-n, are, respectively, the proportions of males in the population with: an unsuppressed D 1 ; an unsuppressed D 2 ; neither of these (all other males). and respectively give the proportion of a male's Y bearing sperm that are killed by an unsuppressed D 1 and D 2 trait distorter, and c sup gives the viability cost of trait distorter suppression. T is the sum of the right sides of the system of equations such that ∑p=1. It normalises the recursions to ensure that gene frequency changes reflect proportions.

Females Males
that, given sequential mutations on the X chromosome to increase sex ratio distortion, suppression will ultimately evolve, completely (λ>1) or partially (λ=1) restoring an equal sex ratio (

Agent-based simulation
We construct an agent-based simulation to ask what level of sex ratio distortion evolves under strong selection, and when continuous variation is permitted at trait distorter and suppressor loci. We model a population of N=10,000 individuals and track evolution at an X chromosome trait distorter locus and an autosomal suppressor locus. Individuals either have two alleles at the X chromosome The sex ratio at this equilibrium level of X chromosome was recorded. The cost of suppression required for sex ratio distortion to be appreciably distorted (>60% females produced) is plotted (solid circles). Sex ratio is significantly distorted for the region above this curve. locus, with strengths denoted by a and b (females), or one allele at the X chromosome locus, with strength denoted by a (males), and one Y chromosome.
Each allele at the trait distorter locus can take any continuous value between zero and one. Individuals have two alleles at the suppressor locus, with strengths denoted by m a and m b (diploid). At the suppressor locus, we consider both the case of: (i) discrete variation, in which suppressor strengths are either zero or one, and (ii) continuous variation, in which suppressor strengths can take any continuous value between zero and one. We assume that the strongest (highest value)

Supplementary Note 4 Genomic Imprinting and Altruism
Genomic imprinting occurs at a minority of genes in mammals and flowering plants.
An imprinted allele has different epigenetic marks, and corresponding expression levels, when maternally and paternally inherited 50  We consider an autosomal, maternally expressed selfish genetic element that may gain a propagation advantage by upregulating individual altruistic investment 55,57-61 .
The genes that do not gain a propagation advantage from altruism upregulation comprise both paternally expressed and unimprinted genes. The conflict between maternally and paternally expressed genes, which can result in arms races and a 'tug of war' over organism phenotype, has been considered in previous theoretical work [62][63][64][65] . However, we focus on unimprinted suppressors, for simplicity, and because unimprinted genes comprise the larger group of genes, constituting the majority within the parliament of genes 50,66,67 . We focus our analyses on when a maternally expressed trait distorter and an unimprinted suppressor can spread. We first describe our modelling assumptions, then successively analyse the cases of unimprinted, and imprinted, altruism. The purpose of this model is to illustrate how selection will act on selfish imprinted genes and their suppressors.

Modelling Assumptions
We track a large population of diploid individuals. We consider a gene that induces an altruistic investment of some amount ( >0), at a fitness cost to the individual (c( )), which is a monotonically increasing function of altruistic investment pairs are random with respect to identity at the paternally (sperm-) inherited allele at all loci (R m =1; R p =0). Individuals may then invest in altruism directed towards their partner, before producing gametes in proportion to their fitness (fertility), and dying (non-overlapping generations).
In nature, relatedness asymmetries within a generation may be generated by sex biased migration patterns 58 , or as a consequence of greater variance in reproductive success in males 52,69 . They may alternatively be generated if kin recognition alleles are imprinted, which has been implicated in humans 70 and mice [71][72][73] .

Unimprinted Altruism
We consider an unimprinted altruism gene, denoted by y A , that, when homozygous, where the mean fitness of individuals is given by: this implies that, when ½b( A )<c( A ), the optimal altruism investment for unimprinted genes is zero, and increased altruistic investment is increasingly suboptimal.

Trait Distorter Spread
We consider an imprinted altruism gene that is only expressed when maternally inherited, denoted by D 1 , and induces an altruistic investment of ( >0). If we take p and p' as the population frequency of the altruism gene in two consecutive generations, then the population frequency of the altruism gene in the latter generation is: where the mean fitness of individuals is given by: We ask when a rare imprinted altruism gene (D 1 ) can invade a population fixed for the non-trait distorter (D 0 ). We take Supplementary Equation 7, set p'=p=p*, and solve to find two possible equilibria: p*=0 (non-trait distorter fixation) and p*=1 (imprinted gene fixation). The imprinted gene (D 1 ) can invade from rarity when the p*=0 equilibrium is unstable, which occurs when the differential of p' with respect to p, at p*=0, is greater than one. The imprinted altruism gene invasion criterion is therefore b( )>c( ).
We now ask what frequency the imprinted altruism gene (D 1 ) will reach after invasion. The gene (D 1 ) can spread to fixation if the p*=1 equilibrium is stable, which requires that the differential of p' with respect to p, at p*=1, is less than one. This requirement always holds true, demonstrating that there is no negative frequency dependence on the imprinted gene, and that it will always spread to fixation after its initial invasion.
Given that genetic relatedness is R m =1, our condition for the spread of the imprinted altruism gene (b( )>c( )) corresponds to Hamilton's Rule 51,74,75 . Combining with the result of the "Unimprinted altruism" model, altruistic investment (of i = A = ) is simultaneously favoured at maternally expressed genes and disfavoured at unimprinted genes, rendering the imprinted altruism gene a selfish trait distorter, when ½b( i )<c( i )<b( i ).

Spread of an autosomal suppressor
We ask when an unimprinted suppressor (S 1 ), competing against a non-suppressor (S 0 ), will invade from rarity. We can write recursions detailing the generational  (1 + ( )( :: + 3: )) + 3 V x 3: x 33 (1 − yaz )(1 + ( )( :: + 3: )) + We derive the Jacobian stability matrix for the equilibrium in which the trait distorter (D 1 ) and non-suppressor (S 0 ) are at fixation (x 00 *=0, x 01 *=0, x 10 *=1, x 11 *=0). The suppressor can invade when the equilibrium is unstable, which occurs when the leading eigenvalue is greater than one. The leading eigenvalue is , meaning the suppressor invasion criterion is given by: Therefore, the suppressor invades from rarity above a threshold level of distortion, , when, from the perspective of an unimprinted locus, the number of relatives that die as a result of trait distortion (c( )-b( )/2), exceeds the number of relatives that die as a result of trait distorter suppression (c sup (1+b( )/2)).

Consequences of suppressor spread for organism phenotype
We ask what frequency the trait distorter (D 1 ) and suppressor (S 1 ) will reach after initial suppressor (S 1 ) invasion. We assume that the suppressor is introduced from rarity when the trait distorter has reached the population frequency given by f (x 00 →f, x 10 →1-f, {x 01 ,x 11 }→0). We numerically iterate Supplementary Equations 8-11, over successive generations, until equilibrium has been reached. At equilibrium, for all parameter combinations (f,t,c sup ,c trait ), the suppressor reaches an internal equilibrium and the trait distorter is lost from the population (x 00 *+x 01 *=1, x 10 *=0, x 11 *=0). This equilibrium arises because trait distorter-presence gives the suppressor (S 1 ) a selective advantage, leading to high suppressor frequency, which in turn reverses the selective advantage of the trait distorter (D 1 ), leading to trait distorter loss and suppressor equilibration (Figure 3b).

Invasion of a mutant trait distorter
We ask when a mutant trait distorter (D 2 ) of strength will invade against a resident trait distorter (D 1 ) that is unsuppressed and at fixation ( ≠ ). We write recursions detailing the generational frequency changes in the six possible gametes, D 0 /S 0 , D 0 /S 1 , D 1 /S 0 , D 1 /S 1 , D 2 /S 0 , D 2 /S 1 , with current generation frequencies denoted respectively by x 00 , x 01 , x 10 , x 11 , x 20 , x 21 , and next generation frequencies denoted with an appended dash ('): is the average fitness of individuals in the current generation, and equals the sum of the right-hand side of the system of equations. The mutant trait distorter can invade when the equilibrium given by x 00 *=0, x 01 *=0, x 10 *=1, x 11 *=0, x 20 *=0, x 21 *=0 is unstable, which occurs when the leading eigenvalue of the Jacobian stability matrix for this equilibrium is greater than one. Testing for stability in this way, we find that the mutant trait distorter invades from rarity when Δb>Δc, where Δb=b( )-b( ),

Δc=c( )-c( ).
The implication is that mutant trait distorters will invade if they approach a 'target' strength (k target ), corresponding to the level of trait distortion that would maximise the fitness of the gene 53 , at which: In the absence of suppression, this target is the equilibrium level of distortion ( *= target ).

Equilibrium trait distorter and suppressor frequencies (long term evolution)
We ask what equilibrium state will arise after the invasion of a mutant trait distorter.
We assume that the mutant trait distorter (D 2 ) is introduced from rarity when the resident trait distorter (D 1 ) has reached the population frequency given by q. We Given that mutant trait distorters will invade if they approach a 'target' strength (k target ), if the individual level cost associated with this target level of distortion (c(k target )) is sufficiently high relative to the cost of suppression (c sup ), so that the following condition is satisfied, the equilibrium level of distortion will be *=0: c sup (1+b(k target )/2)<c(k target )-b(k target )/2. If this condition is not satisfied the equilibrium level of distortion will be *=k target (Figure 3e).

Discussion
Although there have been no direct tests, our predictions are consistent with data on imprinted genes. There is no evidence that traits influenced by imprinted genes deviate significantly from individual level optima under normal development 52 .
Significant deviation is only observed when imprinted genes are deleted, implying that imprinted trait distorters are either suppressed, or counterbalanced by oppositely imprinted genes pulling the trait in the opposite direction 63,76 . Furthermore, although many different parties (coreplicons) have vested interests in genomic imprinting, our analysis suggests why the unimprinted majority could win control 77 . This could help explain both why imprinting appears to be relatively rare within the genome 50,54,66 , and why imprints are removed and re-added every generation in mice, handing control of genomic patterns of imprinting to unimprinted genes 54,77,78 .

Supplementary Note 5 Horizontal Gene Transfer and Public Goods
Bacteria produce and excrete many extracellular factors that provide a benefit to the local population of cells and so can be thought of as public goods 79 . We modelled the evolution of investment in a public good in a large, clonally reproducing population. We assume a public good that costs c to produce, and provides a benefit b to the group. We assume a well-mixed population, meaning genetic relatedness at vertically inherited genes is zero (R vertical =0), and so indirect fitness benefits cannot favour public good production at the individual level (R vertical b=0<c) 51,74,75,80,81 . There are also direct fitness benefits of public good production, which arise because producers of public goods receive a fraction of the benefit (b) they confer on the group, but we assume that the population is sufficiently large and well mixed that direct fitness benefits cannot favour public good production at the individual level.
This means that public good production is disfavoured at the individual level.
We consider a selfish genetic element that resides on a mobile locus (horizontal & vertical transmission) and may gain a propagation advantage by upregulating individual public goods investment [82][83][84][85][86] . The genes that do not gain a propagation advantage from increased public goods production comprise the non-mobile loci (vertical transmission). Non-mobile loci comprise most of the genome, and so constitute the majority within the parliament of genes. We focus our analyses on when a mobile trait distorter and a non-mobile suppressor can spread. The purpose of this model is to illustrate how selection will act on selfish mobile genes and their suppressors.

Model assumptions
We consider a public goods gene (D 1 ) that competes against a non-trait distorter (D 0 ) at a mobile locus. The trait distorter (D 1 ) increases public goods investment by some amount ( ), at a fitness cost to the individual (c( )) and benefit shared within the group (b( )>c( )) that are both monotonically increasing functions of investment We assume the following lifecycle. Individuals in a large, effectively infinite, population randomly aggregate into smaller social groups (patches). Individuals then randomly pair up within their patch, and horizontal gene transfer occurs, with certainty, within pairs that are genetically dissimilar at the mobile locus 87,88 .
Alternative assumptions about the probability of horizontal gene transfer do not change our qualitative results (Scott, unpublished). Only one allele at the mobile locus is transferrable in each patch, and each allele at the mobile locus is transferrable in an equal proportion of patches. We denote those patches in which the non-trait distorter (D 0 ) is transferred as "type 1" patches, and those patches in which the trait distorter (D 1 ) is transferred as "type 2" patches. Individuals may then produce public goods, which are shared within patches, before the population remerges, and individuals reproduce in proportion to their fitness before dying (nonoverlapping generations), with progeny inheriting all alleles from their parent (perfect inheritance).

Trait Distorter Spread
We respectively take and ′′ as the population frequency of the trait distorter (D 1 ) at the start of two consecutive generations, and O l as the average frequency of the trait distorter (D 1 ) in patches of type j after horizontal gene transfer (j∈{1,2}), with OG3 l = + (1-) and OGV l = -(1-). The population frequency of the trait distorter in the latter generation ( ′′) is: where the denominator denotes average individual fitness. Stable equilibria occur for = ll = * and

Spread of a suppressor and consequences for the organism
We consider a suppressor allele (S 1 ) that competes against a non-suppressor (S 0 ) at a non-mobile locus. Suppressors of mobile elements are widespread and may silence elements before they are translated, through gene methylation and RNAi 92 .
We numerically iterated these recursions, for a range of parameter values (b,c,c sup ), and for different initial frequencies of the trait distorter (D 1 ) to find the trait distorter (D 1 ) and suppressor (S 1 ) frequencies at equilibrium, and the resulting average trait distortion (x 10 ). We found that, when distortion is weak (low ), suppressors are not favoured, but the trait distorter has relatively little impact at the individual level. For example, when the cost of suppression is c sup =0.05, and the cost and benefit of public goods production are c HGT = (linear cost) and b HGT =8 0.9 (relatively large, decelerating benefit), unsuppressed trait distorters cannot upregulate public goods by more than =c HGT =0.08 (Figure 3c).
We found that the suppressor invades from rarity, in response to a trait distorter at equilibrium 3:
We assume that trait distorter strength ( ) is initially low, and introduce successive mutant trait distorters (D 2 ), each deviating only slightly from the trait distorters from which they are derived, until one fails to displace the resident trait distorter. The strength of the non-invadable allele gives the equilibrium level of distortion underweak selection 49 . We find that, if the rate of decrease in marginal cooperative benefits − " C i "$ C is high relative to the rate of increase in marginal cooperative costs " C , "$ C , distortion ( *) evolves to be low, and the suppressor (S 1 ) may not invade.
Otherwise, stronger trait distorters (D 2 ) successively invade, bringing trait distorter strength above the threshold level at which the suppressor (S 1 ) spreads, with the end result that trait distorters are suppressed and lost from the population, with no trait distortion at equilibrium ( *=0) (Figure 3f).

Discussion
We lack empirical data that would allow us to test our model of mobile public goods genes. Genes associated with extracellular traits, which could represent cooperative public goods, appear to be overrepresented on mobile elements 91 . However, this may be nothing to do with cooperation per se -genes involved with adaptation to new environments might be more likely to be horizontally acquired, and extracellular traits might be especially important in adaptation to new environments [84][85][86][87]94 .

Supplementary Note 6 Suppressor Conditionality
We assumed in our Equilibrium and Dynamics models (Main Text) that suppressors are only expressed in the presence of their target trait distorters (facultative). We generalise our Equilibrium models (Main Text) by defining the parameter y as the "conditionality" of the suppressor (0≤y≤1). For full conditionality (y=1), the suppressor is facultative. For zero conditionality (y=0), the suppressor is obligate, meaning it is fully expressed when the trait distorter is absent. For intermediate conditionality (0<y<1), the suppressor is partially expressed when the trait distorter is absent. As a result, the suppressor incurs a cost of c sup on the individual when the trait distorter is present, and a cost of (1-y)*c sup when the trait distorter is absent.
In the facultative suppressor case (y=1), considered in the main text, the fitness of D 0 /S 0 D 0 /S 1 and D 0 /S 1 D 0 /S 1 individuals, which have a suppressor but not a trait distorter, is 1. Now, in the generalised scenario, the fitness of these individuals is: Individual trait distortion is plotted for three different parameter regimes. The first parameter regime is plotted as a reference, and represented by the red line (γ=10 6 ; ρ D1 =10 -11 ,ρ S1 =10 -11 ). The second parameter regime has a half-sized genome size relative to the reference, with an unchanged baseline mutation rate (γ=5*10 5 ; ρ D1 =10 -11 ,ρ S1 =10 -11 ). The third parameter regime has a half-sized baseline mutation rate relative to reference, with an unchanged genome size (γ=10 6 ; ρ D1 =5*10 -10 ,ρ S1 =5*10 -10 ). Proportional changes in genome size (γ) have identical effects to proportional changes in baseline mutation rate (ρ), and therefore, the second and third parameter regimes lead to the same outcome, which is represented by the purple line.

(b) High-sophistication trait distorters
(ρ S1 /ρ D1L /ρ D1H ) are not exceedingly high. As a result, the increased trait distortion achieved by highsophistication trait distorters as a result of productive interaction whilst co-segregating is roughly offset by the increased trait distortion achieved by low-sophistication trait distorters as a result of higher mutational accessibility. This is why trait distorter sophistication has a relatively small effect on average trait distortion, as can be seen by comparing (a) and (b). and their dedicated suppressors (S 1 ) are continuously introduced. Trait distorters and suppressors vary continuously in strength, and are free to evolve. Each block is the average of 30 simulation runs, each over T end =30,000 generations. Average trait distortion increases with cabal size (θ). Lowsophistication trait distorters interact counter-productively whilst co-segregating, and so average trait distortion decreases with genome size (γ). High-sophistication trait distorters interact productively whilst co-segregating, and so average trait distortion increases with genome size (c sup =0.01, t=k, c trait =Dist/2, ρ S1 =4*10 -9 , ρ D1L =4*10 -9 , ρ D1H =2*10 -9 ).

Cytoplasmic element & X chromosome cabal in humans
• In humans, the only cytoplasmic elements that carry transcribed genes are the mitochondria. Human mitochondria bear 37 genes 107 .
• The number of genes on the human X chromosome (protein coding genes plus non-coding RNA genes) is 1515 (Ensembl release 97 -July 2019) 108 .
• The total number of genes in the human genome is 42611 109 .

Plasmid cabal in Escherichia coli
• Different E.coli individuals will carry different numbers and types of plasmids.
We therefore draw a random sample of 139 E.coli strains from the 875 E.coli strains for which complete genome sequences are publicly available (Genbank Refseq; ftp://ftp.ncbi.nih.gov/genomes).
• For each strain in our sample, we calculate proportional cabal size by counting the number of genes that are on plasmids, and dividing this by the total number of genes in the individual.

Relation to Eshel (1984) and Eshel (1985)
Eshel 101 highlighted the conflict between individual fitness maximisation and selfish genetic elements. Eshel 98 also pointed out that suppressors of simple meiotic drivers will spread as long as (i) the driver is below fixation, and (ii) the suppressor is unlinked. In doing so, he pointed out that fair meiosis can be stabilised if there is free-recombination between genes. There are a few key differences between our models and the models of meiotic drive suppression developed by Eshel 98  In contrast, if the individual costs stem directly from the expression of the driver, rather than any unlinked genes, recoverable costs of drive are appropriate 97 . In this case, suppression of the driver also removes the individual level cost. This scenario applies to cases where the meiotic driver is not just a meiotic driver per se, but rather, a "trait distorter" that gains the ability to drive at meiosis as a consequence of distorting an organism trait. Our models demonstrate that trait distorters, unlike the simple meiotic drivers considered by Eshel 98 , can promote suppressor spread even after they have gone to fixation. As a result, costly trait distorters (c trait ( )>c sup ) will be suppressed in the evolutionary long term, even if they can reach fixation in the evolutionary short term.
Our models also expand on Eshel 98 in addressing how likely it is that a suppressor will spread. Eshel 98 demonstrated that a cost-free unlinked suppressor can spread in response to a costly meiotic driver. Our models account for a cost of suppression, to show that the likelihood that a trait distorter is suppressed correlates with the costliness of the driver to the individual, which serves to limit deviation from individual fitness maximisation.

Relation to Cosmides & Tooby (1981): Coreplicons, Cabal & Commonwealth
An anonymous referee suggested that, were we to extend our models to permit trait distorter introduction at any locus in the genome, rather than at a subsection of loci that are chosen a priori (cabal), the resulting trait distortion may be greater. In this section, we explicitly clarify why cabals are defined a priori by showing how they follow from the 'coreplicon' concept introduced by Cosmides & Tooby (1981) 6 . We then undertake this suggested modelling extension, showing that the scenario it depicts: (i)) leads to the same results as our models, but (ii) is biologically implausible.

The coreplicon concept
Cosmides & Tooby (1981) 6 pointed out that we can divide a genome up into 'coreplicons'. A coreplicon comprises a collection of loci within the genome that are inherited in the same way, and so share the same maximand. Autosomal loci and X chromosome loci do not form part of the same coreplicon, because the former are transmitted equally through males and females and the latter are transmitted predominantly through females. Coreplicons are assigned, a priori, based on inheritance patterns -not on the basis of trait-affecting alleles that have been observed empirically or within the context of a theoretical model. The coreplicon concept has been employed regularly in the study of intragenomic conflict and evolutionary adaptation 52,110,111 .
Coreplicons have the potential to be in conflict over organism traits. If, for a given trait, loci within coreplicon X are propagated best when the organism trait value is x, but loci within coreplicon Y are propagated best when the organism trait value is y, then the coreplicons have the potential to be in conflict if the current organism trait value is between x and y. This evolutionary battleground ('potential conflict') is derived a priori based on a purely theoretical, first principles optimisation approach, as detailed in Gardner & Úbeda (2017) 53 . The evolutionary battleground for conflict is independent of whether any conflicting, trait-affecting alleles actually exist at any of the loci ('actual conflict') 112 .
Sometimes, different coreplicons may form alliances, because they both benefit from a particular kind of trait distortion. For instance, if coreplicon Z is propagated best when the organism trait is z, where z lies in between x and y but is closer to x, coreplicon Z may ally with coreplicon X if the current organism trait value lies at y.
Though the coreplicons may ally here, they may disagree over the form of other traits. This is where the concepts of 'cabal' and 'commonwealth' are useful. For example, in humans, cytoplasmic elements are inherited exclusively through females, and X chromosomes are inherited predominantly though not exclusively through females, meaning they represent different coreplicons. However, the coreplicons form a cabal with respect to sex ratio, favouring a female bias.

The cabal / commonwealth concept
The cabal comprises all coreplicons that favour the distortion of a particular trait, along a particular axis, in a particular direction, away from individual fitness maximisation. The commonwealth comprises the remaining coreplicons. Cabals and commonwealths are therefore trait-specific. It is useful, when analysing a specific trait, to partition the genome along these lines, because it is the resolution of this conflict -between the cabal and commonwealth -that gives the evolved deviation of a trait from individual fitness maximisation. Cabals and commonwealths are defined a priori, by partitioning and summing up the coreplicons that respectively disfavour and favour the trait distortion under study.
Our models address whether selfish genetic elements can distort organism traits away from individual fitness maximisation, where the 'individual' here really means the majority interest within the parliament of genes 111 . This is why we only considered cabal sizes of up to a half. If the cabal was greater than half of the genome, it would reflect the majority interest within the parliament, so would cease to be a cabal. Our models therefore consider the full range of scenarios depicting potential distortion of organism traits from individual fitness maximisation.

Modelling extension
Having justified our approach, which defines the cabal and commonwealth a priori, we now undertake the theoretical exercise suggested by the anonymous reviewer, and allow trait distorters to arise at any locus in the genome, and not just at an a priori subsection (cabal).
to be a cabal), it must logically be the case that (at least two) different types of trait distortion are favoured across the genome.
We now assume that the rate of trait distorter introduction, per generation, per locus, in some genome within the population, is ρ D1. We take the number of loci within a genome to be γ, which means that new trait distorters are introduced into the population every 1/(ρ D1 γ) generations. This is a faster rate than previously considered in our Dynamics models, which was dependent on proportional cabal size (1/(θρ D1 γ)). As was the case in our Dynamics models, the suppressor of a given trait distorter will be expected to arise after a lag of (1/(1-θ)ρ S1 γ) generations, where ρ S1 is the rate of suppressor introduction, per generation, per locus, for any locus situated outside of the target trait distorter's cabal.
So in this new theoretical scenario, compared to our previous Dynamics models, trait distorters are arising at a faster rate, but they are suppressed at the same rate as before. This would apparently suggest that average trait distortion should be more appreciable in this new scenario. However, this is not the case. The rate that trait distorters that distort a given trait are introduced is the same as our Dynamics models (1/(θρ D1 γ)). This new formulation appears to favour increased deviation of organisms from individual fitness maximisation, but this is not the case, as the new scenario is implicitly considering the distortion of multiple traits simultaneously.
The distortion of any given trait from individual fitness maximisation in this new theoretical scenario is still accurately given by our Dynamics models. Specifically, in this new theoretical scenario, if trait distorters belonging to the same cabal arise at new loci in the genome very slowly compared to the rate at which gene frequencies equilibrate after trait distorter / suppressor introduction (separation of timescales), the trait that the cabal is attempting to distort assumes an average value, in individuals over evolutionary time, given by Equation 6 in the main text. If trait distorters belonging to the same cabal arise more quickly than this, such that they may co-segregate, the trait that the cabal is attempting to distort assumes an average value that is given by the simulation results of our Dynamics models. This holds regardless of the overall rate of trait distorter introduction across the whole genome.
Therefore, the scenario in which trait distorters may arise at any locus in the genome implicitly refers to a scenario where multiple traits are being distorted and restored simultaneously, in the context of a single model. However, there is no reason why the evolution of distortion and suppression at one trait should be affected by the evolution of distortion and suppression at any other trait. Consequently, the results of the new theoretical scenario converge on our Dynamics models once we consider a single type of trait distortion in isolation. Our models cover the full range of scenarios depicting potential distortion of an organism trait from individual fitness maximisation.
The modelling extension, as well as being biologically implausible, provides no additional insight.