Cultural hitchhiking and competition between patrilineal kin groups explain the post-Neolithic Y-chromosome bottleneck

In human populations, changes in genetic variation are driven not only by genetic processes, but can also arise from cultural or social changes. An abrupt population bottleneck specific to human males has been inferred across several Old World (Africa, Europe, Asia) populations 5000–7000 BP. Here, bringing together anthropological theory, recent population genomic studies and mathematical models, we propose a sociocultural hypothesis, involving the formation of patrilineal kin groups and intergroup competition among these groups. Our analysis shows that this sociocultural hypothesis can explain the inference of a population bottleneck. We also show that our hypothesis is consistent with current findings from the archaeogenetics of Old World Eurasia, and is important for conceptions of cultural and social evolution in prehistory.


Supplementary Note 1: The Modified Lotka-Volterra Model and Proof of Claim in Results
In Results of the main text we introduced the modified Lotka-Volterra competition model, which considers n 2 populations of males competing with each other and reproducing with a common population of females. Let Y i denote the population size of male population i (i = 1, . . . , n), and let X denote the population size of females. We assume the rate of growth of each male population is constrained by the size of all competing male populations: when an individual in population i meets an individual from population j (with j = i), death of an individual in population i occurs at an instantaneous rate c ij and death of an individual in population j occurs at rate c ji . We assume population growth to be self-restraining, so that c ii is the instantaneous rate of death of an individual of population i meeting another individual from population i. Further, each male population grows in proportion to the number of females X, with male population i having a rate constant of r i . Finally, we assume that X grows logistically with carrying capacity and growth rate parameters K and a respectively. This modified Lotka-Volterra competition model is described by the following system of n + 1 differential equations, with the following assumptions: 1. The parameters {r i } n i=1 , {c ij } 1 i,j n , a and K are positive numbers.
2. Competition between male populations is more intense than self-restraint: c ij > c kk for all i, j, k ∈ [n] with i = j.
We claim in the Results section of the main text that when n = 2, one population of males almost always takes over the other. The proof is straightforward, and we include it here for completeness.
Proof of Claim. First, we verify that eq. 1 of the main text admits no limit cycles. Note that X is an isolated autonomous system with unique solution given by X(t) = KX(0) X(0) + (K − X(0))exp (−at) .
Hence, if X(0) < K, then X(t) approaches the plane X = K. If a limit cycle exists, then it has to involve variables Y 1 and Y 2 with X = K. However, this is impossible as two-dimensional Lotka-Volterra systems admit no limit cycles (Hofbauer and Sigmund, 1998).
To determine the local stability of each of the equilibria, we compute the Jacobian of eq. 1 of the main text, J(y 1 , y 2 , x) =   −2c 11 y 1 − c 12 y 2 + r 1 x −c 12 y 1 r 1 y 1 −c 21 y 2 −2c 22 y 2 − c 21 y 1 + r 2 x r 2 y 2 0 0 a − 2ax The boundary equilibria (0, r 2 K c 22 , K) and ( r 1 K c 11 , 0, K) produce three negative eigenvalues, and hence are locally stable. Equilibrium (0, 0, 0) has two zero eigenvalues and one positive eigenvalue (along the y axis); it is non-hyperbolic. Equilibrium (0, 0, K) has two positive eigenvalues and one negative eigenvalue (along the y axis) and hence is an unstable saddle. The last equilibrium, ( r 1 c 22 −r 2 c 12 c 11 c 22 −c 12 c 21 K, r 2 c 11 −r 1 c 21 c 11 c 22 −c 12 c 21 K, K), is an unstable saddle with one positive eigenvalue and two negative eigenvalues.
Thus, eq. 1 of the main text has four hyperbolic equilibria and one non-hyperbolic equilibrium, with exactly two stable equilibria, (0, r 2 K c 22 , K) and ( r 1 K c 11 , 0, K). Because each unstable saddle gives rise to a locally stable submanifold, positive flows originating in (R + >0 ) 3 either (a) approach (0, r 2 K c 22 , K), or (b) approach ( r 1 K c 11 , 0, K), or (c) approach (0, 0, K) if they originated along the stable submanifold of dimension 1, or (d) approach ( r 1 c 22 −r 2 c 12 c 11 c 22 −c 12 c 21 K, r 2 c 11 −r 1 c 21 c 11 c 22 −c 12 c 21 K, K) if it originated along the stable submanifold of dimension 2. The non-hyperbolicity of (0, 0, 0) does not affect the conclusion, since all flows originating in (R + >0 ) 3 satisfy lim t→+∞ X(t) = K. Because the Lebesgue measure of each of the stable submanifolds is 0, we conclude that flows obeying eq. 1 of the main text that originate in (R + >0 ) 3 converge to either (0, r 2 K c 22 , K) or ( r 1 K c 11 , 0, K) almost everywhere. If (S2) does not hold, then the interior equilibrium does not lie in (R + >0 ) 3 . If r 1 /r 2 c 11 /c 21 , then only (0, r 2 K c 22 , K) is a stable sink while the other three equilibria are unstable saddles; in other words, flows originating in (R + >0 ) 3 lead to extinction of male population 1 almost everywhere. On the other hand, if r 1 /r 2 c 12 /c 22 , then only ( r 1 K c 11 , 0, K) is a stable sink while the other three equilibria are unstable saddles; in other words, flows originating in (R + >0 ) 3 lead to extinction of male population 2 almost everywhere. This completes the proof. 3

Supplementary Note 2: Complete Specification of Computational Grid Model
The pseudocode in Supplementary Fig. 1 below completely describes the Computational Grid Model that we introduced in the Results section of the main text. The full code can be accessed online via the following link: (to be provided). 2. For c ∈ C, h ∈ H, t T , N c,h (t) denotes the number of individuals belonging to cultural group c and haplogroup h at generation t.
-If each cultural group is initially completely non-patrilineal, then N c,h (0) = N total /|C| 2 for 1 c, h |C| and N c,h (0) = 0 otherwise. If each cultural group initially represents a patrilineal kinship group, then N c,h (0) = N total /|C| > 0 for c = h and c |C| and N c,h (0) otherwise.
3. For c ∈ C, h ∈ H, t ∈ T , the mean number of individuals killed in class (c, h) at generation t follows a Poisson distribution with mean here, ρ is a time-homogeneous scaling parameter that adjusts the mean proportion of a population killed due to competition.
4. µ is the rate of mutation per individual per generation of a majority haplogroup within each cultural group to any other haplogroup.
-In our simulations, we assume µ is time-homogeneous. We set µ = 2/1000. This is akin to the scenario in which, for every population of 1000 male individuals, every new generation produces about two new haplogroup-defining mutants.
(i) Let hc be the haplogroup maximizing x c,h (t).

(Scaling or Regeneration
-If h∈H N c,h (t + 1) = 0 (i.e., the cultural group is empty), do: (i) Find the cultural group c max that maximizes the sum h∈H N c,h (t + 1).
(ii) For each h ∈ H, move 1 2 N c max ,h (t + 1) from grid cell (c max , h) to grid cell (c, h).

End of Algorithm.
Supplementary Figure 1: Algorithmic specification of Computational Grid Model.
We elaborate on two features of our model.
1. The mean number of a group c killed per generation is proportional to the product of the total number of individuals not belonging to c and the square root of the number of individuals belonging to c (see eq. S3 of Supplementary Fig. 1). This setup differs from that of standard competition dynamics (e.g., in the Lotka-Volterra model), where the number of deaths is proportional to the number of collisions that can occur between an individual from the group and an individual from its set of competitors. We choose the square root because it results in larger cultural groups incurring fewer deaths on average, as a proportion of their sizerecall that as n increases, √ n/n = 1/ √ n decreases and approaches zero.
2. The haplogroup mutation rate measures the rate of "formation" of new haplogroups, which themselves arise from mutations into new haplotypes and subsequent classification. Because haplogroup classification is post-hoc and defined by groups of Y chromosomal haplotypes that share common sequence motifs at selected regions within the Y chromosome, determining the haplogroup mutation rate proves challenging.
We suggest two approaches to approximating µ that rely on Y chromosomal data and the Y phylogeny. The first approach -the one we took to obtain the approximation µ = 0.002starts with Y chromosome mutation rates. Taking the point mutation rate per year to be ≈ 7-9 × 10 −10 (Helgason et al., 2015) and the number of years constituting one generation to be 25 (see footnote about number of generations on the next page), the mutation rate per nucleotide per generation, m, is 1.75-2.25×10 −8 . Assuming each nucleotide along the complete Y chromosome mutates at roughly the same rate, the expected number of mutations along the Y chromosome per generation is m× length of Y chromosome ‡ = 0.9975-1.425, which is centered slightly above 1. Therefore, on average about 1-2 new haplotypes will emerge from point mutation in one generation, and in a population of 1000 males their offspring are expected to generate 1000 new haplotypes. However, the number of these new haplotypes that constitute haplogroups is significantly (at least of the order of 10 6 ) smaller, as observed by counting the number of haplogroups on the Y tree, which spans more than 200, 000 years and grows in a population whose size is of the order of 10 6 . Our choice of µ reflects this observation, and we tested mutation rates µ that are of smaller orders, e.g., 10 −4 . The results of our tests yielded qualitatively similar graphs to the ones presented here. Our choice µ = 0.002 also happens to agree with estimates of STR mutation rates on Y chromosomes obtained from father-son genetic studies (see Table 2 of Balanovsky, 2017).
The second approach uses the Y tree, which is available at the International Society of Genetic Genealogy (ISOGG) or YFull (www.yfull.com). One may approximate the haplogroup mutation rate by estimating the average number of new haplogroups appearing per generation in a population of 1000 males over the entire length of the Y tree. The length of the Y tree is ≈ 200, 000 years, or ≈ 6, 500-8, 000 generations with one generation spanning 25-30 years, and coalesces in an effective population size of the order of 10 6 . Denoting by N e (t) the effective population size at time t (before present) and nh(t) the number of new haplogroups appearing at time t, one evaluates the integral which computes the average number of new haplogroups arising per generation (= 25 years in formula above) in a population of 1000 males. ‡ approximately 5.7 × 10 7 nucleotides

Supplementary Note 3: Details of Grid Model Simulation and Results
We present results obtained from 18 distinct simulations, each sharing the following parametrization.
• |C| = 100 cultural groups (C = {1, 2, . . . , 100}) • |H| = 500 haplogroups (H = {1, 2, . . . , 500}) • Number of generations T = 60 • µ = 0.002 We take the length of one generation, which can be interpreted as either generation time or lifespan, to be between 25 and 30 years. ‡ By setting T = 60 for our simulations, we are thus modeling the process over a time period of 1500-1800 years, a time length that matches the duration of the bottleneck inferred by Karmin et al. (2015). We set |H| = 500 = 5 × |C| to reflect that there are many more new haplogroups that can arise from mutation than there are cultural groups. The 18 simulations are divided into two sets of 9 simulations, the first set simulating conditions of patrilineality and the second simulating complete non-patrilineality. To simulate conditions of patrilineality, we assume that each cultural group c ∈ C consists of only one haplogroup at generation T = 0: specifically, for c ∈ C, set N c,c (0) = N total /100 and N c,h (0) = 0 for all h = c. To simulate conditions of complete non-patrilineality, we assume that each cultural group c ∈ C consists of equal numbers of each haplogroup up to haplogroup |C| = 100 at generation T = 0: For both the patrilineal and non-patrilineal cases, the 9 simulations specify the values of the remaining two parameters, N total and ρ. They take the following values: 1. (Total Population Size) N total ∈ {10000, 20000, 30000} 2. (Death Rate) ρ, scaled to force either a 15%, 25% or 50% average proportion of population decline in cultural group 1 in the beginning (i.e., t = 0).
As an example of how we choose the value of ρ, assume that N total = 10000 and the desired proportion of population decline in the beginning is 15%. In the beginning, the total number of deaths in cultural group 1 is D, where (see eq. S3 in Supplementary Fig. 1) To obtain a 15% proportion of population decline on average, we must have E[D] = (99×10 3 )ρ = 15. Therefore ρ = 15/99 × 10 −3 ≈ 15 × 10 −5 . The values of ρ for the rest of the choices that we used for our simulations are specified in Table 1. ‡ This is the range of generation times, or generation lengths, typically reported for human male populations, and it appears to be relatively stable over thousands of years. For example, pg. 229 of Barceló and Del Castillo (2016) states that the generation time for modern humans is 25 years, and a recent study relying on ancient genomes (Moorjani et al., 2016)  To summarize, our model features six parameters, four of which are shared throughout all 18 simulations (|C|, |H|, T and µ), and the remaining two range within a list of values to allow comparison of dynamics across different cultural group sizes (the values of N total itself) as well as different competition rates (the values of ρ, set as described above).
Supplementary Table 1 lists the 18 parametrizations that correspond to Supplementary Figs 2-19. Each figure consists of (i) a panel of graphs showing the average dynamics of the number of distinct haplogroups, haplogroup diversity, the number of distinct cultural groups, and cultural group diversity, over 100 runs sharing the specified parametrization; and (ii) graphs of the average dynamics of each of the 500 haplogroups over 100 runs, as well as the dynamics of each of the 500 haplogroups for one run. Note that because each run possibly produces a different haplogroup that dominates the distribution of existing haplogroups, we rank the haplogroups in terms of their levels of abundancy in the population after 60 generations for every run, before computing the average haplogroup dynamics across the 100 runs.
To measure diversity, we used the normalized Shannon entropy index H norm , which is defined as follows. For a population consisting of I distinct types, with type i present at frequency p i ∈ (0, 1) for i = 1, . . . , I, Note that H norm ranges in [0, 1] and increases as the vector of frequencies (p i ) I i=1 becomes increasingly homogeneous, with H norm = 1 if and only if p 1 = . . . = p I = 1/I and H norm = 0 if and only if, for any i = 1, . . . , I, p i = 1 and p j = 0 for each j = i. Table 1: List of pairs of parameters (N total , ρ) corresponding to choices of total population size and ρ parameter. The ρ parameter is calculated based on the average proportion of deaths (15%, 25% or 50%) in the first cultural group in the first generation, and is fixed after the first generation. The initial configuration (patrilineal or non-patrilineal) gives a total of 18 distinct simulations, whose figure numbers are listed in the table.

Supplementary Note 4: Sensitivity to Model Modifications
Our computational model encompasses our hypothesis about the sociocultural process that resulted in the bottleneck. Treating it as a "null model," we describe how our model can be modified to reflect different anthropological and sociological conditions. These include (1) changing the ratio of the number of cultural groups to the number of haplogroups (|C| : |H| ratio), (2) incorporating selection among cultural groups, and (3) including fusion of cultural groups. In each modification parameters are changed. We refer to these model modification parameters as "hyperparameters" to distinguish them from the parameters described in Supplementary Note 3. We elaborate on these hyperparameter changes below.
1. (Changing the |C| : |H| Ratio) In our null hypothesis, we fixed |C| : |H| = 1 : 5. This ratio can be modified; as an example we present simulations in which |C| = 100 but |H| = 2500, shifting the ratio downward, which more closely mirrors an infinite alleles model.

(Incorporating Selection)
Our null hypothesis assumes no fitness differences among cultural groups. It is very plausible that in addition to competition in which size alone confers survival advantage -so-called selection for large groups (see Henrich, 2004) -differences in cultural innovations or toolkits act as a secondary mechanism of selection among these groups. To incorporate such selection effects, we modify the method by which the numbers killed are generated, i.e., eq. (3) of Supplementary Fig. 1. Instead of where {α c : c ∈ C} are parameters describing the fitness differentials among cultural groups.
As an example, we assume that the cultural groups are linearly ordered in terms of their relative fitnesses, with the first cultural group being the fittest and subsequent groups' fitnesses declining. Because the fittest cultural group suffers from the fewest number of deaths, α c increases in c. We consider α c = 1.00 + (c − 1) × 0.01; in other words normalizing the fitness differential so that the fittest cultural group (namely, cultural group c = 1) suffers the same mean proportion of deaths as it does in our null model and subsequent cultural groups suffer a greater mean proportion of deaths. Our choice of a linear function for α c is for simplicity.

(Including Group Fusion)
Step 4 of the Computational Grid Model (Supplementary Fig. 1) simply removes inviably small cultural groups. In reality, however, small cultural groups may be assimilated into larger ones. We can modify Step 4 to incorporate such group fusion mechanisms; as an example we use a stepping-stone approach, moving individuals in each small cultural group (i.e., whose size is < 20) to either the cultural group above it or below it in our grid. If this occurs to the last cultural group, we allow movement to the first cultural group; if this occurs to the first cultural group, we allow movement to the last cultural group. Note that in moving these individuals, their haplogroups do not change -in other words, we are effectively shifting a row of the grid up or down.
To demonstrate the effects of changing the three hyperparameters, we perform simulations incorporating (1) exactly one of the three hyperparameter changes; (2) two of the three hyperparameter changes; and (3) all three hyperparameter changes. We do so using the parameter setting (N total , ρ) = (20000, 18 × 10 −5 ), which is the case of 100 cultural groups each having size 200 and with mean proportion of deaths in the first generation approximately 25%. Note that Supplementary Figs 10 and 11 show the dynamics under this parametrization for our null hypothesis. Supplementary Table 2 lists figures corresponding to simulations performed with various combinations of hyperparameter changes. As in Supplementary Figs 2-19, we produce six graphs summarizing the dynamics of cultural group and haplogroup diversity and richness, as well as average and single run dynamics of the individual haplogroups in H. Additionally, for each combination of modifications, we present together results of simulations for both nonpatrilineal (NPT) and patrilineal (PT) configurations. We note the following with regard to changing hyperparameters. First, by changing the |C| : |H| ratios ( Supplementary Fig. 20), we observe no changes in the qualitative outcome of the simulations: patrilineal configurations lead to the death of most lineages and the dominance of fewtypically less than four -major lineages, while non-patrilineality leads to the survival of multiple haplogroup lineages, with lineage sizes all of the same order of magnitude. Second, by incorporating selection, via a culture-related fitness differential ( Supplementary Fig. 21), we find that there is a quicker spreading out of haplogroup lineages, and these lineages eventually vanish or persist in the same qualitative manner as they do in our original simulations. Moreover, in the patrilineal initial configuration case, not only is there less variation in haplogroup richness and haplogroup diversity, but also fewer haplogroup lineages persist in large numbers, either on average or in a single run. Third, including group fusion through a stepping-stone mechanism ( Supplementary Fig. 22) produces no qualitative changes in simulation outcomes for both patrilineal and non-patrilineal initial configurations.

Supplementary
We observe little change despite applying multiple modifications at once, as exemplified by the dynamics shown in Supplementary Figs 23-26. The qualitative outcomes of simulations starting from either patrilineal or non-patrilineal configurations appear robust to the modifications described above.
These changes in the model sharpen the questions we can ask about the processes that drove the persistence of the bottleneck. For instance, the strength of selection pressure on cultural groups -captured by the gradient of the fitness differential α c and its choice (not necessarily a linear function, as assumed in our simulations) -can be studied as a covariate influencing the speed of divergence of haplogroup sizes. This helps lay the foundation for the development of a statistical framework to infer the degree of selection (or fusion) present in populations that historically experienced intergroup competition and whose present-day genome samples are available.

Supplementary Note 5: Stochastic Modifications to the Modified Lotka-Volterra Model
The modified Lotka-Volterra model (Supplementary Note 1) is arguably a poor representation of human population dynamics, since external factors not captured by the deterministic process may add a layer of uncertainty to the outcomes, even if the growth rate and competition parameters are known. We may modify eq. S1 to include effects of randomness by appending Brownian motions to each population in the system, so that the dynamics of each population are driven by an additional "random noise" proportional to the square root of its size. Mathematically, where {γ i } 1 i n and ν are positive constants and {B i (t) : t 0} 1 i n is a collection of mutually independent standard Wiener processes. We refer to eq. S4 as the modified stochastic Lotka-Volterra competition (MSLVC) process; it extends the stochastic Lotka-Volterra competition process, a model in which there is no female population X(t) and the rates r i X(t) are replaced by positive growth rates r i . The stochastic Lotka-Volterra competition process itself arises from a family of stochastic differential equations known as generalized Feller diffusion processes, which have received attention in population biology studies in recent years (Bansaye and Méléard, 2015;Cattiaux et al., 2009;Cattiaux and Méléard, 2010).
By mirroring the stochastic dominance argument (Theorem 1.1, Chapter 6 of Ikeda and Watanabe, 1989) in the analysis of the two-dimensional SLVC, it can be shown that each of the Y i in eq. S4 is dominated by a Feller diffusion with logistic growth. Hence, the MSLVC (Y 1 , . . . , Y n , X) is dominated by the process ( ∼ Y 1 , . . . , ∼ Y n , X), where each ∼ Y i is a Feller diffusion with logistic growth in which the logistic growth rate is driven by X. By applying a result by Lambert et al. (2005) about the almost sure (a.s.) convergence to 0 in finite time of Feller diffusions with logistic growth, we see that the MSLVC must itself hit (0, . . . , 0) ∈ R n+1 in finite time (a.s.).
To analyze eq. S4 meaningfully, one may wish to investigate its behavior conditioned on the event {X(t) > 0}. There are various ways this might be done, but one numerical approach is to compare the expected diversity of the male populations over time against numerical simulations of the female population dynamics (which do not depend on the males). Similar to the theoretical analysis of n = 2 male populations for the modified Lotka-Volterra model in Supplementary Note 1, we consider the process which can be expressed in more streamlined notation. Let Z(t) = (Y 1 (t), Y 2 (t), X(t)) . Then, eq. S5 can be written as dZ(t) = µ(Z(t))dt + σ(Z(t))dB(t), where B(t) = (B 1 (t), B 2 (t), B 3 (t)) is standard 3D Brownian motion satisfying and µ(z 1 , z 2 , z 3 ) = r 1 z 1 z 3 − c 11 z 2 1 − c 12 z 1 z 2 , r 2 z 2 z 3 − c 21 z 1 z 2 − c 22 z 2 2 , az 3 − , then f (Z(t)) computes the (unnormalized) Shannon entropy index of the male population in eq. S6.
Applying the Itô calculus to compute E z [f (Z(t))], we must solve a partial differential equation (PDE) whose solution is equal to the expected value. This PDE is given in the following proposition.
Suppose that u is a solution of the PDE (eq. S7) that satisfies all the other analytical conditions in Proposition S1. Because u is of class C 1,2 (i.e., differentiable w.r.t. t and twice differentiable w.r.t. z), an application of Itô's formula (Øksendal, 2003) produces νZ 3 (s) ∂u(t − s, Z(s)) ∂z 3 dB 3 (s), ‡ That is, a martingale with respect to the Pz measure; see Øksendal (2003) for details.
where b ij (x) = [σ(x)σ(x) ] ij . Note that the second equality above follows from the identity ∂u ∂t = L u in eq. S7 and from simplifying the expression in the stochastic integral. For the sum of the stochastic integrals in the last expression, which is a local martingale, to be a true meanzero martingale, we must apply the boundedness condition. Because u is bounded, the local martingale has finite quadratic variation; hence it is a mean-zero martingale. This implies that E z [u(t−s, Z(s))] = u(t, z) for all s ∈ [0, t]. Setting s = t, we obtain u * (t, z) = E z [f (Z(t))] = u(t, z).
Using the PDE given in Proposition S1, we can compare the dynamics of male diversity up to simulated times at which the female population X hits 0. Unfortunately, this approach is hindered by issues of convergence of numerical simulations for the female population dynamics {X(t) : t 0}. Standard methods such as Euler-Maruyama lead to numerical solutions that diverge from the true solution for stochastic differential equations that fail to satisfy conditions on their drift and volatility coefficients. The Feller diffusion with logistic growth is one such equation (Hutzenthaler et al., 2011). In general, while stochastic differential equations provide a rich collection of biological models that capture effects of randomness not present in deterministic dynamical systems and also enable statistical inference (Fuchs, 2013), they can be mathematically and computationally intractable. A potential theoretical project would investigate ways of extending the modified Lotka-Volterra model (eq. S1) to incorporate effects of randomness, for instance via the stochastic differential equation route described above, and to show that incorporating randomness does not change the conclusion about the long-term behavior of the process.

Supplementary Note 6: Other Computational Models
The Computational Grid Model ( Supplementary Fig. 1), despite including both cultural group and haplogroup dynamics, is still limited in several aspects. The constant population size (N total ) assumption is unrealistic, and moreover the female population dynamics are not considered. We also do not consider deaths not arising from intergroup competition (e.g., epidemics or natural disasters), which would allow direct comparison of different scenarios within a single framework that could produce the evolutionary signal of the bottleneck and star-like expansion of lineages inferred from coalescent models. Below, we discuss two approaches that potentially generalize our treatment.  Figure 27: A schema for modeling evolving strings. There are |P| = n 1 + n 2 strings representing individuals who are partitioned into two cultural groups of size n 1 and size n 2 .
(The cultural group membership feature is encoded in the last column labeled C.) In the genetic component of these strings, there are S polymorphic sites and an outgroup individual is assumed to exist so that each position within an individual string is either '1' (derived allele) or '0' (ancestral / same as outgroup allele). The patterns of strings evolve over time (not shown above) and their evolution is determined by rules of competition, genetic mutation, and cultural group migration.
Because the population bottleneck and subsequent star-like expansion of lineages inferred from coalescent models describe features of a population of lineages evolving under a Wright-Fisher framework, one possible way of verifying the hypothesis we proposed in the main text is to consider a huge population of sequences P = {P i } |P| i=1 partitioned into many different cultural groups, and to model the effects of intergroup competition and movement between cultural groups on P over time (see Supplementary Fig. 27). Each sequence P i is a string of symbols representing one individual, where the symbols could encode either a nucleotide or a cultural feature (e.g., a symbol at a string position representing membership of a particular cultural group). This model can in principle track the effects of intergroup competition on molecular genetic variation -much like standard models of molecular evolution (Nei and Kumar, 2000) that track effects of mutation and recombination on molecular genetic variation in populations. To proceed, one may describe rules according to which sequences are removed from P (e.g., individuals who possess a sequence encoding a particular cultural or genetic motif are killed), or undergo genetic mutations, or move from one cultural group to another. At the end of the dynamical process, a (discrete-time) phylogenetic tree can be constructed from the remaining sequences in P using sub-sequence data corresponding to the genetic component. Haplogroups can also be retroactively defined based on shared sequence motifs, resolving the issue of defining "haplogroup mutation rate" in our current model. This tree can be compared to the phylogenetic tree obtained by Poznik et al. (2016). Such an approach is pursued by Fogarty et al. (2017) in studying of the evolution of cultural complexity, where cultural traits (the cultural analogue of a gene, see Cavalli-Sforza and Feldman, 1981) are modeled as bits in a string.

Second Approach: Incorporate Population Growth and Deaths.
It is also possible to modify our current grid model to include both population growth and deaths of individuals not based on competition. To include population growth, one may allow N total to depend on the generation number t, with rules on the dynamic behavior of N total (t) (for example, N total (t) could be piecewise constant, an exponential function of t, etc.). A less controlled option is to directly model the reproduction of males, say, by assuming that at each generation an individual from cultural group-haplogroup (c, h) produces a certain number of offspring. To include deaths not based on competition, one may consider imposing an additional tuning parameter that scales down the size of each cultural group-haplogroup (c, h) after the killing and mutation steps in the model.
In either of the approaches suggested above, explicitly including the female population and modeling the interaction between the male and female populations are possible next steps. Apart from these two approaches, there may be other approaches, including agent-based models (Wilensky and Rand, 2015) and compartmental models (Godfrey, 1983) both of which are used widely in the social sciences and biology.