## Introduction

Species that are reproductively incompatible with, but otherwise identical to, wild counterparts can be engineered via the insertion of reproductive barriers. Such barriers could be used for ecosystem engineering or pest and vector control1. Previous attempts have engineered synthetic species barriers via genetic recoding in bacteria2 and yeast3, though this is likely infeasible in multicellular organisms. A reproductively isolated strain of D. melanogaster was generated with preexisting transgenes and recessive mutations4, though elevated fitness costs prevented widespread utility of this approach. CRISPR-based genome editing and transcriptional transactivation (CRISPRa) strategies exist1,5, though most fail to achieve complete reproductive isolation due to either escape mutants or incomplete lethality.

Here, we describe the development of an extreme underdominance-based approach by engineering synthetic reproductive barriers in D. melanogaster using a variation of a recently described engineered genetic incompatibility (EGI)6 approach we term Synthetic Postzygotic barriers Exploiting CRISPR-based Incompatibilities for Engineering Species (SPECIES). To engineer SPECIES, we use a nuclease-deficient deactivated Cas9 (dCas9) protein fused to a transactivation domain, which recruits transcriptional machinery to the sites of single-guide RNA (sgRNA) binding. These sgRNAs target the promoter regions of endogenous genes essential for development/viability to induce lethality caused by dCas9-mediated overexpression. This lethality is rescued in engineered, but not WT, individuals via mutations of the sgRNA target sites, preventing dCas9 binding, and subsequent lethal overexpression, without interfering with target gene function (Fig. 1a). Using this approach we engineer eight SPECIES, incompatible with each other and wildtype, and demonstrate that they can function as confineable gene drives and replace populations in a threshold-dependent manner. Supported by mathematical models, SPECIES could be adapted to pest and disease vectors in the future to provide a safe platform for modification of wild populations.

## Results

### Engineering SPECIES

We engineered flies expressing a dCas9 activator domain fusion (dCas9-VPR) and evaluated whether these transgenes could drive lethal target overexpression using CRISPRa sgRNA lines each targeting the promoter region of one of four important developmental genes (eve, hid, hh, and wg)5,7,8 (Supplementary Fig. 1). Zygotic dCas9-VPR expression did not cause noticeable toxicity on its own and achieved 100% lethality in individuals also expressing sgRNAs targeting one of the target genes (Fig. 1b–d). Interestingly, this lethality could only be rescued when homozygous dCas9-VPR-expressing fathers were crossed to heterozygous Cas9; sgRNA mothers (Fig. 1d). With this cross, mothers provided indel mutations in the promoter region of the target genes, while simultaneously depositing sgRNA/Cas9 into all embryos, which mutated the inherited paternal copy of the target sites.

Importantly, the inherited sgRNA/dCas9-VPR transgenes forced a bottleneck that selected for protective indels which blocked CRISPRa-induced lethality and allowed for endogenous expression levels of the target gene, providing embryonic rescue and survival (Fig. 1c, d and Supplementary Fig. 2). However, when homozygous dCas9-VPR/dCas9-VPR; sgRNA/sgRNA individuals harboring the protective indel mutations were outcrossed to WT, they produced some viable progeny, indicating that homozygous “rescued” flies were not 100% reproductively isolated from their WT counterparts (Supplementary Fig. 2), a feature also previously demonstrated5.

To overcome the incomplete isolation from single-gene overexpression, we tested multiplexed overexpression by engineering flies that simultaneously expressed sgRNAs targeting two or more genes (eve + hid; eve + hid + hh; eve + hid + wg; and hh + wg; Supplementary Figs. 13). Crossing these to the dCas9-VPR flies resulted in complete progeny lethality (Fig. 1c), suggesting that heterozygosity for a WT allele and an allele with a selected indel is lethal.

With the selective bottleneck genetic crossing scheme, crosses with multiplexed sgRNA/Cas9-expressing mothers rescued heterozygous dCas9; sgRNA animals through the introduction of indel mutations (Supplementary Fig. 4). Some fitness costs can be seen, at least as inferred from fertility and survivorship, in many of the SPECIES strains (Fig. 2c). Moreover, in contrast to the single-target sgRNA lines, heterozygote progeny from homozygous dCas9-VPR; multiplexed sgRNA individuals crossed to WT were all lethal, suggesting that inheriting one WT copy of each target site from the WT parent ensured 100% lethality (Supplementary Fig. 2).

To generate reproductively isolated SPECIES, multiple generations (>5) of dCas9-VPR; sgRNA “rescued” individuals were intercrossed, resulting in homozygous stocks representing eight isolated SPECIES (A1-D2). Each SPECIES was reproductively incompatible with WT (Supplementary Figs. 2 and 3) and harbored the expected indels at the target sites (Supplementary Fig. 4). Bidirectional outcrosses of all eight SPECIES to WT, or to a different SPECIES with varying target genes, demonstrated 100% reproductive isolation, indicating the creation of several independent barriers to sexual reproduction (Fig. 2, Supplementary Fig. 5 and Supplementary Video 1). Additional crosses between SPECIES and genetically diverse stocks from five different continents also demonstrated 100% reproductive isolation (Supplementary Table 8).

We next determined the extent of target gene overexpression when outcrossed to WT by visualizing overexpression in embryos via antibody stain, and we evaluated the effect of misexpression on development using cuticle preps of late embryos and young larvae. We observed target gene overexpression at embryonic stages and segment polarity defects in larvae when the SPECIES lines were mated to WT but not when self-crossed (Fig. 3a–c). To quantify the extent of target gene overexpression and to measure possible global gene misexpression, we performed transcriptome-wide expression profiling (Supplementary Table 1). We quantified RNA-expression profiles for all samples (Supplementary Table 2), including genes that were expressed from our constructs (Fig. 3d). From this analysis, we found significant target gene overexpression (up to 48-fold) in the progeny generated from SPECIES and WT crosses but not in the progeny from SPECIES intercrosses (Supplementary Figs. 6 and 7 and Supplementary Tables 1 and 7).

### Population replacement via SPECIES mediated extreme underdominance

To assess whether the SPECIES were capable of reversible WT population replacement via gene drive, we conducted population studies at various release thresholds employing one representative SPECIES, A1 (Fig. 4). Releases of A1 individuals at a population frequency of 70% resulted in this SPECIES replacing the WT population in two of six replicates (Fig. 4 and Supplementary Table 4). Population replacement also occurred in three of four replicates of A1 releases at a frequency of 80% and in one of one replicate at a 90% release frequency. However, a release frequency of 50% resulted in elimination of the A1 strain in three of three replicates (Fig. 4 and Supplementary Table 4).

To characterize the population dynamics observed in the population studies, we fitted a mathematical model to the observed data, incorporating a fitness cost for reproductively isolated individuals relative to WT individuals. The A1 strain was estimated to have a strong relative fitness cost of 34.84% (95% credible interval [CrI]: 34.82–34.87%), producing a threshold frequency of ~61%, which corresponds to what was observed in the population studies. Of the seven other SPECIES characterized, two consistently led to population replacement at a release frequency of 80% (A2 and D1), and three led to population replacement at a release frequency of 90% (A2, D1, and D2) (Fig. 4), with increased threshold frequencies corresponding to increased fitness costs for all SPECIES (Supplementary Table 6). This suggests that population replacement via gene drive would theoretically occur when the release of SPECIES individuals exceeded a critical threshold frequency in the population9,10, the value of which depends on the fitness of the synthesized strain relative to the WT strain.

## Discussion

Altogether, we demonstrate that the SPECIES approach can be exploited to build stable reproductive barriers that could drive genes through a population in a reversible manner. The SPECIES approach is advantageous over other developed technologies such as Medea systems or X-chromosome shredders11,12,13,14, as dCas9-mediated overexpression does not rely on organism-specific mode of incompatibilites such as having well-characterized maternal effect genes or separate sex chromosomes; instead, it is programmable to virtually any suitable target gene and thus should be amenable to most sexually reproducing organisms. Additionally, the “stacked” genetic barriers using multiple sgRNA’s (which is a significant difference between SPECIES and the previously described EGI system) may reduce the chances of evolved resistance via target site mutation. Once a basic engineering toolkit is constructed, it can be used to build multiple SPECIES that are reproductively isolated. As an underdominant gene drive system, the SPECIES approach is threshold dependent and allows for geographically localized population modification9,15. Although it may not be as applicable to larger-scale population manipulation as other types of drives, such as homing based drives, it is inherently confinable and reversible which may be advantageous from a regulatory perspective11,12,13,14. Furthermore, confinable SPECIES underdominant systems are preferable to other underdominant systems, such as engineered translocations16, for local population replacement since they can tolerate higher fitness costs, spread more quickly, lead to less contamination of neighboring populations, and are more resilient to elimination due to immigration of wild types (Figs. 5 and 6).

While this article was under review, a related approach termed EGI was demonstrated in flies6. Our study complements that work in several ways. Firstly, we and others5 demonstrate that single-gene targeting is not ideal for generating stable reproductively isolated SPECIES, and therefore, unlike Maselko et al.6, we exploit multiplexing of the target genes (2–3 targets per SPECIES), which ensures complete lethality and increases the evolutionary stability of the system. Additionally, unlike the EGI approach, which generated protective indel mutations first and then used either a second round of transgenesis, or a complex genetic crossing scheme to generate incompatible individuals, we exploit a simplified genetic crossing scheme that forces a bottleneck that streamlines the genetics and enables the organism to direct the evolution of non-deleterious indels. This crossing scheme enables us to multiplex our targets and may also prove instrumental for generating stable SPECIES in polyploids, e.g. fish and plants (although the postzygotic nature of the induced lethality would be an ethical consideration). Finally, our study also demonstrates the majority wins threshold-dependent gene drive capabilities of SPECIES and provides detailed modeling to support this application, an important proof of principle demonstration not demonstrated previously.

Bringing the SPECIES approach to other organisms will require integrating a cassette containing dCas9-VPR and sgRNAs into the genome of interest, which is quickly becoming more tractable with CRISPR-based tools. This approach will also require genome characterization to provide candidate sgRNA sites in promoter sequences near genes of interest as well as species-specific testing for overexpression levels and dCas9-VPR toxicity, which may contribute to observed fitness costs17. Regardless, it should be possible to implement SPECIES in organisms of medical or agricultural interest, such as mosquitoes. This becomes even more interesting considering the gene drive function of SPECIES provides greater control and confinement via threshold dependence11,18,19 as well as a reversibility via WT release16 and protection from the evolution of resistance via multiplexing, features not found in all gene drives11,14. While release requirements for threshold-dependent systems are large due to fitness costs and may present logistical challenges, they are an order of magnitude less than those routinely carried out for insect suppression programs20. Although models without life history and spatial structure tend to underestimate release requirements for threshold-dependent systems, spatially explicit models suggest that local population replacement could be achieved with a series of 1:1 releases carried out over several weeks21. Overall, SPECIES demonstrates a platform for the possible safe control or modification of pest and disease vector populations that impose significant burdens on humanity.

## Methods

### Plasmid design and assembly

To assemble plasmid OA-986A, the base vector used for generating dCas9-expressing plasmids, several components were cloned into the piggyBac plasmid pBac[3xP3-DsRed]22 using Gibson assembly/EA cloning23. pBac[3xP3-DsRed] was digested with BstBI and NotI, and the following components were cloned in with EA cloning: an attP sequence amplified from plasmid M{3xP3-RFP attP}24 with primers 986.C1 and 986.C2, a p10 3′UTR fragment amplified from Addgene plasmid #10058022 with primers 986.C3 and 986.C4, an opie2 promoter fragment amplified from translocation plasmid B16 using primers 986.C5 and 986.C6, and an eCFP marker amplified from Addgene plasmid #47917 using primers 986.C7 and 986.C8. The resulting plasmid was then digested with PacI, and the following components were cloned into generate the final dCas9-expressing vectors: the Ubiquitin-63E promoter fragment amplified with primers 986.C9 and 986.C10 from D. melanogaster genomic DNA and a dCas9-VPR fragment amplified from Addgene plasmid #7889825 with primers 986.C11 and 986.C12 to generate plasmid OA-986B (Addgene #124999); the bottleneck promoter fragment amplified with primers 986.C13 and 986.C14 from D. melanogaster genomic DNA and a dCas9-VPR fragment amplified from Addgene plasmid #7889825 with primers 986.C15 and 986.C12 to generate plasmid OA-986C (Addgene #125000); the Ubiquitin-63E promoter fragment amplified with primers 986.C9 and 986.C16 from D. melanogaster genomic DNA and a dCas9-VP64 fragment amplified from Addgene plasmid #7889725 with primers 986.C17 and 986.C18 to generate plasmid OA-986D (Addgene #125001); and the bottleneck promoter fragment amplified with primers 986.C13 and 986.C19 from D. melanogaster genomic DNA and a dCas9-VP64 fragment amplified from Addgene plasmid #7889725 with primers 986.C20 and 986.C18 to generate plasmid OA-986E (Addgene #125002).

To assemble plasmids OA-1045A-E, the multiple sgRNA containing vectors, several components were cloned into the multiple cloning site of a plasmid24 containing the white gene as a marker and an attB-docking site using Gibson assembly/EA cloning23. First, the plasmid was digested with AsiSI and KpnI, and the following components were cloned in with EA cloning to generate base plasmid OA-1045: a D. melanogaster U6:3 promoter fragment sequence amplified from Addgene plasmid #4941126 with primers 1045.C1 and 1045.C2, and an sgRNA scaffold fragment amplified from Addgene plasmid #4941126 with primers 1045.C3 and 1045.C4. The resulting base plasmid was then used to clone final sgRNA plasmids OA-1045A through OA-1045E. To generate plasmid OA-1045A (Addgene #125003), plasmid OA-1045 was digested with AvrII; then, a fragment containing an 18-base-pair (bp) even skipped (eve) sgRNA target site5, an sgRNA scaffold, a D. melanogaster U6:1 promoter fragment, and an 18-bp head involution defective (hid) sgRNA target site5 was amplified from a custom gBlocks® Gene Fragment (Integrated DNA Technologies, Coralville, Iowa) with primers 1045.C5 and 1045.C6 and was cloned into the digested backbone using EA cloning. To generate plasmid OA-1045B (Addgene #125004), plasmid OA-1045A was digested with XbaI, and a fragment containing a Gypsy insulator, a D. melanogaster U6:1 promoter fragment driving expression of a first hedgehog (hh)-targeting sgRNA, and a D. melanogaster U6:3 promoter fragment driving expression of a second hh-targeting sgRNA amplified from plasmid pCFD4-hh8 with primers 1045.C7 and 1045.C8 was cloned in using EA cloning. To generate plasmid OA-1045C (Addgene #125005), plasmid OA-1045A was digested with XbaI, and a fragment containing a Gypsy insulator, a D. melanogaster U6:1 promoter fragment driving expression of a first wingless (wg)-targeting sgRNA, and a D. melanogaster U6:3 promoter fragment driving expression of a second wg-targeting sgRNA amplified from plasmid pCFD4-wg8 with primers 1045.C7 and 1045.C8 was cloned in using EA cloning. To generate plasmid OA-1045D (Addgene #125006), plasmid OA-1045 was digested with AscI and XbaI, and two fragments were cloned in using EA cloning: a first fragment containing a D. melanogaster U6:1 promoter fragment driving expression of a first wg-targeting sgRNA and a D. melanogaster U6:3 promoter fragment driving expression of a second wg-targeting sgRNA amplified from plasmid pCFD4-wg8 with primers 1045.C9 and 1045.C10, and a second fragment containing a Gypsy insulator, a D. melanogaster U6:1 promoter fragment driving expression of a first hh-targeting sgRNA, and a D. melanogaster U6:3 promoter fragment driving expression of a second hh-targeting sgRNA amplified from plasmid pCFD4-hh8 with primers 1045.C11 and 1045.C12. Finally, to generate plasmid OA-1045E (Addgene #125007), plasmid OA-1045 was digested with AvrII and NotI, and a fragment comprising D. melanogaster Gly tRNA-flanked sgRNAs27 targeting, from 5′ to 3′, eve, hid, and hh followed by a D. melanogaster U6:3 UTR that was amplified with primers 1045.C13 and 1045.C14 from a gene synthesized vector (GenScript, Piscataway, NJ) was cloned using EA cloning. All primers used for cloning are listed in Table S15.

### Fly culture and strains

Fly husbandry and crosses were performed under standard conditions at 25 °C. Rainbow Transgenics (Camarillo, CA) carried out all of the fly injections. The fly strains used to generate dCas9-expressing lines were attP lines attP40w (Rainbow Transgenic Flies line; yw P{nos-phiC31\int.NLS}X;P{CaryP}attP40) and 8621 (BSC #8621; y[1] w[67c23]; P{y[+t7.7] = CaryP}attP1). The fly strains used to generate sgRNA-expressing lines were 86Fa (BSC #24486: y[1] M{vas-int.Dm}ZH-2A w[*]; M{3xP3-RFP.attP’}ZH-86Fa), 9732 (BSC #9732: y[1] w[1118]; PBac{y[+]-attP-9A}VK00013), and 8622 (BSC #8622: y[1]w[67c23]; P{y[+t7.7] = CaryP}attP2). For balancing chromosomes, fly stock BSC#39631 (w[*]; wg[Sp-1]/CyO; P{ry[+t7.2] = neoFRT}82B lsn[SS6]/TM6C, Sb[1]) was used. All lines were homozygous viable.

### Generation and genetics of speciated stocks

Single sgRNA lines targeting eve, hid, hh, and wg were utilized5,8. dCas9-VPR- and dCas9-VP64-expressing lines were generated via microinjection as described above; we were unable to generate a transgenic line that expressed dCas9-VPR ubiquitously, despite numerous attempts, suggesting that such expression was toxic. To test for the ability of all sgRNA lines to induce lethal overexpression (“killing”), five sgRNA males and five virgin females were separately crossed to five dCas9 line individuals of the opposite sex in single vials and were allowed to mate for 7 days. After 7 days, the parents were removed, and the vials were monitored for 7 additional days to assess whether viable larvae were present. No killing was observed in crosses of dCas9-VP64 expressing lines to any of the sgRNA-expressing lines (Fig. 1), consistent with previous observations8. Complete killing was presumed when no larvae were present after 14 days. All experiments were done in biological triplicate.

To generate protective indel mutations, a Ubiquitin-Cas9 line (BSC #79005)28 was used. Briefly, ten Ubiquitin-Cas9 virgin females were crossed to ten sgRNA males, and virgin female and male progeny with both transgenes were selected and crossed to each other for at least three generations. Cas9/sgRNA transheterozygous virgins were then outcrossed in groups of 3–5 to homozygous attP40w bnk-dCas9-VPR males, and progeny containing both a sgRNA (identified by the presence of the w+ marker) and bnk-dCas9-VPR (identified by the presence of the opie2-eCFP marker) were isolated as “rescue” individuals that presumably carried protective indel mutations in the target promoter regions that prevented dCas9-induced overexpression (Fig. 1). To confirm the generation of indels, these flies were Sanger sequenced and crossed to each other, again in groups of 3–5, to establish “rescue” stocks.

These “rescue” crosses were also set up in the reverse direction, utilizing 3–5 homozygous attP40w bnk-dCas9-VPR females crossed to Cas9/sgRNA transheterozygous males, to determine whether maternal deposition of Cas9/sgRNAs is required for generating sufficient protective indel mutations to provide rescue of lethality. In particular, it was assumed that, if both copies of the targeted promoter needed to contain protective indel mutations to provide rescue, lack of maternally deposited Cas9/sgRNA (due to Cas9/sgRNA fathers being used) would lead to lack of “rescue” individuals, as all individuals inheriting the sgRNA and bnk-dCas9-VPR transgenes would still have one wildtype copy of the target promoter inherited from the mother available for targeting and would perish.

To further validate whether both copies of the targeted promoter needed to contain protective indel mutations to provide rescue from lethality, “rescue” individuals were also bidirectionally outcrossed in groups of 3–5 and in triplicate to homozygous attP40w bnk-dCas9-VPR individuals, and the resulting progeny were scored for the “rescue” phenotype. Here, it was presumed that the lack of transheterozygous sgRNA/bnk-dCas9-VPR progeny indicated that both copies of the targeted promoter needed to contain protective indel mutations to provide rescue and that the lack of such mutations in the promoter allele inherited from the homozygous attP40w bnk-dCas9-VPR parent led to lethality in the transheterozygous sgRNA/bnk-dCas9-VPR progeny. Here, too, such lethality was observed for crosses with multiple sgRNA transgenes but not for crosses with single sgRNA transgenes, suggesting that, in the case of the latter, one wildtype copy of the targeted promoter was not sufficient to lead to sgRNA/bnk-dCas9-VPR-induced lethality.

Double homozygous speciated stocks were generated for all sgRNA combinations by crossing dCas9/sgRNA heterozygotes that lacked the Ubiquitin-Cas9 transgene (as evidenced by lack of red fluorescence) and identifying homozygous progeny by eye color (orange to dark red eyes for homozygotes vs. yellow to light red eyes for heterozygotes, depending on sgRNA insertion site) and opie2-eCFP intensity. Putative double homozygous individuals were then outcrossed to w[1118] individuals of the opposite sex in groups of three per vial to test for reproductive isolation. Flies were allowed to mate and lay eggs for 7 days, and vials were checked daily for hatched embryos. Flies that failed to fruitfully mate with w[1118] were presumed to be reproductively isolated double homozygotes and were then crossed to putative double homozygotes of the opposite sex to generate a double homozygous, reproductively isolated stock for each sgRNA line.

### Assessment of reproductive isolation from various strains

To determine whether double homozygous SPECIES lines were reproductively isolated from stocks that were genetically diverse, SPECIES individuals were outcrossed to various Global Diversity Lines (GDL) isolated from five different continents29 and were used in previous work examining gene drive function in different genetic contexts30,31. Briefly, five double homozygous individuals from each SPECIES stock were outcrossed to five individuals of the opposite sex from each of five GDL (from Beijing, China; Ithaca, NY; the Netherlands; Tasmania, Australia; and Zimbabwe, Africa). All crosses were done bidirectionally with respect to sex and in triplicate. Flies were allowed to mate and lay eggs for 7 days, and vials were checked daily for hatched embryos for the following 7 days. Lack of embryo hatching was presumed to indicate reproductive isolation (Supplementary Table 8).

To assess reproductive isolation between double homozygous SPECIES lines, inter-SPECIES crosses were performed by crossing two SPECIES virgin females with two SPECIES males from each strain. Flies were allowed to mate for 12–16 h under standard conditions; following this period, the adult flies were removed and the embryos were counted (Supplementary Fig. 5). The vials were aged at 26 °C for 24 h and subsequently scored for the number of hatched embryos (if reproductive isolation did not occur). The vials were then kept at 26 °C for 7–10 days to ensure no pupae/adults emerged in instances of reproductive isolation or to count emerged adults in instances of incomplete reproductive isolation (Supplementary Fig. 5).

### Population experiments

All genetic experiments were conducted in a high-security Arthropod Containment Level 2 (ACL2) barrier facility in accordance with protocols approved by the Institutional Biosafety Committee from University of California San Diego. Population experiments were carried out at 26 °C, 12 h/12 h day/night cycle, with ambient humidity in 250 ml bottles containing Lewis medium supplemented with live, dry yeast. Starting populations for drive experiments included equal numbers of virgins and males of similar ages for each genotype. Speciated double homozygotes (dCas9/dCas9; +/+) were introduced at a population frequency of 80% for above-threshold drive experiments, and 50% for below-threshold drive experiments. Oregon R virgin females and males (+/+; +/+) of a similar age as the reproductively isolated individuals made up the remainder of the population. The total number of flies for each starting population was 100. All 50, 70, and 80% population experiments were conducted in at least triplicate, with exception of SPECIES C1 seeded at 80% in which one of the replicates did not produce any progeny due to bacteria/mold contamination. Moreover, only one replicate was conducted for releases seeded at 90% for all species except B2 as again this replicate did not produce any progeny due to bacteria/mold contamination. In total, we conducted 44 population cage experiments which were tracked for up to 11 generations. After being placed together, adult flies were removed after 7 days. After another 7 days, progeny were collected and the fraction of speciated double homozygous individuals was determined (Fig. 4 and Supplementary Table 4). The progeny were then placed into a new bottle to initiate the next generation. No significant evidence of crowding in the 250 ml bottles was observed.

### Mathematical modeling

We modeled SPECIES population dynamics under laboratory conditions assuming random mating and discrete generations. We considered a SPECIES allele, “T”, and a corresponding wildtype allele, “t”. Since heterozygotes for the SPECIES system are unviable, there are only two viable genotypes—TT and tt. We denote the proportion of organisms having the genotype TT at generation k by $${p}_{k}$$, and the proportion having the wildtype genotype at generation k by $$(1-{p}_{k})$$. By considering all possible mating pairs, and assuming a fitness cost for TT individuals relative to wildtype individuals, s, the frequency of TT individuals in the next generation is given by:

$${p}_{k+1}={p}_{k}^{2}(1-s)/({p}_{k}^{2}\left(1-s\right)+{(1-{p}_{k})}^{2}).$$
(1)

The threshold frequency is an unstable equilibrium that satisfies the condition:

$${p}_{k+1}={p}_{k}.$$
(2)

Substituting Eq. (2) into Eq. (1) and solving for $${p}_{k}$$, we find two stable equilibria ($${p}_{k}=0$$ and $${p}_{k}=1$$) and one unstable equilibrium ($${p}_{k}=1/(2-s)$$). The latter represents the critical threshold frequency, above which the SPECIES system is more likely to spread to fixation than not, and below which it is more likely to be eliminated than not.

The likelihood of the population data for each SPECIES system was calculated by assuming a binomial distribution of wildtype (CFP−) and SPECIES (CFP+) individuals, and by using the model in Eq. (1) to generate expected proportions for each fitness parameter value, s, i.e., by calculating the log-likelihood:

$${{\log }}L\left(s \right)=\mathop{\sum }\limits_{i=1}^{j}\mathop{\sum }\limits_{k=1}^{{n}_{i}}{{\log }}\left(\begin{array}{c}{{\rm{TT}}}_{i,k}+{{\rm{tt}}}_{i,k}\\ {{\rm{TT}}}_{i,k}\end{array}\right)+{{\rm{TT}}}_{i,k}{{\log }}\left({p}_{k}\left(s\right)\right)+{{\rm{tt}}}_{i,k}{{\log }}\left(1-{p}_{k}\left(s\right)\right).$$
(3)

Here, (1) TTi,k and tti,k are the number of SPECIES (CFP+) and wildtype (CFP−) individuals at generation k in experiment i, respectively, (2) there are a total of j experiments for this SPECIES system, (3) the ith experiment is run for ni generations, and (4) the expected genotype frequencies are dependent on the fitness parameter, s. The initial condition for each experiment is specified by the data. Fitness parameters, including 95% credible intervals, were estimated using a Markov chain Monte Carlo sampling procedure.

The stochastic simulations in Fig. 4 were implemented by calculating expected genotype frequencies in the next generation according to Eq. (1), and taking a binomial sample from a total of 50 individuals.

Comparative modeling of other underdominant systems is described in Marshall and Hay19. That paper uses the mathematical modeling framework described here in addition to two approaches for modeling migration: (1) a “two-population model”, in which reciprocal movement occurs between the two connected populations; and (2) a “source model”, in which the system is initially fixed in the source population, absent from the sink population, and one-way migration occurs from the source to sink population. In Marshall and Hay19, population replacement and confinement dynamics are shown for: (1) extreme underdominance (the SPECIES system modeled here), (2) reciprocal chromosomal translocations16, (3) single-locus and two-locus engineered underdominance32, (4) Semele33, (5) inverse Medea33, and (6) Merea (Medea with a recessive antidote). A range of parameter values are compared for each gene drive system, including fitness cost (s, varied between 0 and 30%) and migration rate (m, varied between 0 and 10% per individual per generation for both the source and two-population models)16.

Results from that analysis suggest that SPECIES-like extreme underdominant systems fare well against other underdominance-based gene drive systems in terms of both confinement and persistence. The most direct comparison can be made to translocations16, which also have a 50% release threshold in a single population and in the absence of a fitness cost. Considering a 5% fitness cost for both systems, they still have very similar release thresholds (51.3% for SPECIES-based underdominance cf. 52.8% for translocations); however, for a two-population model with a migration rate of 1% per individual per generation, the SPECIES-based underdominant system spreads to only ~0.01% in the neighboring population, while the translocations spread to a much higher ~4.2% in the neighboring population19. The migration rate at which the introduced system is lost due to inward migration of wild types is also much higher for the SPECIES-based underdominant system (17.6% per individual per generation cf. 5.8% for translocations, s = 0.05)19. This suggests that SPECIES-like extreme underdominant systems are preferable to translocations for local population replacement since they lead to less contamination of neighboring populations and are less vulnerable to elimination due to inward migration.

Finally, were SPECIES-based underdominant systems to be implemented for local population replacement, strains would likely be used that would have much smaller fitness costs than those observed here (~30%). Despite that, results from Marshall and Hay16 suggest the population dynamics of the SPECIES system are resilient in the face of these fitness costs. A SPECIES system with a fitness cost of 30% has a release threshold of 58.8%, which could be exceeded through weekly releases over several weeks. Furthermore, in a two-population model, the migration rate at which the SPECIES system would be lost due to inward migration of wild types is 13.3% per individual per generation16, which is greater than the movement rate observed between populations of Anopheles gambiae34, the main mosquito vector of malaria in Sub-Saharan Africa, and Aedes aegypti35, the main mosquito vector of dengue, Zika and Chikungunya viruses.

### RNA sequencing for transcriptional activation analysis

Embryos were collected from the multiple speciated lines to assess transactivation in the embryo. Male speciated flies were crossed to Oregon R virgin females in glass vials supplemented with Drosophila medium and yeast paste and were incubated at 26 °C for 72 h. Following this period, the adult flies were transferred to collection chambers containing grape juice agar plates. The flies were allowed to lay for 4–5 h, after which the embryos were aged for 1 h and collected using a paintbrush. Afterwards, 30–50 embryos that were 5–6-h old were collected, washed with ddH2O, and transferred to individual eppendorf tubes. The samples were flash frozen with liquid nitrogen and stored at −80 °C. Intra-crosses for Oregon R, Cas9, dCas9, sgRNA, and speciated lines were also performed and collected as controls. Each sample was homogenized and processed using the Quick-Start Protocol of the miRNeasy Mini Kit (Qiagen, Hilden, DEU), followed by DNase treatment using the DNA-free™ Kit and protocol (Thermo Fisher Scientific, Waltham, MA, USA).

### RNA-seq library construction and sequencing

RNA integrity was assessed using the RNA 6000 Pico Kit for Bioanalyzer (Agilent Technologies #5067-1513), and mRNA was isolated from ~1 μg of total RNA using NEBNext Poly(A) mRNA Magnetic Isolation Module (NEB #E7490). RNA-seq libraries were constructed using the NEBNext Ultra II RNA Library Prep Kit for Illumina (NEB #E7770) following the manufacturer’s instructions. Briefly, mRNA was fragmented to an average size of 200 nt by incubating at 94 °C for 15 min in the first strand buffer. cDNA was then synthesized using random primers and ProtoScript II Reverse Transcriptase followed by second strand synthesis using NEB Second Strand Synthesis Enzyme Mix. Resulting DNA fragments were end-repaired, dA tailed, and ligated to NEBNext hairpin adaptors (NEB #E7335). After ligation, adaptors were converted to the “Y” shape by treating with USER enzyme, and DNA fragments were size selected using Agencourt AMPure XP beads (Beckman Coulter #A63880) to generate fragment sizes between 250 and 350 bp. Adaptor-ligated DNA was PCR amplified followed by AMPure XP bead clean up. Libraries were quantified using a Qubit dsDNA HS Kit (Thermo Fisher Scientific #Q32854), and the size distribution was confirmed using a High Sensitivity DNA Kit for Bioanalyzer (Agilent Technologies #5067-4626). Libraries were sequenced on an Illumina HiSeq2500 in single read mode with the read length of 50 nt and sequencing depth of 20 million reads per library following the manufacturer’s instructions. Base calls were performed with RTA 1.18.64 followed by conversion to FASTQ with bcl2fastq 1.8.4.

### Quantification and differential expression analysis

Reads were mapped to the Drosophila melanogaster genome (BDGP release 6, GenBank accession GCA_000001215.4) using STAR aligner36 with the default parameters with the addition of the “--outFilterType BySJout” filtering option and the “--sjdbGTFfile Drosophila_melanogaster.BDGP6.22.97.gtf” GTF file downloaded from ENSEMBL. Expression levels were determined with featureCounts37 using “-t exon -g gene_id -M -O --fraction” options. Differential expression analyses between homozygous speciation stocks and corresponding heterozygotes outcrossed to wildtype females were performed with DESeq238 using a two-factor design formula “design= ~ line + genotype”. Two independent lines per each target set (genotype) were used. MA plots [log2(FoldChange) vs. log10(baseMean)] were generated with ggplot239. All sequencing data can be accessed at NCBI SRA (study accession ID PRJNA578541, reviewer link: https://dataview.ncbi.nlm.nih.gov/object/PRJNA578541?reviewer=mnn67aeait5u7v231c1b8nh2vh).

### Immunohistochemistry

For antibody staining, embryos were collected overnight and then fixed and dechorionated using standard protocols40. We used guinea pig anti-Runt polyclonal antibody (kindly provided by David Kosman and John Reinitz) at a concentration of 1:200 and mouse anti-Eve monoclonal 3C10 (developed by C. Goodman and available from the Developmental Studies Hybridoma Bank) at 1:20. Nuclei were counterstained with DAPI. Embryos were stained using standard protocols41.

### Cuticle preparation

Embryos were collected and aged at 27 °C until they were 16–22 h old. Embryos were pipetted onto a slide and excess fluid was removed. Glacial acetic acid mixed 1:1 with Hoyer’s solution was added, covered with a coverslip, and allowed to dry for several days in an oven at 65 °C for clearing. After 24 h, the coverslips were weighted to flatten the preps. Cuticles were imaged on an upright Zeiss Axio Imager microscope with bright field illumination, and grayscale images were later inverted and oversaturated for increased contrast using Adobe Photoshop.

### Molecular characterization of protective indel mutations

To examine the molecular changes that conferred protection from dCas9-mediated overexpression and associated lethality, four genomic loci that include target sites for four functional sgRNAs (Supplementary Fig. 4) were amplified and sequenced. Single-fly genomic DNA preps were prepared using the solid tissue protocol of the Quick-DNA™ Miniprep Plus Kit (Zymo Research). In total, 2–3 µl of genomic DNA was used as a template in a 50-µl PCR reaction with Q5® High-Fidelity 2X Master Mix (NEB, Ipswich, MA, USA). The following primers (Supplementary Table 5) were used to amplify the loci with the corresponding sgRNA targets: 1001.S1 and 1001.S4 for hh; 1045.S1 and 1045.S4 for wg; 1045.S5 and 1045.S8 for eve; and 1045.S9 and 1045.S12 for hid. PCR products were loaded and run on an agarose gel, excised, and purified using a Zymoclean™ Gel DNA Recovery Kit (Zymo Research, Irvine, CA, USA) and were sequenced in both directions using Sanger sequencing at Retrogen Inc (San Diego, CA, USA). To characterize molecular changes at the targeted sites, AB1 sequence files were aligned against the corresponding reference sequences (downloaded from FlyBase release FB2019_3)42 in SnapGene® 4 and/or Benchling™.

#### Gene drive safety measures

All crosses using gene drives genetics were performed in accordance to an Institutional Biosafety Committee approved protocol from UCSD, in which full gene drive experiments are performed in a high-security ACL2 barrier facility and split-drive experiments are performed in an ACL1 insectary in plastic vials that are autoclaved prior to being discarded in accordance with currently suggested guidelines for the laboratory confinement of gene drive systems13,43.

#### Ethical conduct of research

We have complied with all relevant ethical regulations for animal testing and research and conformed to the UCSD institutionally approved biological use authorization protocol (BUA #R2401).

### Reporting summary

Further information on research design is available in the Nature Research Reporting Summary linked to this article.