Engineered reproductively isolated species drive reversible population replacement

Engineered reproductive species barriers are useful for impeding gene flow and driving desirable genes into wild populations in a reversible threshold-dependent manner. However, methods to generate synthetic barriers are lacking in advanced eukaryotes. Here, to overcome this challenge, we engineer SPECIES (Synthetic Postzygotic barriers Exploiting CRISPR-based Incompatibilities for Engineering Species), an engineered genetic incompatibility approach, to generate postzygotic reproductive barriers. Using this approach, we create multiple reproductively isolated SPECIES and demonstrate their reproductive isolation and threshold-dependent gene drive capabilities in D. melanogaster. Given the near-universal functionality of CRISPR tools, this approach should be portable to many species, including insect disease vectors in which confinable gene drives could be of great practical utility.

S pecies that are reproductively incompatible with, but otherwise identical to, wild counterparts can be engineered via the insertion of reproductive barriers. Such barriers could be used for ecosystem engineering or pest and vector control 1 . Previous attempts have engineered synthetic species barriers via genetic recoding in bacteria 2 and yeast 3 , though this is likely infeasible in multicellular organisms. A reproductively isolated strain of D. melanogaster was generated with preexisting transgenes and recessive mutations 4 , though elevated fitness costs prevented widespread utility of this approach. CRISPR-based genome editing and transcriptional transactivation (CRISPRa) strategies exist 1,5 , though most fail to achieve complete reproductive isolation due to either escape mutants or incomplete lethality.
Here, we describe the development of an extreme underdominance-based approach by engineering synthetic reproductive barriers in D. melanogaster using a variation of a recently described engineered genetic incompatibility (EGI) 6 approach we term Synthetic Postzygotic barriers Exploiting CRISPR-based Incompatibilities for Engineering Species (SPE-CIES). To engineer SPECIES, we use a nuclease-deficient deactivated Cas9 (dCas9) protein fused to a transactivation domain, which recruits transcriptional machinery to the sites of singleguide RNA (sgRNA) binding. These sgRNAs target the promoter regions of endogenous genes essential for development/viability to induce lethality caused by dCas9-mediated overexpression. This lethality is rescued in engineered, but not WT, individuals via mutations of the sgRNA target sites, preventing dCas9 binding, and subsequent lethal overexpression, without interfering with target gene function (Fig. 1a). Using this approach we engineer eight SPECIES, incompatible with each other and wildtype, and demonstrate that they can function as confineable gene drives and replace populations in a threshold-dependent manner. Supported by mathematical models, SPECIES could be adapted to pest and disease vectors in the future to provide a safe platform for modification of wild populations.

Results
Engineering SPECIES. We engineered flies expressing a dCas9 activator domain fusion (dCas9-VPR) and evaluated whether these transgenes could drive lethal target overexpression using CRISPRa sgRNA lines each targeting the promoter region of one of four important developmental genes (eve, hid, hh, and wg) 5,7,8 ( Supplementary Fig. 1). Zygotic dCas9-VPR expression did not cause noticeable toxicity on its own and achieved 100% lethality in individuals also expressing sgRNAs targeting one of the target genes ( Fig. 1b-d). Interestingly, this lethality could only be rescued when homozygous dCas9-VPR-expressing fathers were crossed to heterozygous Cas9; sgRNA mothers ( Fig. 1d). With this cross, mothers provided indel mutations in the promoter region of the target genes, while simultaneously depositing sgRNA/Cas9 into all embryos, which mutated the inherited paternal copy of the target sites.
Importantly, the inherited sgRNA/dCas9-VPR transgenes forced a bottleneck that selected for protective indels which blocked CRISPRa-induced lethality and allowed for endogenous expression levels of the target gene, providing embryonic rescue and survival (Fig. 1c, d and Supplementary Fig. 2). However, when homozygous dCas9-VPR/dCas9-VPR; sgRNA/sgRNA individuals harboring the protective indel mutations were outcrossed to WT, they produced some viable progeny, indicating that homozygous "rescued" flies were not 100% reproductively isolated from their WT counterparts ( Supplementary Fig. 2), a feature also previously demonstrated 5 .
To overcome the incomplete isolation from single-gene overexpression, we tested multiplexed overexpression by engineering flies that simultaneously expressed sgRNAs targeting two or more genes (eve + hid; eve + hid + hh; eve + hid + wg; and hh + wg; Supplementary Figs. [1][2][3]. Crossing these to the dCas9-VPR flies resulted in complete progeny lethality (Fig. 1c), suggesting that heterozygosity for a WT allele and an allele with a selected indel is lethal.
With the selective bottleneck genetic crossing scheme, crosses with multiplexed sgRNA/Cas9-expressing mothers rescued heterozygous dCas9; sgRNA animals through the introduction of indel mutations (Supplementary Fig. 4). Some fitness costs can be seen, at least as inferred from fertility and survivorship, in many of the SPECIES strains (Fig. 2c). Moreover, in contrast to the single-target sgRNA lines, heterozygote progeny from homozygous dCas9-VPR; multiplexed sgRNA individuals crossed to WT were all lethal, suggesting that inheriting one WT copy of each target site from the WT parent ensured 100% lethality ( Supplementary Fig. 2).
We next determined the extent of target gene overexpression when outcrossed to WT by visualizing overexpression in embryos via antibody stain, and we evaluated the effect of misexpression on development using cuticle preps of late embryos and young larvae. We observed target gene overexpression at embryonic stages and segment polarity defects in larvae when the SPECIES lines were mated to WT but not when self-crossed ( Fig. 3a-c). To quantify the extent of target gene overexpression and to measure possible global gene misexpression, we performed transcriptomewide expression profiling (Supplementary Table 1). We quantified RNA-expression profiles for all samples (Supplementary Table 2), including genes that were expressed from our constructs (Fig. 3d). From this analysis, we found significant target gene overexpression (up to 48-fold) in the progeny generated from SPECIES and WT crosses but not in the progeny from SPECIES intercrosses (Supplementary Figs. 6 and 7 and Supplementary  Tables 1 and 7).
Population replacement via SPECIES mediated extreme underdominance. To assess whether the SPECIES were capable of reversible WT population replacement via gene drive, we conducted population studies at various release thresholds employing one representative SPECIES, A1 (Fig. 4). Releases of A1 individuals at a population frequency of 70% resulted in this SPECIES replacing the WT population in two of six replicates ( Fig. 4 and Supplementary Table 4). Population replacement also occurred in three of four replicates of A1 releases at a frequency of 80% and in one of one replicate at a 90% release frequency. However, a release frequency of 50% resulted in elimination of the A1 strain in three of three replicates ( Fig. 4 and Supplementary Table 4).
To characterize the population dynamics observed in the population studies, we fitted a mathematical model to the observed data, incorporating a fitness cost for reproductively isolated individuals relative to WT individuals. The A1 strain was estimated to have a strong relative fitness cost of 34.84% (95% credible interval [CrI]: 34.82-34.87%), producing a threshold frequency of~61%, which corresponds to what was observed in the population studies. Of the seven other SPECIES characterized, two consistently led to population replacement at a release frequency of 80% (A2 and D1), and three led to population replacement at a release frequency of 90% (A2, D1, and D2) (Fig. 4), with increased threshold frequencies corresponding to increased fitness costs for all SPECIES (Supplementary Table 6). This suggests that population replacement via gene drive would theoretically occur when the release of SPECIES individuals exceeded a critical threshold frequency in the population 9,10 , the value of which depends on the fitness of the synthesized strain relative to the WT strain.

Discussion
Altogether, we demonstrate that the SPECIES approach can be exploited to build stable reproductive barriers that could drive genes through a population in a reversible manner. The SPECIES approach is advantageous over other developed technologies such as Medea systems or X-chromosome shredders [11][12][13][14] , as dCas9mediated overexpression does not rely on organism-specific mode of incompatibilites such as having well-characterized maternal NATURE COMMUNICATIONS | https://doi.org/10.1038/s41467-021-23531-z ARTICLE effect genes or separate sex chromosomes; instead, it is programmable to virtually any suitable target gene and thus should be amenable to most sexually reproducing organisms. Additionally, the "stacked" genetic barriers using multiple sgRNA's (which is a significant difference between SPECIES and the previously described EGI system) may reduce the chances of evolved resistance via target site mutation. Once a basic engineering toolkit is constructed, it can be used to build multiple SPECIES that are reproductively isolated. As an underdominant gene drive system, the SPECIES approach is threshold dependent and allows for geographically localized population modification 9,15 . Although it may not be as applicable to larger-scale population manipulation  (Fig. S3). b Schematic detailing the methods of determining embryo and adult survival compared to WT. c % embryo and adult survival was calculated and plotted. The number below each x-axis group indicates cross number (#1-4). N = 3 biologically independent replicates for each cross number. ✝ indicates that embryos did not survive past L1/L2 stages. Unpaired two-tailed t-tests were performed for each SPECIES compared to WT (*p < 0.05, **p < 0.01, ***p < 0.001, and ****p < 0.0001). Middle lines indicate mean, while error bars represent standard deviation. The color of the dots represent each species as indicated in figure key. d Embryo (left graph) and adult (right graph) Cohen's d effect sizes compared to WT. N = 6 observations per cross number per SPECIES (effect size based on mean comparison between three replicates of each experimental cross (#1-3) to control cross (#4)). Error bars represent 95% confidence interval. The color of the dots represent each species as indicated in figure key. Source data are provided as a Source Data file. as other types of drives, such as homing based drives, it is inherently confinable and reversible which may be advantageous from a regulatory perspective [11][12][13][14] . Furthermore, confinable SPECIES underdominant systems are preferable to other underdominant systems, such as engineered translocations 16 , for local population replacement since they can tolerate higher fitness costs, spread more quickly, lead to less contamination of neighboring populations, and are more resilient to elimination due to immigration of wild types (Figs. 5 and 6).
While this article was under review, a related approach termed EGI was demonstrated in flies 6 . Our study complements that work in several ways. Firstly, we and others 5 demonstrate that single-gene targeting is not ideal for generating stable reproductively isolated SPECIES, and therefore, unlike Maselko et al. 6 , we exploit multiplexing of the target genes (2-3 targets per SPE-CIES), which ensures complete lethality and increases the evolutionary stability of the system. Additionally, unlike the EGI approach, which generated protective indel mutations first and then used either a second round of transgenesis, or a complex genetic crossing scheme to generate incompatible individuals, we exploit a simplified genetic crossing scheme that forces a bottleneck that streamlines the genetics and enables the organism to direct the evolution of non-deleterious indels. This crossing scheme enables us to multiplex our targets and may also prove instrumental for generating stable SPECIES in polyploids, e.g. fish and plants (although the postzygotic nature of the induced lethality would be an ethical consideration). Finally, our study also demonstrates the majority wins threshold-dependent gene drive capabilities of SPECIES and provides detailed modeling to support this application, an important proof of principle demonstration not demonstrated previously.
Bringing the SPECIES approach to other organisms will require integrating a cassette containing dCas9-VPR and sgRNAs into the genome of interest, which is quickly becoming more tractable with CRISPR-based tools. This approach will also require genome characterization to provide candidate sgRNA sites in promoter sequences near genes of interest as well as species-specific testing for overexpression levels and dCas9-VPR toxicity, which may contribute to observed fitness costs 17 . Regardless, it should be possible to implement SPECIES in organisms of medical or agricultural interest, such as mosquitoes. This becomes even more interesting considering the gene drive function of SPECIES provides greater control and confinement via threshold dependence 11,18,19 as well as a reversibility via WT  16 and protection from the evolution of resistance via multiplexing, features not found in all gene drives 11,14 . While release requirements for threshold-dependent systems are large due to fitness costs and may present logistical challenges, they are an order of magnitude less than those routinely carried out for insect suppression programs 20 . Although models without life history and spatial structure tend to underestimate release requirements for threshold-dependent systems, spatially explicit models suggest that local population replacement could be achieved with a series of 1:1 releases carried out over several weeks 21 . Overall, SPECIES demonstrates a platform for the possible safe control or modification of pest and disease vector populations that impose significant burdens on humanity.

Methods
Plasmid design and assembly. To assemble plasmid OA-986A, the base vector used for generating dCas9-expressing plasmids, several components were cloned into the piggyBac plasmid pBac[3xP3-DsRed] 22 using Gibson assembly/EA cloning 23 . pBac[3xP3-DsRed] was digested with BstBI and NotI, and the following components were cloned in with EA cloning: an attP sequence amplified from plasmid M{3xP3-RFP attP} 24  To assemble plasmids OA-1045A-E, the multiple sgRNA containing vectors, several components were cloned into the multiple cloning site of a plasmid 24 containing the white gene as a marker and an attB-docking site using Gibson assembly/EA cloning 23 . First, the plasmid was digested with AsiSI and KpnI, and the following components were cloned in with EA cloning to generate base plasmid OA-1045: a D. melanogaster U6:3 promoter fragment sequence amplified from Addgene plasmid #49411 26 with primers 1045.C1 and 1045.C2, and an sgRNA scaffold fragment amplified from Addgene plasmid #49411 26 with primers 1045.C3 and 1045.C4. The resulting base plasmid was then used to clone final sgRNA plasmids OA-1045A through OA-1045E. To generate plasmid OA-1045A (Addgene #125003), plasmid OA-1045 was digested with AvrII; then, a fragment containing an 18-base-pair (bp) even skipped (eve) sgRNA target site 5 , an sgRNA scaffold, a D. melanogaster U6:1 promoter fragment, and an 18-bp head involution defective (hid) sgRNA target site 5 was amplified from a custom gBlocks ® Gene Fragment (Integrated DNA Technologies, Coralville, Iowa) with primers 1045.C5 and 1045.C6 and was cloned into the digested backbone using EA cloning. To generate plasmid OA-1045B (Addgene #125004), plasmid OA-1045A was digested with XbaI, and a fragment containing a Gypsy insulator, a D. melanogaster U6:1 promoter fragment driving expression of a first hedgehog (hh)targeting sgRNA, and a D. melanogaster U6:3 promoter fragment driving expression of a second hh-targeting sgRNA amplified from plasmid pCFD4-hh 8 with primers 1045.C7 and 1045.C8 was cloned in using EA cloning. To generate plasmid OA-1045C (Addgene #125005), plasmid OA-1045A was digested with XbaI, and a fragment containing a Gypsy insulator, a D. melanogaster U6:1 promoter fragment driving expression of a first wingless (wg)-targeting sgRNA, and a D. melanogaster U6:3 promoter fragment driving expression of a second wg-targeting sgRNA amplified from plasmid pCFD4-wg 8 with primers 1045.C7 and 1045.C8 was cloned in using EA cloning. To generate plasmid OA-1045D (Addgene #125006), plasmid OA-1045 was digested with AscI and XbaI, and two fragments were cloned in using EA cloning: a first fragment containing a D. melanogaster U6:1 promoter fragment driving expression of a first wg-targeting sgRNA and a D. melanogaster U6:3 promoter fragment driving expression of a second wg-targeting sgRNA amplified from plasmid pCFD4-wg 8 with primers 1045.C9 and 1045.C10, and a second fragment containing a Gypsy insulator, a D. melanogaster U6:1 promoter fragment driving expression of a first hh-targeting sgRNA, and a D. melanogaster U6:3 promoter fragment driving expression of a second hh-targeting sgRNA amplified from plasmid pCFD4-hh 8 with primers 1045.C11 and 1045.C12. Finally, to generate plasmid OA-1045E (Addgene #125007), plasmid OA-1045 was digested with AvrII and NotI, and a fragment comprising D. melanogaster Gly tRNA-flanked sgRNAs 27 targeting, from 5′ to 3′, eve, hid, and hh followed by a D. melanogaster U6:3 UTR that was amplified with primers 1045.C13 and 1045.C14 from a gene synthesized vector (GenScript, Piscataway, NJ) was cloned using EA cloning. All primers used for cloning are listed in Table S15.  Table 6). Ten stochastic model predictions are shown for each release frequency, assuming a population size of 50. Proportion of individuals CFP+ represents the percentage of SPECIES individuals at each generation. Source data are provided as a Source Data file.  Generation and genetics of speciated stocks. Single sgRNA lines targeting eve, hid, hh, and wg were utilized 5,8 . dCas9-VPR-and dCas9-VP64-expressing lines were generated via microinjection as described above; we were unable to generate a transgenic line that expressed dCas9-VPR ubiquitously, despite numerous attempts, suggesting that such expression was toxic. To test for the ability of all sgRNA lines to induce lethal overexpression ("killing"), five sgRNA males and five virgin females were separately crossed to five dCas9 line individuals of the opposite sex in single vials and were allowed to mate for 7 days. After 7 days, the parents were removed, and the vials were monitored for 7 additional days to assess whether viable larvae were present. No killing was observed in crosses of dCas9-VP64 expressing lines to any of the sgRNA-expressing lines (Fig. 1), consistent with The release threshold increases with fitness cost, from 50% for no fitness cost, to 83.3% for an 80% fitness cost. c Extreme underdominant systems spread quickly, reaching a population frequency of 99% (including heterozygotes and homozygotes) within 3-4 generations of a release at a frequency of 65% for fitness costs between 0 and 20%. d Discrete generation model of reciprocal chromosomal translocations with a fitness cost of 10% released at population frequencies of 50-65%. A release threshold is apparent between 55 and 60% (simulations confirm a threshold of 56.1%). e The release threshold increases with fitness cost, from 50% for no fitness cost, to 62.4% for a 20% fitness cost. Translocations cannot spread for fitness costs greater than 66.3%. f Translocations spread less quickly, reaching a population frequency of 99% (including heterozygotes and homozygotes) within 7-13 generations of a release at a frequency of 65% for fitness costs between 0 and 20%. g Discrete generation model of two-locus engineered underdominance with a fitness cost of 10% released at population frequencies of 25-40%. A release threshold is apparent between 30 and 35% (simulations confirm a threshold of 31.2%). h The release threshold increases with fitness cost, from 26.9% for no fitness cost, to 35.9% for a 20% fitness cost. Two-locus engineered underdominance cannot spread for fitness costs greater than 72.7%. i Two-locus engineered underdominant systems spread slightly less quickly, reaching a population frequency of 99% (including heterozygotes and homozygotes) within 4-5 generations of a release at a frequency of 65% for fitness costs between 0 and 20%. To generate protective indel mutations, a Ubiquitin-Cas9 line (BSC #79005) 28 was used. Briefly, ten Ubiquitin-Cas9 virgin females were crossed to ten sgRNA males, and virgin female and male progeny with both transgenes were selected and crossed to each other for at least three generations. Cas9/sgRNA transheterozygous virgins were then outcrossed in groups of 3-5 to homozygous attP40w bnk-dCas9-VPR males, and progeny containing both a sgRNA (identified by the presence of the w+ marker) and bnk-dCas9-VPR (identified by the presence of the opie2-eCFP marker) were isolated as "rescue" individuals that presumably carried protective indel mutations in the target promoter regions that prevented dCas9-induced overexpression (Fig. 1). To confirm the generation of indels, these flies were Sanger sequenced and crossed to each other, again in groups of 3-5, to establish "rescue" stocks.
These "rescue" crosses were also set up in the reverse direction, utilizing 3-5 homozygous attP40w bnk-dCas9-VPR females crossed to Cas9/sgRNA transheterozygous males, to determine whether maternal deposition of Cas9/ sgRNAs is required for generating sufficient protective indel mutations to provide Fig. 6 Population dynamics of underdominant systems in two populations. a Discrete generation model of a SPECIES-like extreme underdominant system released at 60% in population A and initially absent from population B. Population A exchanges migrants with population B at a rate of 1% per individual per generation. For a fitness cost of 10%, the system reaches near-fixation in population A within seven generations but only spreads to 0.01% in population B. b As the migration rate increases, the SPECIES system reaches a higher frequency in population B, exceeding 4%; however, for migration rates above 16.6% per individual per generation, it is eliminated from both populations through dilution of population A with wild types from population B. c For the two-population model, there is a migration threshold below which the construct fixes in population A and persists at a low level in population B and above which it is lost in both populations. For the source model, extreme underdominance displays threshold behavior with respect to migration rate. d Discrete generation model of reciprocal chromosomal translocations released at 60% in population A and initially absent from population B. Population A exchanges migrants with population B at a rate of 1% per individual per generation. For a fitness cost of 10%, the system reaches near-fixation in population A within 22 generations and spreads to 3.6% in population B. e As the migration rate increases, the translocations reach a higher frequency in population B, exceeding 15%; however, for migration rates above 5.0% per individual per generation, they are eliminated from both populations through dilution of population A with wild types from population B. f For the two-population model, there is a migration threshold below which translocations fix in population A and persist at a low level in population B, and above which they are lost in both populations. For the source model, translocations display threshold behavior with respect to migration rate. g Discrete generation model of two-locus engineered underdominance released at 60% in population A and initially absent from population B. Population A exchanges migrants with population B at a rate of 1% per individual per generation. For a fitness cost of 10%, the system reaches near-fixation in population A within eight generations and spreads to 3.2% in population B. h As the migration rate increases, the system reaches a higher frequency in population B, exceeding 21.2%; however, for migration rates above 4.2% per individual per generation, the system becomes fixed in both populations. i Two-locus engineered underdominance displays threshold behavior with respect to migration rate. rescue of lethality. In particular, it was assumed that, if both copies of the targeted promoter needed to contain protective indel mutations to provide rescue, lack of maternally deposited Cas9/sgRNA (due to Cas9/sgRNA fathers being used) would lead to lack of "rescue" individuals, as all individuals inheriting the sgRNA and bnk-dCas9-VPR transgenes would still have one wildtype copy of the target promoter inherited from the mother available for targeting and would perish.
To further validate whether both copies of the targeted promoter needed to contain protective indel mutations to provide rescue from lethality, "rescue" individuals were also bidirectionally outcrossed in groups of 3-5 and in triplicate to homozygous attP40w bnk-dCas9-VPR individuals, and the resulting progeny were scored for the "rescue" phenotype. Here, it was presumed that the lack of transheterozygous sgRNA/bnk-dCas9-VPR progeny indicated that both copies of the targeted promoter needed to contain protective indel mutations to provide rescue and that the lack of such mutations in the promoter allele inherited from the homozygous attP40w bnk-dCas9-VPR parent led to lethality in the transheterozygous sgRNA/bnk-dCas9-VPR progeny. Here, too, such lethality was observed for crosses with multiple sgRNA transgenes but not for crosses with single sgRNA transgenes, suggesting that, in the case of the latter, one wildtype copy of the targeted promoter was not sufficient to lead to sgRNA/bnk-dCas9-VPRinduced lethality.
Double homozygous speciated stocks were generated for all sgRNA combinations by crossing dCas9/sgRNA heterozygotes that lacked the Ubiquitin-Cas9 transgene (as evidenced by lack of red fluorescence) and identifying homozygous progeny by eye color (orange to dark red eyes for homozygotes vs. yellow to light red eyes for heterozygotes, depending on sgRNA insertion site) and opie2-eCFP intensity. Putative double homozygous individuals were then outcrossed to w[1118] individuals of the opposite sex in groups of three per vial to test for reproductive isolation. Flies were allowed to mate and lay eggs for 7 days, and vials were checked daily for hatched embryos. Flies that failed to fruitfully mate with w[1118] were presumed to be reproductively isolated double homozygotes and were then crossed to putative double homozygotes of the opposite sex to generate a double homozygous, reproductively isolated stock for each sgRNA line.
Assessment of reproductive isolation from various strains. To determine whether double homozygous SPECIES lines were reproductively isolated from stocks that were genetically diverse, SPECIES individuals were outcrossed to various Global Diversity Lines (GDL) isolated from five different continents 29 and were used in previous work examining gene drive function in different genetic contexts 30,31 . Briefly, five double homozygous individuals from each SPECIES stock were outcrossed to five individuals of the opposite sex from each of five GDL (from Beijing, China; Ithaca, NY; the Netherlands; Tasmania, Australia; and Zimbabwe, Africa). All crosses were done bidirectionally with respect to sex and in triplicate. Flies were allowed to mate and lay eggs for 7 days, and vials were checked daily for hatched embryos for the following 7 days. Lack of embryo hatching was presumed to indicate reproductive isolation (Supplementary Table 8).
To assess reproductive isolation between double homozygous SPECIES lines, inter-SPECIES crosses were performed by crossing two SPECIES virgin females with two SPECIES males from each strain. Flies were allowed to mate for 12-16 h under standard conditions; following this period, the adult flies were removed and the embryos were counted ( Supplementary Fig. 5). The vials were aged at 26°C for 24 h and subsequently scored for the number of hatched embryos (if reproductive isolation did not occur). The vials were then kept at 26°C for 7-10 days to ensure no pupae/adults emerged in instances of reproductive isolation or to count emerged adults in instances of incomplete reproductive isolation ( Supplementary  Fig. 5).
Embryo and adult viability determination. For embryo viability counts (Fig. 2), 3-4 day old adult virgin females were mated with males of the relevant genotypes for 2-3 days in glass vials supplemented with Drosophila medium and yeast paste. Following this period, the adults were transferred to an egg collection chamber containing a grape juice agar plate. Females were allowed to lay at 26°C for 12 h, after which the adults were removed and the total number of embryos were scored. These embryos were kept on the agar surface at 26°C for 24 h. The % survival was then determined by counting the number of unhatched embryos. One group of 100-300 embryos per cross was scored in each experiment, and each experiment was carried out in biological triplicate (total number of offspring scored is presented in Table S3). The results presented are averages from these three experiments. Embryo survival was normalized with respect to the % survival observed in parallel experiments carried out with the Oregon R wildtype strain, which was 91.66%. For adult fly counts (Fig. 2), the agar plates were transferred to 250 ml plastic bottles with Drosophila medium and kept at 26°C for 7-10 days. Following this period, the number of adults that emerged was scored. The percentages of adult survival presented are averages from each cross normalized with respect to the % survival observed in Oregon R, which was 58.04% (total number of offspring scored is presented in Source Data file); all crosses were set in triplicate. Unpaired two-tailed t-test statistical analyses using GraphPad Prism version 8.2.1 for Windows (GraphPad Software, San Diego CA, www.graphpad.com) were carried out for both embryo and adult fly counts to compare expected and observed values. For species crossed to themselves, significant differences were found in embryo survival for species A2 and C1 compared to WT crossed to itself (p values = 0.0322 and 0.0325, respectively) but not for any of the remaining six species. For species outcrossed to WT, significant differences were found for all bidirectional crosses of each species to WT when compared to WT crossed to itself. This significance was seen in experiments for both embryo and adult fly survival (p < 0.05). Exact p values reported in Source Data file. Cohen's d effect sizes were calculated using Stata (StataCorp. 2019. Stata Statistical Software: Release 16. College Station, TX: StataCorp LLC). Figure icons were created with Biorender.com Population experiments. All genetic experiments were conducted in a highsecurity Arthropod Containment Level 2 (ACL2) barrier facility in accordance with protocols approved by the Institutional Biosafety Committee from University of California San Diego. Population experiments were carried out at 26°C, 12 h/12 h day/night cycle, with ambient humidity in 250 ml bottles containing Lewis medium supplemented with live, dry yeast. Starting populations for drive experiments included equal numbers of virgins and males of similar ages for each genotype. Speciated double homozygotes (dCas9/dCas9; +/+) were introduced at a population frequency of 80% for above-threshold drive experiments, and 50% for belowthreshold drive experiments. Oregon R virgin females and males (+/+; +/+) of a similar age as the reproductively isolated individuals made up the remainder of the population. The total number of flies for each starting population was 100. All 50, 70, and 80% population experiments were conducted in at least triplicate, with exception of SPECIES C1 seeded at 80% in which one of the replicates did not produce any progeny due to bacteria/mold contamination. Moreover, only one replicate was conducted for releases seeded at 90% for all species except B2 as again this replicate did not produce any progeny due to bacteria/mold contamination. In total, we conducted 44 population cage experiments which were tracked for up to 11 generations. After being placed together, adult flies were removed after 7 days. After another 7 days, progeny were collected and the fraction of speciated double homozygous individuals was determined ( Fig. 4 and Supplementary Table 4). The progeny were then placed into a new bottle to initiate the next generation. No significant evidence of crowding in the 250 ml bottles was observed.
Mathematical modeling. We modeled SPECIES population dynamics under laboratory conditions assuming random mating and discrete generations. We considered a SPECIES allele, "T", and a corresponding wildtype allele, "t". Since heterozygotes for the SPECIES system are unviable, there are only two viable genotypes-TT and tt. We denote the proportion of organisms having the genotype TT at generation k by p k , and the proportion having the wildtype genotype at generation k by ð1 À p k Þ. By considering all possible mating pairs, and assuming a fitness cost for TT individuals relative to wildtype individuals, s, the frequency of TT individuals in the next generation is given by: The threshold frequency is an unstable equilibrium that satisfies the condition: Substituting Eq. (2) into Eq. (1) and solving for p k , we find two stable equilibria (p k ¼ 0 and p k ¼ 1) and one unstable equilibrium (p k ¼ 1=ð2 À sÞ). The latter represents the critical threshold frequency, above which the SPECIES system is more likely to spread to fixation than not, and below which it is more likely to be eliminated than not.
The likelihood of the population data for each SPECIES system was calculated by assuming a binomial distribution of wildtype (CFP−) and SPECIES (CFP+) individuals, and by using the model in Eq. (1) to generate expected proportions for each fitness parameter value, s, i.e., by calculating the log-likelihood: Here, (1) TT i,k and tt i,k are the number of SPECIES (CFP+) and wildtype (CFP−) individuals at generation k in experiment i, respectively, (2) there are a total of j experiments for this SPECIES system, (3) the ith experiment is run for n i generations, and (4) the expected genotype frequencies are dependent on the fitness parameter, s. The initial condition for each experiment is specified by the data. Fitness parameters, including 95% credible intervals, were estimated using a Markov chain Monte Carlo sampling procedure.
The stochastic simulations in Fig. 4 were implemented by calculating expected genotype frequencies in the next generation according to Eq. (1), and taking a binomial sample from a total of 50 individuals.
Comparative modeling of other underdominant systems is described in Marshall and Hay 19 . That paper uses the mathematical modeling framework described here in addition to two approaches for modeling migration: (1) a "two-population model", in which reciprocal movement occurs between the two connected populations; and (2) a "source model", in which the system is initially fixed in the source population, absent from the sink population, and one-way migration occurs from the source to sink population. In Marshall and Hay 19 , population replacement and confinement dynamics are shown for: (1) extreme underdominance (the SPECIES system modeled here), (2) reciprocal chromosomal translocations 16 , (3) single-locus and two-locus engineered underdominance 32 , (4) Semele 33 , (5) inverse Medea 33 , and (6) Merea (Medea with a recessive antidote). A range of parameter values are compared for each gene drive system, including fitness cost (s, varied NATURE COMMUNICATIONS | https://doi.org/10.1038/s41467-021-23531-z ARTICLE between 0 and 30%) and migration rate (m, varied between 0 and 10% per individual per generation for both the source and two-population models) 16 .
Results from that analysis suggest that SPECIES-like extreme underdominant systems fare well against other underdominance-based gene drive systems in terms of both confinement and persistence. The most direct comparison can be made to translocations 16 , which also have a 50% release threshold in a single population and in the absence of a fitness cost. Considering a 5% fitness cost for both systems, they still have very similar release thresholds (51.3% for SPECIES-based underdominance cf. 52.8% for translocations); however, for a two-population model with a migration rate of 1% per individual per generation, the SPECIESbased underdominant system spreads to only~0.01% in the neighboring population, while the translocations spread to a much higher~4.2% in the neighboring population 19 . The migration rate at which the introduced system is lost due to inward migration of wild types is also much higher for the SPECIESbased underdominant system (17.6% per individual per generation cf. 5.8% for translocations, s = 0.05) 19 . This suggests that SPECIES-like extreme underdominant systems are preferable to translocations for local population replacement since they lead to less contamination of neighboring populations and are less vulnerable to elimination due to inward migration.
Finally, were SPECIES-based underdominant systems to be implemented for local population replacement, strains would likely be used that would have much smaller fitness costs than those observed here (~30%). Despite that, results from Marshall and Hay 16 suggest the population dynamics of the SPECIES system are resilient in the face of these fitness costs. A SPECIES system with a fitness cost of 30% has a release threshold of 58.8%, which could be exceeded through weekly releases over several weeks. Furthermore, in a two-population model, the migration rate at which the SPECIES system would be lost due to inward migration of wild types is 13.3% per individual per generation 16 , which is greater than the movement rate observed between populations of Anopheles gambiae 34 , the main mosquito vector of malaria in Sub-Saharan Africa, and Aedes aegypti 35 , the main mosquito vector of dengue, Zika and Chikungunya viruses.
RNA sequencing for transcriptional activation analysis. Embryos were collected from the multiple speciated lines to assess transactivation in the embryo. Male speciated flies were crossed to Oregon R virgin females in glass vials supplemented with Drosophila medium and yeast paste and were incubated at 26°C for 72 h. Following this period, the adult flies were transferred to collection chambers containing grape juice agar plates. The flies were allowed to lay for 4-5 h, after which the embryos were aged for 1 h and collected using a paintbrush. Afterwards, 30-50 embryos that were 5-6-h old were collected, washed with ddH 2 O, and transferred to individual eppendorf tubes. The samples were flash frozen with liquid nitrogen and stored at −80°C. Intra-crosses for Oregon R, Cas9, dCas9, sgRNA, and speciated lines were also performed and collected as controls. Each sample was homogenized and processed using the Quick-Start Protocol of the miRNeasy Mini Kit (Qiagen, Hilden, DEU), followed by DNase treatment using the DNA-free™ Kit and protocol (Thermo Fisher Scientific, Waltham, MA, USA).
RNA-seq library construction and sequencing. RNA integrity was assessed using the RNA 6000 Pico Kit for Bioanalyzer (Agilent Technologies #5067-1513), and mRNA was isolated from~1 μg of total RNA using NEBNext Poly(A) mRNA Magnetic Isolation Module (NEB #E7490). RNA-seq libraries were constructed using the NEBNext Ultra II RNA Library Prep Kit for Illumina (NEB #E7770) following the manufacturer's instructions. Briefly, mRNA was fragmented to an average size of 200 nt by incubating at 94°C for 15 min in the first strand buffer. cDNA was then synthesized using random primers and ProtoScript II Reverse Transcriptase followed by second strand synthesis using NEB Second Strand Synthesis Enzyme Mix. Resulting DNA fragments were end-repaired, dA tailed, and ligated to NEBNext hairpin adaptors (NEB #E7335). After ligation, adaptors were converted to the "Y" shape by treating with USER enzyme, and DNA fragments were size selected using Agencourt AMPure XP beads (Beckman Coulter #A63880) to generate fragment sizes between 250 and 350 bp. Adaptor-ligated DNA was PCR amplified followed by AMPure XP bead clean up. Libraries were quantified using a Qubit dsDNA HS Kit (Thermo Fisher Scientific #Q32854), and the size distribution was confirmed using a High Sensitivity DNA Kit for Bioanalyzer (Agilent Technologies #5067-4626). Libraries were sequenced on an Illumina HiSeq2500 in single read mode with the read length of 50 nt and sequencing depth of 20 million reads per library following the manufacturer's instructions. Base calls were performed with RTA 1.18.64 followed by conversion to FASTQ with bcl2fastq 1.8.4.
Immunohistochemistry. For antibody staining, embryos were collected overnight and then fixed and dechorionated using standard protocols 40 . We used guinea pig anti-Runt polyclonal antibody (kindly provided by David Kosman and John Reinitz) at a concentration of 1:200 and mouse anti-Eve monoclonal 3C10 (developed by C. Goodman and available from the Developmental Studies Hybridoma Bank) at 1:20. Nuclei were counterstained with DAPI. Embryos were stained using standard protocols 41 .
Cuticle preparation. Embryos were collected and aged at 27°C until they were 16-22 h old. Embryos were pipetted onto a slide and excess fluid was removed. Glacial acetic acid mixed 1:1 with Hoyer's solution was added, covered with a coverslip, and allowed to dry for several days in an oven at 65°C for clearing. After 24 h, the coverslips were weighted to flatten the preps. Cuticles were imaged on an upright Zeiss Axio Imager microscope with bright field illumination, and grayscale images were later inverted and oversaturated for increased contrast using Adobe Photoshop.
Molecular characterization of protective indel mutations. To examine the molecular changes that conferred protection from dCas9-mediated overexpression and associated lethality, four genomic loci that include target sites for four functional sgRNAs ( Supplementary Fig. 4) were amplified and sequenced. Single-fly genomic DNA preps were prepared using the solid tissue protocol of the Quick-DNA™ Miniprep Plus Kit (Zymo Research). In total, 2-3 µl of genomic DNA was used as a template in a 50-µl PCR reaction with Q5 ® High-Fidelity 2X Master Mix (NEB, Ipswich, MA, USA). The following primers (Supplementary Table 5) were used to amplify the loci with the corresponding sgRNA targets: 1001.S1 and 1001. S4 for hh; 1045.S1 and 1045.S4 for wg; 1045.S5 and 1045.S8 for eve; and 1045.S9 and 1045.S12 for hid. PCR products were loaded and run on an agarose gel, excised, and purified using a Zymoclean™ Gel DNA Recovery Kit (Zymo Research, Irvine, CA, USA) and were sequenced in both directions using Sanger sequencing at Retrogen Inc (San Diego, CA, USA). To characterize molecular changes at the targeted sites, AB1 sequence files were aligned against the corresponding reference sequences (downloaded from FlyBase release FB2019_3) 42 in SnapGene ® 4 and/or Benchling™.
Gene drive safety measures. All crosses using gene drives genetics were performed in accordance to an Institutional Biosafety Committee approved protocol from UCSD, in which full gene drive experiments are performed in a high-security ACL2 barrier facility and split-drive experiments are performed in an ACL1 insectary in plastic vials that are autoclaved prior to being discarded in accordance with currently suggested guidelines for the laboratory confinement of gene drive systems 13,43 .
Ethical conduct of research. We have complied with all relevant ethical regulations for animal testing and research and conformed to the UCSD institutionally approved biological use authorization protocol (BUA #R2401).
Reporting summary. Further information on research design is available in the Nature Research Reporting Summary linked to this article.

Data availability
The RNA sequencing data generated in this study is available at NCBI SRA under accession number PRJNA578541. All plasmids and annotated DNA sequence maps are available at www.addgene.com under accession numbers; 112686; 124999; 125000; 125001; 125002; 125003; 125004; 125005; 125006; 125007. The fly strains engineered in this study and used to generate SPECIES A1 will be available at the Bloomington fly stock center with the stock numbers 79005, 91792, 91791. SPECIES fly lines will be made available from the corresponding author upon reasonable request upon agreement to suggested guidelines for the laboratory confinement of gene drive systems 13,43 . Source data are provided with this paper. Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/ licenses/by/4.0/.