Introduction

In genetics, underdominance occurs when a heterozygous genotype (Aa) is less fit than either homozygous genotype (AA and aa). In extreme underdominance, the heterozygote is inviable while each homozygote has equal fitness1. Extreme underdominance is an attractive and versatile tool for population control. First, it could be leveraged to create threshold-dependent, spatially contained gene drives2 capable of replacing local populations. Such gene drives may be more socially acceptable than threshold-independent gene drives since their spread can be more tightly controlled. Alternatively, only males could be released for a genetic biocontrol approach that mimics sterile insect technique. Several strategies for engineering underdominance have been described, including one- or two-locus toxin-antitoxin systems3,4, chromosomal translocations5, and RNAi-based negative genetic interactions6. Despite its theoretical utility in population control, engineering extreme underdominance has been difficult1.

Extreme underdominance essentially constitutes a speciation event, as it prevents successful reproduction and therefore genetic exchange between two populations. In nature, prezygotic and postzygotic incompatibilities maintain species barriers. Prezygotic incompatibilities prevent fertilization from taking place. These can include geographic separation or behavioral/anatomical differences between individuals in two populations that prevent sperm and egg from meeting. Postzygotic incompatibilities occur when genetic or cellular differences between the maternal and paternal gametes render the offspring inviable or infertile. The Dobzhansky–Muller Incompatibility (DMI) model7 asserts that postzygotic incompatibilities can arise via mutations that create a two-locus underdominance effect8. DMIs are considered a major driving force underlying natural speciation events. Understanding the molecular mechanisms resulting in hybrid incompatibilities between species is a central question for evolutionary biology and ecology.

We have recently described a versatile and effective method for engineering DMIs in the lab to direct synthetic speciation events. We name this method engineered genetic incompatibility (EGI). An EGI strain is homozygous for a lethal effector gene and the corresponding resistance allele(s). What separates EGI from described toxin/antitoxin systems is that the lethal effector allele is haplosufficient, while the resistance allele is haploinsufficient. Any outcrossing of the EGI strain with wild-type generates inviable hybrids, as the resulting heterozygotes contain the lethal effector gene but only one copy of the haploinsufficient resistance allele (Fig. 1a). Unlike single locus, bi-allelic toxin-antitoxin systems3, the EGI genotype in principle incurs no fitness penalty, as 100% of the offspring between similarly engineered EGI parents remain viable. Our approach leverages sequence-programmable transcription activators (PTAs) to drive lethal over or ectopic expression of endogenous genes (Fig. 1b, c)9.

Fig. 1: Design of Engineered Genetic Incompatibility (EGI).
figure 1

a Schematic diagram of required genotypes for EGI. L, dominant lethal gene; l, wild-type allele (null); S, dominant susceptible allele; s, recessive resistant allele. b X-ray crystal structure of S. pyogenes Cas9 (PDB ID: 6o0z, left) and diagram of dominant lethal gene product, dCas9-VPR. c Interaction of dCas9-VPR with resistant (top) or susceptible (bottom) alleles. Blue square represents a mutation that abrogates dCas9 binding. RNAP, RNA polymerase.

Here we apply EGI to engineer extreme underdominance in the model insect, Drosophila melanogaster. We show that the strength and timing of hybrid lethality can be tuned based on genetic design. Further, we show that multiple mutually incompatible synthetic species can be created for the same target organism. This has important ramifications for the design of genetic biocontrol strategies that are robust in the face of genetic resistance.

Results

Lethal overexpression of endogenous genes

To drive lethal overexpression of endogenous genes, we use the dCas9-VPR, composed of a catalytically inactive Cas9 fused to three transcriptional activation domains (VP64, p65, and Rta)10 (Fig. 1b). This has been used previously to cause lethal gene activation in D. melanogaster11; however, we had to mitigate apparent off-target toxicity associated with strong dCas9-VPR expression in the absence of sgRNA12. Replacing the promoter driving dCas9-VPR with promoters from various developmental genes (Pwg*, Pfoxo, Pbam) or a truncated tubulin promoter (Ptub)13 allowed us to constrain dCas9-VPR expression sufficiently to allow generation of homozygous fly strains. We also created viable homozygous flies expressing the evolved dXCas9-VPR transactivator from the truncated tubulin promoter14.

Strains homozygous for dCas9-VPR constructs were mated to strains homozygous for sgRNAs targeting several genes important for development (hh, hid, pyr, upd1, upd2, upd3, wg, vn) (Supplementary Table 1). The parental flies were removed from mating vials after five days and the number of offspring surviving to pupal and adult life-stages were counted after 15 days (Fig. 2, Source Data File). Several crosses produced no surviving adult offspring in replicate experiments. Interestingly, we observed unique hybrid incompatibility phenotypes that depended on the combination of PTA and sgRNA used to drive over or ectopic expression. Six crosses (red shading, Fig. 2) yielded no pupae, indicating embryonic or larval lethality. The strongest early lethality was seen when Ptub:dCas9-VPR or Pwg*:dCas9-VPR drove expression of the developmental genes pyramus, wingless, and unpaired-1. Thirteen crosses (yellow shading, Fig. 2) produced a pupal-lethal phenotype. These include genotypes which are predominantly larval lethal with a small number of offspring surviving to form pupae (e.g. Pfoxo:dCas9-VPR with wg-sgRNA) as well as genotypes that give normal numbers of pupae, but no adults (e.g. Ptub:dCas9-VPR with upd3-sgRNA). One of the crosses, Pwg*:dCas9-VPR X upd3-sgRNA (blue shading, Fig. 2) produced a small number of surviving adults that were visibly deformed and died within three days of emerging from pupae. Interestingly, we observed two crosses with the Ptub:dXCas9-VPR parent (green shading, Fig. 2) that showed strong sex-ratio biasing, with predominantly (95%, upd1) or exclusively (100%, upd2) male adult survivors. The same PTA crossed with vn exhibits a slight sex-ratio bias of 1.7:1 males:females (Source Data File). We used these data to select a sub-set of putative target genes for constructing EGI flies, focusing on pyr, wg, and hh moving forward.

Fig. 2: Empirical determination of targets for lethal over or ectopic expression.
figure 2

Results showing the number of progeny surviving to pupal or adult life-stages (purple, orange circles, respectively) for crosses between a paternal fly homozygous for a dCas9-VPR expression cassette (rows) and a maternal fly homozygous for sgRNA expression cassette (columns). Individual experiments are colored according to phenotype categories according to the key below. n = 2 biologically independent replicates.

Constructing EGI strains

Mutations to the sgRNA-binding sequences of target promoters are necessary to prevent lethal over or ectopic expression in the EGI strain. These constitute the haploinsufficient resistance alleles in the EGI design (Fig. 1a, c). To generate viable promoter mutations, flies expressing germline active Cas9 nuclease15 were crossed to homozygous sgRNA-expressing strains16 or were directly microinjected with sgRNA expression constructs. Offspring were crossed to balancers and F2 flies were screened for the presence of mutations via Sanger sequencing. For each target promoter, we isolated mutant flies that were viable as homozygotes and without any apparent phenotype, suggesting that the mutations are benign and do not substantially interfere with required expression from these loci (Fig. 3a, Supplementary Fig. 1). It is noteworthy that we commonly recovered mutated promoters that had independent NHEJ events at each sgRNA target site, despite their close proximity. This is contrary to the belief that targeting proximal sequences is likely to result in complete excision of the intervening sequence in the event of NHEJ17. The sgRNAs are expressed from different U6 promoters and perhaps there is some variation in expression patterns that allows for each site to be cut sequentially.

Fig. 3: Genotype and hybrid incompatibility of select EGI strains.
figure 3

a Proximity of sgRNA-binding sites to transcription start site (TSS) for EGI strains. Sequences of both sgRNA-binding sites are shown below promoter illustration, with protospacers in red and protospacer adjacent motifs in blue. Sequences of the mutated promoters at the sgRNA-binding loci are shown below with differences highlighted in gray shadow. b Chromosomal locations of genome alterations. Violet text represents target genes, orange text represents PTA-constructs, blue text represents sgRNA constructs, and green text represents joint PTA-sgRNA constructs. c Hybrid incompatibility data showing number of progeny surviving to adulthood. Genotype of parental strains for each cross are given on the x-axis. n = 3 biologically independent experiments. d Immunohistochemical staining of wild-type (left) or hybrid (right) larva showing over or ectopic expression of targeted signaling pathways. Antibody binding targets are labeled in the bottom left corner of each image. For each panel, the wg-, pyr-, and hh-targeting EGI genotypes are shown from top to bottom. 200 µM scale bar. Images are representative of at least six independent biological samples for each strain.

We combined each of the required components to create a full EGI genotype via one of two approaches. In each, we needed to avoid passing through intermediate genotypes that contained an active PTA and a wild-type promoter sequence, as this would be lethal. The first method involved a total of 19 crosses between flies containing Cas9, PTA, or sgRNA expression constructs that had already been characterized in Fig. 2 (Fig. 3b top, Supplementary Figs. 2, 3). The second method involved re-injecting embryos from homozygous promoter mutant strains with a single plasmid containing expression constructs for both the dCas9-VPR and the sgRNA (Fig. 3b bottom). The latter approach was more direct, requiring approximately half the number of crosses, but resulted in a different chromosomal location for PTA expression constructs compared to what was previously characterized (Supplementary Figs. 46). Using these two methods, we produced a total of 12 unique EGI genotypes (Supplementary Fig. 7). We use a short-hand naming convention that describes the target gene (wg, pyr, hh), the promoter driving dCas9-VPR (Pwg*, Pfoxo, Ptub, Pbam), and the method used in strain construction (crossing, injection): for example pyr.Pfoxo.injection.

Assessing hybrid incompatibility

Candidate EGI strains were crossed to wild-type (Oregon R and w1118) to assess mating compatibility. While w1118 was the wild-type starting point for our EGI engineering efforts, male w1118 flies have a previously reported courtship phenotype18. Oregon R males lack this mating phenotype and reproduce more efficiently. We performed intra-specific matings (male and female from the same EGI genotype) and EGI× wild-type matings by combining three virgin females of one genotype with two males of another genotype. The number of pupae and adult progeny were counted after 15 days just as for the hybrid lethality screen described above. EGI strains that drove over or ectopic expression of wingless or pyramus both showed full incompatibility, with no hybrids surviving to adulthood (Fig. 3c). These represent engineered extreme underdominance. The EGI lines were healthy and fecund, with EGI × EGI crosses yielding numbers of offspring on par with wild-type × wild-type crosses. A third EGI strain targeting the hedgehog promoter showed a marked underdominant phenotype, but not as strong as the extremely underdominant wg- or pyr-EGI strains. Approximately 10–13% of hybrid offspring from hh-EGI crosses survived to adulthood. Of these surviving offspring, the females were all sterile, but the males were not, supporting a role of proper hedgehog expression in oogenesis19,20. That the initial hh-EGI strain was not as robust as wg- or pyr-EGI strains is not surprising. Activation of hedgehog produced a later-acting lethal phenotype compared to activation of pyramus and wingless in the PTA × sgRNA crosses (yielding pupal lethality instead of larval lethality). We believe that the weaker phenotype for hh-targeting guides in the EGI × wild-type hybrids (i.e. Fig. 3) versus the PTA × sgRNA crosses (i.e. Fig. 2) is the result of having only one sensitive (wild-type) promoter from which to drive lethal expression in the EGI × wild-type hybrids.

In order to confirm the mechanism of hybrid lethality, we performed immunohistochemistry on hybrid larva. We stained for target gene overexpression (Wingless) or activation of known downstream components in the relevant signaling pathways (p-ERK1/2 and Patched (Ptc) for EGI targets pyr and hh, respectively). For our wg-targeting EGI line we observed overexpression, but no ectopic expression, in the wing imaginal disc as expected from our Pwg*::dCas9-VPR expression design, in which the PTA is itself driven by the mutated wg promoter (Fig. 3d, top panel). Interestingly, we did observe unique staining patterns in the brain, but are not sure if this is due to ectopic expression or just accumulation of the overproduced ligand (Supplementary Fig. 8). When we drive expression of pyr or hh with a foxo or short tubulin promoter, respectively, we observe clear evidence of ectopic expression in hybrid larva (Fig. 3d, Supplementary Fig. 8). For the pyr-targeting EGI line, we observed ectopic activation of pERK1/2 in clusters of cells throughout the wing imaginal disc, whereas pERK1/2 is normally activated in a speckled like pattern. For the hh-targeting EGI line, we observe Patched ectopic production only in the anterior compartment, which phenocopies previous experiments of hh overexpression in imaginal discs21.

Incompatibility between EGI strains with distinct genotypes

We predicted that our method of generating species-like barriers to sexual reproduction would allow us to engineer not just one, but many EGI genotypes that are all incompatible with wild-type and with each other. To test this, we performed a large all-by-all cross-compatibility experiment that included 12 EGI and 2 wild-type genotypes. Each cross was performed bi-directionally (female of strain A to male of strain B and vice versa). The orthogonality plot (Fig. 4) shows the number of surviving adults from each cross. Crosses that are expected to produce viable offspring are present on the diagonal, with multiple compatibility groups defined by target-promoter mutations. Nine EGI strains were 100% incompatible with one or both wild-type lines. These include strains designed towards each of the developmental morphogen targets (hh, pyr, and wg). The high degree of symmetry across the diagonal shows that EGI produces bi-directional incompatibility, with the number of surviving offspring being similar if the EGI constructs were inherited maternally or paternally. While this is true when assessing the number of surviving adult progeny, we observed differences in timing of lethality for maternally versus paternally inherited EGI constructs (Source Data File, Supplementary Movie File 1). Three strains were apparently still compatible with wild-type. The pyramus-targeting construct in which the PTA was driven by Pbam was apparently not sufficiently strong to induce a lethal phenotype. Two other strains, pyr.Pwg*.injectionB and wg.Pwg*.injectionB were later found to have low levels of balancer chromosomes floating in the population. Data for these latter two strains are shown in Fig. 4 despite this mistake, as the matings represent blinded negative controls that were expected by the experimenters to be incompatible with wild-type, while the presence of the balancer chromosome still allowed for some offspring to survive. None of the experimental crosses were truly performed in a blinded manner. The relative fecundity of wild-type and EGI lines is shown as a bar graph in Supplementary Fig. 9.

Fig. 4: Engineering multiple orthogonal EGI strains.
figure 4

Mating compatibility between wild-type and 12 EGI genotypes, reported as the number of adult offspring 15 days after mating. Female (maternal) genotype is listed on the left axis with the naming convention [target.PTApromoter.construction-method], and male (paternal) genotypes are presented in the same order along the top axis. Predicted compatible strains are indicated with black-outline boxes across the diagonal. Gray boxes indicate crosses that were not measured for lack of virgin females for hh.Pfoxo.injection and pyr.Ptub.injection strains. Superscript B denotes that the strain was later found to have floating Balancer chromosomes. Smaller grid at right highlights four mutually compatible strains. Unless otherwise noted in Source Data File, values represent mean of three independent replicates.

Nuanced differences in genetic design are important for the performance of EGI strains. For example, wg.Pfoxo.cross and wg.Pfoxo.injection have the same genetic components and are expected to work via identical mechanisms. The only difference is the chromosomal location of the PTA components. Despite their similarity, the two strains show slightly different behavior in the hybrid compatibility experiment. This difference in performance is likely due to variable expression levels of the PTA construct, although this was not directly tested. Overall, these results show that we can engineer multiple EGI strains for a target organism. This has important implications in overcoming resistance to genetic population control, which is discussed in detail below.

Discussion

Here we demonstrate the ability to rationally engineer species-like barriers to sexual reproduction in a multicellular organism. We employed the EGI approach that was recently described in yeast9. Our successful implementation in flies confirms that this is a broadly applicable strategy for engineering reproductive barriers. Engineered speciation has been previously described in D. melanogaster by Moreno, wherein a non-essential transcription factor, glass, was knocked out and glass-dependent lethal gene construct was introduced22. This approach also uses a similar topology to EGI; however, the resulting flies were blind in the absence of glass and this approach could not be scaled to make multiple incompatible strains. Our use of PTAs to drive lethal over or ectopic expression allows us to generate multiple EGI strains with no noticeable phenotypes aside from their hybrid incompatibility. Using a similar approach, Windbichler et al. developed PTAs capable of driving lethal overexpression of developmental morphogens in D. melanogaster, but were unable to generate complete EGI strains due to target selection and transgene toxicity13. We found that the ability to create viable EGI lines requires empirical testing of genetic designs to first generate strains that express dCas9-VPR in a sufficiently limited manner to mitigate its toxicity and then screening target promoters to affect genetic incompatibility. We allowed NHEJ to determine the sequence of promoter mutations and selected for viable homozygous mutants, however, it may be more efficient in some cases to replace the targeted portion of a promoter with a sequence from a closely related species.

The ability to rationally design reproductive barriers opens up diverse opportunities for pest management and biocontrol of invasive species as well as genetic containment of novel proprietary genetically modified organisms23. Underdominance-based gene drives are threshold-dependent and allow for localized population replacement24. Several strategies for engineered underdominance exist4,5,6, but the EGI system is the first to produce 100% lethality of F1 heterozygotes. Compared to homing endonuclease gene drives, underdominance drives are more easily reversible and less likely to spread beyond the local target population25. Extreme underdominance gene drives are unique in their ability to spread genes/traits through a population that are unlinked to the drive allele. Since no hybrids between the biocontrol agent and wild-type organisms are viable, there is no opportunity for recombination events to break the linkage between the drive allele and other genes in the genome. Thus, EGI could be used to replace multi-locus traits in a target population. While this article was going to press, a preprint was published in which Buchman et al. show that release of male and female EGI flies function as expected in a caged population replacement experiment26.

Alternatively, EGI could be used as an alternative to Sterile Insect Technique27 by releasing only one sex. For example, released males would compete with wild males to mate with wild females. Any egg fertilized by an EGI male would fail to develop to adulthood. In applications such as this, our ability to tune the life-stage of hybrid lethality could have dramatic impact on the success of a biocontrol program. Late-acting pupal lethality would still allow for hybrid larva to compete for resources with wild-type larva which could make EGI more effective than conventional SIT. This is preferred for insects with overcompensating density-dependence at larval stages28,29,30, where decreasing larval numbers increases the larval survival probability to the point where the total population actually grows instead of shrinks. On the contrary, embryonic-lethality could be preferred for agricultural pests whose larva cause extensive crop damage31. Combining EGI with a female lethal genetic circuit may also enable simultaneous sex-sorting and production of sterile males.

Applying EGI to other insect species, especially those closely related to D. melanogaster is probably possible within about a year and 2–3 years in a rodent. However, it is still not a trivial endeavor, requiring multiple genetic modifications and careful avoidance of combining the PTA and WT target sequence in the same organism. It may be possible in some instances to avoid introducing a promoter mutation by exploiting the existence of regional polymorphisms in promoter sequence as a pre-existing mutation.

There are three primary molecular mechanisms by which the incompatibility provided by EGI could break: (i) transgene silencing of the dCas9-based PTA, (ii) early promoter conversion of the target locus in hybrid organisms, and (iii) underlying sequence diversity at target loci in wild populations that prevent PTA recognition. We have previously published an engineering solution to (i) that involves creating a positive selection for the PTA using endogenous essential genes9. Promoter conversion is unlikely to effect the success of biocontrol programs due to an inherent fitness defect in such escape mutants32. To address the underlying sequence diversity at target loci, population genetics studies should precede strain engineering to identify highly conserved targetable regions, which we have found to be present in populations of interest. However, all of these resistance mechanisms can be mitigated using mutually incompatible EGI strains (e.g. Fig. 4). With just two orthogonal EGI strains (A and B), an iterative release of A-B-A-B-A-B… for biocontrol is expected to result in negatively correlated cross-resistance33 (Supplementary Fig. 10). Any surviving hybrids from mating events between EGI-A and wild-type would automatically inherit a susceptibility to EGI-B (because EGI-A and EGI-B are mutually incompatible). Thus these surviving escapees would be sensitive to the next release of EGI-B, and this renewed sensitization would continue with each sequential release. However, there may also be other ecological and evolutionary considerations that may pose challenges to EGI such as inbreeding depression due to genetic bottlenecks encountered in the creation of EGI strains, wild populations developing mating preferences to avoid EGI strains, or subtle fitness defects due to expression of the PTA which are difficult to overcome.

In summary, we demonstrate the ability to engineer species-like genetic incompatibilities in a multicellular organism. Our approach uses genetic tools that have been proven effective in many organisms, and our design is applicable to any sexually reproducing species. We show that the EGI approach is robust to specific design implementations, with extreme underdominance possible with at least three distinct developmental morphogen targets. Further, we show that multiple synthetic species can be engineered from a given target organism.

Methods

Drosophila stocks

Experimental crosses were performed at 25 °C and 12 h days. Existing Cas9 and sgRNA strains were obtained from the Bloomington Drosophila Stock Center. All transgenic flies were generated via ΦC31 mediated integration targeted to attP landing sites. Embryo microinjections were performed by BestGene Inc (Chino Hills, Ca). See Supplementary Table 2 for descriptions of fly strains.

Plasmids

Plasmids expressing dCas9-VPR were constructed by Isothermal assembly34 combining NotI linearized pMBO2744 attP vector backbone with dCas9-VPR PCR amplified from pAct:dCas9-VPR (Addgene #78898)35 and SV40 terminator from pH-Stinger (BDSC, #1018) to generate pMM7-6-1. Isothermal assembly was used to clone 5’UTR and ~1.5 kb of promoter sequence into NotI linearized pMM7-6-1 (pMM7-6-2: Foxo promoter. pMM7-6-3: Tubulin promoter. pMM7-6-4: wingless promoter. pMM7-6-5: Bam promoter). Plasmids expressing dXCas9-VPR were constructed by introducing mutations into the dCas9 region predicted to improve activity36 to generate pMM7-9-3 which also has a NotI linearization site used for cloning promoter and 5′UTR sequences.

Plasmids expressing sgRNAs were generated by cloning annealed oligos into p{CFD4-3xP3::DsRed} (Addgene #86864).

Plasmids expressing both sgRNAs and dCas9-VPR were constructed by Isothermal assembly combining KpnI linearized dCas9-VPR plasmids (pMM7-6-2 through pMM7-6-5) with sgRNAs amplified from genomic DNA from Drosophila melanogaster stocks that are available from BDSC (pyr sgRNA: 67537. Hh sgRNA: 67560. Upd1: 67555. Wg: 67545). See Supplementary Table 3 for plasmid descriptions and Supplementary Table 4 for primers used in this study.

Drosophila rearing conditions

All drosophila strains were grown on Bloomington Formulation Nutri-Fly media containing 4% v/v 1 M propionic acid (pH 4.3). Additional dry yeast crumbs were added to vials during EGI strain generation matings. No additional yeast was used in any mating compatibility tests. The flies were housed at 25 °C with 12 h day/night cycles. Drosophila strains used in the all-by-all cross were moved to 18 °C overnight to aid in virgin female collection the following day.

Mating compatibility tests

Genetic compatibility was assayed between parental stock homozygous for the PTA or sgRNA expression cassette (i.e. PTA-sgRNA testing) as well as between final EGI genotypes and wild-type (i.e. EGI testing). Test crosses were performed by crossing sexually mature adult males to sexually mature virgin females homozygous for their respective genotype at a ratio of 3:3 (PTA-sgRNA testing) or 2:3 (EGI testing). The adults were removed from the vials after 5 days and the offspring were counted after 15 days. Filled and empty pupal cases were counted towards the pupae total and adult males and females were counted towards the adult count. Independent mating compatibility tests were performed in duplicate (PTA-sgRNA testing) or triplicate (EGI testing). In all-by-all EGI compatibility test (Fig. 4), the data for the pyr.Pfoxo.injection self-cross were performed independently from the other crosses in that dataset.

Immunohistochemistry

Late 3rd instar larvae were dissected in cold PBS, and fixed with 4% formaldehyde (Electron Microscopy Science, RT-15714) overnight at 4 °C. Tissues were washed and permeabilized with PBS-TritonX-100 (0.1%) before staining with appropriate antibodies. Tissues for fluorescence microscopy were mounted with 80% Glycerol in PBS (0.1% TritonX-100). Images were captured using the Zeiss LSM710. Confocal Z-stacks were processed in FIJI (ImageJ).

Antibodies and staining reagents

Drosophila-Patched, apa1 (Developmental Studies Hybridoma Bank (DSHB)) (1:50); Drosophila-Wingless, 4D4 (DSHB) (1:50); Drosophila-Armadillo, N2-7A1 (DSHB) (1:50); Phospho-MAPK (ERK1/2), #4370 (Cell Signaling Technologies) (1:100). AlexaFluor 568 and 647 (Invitrogen) conjugated secondary antibodies were used as necessary at (1:500) dilution. Tissues were counterstained with DAPI (Millipore Sigma, #D9542) (1 µg/ml).

Reporting summary

Further information on research design is available in the Nature Research Reporting Summary linked to this article.