Introduction

Behaviors can act as important barriers to reproduction (Coyne and Orr 1997, 2004), preventing gene flow between many taxa (Jiggins et al. 2001; Ligon et al. 2018; Mendelson and Shaw 2002; Pröhl et al. 2006). While behaviors are important in this context, we still have relatively few examples of the genetic underpinnings explaining species’ differences in behavior. As a result, we are unable to draw broad conclusions regarding the proximate and ultimate causes of behavioral evolution. However, the limited results to date have shown that single loci can have large effects on behavioral divergence (Andersson et al. 2012; Arbuthnott 2009; Auer et al. 2020; Cande et al. 2013; Ding et al. 2016; Fanara et al. 2002; Konopka and Benzer 1971; Leary et al. 2012; McGrath et al. 2009, 2011; Prince et al. 2017; Yalcin et al. 2004). Further, many examples highlight the potential of sensory receptor genes to explain these effects (Auer et al. 2020; Cande et al. 2013; Fanara et al. 2002; Leary et al. 2012; McGrath et al. 2011). Such findings imply that substantial changes to animal behavior can arise via the evolution of a single, large-effect locus, and that means that behavioral isolation could evolve more frequently (Hayashi et al. 2007) and quickly (Gavrilets et al. 2007; Gavrilets and Vose 2007) when populations experience some gene flow. Because large-effect genes are easier to detect, however, they are also likely overrepresented in the literature relative to their occurrence in nature (Rockman 2012). Thus, we need more studies of a variety of comparable behavioral traits to better test these hypotheses.

Drosophila courtship behavior provides an excellent genetic model to begin filling in these blanks (Sokolowski 2001). In Drosophila, males perform a complex courtship ritual that can involve tactile, chemosensory, auditory, and visual signals (Greenspan and Ferveur 2000). Males do not court indiscriminately, however. Instead they use signals expressed by females to identify the most suitable mates (Arbuthnott et al. 2017). Thus, courtship behavior in Drosophila represents coordinated communication among signalers and receivers using a variety of sensory stimuli and behavioral responses. Differences in courtship behaviors and signals compose important barriers to interspecific hybridization. For example, in the Drosophila melanogaster subgroup, female gustatory pheromones differ between species (Jallon 1984). The biggest difference in this group of flies is the expression of the cuticular hydrocarbon (CHC) gustatory pheromone, 7,11-heptacosadiene (7,11-HD). While Drosophila express a blend of CHCs, Drosophila melanogaster and D. sechellia females primarily express 7,11-HD. In their sister species, D. simulans, females primarily express a different pheromone, 7-tricosene. Differences in male response to divergent female CHC expression constitute significant barriers to reproduction among these species (Billeter et al. 2009; Shahandeh et al. 2018). D. melanogaster and D. sechellia males preferentially court females that express 7,11-HD, while D. simulans males are deterred from courting by the presence of 7,11-HD (Billeter et al. 2009; Shahandeh et al. 2018). Recent work has begun to illuminate the physiological changes in the nervous system underlying divergent male response to 7,11-HD expression (Clowney et al. 2015; Seeholzer et al. 2018). However, little is known about the genetic loci that underlie this phenotypic evolution.

Here we report the results of a large quantitative trait locus (QTL) study seeking to identify loci underlying natural differences in male mate choice behavior between D. simulans and D. sechellia. Our study identifies three large-effect QTL explaining a significant amount of the phenotypic difference. When we manipulate female CHC expression, we find that different regions of the genome control separate aspects of male mate choice. Contrasting with the simplicity of our initial QTL results, a fine-scale dissection of the largest effect QTL reveals the presence of multiple, nonadditively interacting loci. Ultimately, our findings underscore that the evolution of behaviors may be more complexly regulated (Mackay 2004; MacKay 2009; Zwarts et al. 2011) than found in some case studies thus far (Andersson et al. 2012; Ding et al. 2016; Konopka and Benzer 1971; McGrath et al. 2009; Okhovat et al. 2015; Yalcin et al. 2004). Such a complex genetic architecture has important implications on furthering our understanding of the evolution of animal communication, reproductive isolation, and the speciation process (Kondrashov and Kondrashov 1999; Shaw and Parsons 2002).

Materials and methods

Fly strains and maintenance

We maintained all fly strains as described in Shahandeh et al. (2018). For QTL mapping, we used a single D. simulans parent strain, simC167.4 (Stock #: 14021-0251.199), and a single D. sechellia parent strain, synA, (Stock #: 14021-0248.28) that we will refer to as “D. simulans” and “D. sechellia.” To create introgression strains to fine-map a single QTL, we used two additional strains: sim2133 and sim933 (Stern et al. 2016). We selected these strains because they harbor transgenically inserted fluorescent proteins under the control of the eyeless promotor (see below). We collected all male and female flies used in the experiments described below from these strains, or crosses of these strains, as virgins within 6 h of eclosion on the 11th day following oviposition under light CO2 anesthesia.

Hybrid crosses and backcrosses

To create reciprocal hybrid males, we combined 10 virgin females, collected just 2–3 h following eclosion, with 15 virgin males that had been aged for 7–10 days. D. sechellia females crossed with D. simulans males yield sechXF1 hybrid males (Fig. 1a), and D. simulans females crossed with D. sechellia males yield simXF1 hybrid males (Fig. 1b). These hybrids have an identical background with the exception of their sex chromosomes and maternal inheritance. To generate backcross males for QTL mapping (Fig. 1c), we crossed virgin D. simulans-D. sechellia hybrid females to D. sechellia (BCsech) or D. simulans males (BCsim) using the same crossing method.

Fig. 1: Courtship behavior of parent strains and hybrids.
figure 1

a The crosses used to create hybrids with D. sechellia chrX (sechXF1) and b hybrids with D. simulans chrX (simXF1). c The crosses used to create backcross progeny. Autosome and sex chromosome genotypes are represented by colored squares (A autosome, X&Y sex chromosomes, orange D. sechellia chromosome, blue D. simulans chromosome). d CF for each male type when paired with D. simulans females (N = 76, N = 30, N = 70, N = 57, N = 952, and N = 77, for blue bars from left to right) or with D. sechellia females (N = 96, N = 30, N = 59, N = 55, N = 952, and N = 79, orange bars from left to right). e CF for each male type when paired with sham-perfumed D. simulans (N = 19, N = 23, N = 16, and N = 20, for blue bars from left to right) or 7,11-HD-perfumed D. simulans females (N = 20, N = 25, N = 25, and N = 20, orange bars from left to right). Whiskers represent 95% confidence intervals from a binomial test.

Generating recombinant introgression males for fine-mapping

We chose to fine-map a single QTL on chr3 with the largest effect on male courtship behavior toward both species females (see “Results”). To do so, we employed a marker-assisted introgression of chr3 from D. simulans into a D. sechellia background. We first crossed sim2133 and sim933 to create a D. simulans strain harboring a red fluorescent protein marker on chr3 and a yellow fluorescent protein marker on chr2 (strain 3R2Y). We then crossed this strain to D. sechellia. We took the resulting hybrid females (hybrid males are sterile) and crossed them back to the D. sechellia parent strain. We then selected male progeny that had lost the yellow fluorescent protein marker (i.e. chr2 of D. simulans), but retained the red fluorescent protein marker (i.e. chr3 of D. simulans). We crossed these males, a small fraction of which had regained fertility, to D. sechellia females for at least 5 more generations to purify the genetic background, while always selecting for males with red fluorescent protein expression to maintain D. simulans chr3. We always collected and crossed red fluorescent protein positive male offspring because recombination only occurs during female gametogenesis in Drosophila. By only crossing males, the introgressed D. simulans chr3 region remains fully intact. Once we created an initial introgression strain (3R.1), we used females from this strain in a cross to D. sechellia males to harness recombination to create recombinant introgression strains, each with a unique recombination event reducing the size of the original introgression. We identified unique recombinant progeny from this cross first, roughly by polymerase chain reaction using two species-specific alleles flanking the QTL region (You et al. 2008, Table S1), then, precisely by whole genome sequencing. We maintained these lines without recombination, as above. For both the initial introgression and recombinant introgression strains, the D. simulans genomic segment is heterozygous. Thus, each cross produces two types of males: introgression males (marked by red fluorescent protein) and control males, which resemble the D. sechellia parent strain. Introgression and control males are reared in the same vial, controlling for environment, so any difference we observe between the two is directly attributable to the presence or absence of the D. simulans chr3 fragment.

Two-day courtship assays

To measure courtship behavior of D. simulans, D. sechellia, simXF1, and sechXF1 males, we followed the methods first described by Shahandeh et al. (2018). In brief, we observed individual male genotypes with two different female genotypes on consecutive days in a full factorial design collecting minute-by-minute courtship data. As previously reported, for the parent strains, we observed no effect of assay day or female order (Shahandeh et al. 2018). For the hybrids, we also detected no effect of female order on courtship frequency (CF) for either sechXF1 or simXF1 males (all p = 1 after Bonferroni correction). Thus, we analyzed courtship data collected for the same males on each day independently. For males that received the same species female on both days, we only included data from the first day to avoid pseudoreplication. For backcross and chr3 introgression males, we observed all males with D. sechellia females on day 1, and D. simulans females on day 2 because we needed to observe each male with both female types, and there was no detectable effect of female order in the parent or hybrid male strains. Still, for backcross males, there is a small chance of an interaction between genotype and female order for courtship with D. simulans females, although our functional follow-up (observed in single day assays) suggest that this is unlikely (see “Results”).

Perfuming assays

To measure male response to the presence of 7,11-HD on the female cuticle, we perfumed D. simulans females, which lack 7,11-HD expression, with synthetic 7,11-HD (Cayman chemical CAS #10012567) using a protocol modified from Thistle et al. (2012). First, we added 400 μg of synthetic 7,11-HD dissolved in 40 μl 200 proof ethanol to a 20 mm plastic Drosophila rearing vial. We swirled the liquid around the vial, and let the ethanol evaporate for at least 1 h. After the ethanol had evaporated, we added 20 4-day-old virgin D. simulans females to each vial. We vortexed each vial for three 20-s intervals separated by 20 s of rest. We allowed females to recover in these vials at 25 °C and 50% humidity for 30 min before assay. For controls, we created a sham treatment where just 40 μl 200 proof ethanol was added to the vial and the process was repeated. We conducted these assays on a single day because we were concerned that synthetic 7,11-HD might transfer to the courtship assay vials, confounding the data from a sequential observation.

Courtship data analysis

We collected minute-by-minute courtship data over a 30-min period, counting each minute where courtship was observed. From the courtship data we extracted three parameters of male courtship, one of which we will focus on in the main text: CF. CF is a measure of what proportion of a male genotype courts a female type. We calculated CF as the number of males of a single genotype that we observed courting a female genotype divided by the total number of potentially courting pairs. For CF, we only counted males that displayed courtship behavior for more than 10% of the assay time (i.e., males that were scored for courtship in more than 3 min of the 30-min assay) to account for a potential low rate of observer error. We compared CFs between male and female pairings using Fisher’s exact tests with post hoc correction for multiple comparisons (Holm 1979). We estimated 95% confidence intervals (CIs) using a binomial test. We tested for correlations between CF for D. simulans and D. sechellia females among males from our mapping population using Pearson’s correlation test. Analyses for the second and third parameters we collected, courtship effort and courtship latency, are described at length in Supplementary document 1.

Genotyping and linkage map construction

We collected whole genome sequence from five sources: each parent strain, 382 BCsech males, 89 males resulting from a separate third generation backcross, and a set of lines where segments of chr3 from D. simulans were introgressed into a D. sechellia background. We isolated genomic DNA from individual flies using the Qiagen DNEasy blood and tissue kit (catalog #69504). We prepared libraries for illumina sequencing using a modified Nextera library preparation protocol (Baym et al. 2015) and sequenced using the illumina HiSeq3000. We sequenced the parent strains to deep coverage (~40× average), and the backcross male progeny to lesser coverage (~4× average). We aligned reads to a D. simulans reference genome (Hu et al. 2013) using BWA v0.6.2 (Li and Durbin 2009). We called high-confidence single nucleotide variants between our parent strains in the variant call format (Danecek et al. 2011) using SAMtools v1.1 (Li et al. 2009) and BCFtools v1.1 (Li 2011). We then scanned the aligned reads of the 382 backcross males to estimate recombination breakpoints using custom sliding window scripts (Fig. S1).

QTL analysis

Our sliding window analysis produced ~1200 genotype markers across the genome (roughly 200 per autosome arm) for 382 individuals. We imported the list of individual marker genotypes and phenotypes into R/qtl (Broman et al. 2003). We performed single QTL scans using the “scanone()” function with a binomial model for CF (court/did not court). This method identifies the single most likely QTL per chromosome. We additionally used the “scantwo()” function to formally test for epistasis between QTL. Because the single QTL scan for CF produced additional lesser peaks on chr3 (Fig. 2a), we also performed multiple QTL mapping to test for the presence of potential additional QTL at these loci. We first used the “stepwiseqtl()” function to identify the most parsimonious multiple QTL model (Zhou 2010). We then used “fitqtl()” and “refineqtl()” to identify markers associated with QTL peaks. We estimated a genome-wide log of odds (LOD) significance threshold for single and multiple QTL models using 10,000 and 1000 permutations, respectively (multiple QTL permutation is computationally demanding). We calculated CIs as the region surrounding a QTL peak encompassed by a drop of 1.5 in LOD score. For single QTL scans, we calculated QTL effect sizes as the proportion of phenotypic difference displayed by heterozygous and homozygous genotypes relative to the total difference observed among the parent strains (ex: (CFAB−CFAA)/(CFDsim−CFDsech)) using the individuals from our mapping population (Broman and Sen 2010). To calculate combined effect sizes, we compared individuals heterozygous at both QTL to individuals homozygous at both QTL. Effect sizes are often overestimated when using the same data to identify QTL and estimate effects (Beavis 1998). To independently measure QTL effects we phenotyped and genotyped 89 progeny from a hybrid backcross to D sechellia for three generations. These 89 males did not court D. sechellia, but did court D. simulans females. We calculate effect sizes for CF with D. simulans females using the above method with these 89 individuals.

Fig. 2: Single QTL scans for courtship frequency.
figure 2

a Significant loci for CF toward D. simulans (blue) and D. sechellia females (orange). Significant loci have LOD scores above the dashed lines, which represent alpha = 0.05. Peaks (vertical black lines) and confidence intervals (colored rectangles) are shown above QTL. Asterisks denote significance (*p < 0.05 and ***p < 0.001). The black arrow depicts a secondary peak in LOD score that leads us to hypothesize about the presence of additional QTL on chr3. b The effect of chr3 QTL on CF toward D. simulans females. Black data represents individuals from the mapping population and gray data represents individuals from the separate advanced backcross (N = 186, N = 34, N = 196, and N = 55, from left to right). c The effect of the chr3 QTL on CF toward D. sechellia females (N = 151 for sech/Y and N = 231 for sim/Y). d The effect of chrX QTL on CF toward D. sechellia females (N = 151 for sech/Y and N = 231 for sim/Y). e The combined effect of chr3 and chrX on courtship frequency towards D. sechellia females (N = 74, N = 115, N = 77, and N = 116, from left to right). For all, orange lines represent the phenotype of the D. sechellia parent strain and blue lines represent the phenotype of the D. simulans parent strain. Rectangles of respective colors and whiskers represent the 95% confidence intervals from a binomial test. Effect sizes are estimated on the right.

Results

D. simulans courtship preference alleles are largely autosomal dominant

Drosophila simulans and D. sechellia males preferentially court females of their own species. CF is higher for males paired with their own females vs. females of the other species (Fig. 1d), as previously shown (Shahandeh et al. 2018). We also investigated mating behavior in hybrid males (Fig. 1d). Hybrid males court D. simulans females at high frequencies, regardless of whether the D. simulans parent was their mother (simXF1) or their father (sechXF1, Table S2). However, there was one difference between simXF1 and sechXF1 hybrid males, consistent with some courtship loci residing on chrX (though cytoplasmic inheritance or an effect of chrY cannot be ruled out). A hybrid male is more likely to court a D. sechellia female if chrX is inherited from D. sechellia (p < 0.05, Fig. 1d and Table S2).

BCsim males inherit an average of 25% of their genome from D. sechellia, but behave indistinguishably from simXF1 and D. simulans males (Fig. 1d). Thus, it appears that D. simulans alleles affecting CF are largely autosomal dominant to D. sechellia alleles. BCsech males display increased CF toward D. sechellia females and decreased CF toward D. simulans females relative to sechXF1 males (p < 0.00001 for both, Table S2). These differences in BCsech behavior reflect the effect of making regions of the genome homozygous for D. sechellia sequence, revealing the effects of recessive D. sechellia loci, leading us to select 382 BCsech individuals for QTL mapping (Fig. 1c).

An effect of the X chromosome

We next tested whether these putative X-linked loci are responding to the presence of the 7,11-HD pheromone. This pheromone is present on the D. sechellia female cuticle but absent in D. simulans. Previous results point to this pheromone as the most important cause of male courtship differences between Drosophila species (Billeter et al. 2009). To do so, we observed males with 7,11-HD-perfumed and sham-perfumed D. simulans females.

As expected, males of the parent species respond strongly to the presence of this pheromone (Fig. 1e). D. sechellia males courted 7,11-HD-perfumed females more frequently (CF = 85%) than sham-perfumed D. simulans (CF = 20%, p < 0.001). Like when paired with D. sechellia females, D. simulans males never courted 7,11-HD-perfumed D. simulans females, but they did court sham-perfumed D. simulans females (CF = 84%, p < 0. 00001). simX hybrid males were similar to D. simulans males in that they courted 7,11-HD perfumed less frequently (CF = 28%) than sham-perfumed D. simulans females (CF = 100%, p < 0.00001). These hybrid males are significantly more likely to court 7,11-HD perfumed D. simulans than the D. simulans parent strain, however, indicating some effect of the autosomal D. sechellia genome (p < 0.05). sechXF1 males behave differently from D. sechellia males, in that they do not discriminate between the two treatments (p = 0.0655). Like simXF1 hybrids, 100% of sechXF1 males courted sham-perfumed D. simulans, but 76.0% also courted 7,11-HD-perfumed D. simulans. Thus, sechXF1 males are significantly more willing to court sham-perfumed D. simulans than the D. sechellia parent strain (p < 0.00001), but are also equally willing to court 7,11-HD perfumed D. simulans (p = 0.7095). Thus, males harboring chrX of D. sechellia are not 7,11-HD adverse, like males harboring chrX of D. simulans, indicating the presence of 7,11-HD aversion loci on the D. simulans X chromosome. The fact that both simXF1 and sechXF1 hybrids court sham-perfumed D. simulans females at high frequencies again suggests the likelihood of autosomal dominant D. simulans loci affecting CF toward D. simulans females.

Single QTL scans reveal regions with large effects on CF

A QTL scan for CF toward D. simulans females produced a single, highly significant QTL on the right arm of chr3 (Fig. 2a, blue; peak = 30.55 Mb, CI = 20.60–34.45 Mb). Using the 382 backcross individuals in the mapping population, we estimate this QTL explains 44.3% of the total phenotypic difference between parent strains (Fig. 2b, orange). Our independent estimation of effect size using 89 advanced backcross males not used in the QTL analysis is highly congruent with this measure, at 44.2% of the total phenotypic difference (Fig. 2b, gray). A QTL scan for CF toward D. sechellia identifies a highly overlapping peak on chr3 (Fig. 2a, orange; peak = 31.05 Mb, CI = 20.60–34.45 Mb), with a comparable effect size (40% of the total phenotypic variance, Fig. 2c). These QTL have opposing effects: males with a D. simulans allele in this region court D. simulans females with high frequency, and D. sechellia females infrequently. This result is further supported by the fact that CF toward D. simulans and D. sechellia females among BCsech males was significantly negatively correlated (Pearson’s corr. coefficient = −0.4745, p < 0.00001). Thus, males that court one female type, are significantly less likely to court the other female type.

Our QTL scan for CF toward D. sechellia also identifies a second, less significant QTL on chrX (Fig. 2a; peak = 9.60 Mb, CI = 6.40–15.30 Mb), as expected from the behavior of reciprocal hybrid males (simXF1 and sechXF1). We estimate this QTL explains 23% of the total phenotypic difference between species (Fig. 2d). Individuals that inherit D. sechellia alleles within this region court D. sechellia females more frequently than individuals that inherit D. simulans alleles at the same locus, but there is no effect of this locus on courtship toward D. simulans females. This result is congruent with a 7,11-HD aversion locus on chrX (as demonstrated by our perfuming experiments). Males inheriting D. sechellia alleles within this region are more willing to court D. sechellia females, but not less willing to court D. simulans females, as indicated by a lack of overlapping QTL like we observed on chr3. This implies that these same males do not require 7,11-HD to stimulate courtship, and thus likely are not inheriting 7,11-HD preference alleles at this locus. When we calculate the combined effect of both QTL, we estimate that they explain 63.7% of the total phenotypic difference between species (Fig. 2e), indicating a purely additive relationship between these QTL. Two-dimensional QTL scans detect no evidence of epistasis between individual QTL, supporting this result (Table S3).

A multiple QTL model for CF identifies additional QTL

We also performed multiple QTL mapping to estimate the likelihood of additional QTL on chr3. This method includes QTL detected with a single QTL scan as covariates in its initial model, and then scans for additional QTL and interactions that improve the model fit, allowing us to detect additional QTL on the same chromosome (Broman and Sen 2010).

A multiple QTL model for CF toward D. sechellia identified one additional QTL. This locus is also on the right arm of chr3 (Fig. S2A, peak = 44.25 Mb, CI = 41.25–47.85 Mb), aligning with the secondary peak in LOD score from the single QTL scan (Fig. 2a, black arrow). For males paired with D. simulans females, we identified two additional QTL (Fig. S2B). One is on the right arm of chr3 (peak = 44.25 Mb, CI = 44.15–44.45 Mb) also aligning with the additional QTL for courtship toward D. sechellia females and the secondary peak in LOD score from the single QTL scan (Fig. 2a, black arrow). These additional QTL on the right arm of chr3 highlight this region as a potential hotspot of courtship behavior loci. The second additional QTL was detected on the left arm of chr2 (peak = 8.50 Mb, CI = 3.30–10.20 Mb).

Chr3 introgression validates the largest effect CF QTL

We next sought to fine-map the largest effect QTL with opposing effects: the QTL on ch3. To begin, we used marker-assisted introgression to transfer a portion of D. simulans chr3 into a D. sechellia background (Fig. 3a). To create this strain, we used a parent strain of D. simulans, 3R2Y (see “Materials and methods”), with fluorescent transgenes that allowed us to track the transmission of autosomes across generations (see “Materials and methods,” Fig. 3a). The behavior of 3R2Y was indistinguishable from the D. simulans parent strain we used for QTL mapping (95% court D. simulans females and 0% court D. sechellia females, p = 1 for both, Table S2).

Fig. 3: The effect of D. simulans chr3.
figure 3

a The crosses used to introgress D. simulans chr3 into D. sechellia. b The introgressed region. The red arrow denotes the position of RFP. The black arrow denotes the position of the CF QTL peaks. c The effect of the introgressed segment (3R.1, N = 161) compared with control males (N = 163) on CF toward D. simulans and D. sechellia females. Whiskers represent 95% confidence intervals from a binomial test. Rectangles to the right show the difference observed between the parent strains. The total percent of the species difference is shown.

The resulting introgression strain, 3R.1, is heterozygous at the major CF QTL on chr3 (Fig. 3b). This strain, however, does not harbor D. simulans alleles at the locus of the additional 3R QTL detected by our multiple QTL model (Fig. 3b). When we compared the behavior of 3R.1 introgression males to their control siblings (who are raised in the same vials but harbor no D. simulans alleles; Fig. 3a), we found two distinguishing patterns of behavior. First, relative to control males, we observed a significant increase in the frequency of 3R.1 introgression males courting D. simulans (p < 0.01). Second, we also observed a significant decrease in the frequency of 3R.1 introgression males courting D. sechellia relative to their control siblings (p < 0.0001, Fig. 3c). When we compare the difference in CF between control and 3R.1 males to that of the parent strains, we find that the introgressed segment explains 25% and 32.2% of the species difference for CF toward D. sechellia and D. simulans females respectively, a notable decrease from our original estimate of ~42%. This is likely due to the exclusion of the additional QTL detected by our multiple QTL model. In this case, the effects of multiple QTL could not be entirely separated in either a single QTL scan or advanced backcross due to linkage between multiple QTL on the right arm of chr3, leading us to overestimate an individual QTL’s effect.

Fine-mapping a large-effect QTL reveals a complicated genetic architecture

We next allowed recombination to randomly break up the introgressed region of 3R.1 to create additional introgression lines with smaller fragments of D. simulans chr3. Our aim was to use markers to identify flies harboring useful recombination to map the causal locus to a small interval, but we instead found that introgression strains demonstrate that species differences result, in part, from epistatic interactions between loci contained within the chr3 QTL.

After creating six recombinant sublines with portions of the original introgression (Fig. 4a), we compared the behavior of these lines in two ways. First, we compared these strains by considering CF toward D. simulans females. Recombinant introgression strains that still include the causal loci should court D. simulans females significantly more frequently than control males. For CF with D. simulans females, we identified two such strains: 3R.324 and 3R.13 (Fig. 4b). Males from these strains court D. simulans females at frequencies much greater than control males (p < 0.01 for 3R.324; p < 0.0001 for 3R.13). Importantly, 3R.324 and 3R.13 males also court D. simulans females at frequencies comparable to 3R.1 males (71.2% for 3R.324, p = 0.5331; 81.0% for 3R.13, p = 0.0939). Intriguingly, the introgressed regions of these two strains, while largely overlapping themselves, are also contained in part by each of the other introgression strains, such that the entire overlapping region of 3R.13 and 3R.324 is represented in smaller parts among the other four strains (Fig. 4a). The only plausible explanation for the difference in behavior between 3R.13, 3R.324, and the other strains, is that these two strains harbor at minimum two D. simulans loci that nonadditively interact, while none of the other strains carry both of these loci, and thus do not court D. simulans at high frequencies. There are two regions of chr3 present in both 3R.13 and 3R.324 that are not present together in any of the other introgression strains (Table 1 and Fig. 4a), and thus must act in epistasis to affect CF toward D. simulans females.

Fig. 4: Recombinant introgression males.
figure 4

a The genotype of recombinant introgression males within the QTL region. Blue denotes heterozygous and orange denotes homozygous D. sechellia regions. The numbered regions mark the proposed locations of epistatic loci that increase courtship toward D. simulans females. These regions are both fully contained in 3R.13 and 3R.324, but not in any of the other strains (Table 1). b CF of recombinant introgression males with D. simulans. c CF of recombinant introgression males with D. sechellia. Thick blue and orange bold lines represent the behavior of 3R.1 males (blue) and control males (orange) surrounded by their 95% CIs. N = 113 for 3R.492, N = 111 for 3R.322, N = 66 for 3R.324, N = 58 for 3R.13, N = 50 for 3R.395, and N = 66 for 3R.578. d The CF of three male genotypes when paired with sham-perfumed (N = 21, N = 21, and N = 22, for blue bars from left to right) and 7,11-HD-perfumed D. simulans females (N = 21, N = 21, and N = 21, for orange bars from left to right). Whiskers represent 95% confidence intervals from a binomial test.

Table 1 Two regions within the chr3 QTL interact nonadditively to drive courtship toward D. simulans females.

Second, we compared CF of recombinant introgression strains with D. sechellia females to that of 3R.1 and control males. 3R.324 and 3R.13 courted D. sechellia at frequencies indistinguishable from control males (82.0% and 81.2%, respectively, p = 1 for both; Fig. 4c), but significantly higher than 3R.1 males (p < 0.01 for both), suggesting that the alleles affecting attraction to D. simulans females and the alleles affecting attraction to D. sechellia females are genetically separable. The remaining four lines all court D. sechellia at frequencies significantly lower than control males (p < 0.001 for 3R.395, p < 0.01 for 3R.492, and p < 0.05 for 3R.578 and 3R. 322). These four lines also court D. sechellia at frequencies statistically indistinguishable from 3R.1 males (p > 0.40 for all). However, three of these four lines, 3R.322, 492, and 3R.578, still displayed significantly greater CFs with D. sechellia females than with D. simulans females. In this way, these three lines still behave more like control males, which also court D. sechellia females with significantly higher frequencies (p < 0.0001).

The role of 7,11-HD in courtship behavior

The previous results identify two introgression strains that court D. simulans females at higher frequencies than their control male counterparts. To test if the absence of 7,11-HD on the D. simulans cuticle drives attraction to D. simulans, we selected one introgression strain that courts D. simulans females significantly more than control males, 3R.13, and one that does not, 3R.322, to observe with perfumed D. simulans females (Fig. 4d). We found that 3R.13 males court 7,11-HD-perfumed D. simulans just as frequently as sham-perfumed D. simulans females (90.5% and 81.8% of males court, respectively; p = 0.6640). Alternatively, we found that 3R.322 males greatly prefer 7,11-HD-perfumed D. simulans females, courting them 90.5% of the time, compared with just 47.6% of the time when paired with sham-perfumed females (p < 0.05). 3R.322 males behave indistinguishably from control males in this respect (p = 1 when comparing CF with both 7,11-HD and sham-perfumed females). Therefore, 3R.322 males are attracted to 7,11-HD on the female cuticle while 3R.13 males already court D. simulans females at higher frequencies than control males, and thus do not require 7,11-HD to stimulate courtship (Fig. 4b). Importantly, control males behave like D. sechellia males when comparing CF with both 7,11-HD and sham-perfumed females (p = 0.6866 for both).

Discussion

A single CHC pheromone signal isolates D. simulans and D. sechellia

CHC pheromones and corresponding preferences vary and can act as mating signals both among and between Drosophila species (Cobb and Jallon 1990; Jallon and David 1987; Jallon 1984; Pardy et al. 2018; Pischedda et al. 2014) as well as for other insects (Tregenza and Wedell 1997; Zhang et al. 2014). We have previously shown that, broadly, differences in CHCs drive male-mediated reproductive isolation between D. simulans and D. sechellia (Shahandeh et al. 2018). For D. melanogaster and D. simulans, male-mediated reproductive isolation is driven largely by a single CHC expressed on the D. melanogaster female cuticle: 7,11-HD (Billeter et al. 2009). D. sechellia females also express 7,11-HD (Coyne et al. 1994). Here, through the direct manipulation of 7,11-HD expression on the D. simulans cuticle, we show that this specific pheromone is also the primary driver of male-mediated reproductive isolation between these species for the first time. Thus, the evolution of a CHC signal is an important reproductive isolating barrier between D. simulans and D. sechellia, which are sympatric in the Seychelles archipelago (Matute and Ayroles 2014; Shahandeh et al. 2018).

The genetic architecture underlying the evolution of the 7,11-HD signal is well understood. QTL analysis identifies three loci affecting 7,11-HD expression (Gleason 2005). Two of these QTL encompass the enzymes desaturaseF and elongaseF (Gleason et al. 2009), both of which are essential for 7,11-HD biosynthesis (Chertemps et al. 2006, 2007; Combs et al. 2018; Shirangi et al. 2009). This simple genetic architecture suggests that this signal may have diverged quickly, via mutation to just a few large-effect loci. However, in order for reproductive isolation to occur, male receivers must be able to discriminate based on the presence or absences of 7,11-HD on the female cuticle. Until now, little was known about the genetic architecture of 7,11-HD receiver preference evolution. Thus, this information is an important step toward understanding the tempo and mode of the evolution of reproductive isolation among these species and understanding the process of speciation as a whole (Shaw and Parsons 2002).

The genetic architecture of male mate discrimination

Hybrid offspring behavior largely resembles a single parent, D. simulans, highlighting the potential for a simple genetic architecture with largely dominant D. simulans autosomal alleles. The comparison of reciprocal hybrids also suggests a putative role of chrX in male preference, however. Our single QTL scan results largely reflect this pattern (Fig. 2b, c). Further, when we detect more than one QTL, they appear to act purely additively (Fig. 2e). These initial QTL results are consistent with what many other QTL studies have identified as a simple genetic architecture underlying behavioral phenotypes (Arbuthnott 2009; Merrill et al. 2019; Yalcin et al. 2004), even without refining QTL to a gene level.

The results of a multiple QTL model contrast the above results, highlighting a more complex genetic architecture that was not detected by single QTL scans. In total, we identify three QTL affecting CF with D. simulans females, and three affecting CF with D. sechellia. The results of our chr3 introgression support the results of our multiple QTL models. We introgressed the chr3 QTL detected by our single QTL scan for CF, but not those detected by our multiple QTL model (Fig. 3b). When we use this introgression strain, 3R.1, to estimate effect size, our estimates drop significantly (Fig. 3c).

Our understanding of genetic architecture becomes additionally complicated with the data from recombinant lines made from our original introgression (3R.1). Our observations of these strains suggest the presence of multiple, nonadditively interacting loci underlying the effect of a single QTL affecting CF toward D. simulans females (Table 1 and Fig. 4a), making it clear that the genetic architecture of male mate choice, even within the single largest effect QTL, is not as simple as initially thought. Taken together, our findings underscore the difficulties inherent in determining genetic architecture from QTL mapping alone.

Distinct QTL affect different aspects of male preference

The behavior of reciprocal hybrids with perfumed D. simulans females shows that chrX has a substantial effect on CF with females expressing 7,11-HD (Fig. 1e). The difference between reciprocal hybrids is likely attributable to the X-linked QTL we detected, which affected CF toward D. sechellia females (that express 7,11-HD) but not CF with D. simulans females (that lack 7,11-HD). This QTL appears to include D. simulans loci affecting 7,11-HD aversion, but not 7,11-HD attraction; males with D. simulans genotype at this locus court 7,11-HD-perfumed females less because they are aversive to 7,11-HD, but males with D. sechellia genotype at this locus court 7,11-HD-perfumed females equally to sham-perfumed females (Fig. 1e). However, simXF1 males do court 7,11-HD perfumed D. simulans more frequently than the D. simulans parent strain, suggesting the presence of autosomal 7,11-HD attraction loci as well.

When we observe strains with a part of D. simulans chr3 in a D. sechellia background (3R.13 and 3R.322) with perfumed females, we find evidence for this autosomal locus affecting 7,11-HD attraction. While neither strain is aversive to 7,11-HD, presumably due to their D. sechellia X chromosome (see above), one strain (3R.322) is stimulated to higher CF by it. So, autosomal D. sechellia 7,11-HD preference loci must be contained within the nonoverlapping regions of these strains. This is congruent with the results from our QTL mapping: that this locus harbors alleles of opposing effects. When males inherit D. simulans alleles within this region, they are less attracted to D. sechellia females; when males inherit D. sechellia alleles within this region, they are more attracted to D. sechellia females. Taken together, these results suggest that there are two broad loci affecting 7,11-HD response: an X-linked D. simulans aversion locus, and an autosomal D. sechellia attraction locus. Similarly discrete genetic loci affecting separate aspects of behavior have been described in mice (Stacher Hörndli et al. 2019; Weber et al. 2013).

Genetic complexity and the evolution of reproductive isolation

Considering the coevolution of pheromone signals and preferences raises an important question: if the ancestral preference phenotype is for the original pheromone signal, how can newly evolved signals persist? Such traits are expected to experience stabilizing selection by way of preexisting preferences of signal receivers, unless receiver preferences are able to rapidly track large changes in pheromone signals (Baker 1989). While the genetic architecture of signal evolution implies potentially quick divergence due to a few, large-effect mutations (like the retrotransposition of a single gene, desatF (Fang et al. 2009)), the genetic architecture of the receiver’s response appears to have required many more changes. The epistasis we identify potentially augments the complexity of receiver evolution even further (Karageorgi et al. 2019; Lee et al. 2019; Weinreich et al. 2006). Considering signal–receiver coevolution, the differences in genetic architecture allow us to form new hypotheses about the evolution of this particular reproductively isolating mechanism.

The difference in genetic architectures makes it less likely that receiver preferences could quickly track a shift in primary CHC pheromone signal if that were its ancestral function, causing any new mutations modifying the signal to be selected against. Considering our results, it seems more plausible that a shift in CHCs occurred for another reason, and the phenotype diverged irrespective of mate choice. Once the CHC became different between species, then divergent male preferences may have been favored as a species recognition mechanism in sympatry allowing males to avoid costly interspecific courtship and/or mating (Shahandeh et al. 2018). Species recognition could have evolved at a much slower pace once the CHC difference was in place, co-opting it as a sexual signal. We know that sexual selection on mate choice is most important to the evolution of reproductive isolation in such a scenario, particularly when there is significant postmating isolation, as there is in these species (Hudson and Price 2014; Servedio 2001). Indeed, the evolution of such reproductive isolation via reinforcement is thought to be common in Drosophila (Yukilevich 2012).

This hypothesis is reasonable for CHCs, as they also have important ecological functions in Drosophila. Specifically, CHCs are important to desiccation tolerance in insects (Gibbs 1998). Different CHCs vary in their chemical structure-dependent responses to temperature (Gibbs and Pomonis 1995). Underlining this effect, in Drosophila, increases in CHC chain length correlate with increasing environmental temperatures (Gibbs et al. 1998), and quantitative genetic analyses demonstrate a relationship between CHC composition and desiccation resistance (Foley and Telonis-Scott 2011). Thus, the shift in primary pheromones between these species may have first been an adaptation to differences in environment, going unrecognized by males before being co-opted as a species recognition signal. Eventually, D. simulans evolved (at least) an X-linked locus causing 7,11-HD aversion, while D. sechellia separately evolved an autosomal locus causing 7,11-HD attraction. Indeed, adaptation to divergent environments can lead to subsequent reproductive isolation (William and George 1990), but one study divergently selecting on desiccation resistance for 57 generations without a significant cost to conspecific mating between populations still failed to generate premating isolation (Kwan and Rundle 2010), highlighting that divergent ecological selection alone is not always enough to evolve reproductive barriers. We also cannot rule out the possibility that the shift in primary CHC may have been neutral, and does not necessitate an alternatively adaptive function.

Co-localization of mate choice QTL

Another pattern we observed is the co-localization of many QTL to a single region of the genome: the right arm of chr3. For CF, five of the six QTL map to chr3, with the remaining two falling on chr2 and chrX. We know that two of these QTL, on chr3 and chrX, affect 7,11-HD preference. The multiple loci underlying the chr3 QTL affecting 7,11-HD preference only augments this pattern. The remaining five QTL potentially affect separate aspects of male mate choice. Thus, alleles affecting multiple aspects of male mate choice may be maintained in linkage. Simulation has demonstrated that regions of linkage disequilibrium may serve to reduce gene flow when species co-occur (Barton and De Cara 2009). Real-world examples support this hypothesis. For instance, loci controlling threespine stickleback armor traits cluster within the genome, creating freshwater of marine “supergene” regions that may maintain phenotypic divergence among types in the face of gene flow (Miller et al. 2014). Moreover, regions of multiple preference loci maintained in linkage disequilibrium have been identified in Heliconius, and are thought to be important to the maintenance of species boundaries in sympatry (Sekimura and Nijhout 2017). Drosophila simulans and D. sechellia do occur in sympatry where they exchange a small amount of gene flow (Coyne et al. 1994), so the accumulation of mate choice QTL in a single region may reflect similar selective pressures.

Conclusions

Ultimately, our results highlight the difficulty in determining genetic architecture from QTL mapping results alone. This contradicts a one QTL: one locus ascertainment bias perpetuated in the field (despite mounting evidence to the contrary), where the QTL that are successfully fine-mapped to causal loci represent a potentially small subset that have loci with substantial effects, making them easier to dissect (Rockman 2012). Instead, even for between species comparisons, maybe we should expect genetic complexity, even within relatively small QTL or individual loci. In Drosophila, this has been demonstrated in morphological adaptation (Frankel et al. 2011; Rebeiz et al. 2009; Rebeiz and Williams 2017), intraspecific studies of male mate choice behavior (Moehring and Mackay 2004), and now here with interspecific studies of behavioral evolution (Shahandeh et al. 2020). Although this suggests that the loci of behavioral evolution may be more difficult to identify, as we have discussed, we can still glean valuable insights and generate hypotheses, as we have done here, about the tempo and mode of evolution from studies of genetic architecture. In addition, in Drosophila research, our ever increasing abilities to phenotype behaviors at greater resolutions, genotype organisms in mass, and create transgenic strains in multiple species, will continue to improve our ability to dissect complex genetic architectures.