Introduction

Individuals from many species have evolved barriers to reproduction (Dobzhansky 1937) that ensure they find an appropriate species-specific mate. It is well established that a variety of species in several insect orders produce a particular combination of compounds composed of cuticular hydrocarbons and their derivatives bearing various molecular functional groups (henceforth CHCs). The primary role of CHCs is in desiccation resistance (Foley and Telonis-Scott 2011; Makki et al. 2014), with the greatest protection from desiccation coming from long-chain saturated compounds (Gibbs et al. 1997; Chung and Carroll 2015). However, the blend of CHCs can also be a critical factor in sex recognition and mate choice within a species (Blomquist and Bagnères 2010; Everaerts et al. 2010; Thomas and Simmons 2011), as well as recognition between species, including the reinforcement of premating isolation (e.g., Coyne et al. 1994; Higgie et al. 2000; Rundle et al. 2005; Savarit et al. 1999). Cuticular hydrocarbons are primarily produced in the oenocytes in insects, then transported, through an unknown mechanism, to the cuticle (Howard and Blomquist 2005; Billeter et al. 2009).

One of the most widely used systems for studying the genetic basis of CHC production is Drosophila. Drosophila melanogaster females contain at least 59 compounds in their CHC profile, while females from their closely related sympatric sibling species, D. simulans, have 19 CHC compounds (Everaerts et al. 2010; Sharma et al. 2012). The blends and ratios of these compounds within the CHC profile, rather than a single compound, are thought to be important for insect communication (Everaerts et al. 2010; Savarit et al. 1999). These two species are behaviorally isolated from each other, in part due to these different CHC profiles (Carracedo et al. 2003; Moulin et al. 2004). In most populations, the most abundant CHC in D. simulans females is 7-tricosene (7-T), while 7,11-heptacosadiene (7,11-HD) is the most abundant in D. melanogaster females (Pechine et al. 1985). Drosophila melanogaster females also produce 7,11-nonacosadiene (7,11-ND), which is a moderate attractant for D. melanogaster males (Jallon 1984) and may therefore also contribute to sex and species recognition.

Drosophila melanogaster is the insect species in which the genetic basis for CHC production has been most studied. Three genes encoding desaturases, desat1 (Dallerac et al. 2000; Labeur et al. 2002; Marcillac et al. 2005), desat2 (Takahashi et al. 2001; Coyne et al. 1999), and desatF (Chertemps et al. 2006; Legendre et al. 2008), one gene encoding an elongase, eloF (Chertemps et al. 2007), one gene encoding an aldehyde oxidative decarbonylase, Cyp4g1 (Qiu et al. 2012), and one gene encoding a reductase, Cpr (Qiu et al. 2012), have all been shown to be involved in the biosynthesis of cuticular hydrocarbons. Both desatF (Legendre et al. 2008; Shirangi et al. 2009) and eloF (Chertemps et al. 2007) have also been shown to influence interspecific CHC profiles of dienes (the other compounds were not assessed). However, it is unclear which aspects of the species-specific CHC profile contribute to behavioral isolation between these two species (Carracedo et al. (2003); Moulin et al. 2004). Thus, the genetic basis of other steps in the primary CHC pathways, of other CHCs, of CHC divergence between species, and their individual effects on behavioral isolation, remain unknown.

To begin to address these questions, we mapped the location of genes that influence intraspecific production and interspecific divergence in 28 compounds of the female CHC profile. Previous genetic mapping studies have identified the third chromosome as a major contributor to CHC production in the melanogaster species subgroup (Civetta and Cantor 2003; Coyne 1996; Coyne and Charlesworth 1997; Ferveur and Jallon 1996; Gleason et al. 2009). Four of the genes previously identified as affecting CHC production are located on this chromosome, and the right arm of this chromosome was recently mapped for loci contributing to behavioral isolation between D. simulans and D. melanogaster (Laturney and Moehring 2012). We therefore focused on the third chromosome, using deficiency mapping (Fig. 1; Pasyukova et al. 2000) to identify regions of the genome that influence the CHC profile of females in this species pair. The simultaneous testing of pure-species individuals and interspecies hybrids containing both intact genomes and deficient homologs allowed us to identify genomic regions for both intraspecific production and interspecific divergence in CHCs while controlling for background genetic effects. We then used this map as a tool to evaluate: 1) whether genes previously-identified as influencing within-species variation in CHCs also play a role in between-species divergence in CHCs, and 2) whether genes that contribute to female CHCs also contribute to behavioral isolation in this species pair.

Fig. 1
figure 1

Creation of female offspring used for deficiency mapping to locate genes contributing to cuticular hydrocarbon production. The gray bars represent homologous D. melanogaster 3rd chromosomes; the vertical white bars represent D. simulans homologous 3rd chromosomes. Females from the deficiency strains, which are entirely D. melanogaster, are either (a) crossed interspecifically to D. simulans males (resulting in F1 hybrid offspring) or (b) crossed intraspecifically to D. melanogaster males. Each deficiency strain harbors a dominant visible marker (DVM) on the balancer (Bal) chromosome, and a deleted region (represented by a gap in the chromosome) on the deficiency (Df) chromosome. Intraspecific and interspecific crosses with these deficiency lines produce four F1 genotypes: sim/Bal, sim/Df, mel/Bal, and mel/Df

Materials and methods

Drosophila housing and strains

Wild-type D. melanogaster (BJS1; collected in 2009 in London, ON, Canada by Brent Sinclair), wild-type D. simulans (Florida City; provided by J. Coyne), the two mutant (desat1 and desat2) and 52 deficiency (Df) lines (Table S1; Bloomington Drosophila Stock Center) were maintained in 30 mL (8-dram) food vials containing approximately 7 mL standard recipe agar-cornmeal-yeast media (Bloomington Drosophila Stock Center). Flies and crosses were housed on a 14:10 h light:dark cycle at 25 °C and approximately 80% relative humidity. The interspecific hybrids of three deficiency lines [Df(3R)crb87-5, Df(3R)e-R1 and Df(3R)ry85] had very low survival at 25 °C and so were maintained as above but at 21 °C. While this may have induced a temperature effect on CHCs, it should not remove or induce a genotype × CHC interaction unless a gene influencing CHCs was within the region spanned by the deficiency.

Crosses

Each deficiency (Df) has a deletion of a small segment of one homolog, and flies are therefore hemizygous for this region (Fig. 1). The Df is maintained over a non-deficient homolog, called the balancer (Bal), which contains inversions preventing the recovery of recombinant offspring, as well as a dominant visible marker that can be used to determine which offspring inherit the Df vs. Bal chromosome. When these D. melanogaster Df/Bal strains are crossed to D. simulans, the resulting interspecies female hybrids (sim/Df and sim/Bal) either inherit the Bal chromosome, and are therefore entirely heterospecific throughout the genome, or inherit the Df chromosome, and are therefore entirely heterospecific except for the deficient region in which only the D. simulans alleles are present. Hybrids inheriting the deficiency do not have D. melanogaster alleles for genes that fall within the deficient region, allowing the corresponding D. simulans alleles to be ‘unmasked’ and thus expressed. Therefore, if a gene associated with D. simulans-like CHC production is within the deficient region, then the sim/Df females should display a more D. simulans-like CHC profile for the affected compounds. These unmasked D. simulans alleles would represent the alleles within pure species D. simulans females. A comparison between hybrids with the Df vs. Bal chromosomes, and between pure-species D. melanogaster individuals that inherited the Df or Bal chromosomes, allows us to control for the effect of hemizygosity and the genetic background. The same comparison is made for the two mutant (Mut) strains: flies with a P-element disruption in either desat1 (w1118; P{GT1}desat1BG00955) or desat2 (In(3L)P, desat27-11HD-low) were crossed to introduce a balancer chromosome [to Df(3L)emc-E12/TM6B or w1118; Df(3L)ED228, P{3’.RS5+3.3’}ED228/TM6C, respectively]. These Mut/Bal flies were then crossed to wild-type D. simulans or D. melanogaster to again produce the four genotypes used to compare CHC composition: sim/Mut, sim/Bal, mel/Mut and mel/Bal.

Newly emerged (0–8 h) virgin males and females from each stock were collected under light CO2 anesthesia, stored separately for 7 days for males and 14 days for females, which is after the age of sexual maturity (~4 days old), and then used in crosses. Females from each deficiency line were separately crossed to both D. melanogaster males and D. simulans males. For the intraspecific crosses, five females were paired with five males. An average of three intraspecific crosses was set up for each deficiency line. The interspecific crosses contained approximately 10 females and 25 males. More individuals were used in these crosses because of a lower incidence of mating. An average of 30 interspecific crosses was set up for each deficiency line. Virgin F1 hybrid female offspring were collected using light CO2 anesthesia 0–8 h post-eclosion and separated into the four possible genotypes used to assess CHC production. Five females of each genotype were randomly chosen from these collections for CHC extractions (see below). The replicate intra- and interspecific crosses for a given Df were performed at the same time.

Genomic regions are referred to by their cytological location. This numbering system is based on the banding pattern of the polytene chromosomes, with the 3rd chromosome being labeled 61 at the left telomere to 100 at the right telomere (80 at the centromere). Each number is further subdivided A to F, and each letter is further subdivided into a variable range of number designations (e.g., 1–10).

Quantifying CHCs

CHCs were extracted from mature pure species or F1 hybrid females 8 day after eclosion by washing individual flies in 100 μl hexane for approximately 3 min, then vortexing for 1 min. Flies were then removed and discarded. Octadecane (C18) and n-hexacosane (C26) were added to the extract as internal standards (10 ng/μl) for subsequent gas chromatographic analysis. For each line, 20 individuals were analyzed, five for each genotype. Samples were analyzed on an Agilent Technologies (Wilmington, USA) 6890 N dual channel gas chromatograph (GC) with a fast oven (198–231 V power supply), fitted with an HP5 (5% phenyl methyl siloxane) column (30.0 m × 250.00 μm internal diameter) and a flame ionization detector (at 310 °C). Samples (1 μl) were pulse-injected in splitless mode (at 200 °C with a pulse of 206.8 kPa for the first 1.4 min) and eluted with the following temperature program: 60 °C for 0.5 min, increasing to 190 at 120 °C/min then increasing to 260 at 7 °C/min, then finally to 310 °C at a rate of 120 °C/min, where it was maintained for 3.5 min. Hydrogen was used as the carrier gas.

Individual CHC profiles were determined by integration of the area under 28 peaks, representing all those that could be consistently identified in all individuals. The pattern of peaks corresponded closely to previously published D. melanogaster profiles (Foley et al. 2007; Everaerts et al. 2010), and chemical identities were therefore assigned with reference to these studies (Fig. 2). The internal standards were used as references during integrating to aid in aligning profiles.

Fig. 2
figure 2

Mirrored CHC profiles for female D. melanogaster (a) and D. simulans (b). Compound identity was determined by comparison to previously-published studies. Compounds that were consistently detected are listed from left to right, as follows: n-Heneicosane (n-C21); n-Docosane (n-C22); n-Methyldocosane (23-Br); (Z)-9-Tricosane (9-T); (Z)-7-Tricosene (7-T); (+6-Tricosene (6-T); (Z)-5-Tricosene (5-T); n-Tricosane (n-C23); n-Tetracosane (n-C24); (Z,Z)-9,13-Pentacosadiene (9,13-PD); (Z,Z)-7,11-Pentacosadiene (7,11-PD); 2-Methyltetracosane (25-Br); (Z)-7-Pentacosene (7-P); (Z)-5-Pentacosene (5-P); n-Pentacosane (n-C25); (Z,Z)-7,11-Hexacosadiene (7,11-He+D); 2-Methylpentacosane (26-Br); (Z,Z)-9,13-Heptacosadiene (9,13-HD); (Z,Z)-7,11-Heptacosadiene (7,11-HD); 2-Methylhexacosane (2-MH, a.k.a. 27-Br); (Z)-9-Heptacosene (9-H); (Z)-7-Heptacosene (7-H); n-Heptacosane (n-C27); (Z,Z)-7,11-Nonacosadiene (7,11-ND); 2-Methyloctacosane (29-Br); (Z)-7-Nonacosene (7-N); n-Nonacosane (n-C29); 2-Methyltriacontane (31-Br). The n-hexacosane internal standard = IS

Statistical analyses

To correct for technical error associated with quantifying absolute abundances via gas chromatography, integrated values for each CHC were converted to relative concentrations by dividing each peak area by the total area of all peaks for a given individual. The resulting proportions are a form of compositional data that are represented in the simplex (see Pawlowsky-Glahn and Egozcue 2001) and are associated with a special Aitchison geometry (Billheimer et al. 2001; Pawlowsky–Glahn and Egozcue 2001) to which standard statistical methods should not be applied (Aitchison 1986; Egozcue and Pawlowsky-Glahn 2011). To address this, relative concentrations were transformed to centered log-ratios (CLRs):

$${\mathrm{CLR}}_n{\mathrm{ = }}\ln \left( {\frac{{p_n}}{{(\prod _{n = 1}^{28}p_n)^{1/28}}}} \right),$$
(1)

where pn is the proportional area of the nth CHC and the divisor is the geometric mean of the proportional area of all 28 CHCs within an individual (Atchinson 1986).

Variation in the CLR-transformed relative concentration was tested separately for each CHC and deficiency, employing a false discovery rate control for multiple comparisons where appropriate, as explained below. In each case, a two-way factorial model was fit to the 20 offspring consisting of the four genotypes that result from the intraspecific and interspecific crosses for a particular Df (Fig. S1):

$${\mathrm{CLR}}_n{\mathrm{ = }}{\mathrm{genotype}}{\mathrm{ + }}{\mathrm{species}}{\mathrm{ + }}{\mathrm{genotype}}{\mathrm{ \times }}{\mathrm{species}}{\mathrm{ + }}\varepsilon ,$$
(2)

where genotype is the identity of the maternally contributed D. melanogaster chromosome (i.e., Df vs. Bal), and species is the identity of the paternally-contributed chromosome (i.e., D. melanogaster vs. D. simulans), thereby denoting whether the offspring is ‘pure’ D. melanogaster or is an F1 hybrid.

To identify candidate regions for CHC production in general (as opposed to those for species-specific differences in CHCs), we tested whether hemizygosity for any of the deficiencies impacted CHC relative concentration via a two-step approach. First, Eq. 2 was fit for all CHCs and Df lines, and we identified all instances in which the main effect of genotype (i.e., Df vs. Bal) was significant while controlling the false discovery rate (FDR; Benjamini & Hochberg 1995) at 5% given 1456 tests (i.e., 28 CHCs × 52 lines). From this subset, we then removed all CHCs for which the genotype × species interaction was nominally significant (i.e., P 0.05) as these are candidate regions for interspecfic differences (see below). Nominal significance (as opposed to FDR corrected) was used at this stage to obtain a conservative list of deficiencies that affect CHC production in a non-species-specific manner. While we could have performed a simple comparison between Df and Bal within D. melanogaster alone, this approach would not have allowed us to differentiate between the effect of the deficiency and effects due to the genetic background. Inclusion of the two hybrid genotypes (i.e., sim/Df, sim/Bal) allowed us to test for a main effect of the deficiency, thereby eliminating those effects that are significant due to background genetic effects between D. melanogaster homologs. However, it is not possible to differentiate between background genetic effects that are present in both species and effects due to the deficiency.

To identify candidate regions affecting species-specific differences in relative concentrations of any of the 28 CHCs, we tested for deficiencies that affected CHCs in hybrids but not in the pure D. melanogaster background. To do this, Eq. 2 was again fit for all CHCs and Df lines and we identified all instances in which the species × genotype interaction was significant while controlling the false discovery rate at 5% (1,456 tests). A significant interaction could be due to epistatic interactions and background genetic effects (e.g., an interaction between the D. simulans genome and a D. melanogaster allele on the second chromosome of the Df line), which were not the focus of this study. We addressed this three ways: 1) We removed any cases that had a greater difference between the balancer genotypes (mel/Balsim/Bal) than the deficiency genotypes (mel/Dfsim/Df); 2) we removed those that did not have a significant difference (nominal P 0.05) between the deficiency genotypes, and 3) we removed those that did not shift the CHC value towards that of D. simulans. The subsequent test of the individual candidate genes desat1 and desat2 was performed as above, with a false discovery rate of 5% given 28 CHCs tested for each gene.

CHC pathway

We diagrammed an expanded CHC biochemical pathway based on known catalytic steps facilitated by the products of the genes desat1, desat2, desatF, eloF, Cyp4g1, Cpr and the order of desaturation and elongation steps established for Drosophila and other species (reviewed in Blomquist and Bagnères 2010; Wicker–Thomas and Chertemps, 2010). We then overlaid the significant deficiencies onto this pathway, with the assumption that the observed effect of a given deficiency was due to the genes present within the deficient region. The placement was determined by the best fit based on the compounds that were affected, and whether the effect was an increased or decreased relative concentration of the compound.

Results

We consistently detected 28 CHCs in female D. melanogaster, with a subset of 21 of these also being present in D. simulans females (Fig. 2). Compound identity was determined by comparison to previously-published studies. While correspondence with these studies was high, since we did not perform GC-MS for each compound some caution is warranted with respect to their inferred identities, particularly for those compounds present in trace amounts. Note that no deficiencies were tested that spanned the CHC pathway genes eloF, Cpr, or Cyp4g1. As expected, the wild-type D. melanogaster profiles were dominated by 7,11-HD and 7,11-ND, while that of D. simulans was dominated by 7-T. While these particular traits loaded strongly on the multivariate combination of CHCs that best distinguished females of the two species (i.e., all individuals scored for the first canonical variate from a discriminate function analysis of the 21 shared CHCs), several other CHCs also contributed strongly to this (Table S2). We mapped the third chromosome for genes contributing to production of these compounds by utilizing a series of 52 D. melanogaster deficiency strains, covering approximately 55% of the chromosome. Each of these strains has a known region of one homolog that is absent, or deficient (an individual is hemizygous at this region = Df; Fig. 1; Table S1), while the other homolog is present (the Balancer, Bal). We crossed these deficiency strains to both D. melanogaster (mel) and D. simulans (sim), creating four combinations of species and genotype that we could compare: mel/Df, mel/Bal, sim/Df and sim/Bal. For the multivariate combination of shared CHCs that best discriminates these species (Table S2), F1 hybrids had phenotypes that were intermediate between the two species, although displaced slightly towards D. melanogaster (Fig. 3a), suggesting dominance of at least some D. melanogaster alleles. As expected, F1 hybrids carrying deficiencies (i.e., sim/Df) were more D. simulans-like overall than their sim/Bal F1 counterparts (Fig. 3a). However, patterns for specific CHCs varied widely, even among the four most commonly studied compounds. Hybrids are intermediate for 7-T, intermediate but quite D. melanogaster-like for 7,11-HD, intermediate but very D. simulans-like for 7,11-ND, and surprisingly, transgressive for 7-P (Fig. 3b). In this latter case, hybrids exhibit a phenotype more extreme than pure species females, suggesting epistatic interactions between alleles in the two species.

Fig. 3
figure 3

Variation among deficiency lines and genotypes in relative concentration of (a) the multivariate combination of CHCs that best distinguishes D. melanogaster and D. simulans females, calculated by scoring all individuals for the first canonical variate from a discriminate function analysis of the 21 shared CHC (discriminating between pure D. melanogaster and D. simulans; see Table S2), and (b) the four most studied CHCs. For the deficiency lines, points represent the mean of all individuals from a given line and genotype. Twenty individual D. simulans females are included for reference. Df(3R)e-R1 is labeled in panel (a) as this deficiency had a notably strong D. simulans-like CHC profile

The genomic basis of intraspecific differences in CHC production

We first wanted to determine if a region impacted the general production of any of the 28 CHCs we could consistently detect by examining the effect of being hemizygous (having only one homolog) for each genomic region in the two species. We found a significant effect on intraspecific amounts of CHCs for 41 of the 52 tested deficiencies (Table 1). In cases where a significant deficiency is entirely encompassed by the region spanned by another deficiency that is not significant for that compound, the significant effect is likely due to background genetic effects that are present in both species; these regions are thus not likely to be of further interest. This was observed in only one case: Df(3R)Exel9012 had significantly less 9-Heptacosene, but this effect was not seen in line Df(3R)BSC56, which encompasses Df(3R)Exel9012 in its entirety. Thus, for each significant effect, additional fine-mapping using deficiencies that have a different genetic background will be necessary to confirm regions as contributing to intraspecific differences in CHCs. With that caveat in mind, a comparison of the overlapping deficiencies revealed 43 genomic regions on the third chromosome that may contain candidate genes contributing to intraspecific amounts of a CHC compound (Table 1). Note that the number of candidate regions (43) does not match the number of significant deficiencies (41). This is because significant deficiencies encompassing a smaller non-significant deficiency were divided into two candidate regions flanking the non-significant region. Further, if the same compound was affected by overlapping deficiencies, only the one region of overlap was considered a candidate region for that compound. Overall, there was no correlation between the number of CHCs that were affected and the number of genes found within each region (r= 0.058, P= 0.34).

Table 1 Cuticular hydrocarbons (CHCs) that showed significant intraspecific differences in accumulation due to having only a single allele within the region spanned by a deficiency (Df), i.e., due to being hemizygous

The type of CHC compound affected by the deficiencies we assayed was relatively evenly-distributed among the four classes of molecules. Twenty-three regions uncovered genes affecting one or more of the eight alkanes (saturated compounds with no double bonds or branches), 28 regions uncovered genes affecting one or more of the nine monoenes (compounds with one double bond), 22 uncovered genes affecting one or more of the six dienes (compounds with two double bonds), and 12 uncovered genes affecting one or more of the five branched-chain alkanes (compounds with a 2-methyl group branch). No region unmasked a gene that affected all of the compounds of a given type (e.g., all of the monoenes).

The genomic basis of interspecific differences in CHC production

We also wanted to determine the genomic basis of interspecific differences in CHC production. We identified 24 deficiencies, representing 23 candidate genomic regions, that significantly contributed to interspecific differences in CHC production between D. simulans and D. melanogaster (Table 2; significance P 0.05 after FDR correction for multiple tests). The assumption is that the causal alleles within the candidate regions would be fully expressed within a pure species D. simulans female, and affect the amount of a CHC compound(s) in a species-specific manner. Deficiency mapping in interspecies hybrids would unmask these alleles. The majority of the significant deficiencies (16) affected only one or two CHC compounds, and genes within these regions likely act at the latter stages of the pathways producing these compounds. Deficiency Df(3R)e-R1 (designated [14] in Table 2 and Fig. 4), affected nine compounds. This region likely uncovers genes that have a species-specific effect upstream in the CHC pathway, although it is possible that it instead uncovers a number of genes that affect multiple points of CHC production. The latter scenario is suggested by a significant correlation between the number of compounds that were affected by each significant deficiency (Table 2) and the number of genes within a deficiency (r= 0.496, P= 0.0058). In each case, however, the overall effect was in the direction of the CHC profile found within D. simulans, as expected.

Table 2 Cuticular hydrocarbons (CHCs) that showed significant interspecific differences in accumulation due to having only the D. simulans allele within the region spanned by a deficiency (Df), i.e., due to species-specific differences in genes contributing to the CHC profile
Fig. 4
figure 4

Biochemical pathway overview (a) and specific steps (b) for cuticular hydrocarbon (CHC) production in Drosophila spp. At each step in the pathway, the number of carbons is listed, followed by a colon and the number of double bonds, followed by ω and the position of the double bond. In (b), the final CHC compounds are boxed, with their abbreviated names; full names are listed in Methods. The degree of shading of each box represents the approximate relative quantity of the compound on the cuticle of D. melanogaster, with darker shades indicating greater quantity. Predominant alternative CHC levels in D. simulans that are instead major (***) or minor (*) compounds in this species are denoted as such and outlined with a dashed box. The genes that were previously identified as affecting CHC production are shown at the appropriate steps in the pathway, as are the deficiencies mapped in the study presented here, represented by bold italicized numbers in brackets, as follows: [1] Df(3L)ED4457; [2] Df(3L)ED4486; [3] Df(3L)XS533; [4] Df(3L)BSC284; [5] Df(3L)BSC223; [6] Df(3L)BSC451; [7] Df(3R)ED5177; [8] Df(3R)ED5330; [9] Df(3R)T-32; [10] Df(3R)BSC471; [11] Df(3R)Cha7; [12] Df(3R)Dl-BX12; [13] Df(3R)H-B79; [14] Df(3R)e-R1; [15] Df(3R)Exel9012; [16] Df(3R)BSC137; [17] Df(3R)Exel6196; [18] Df(3R)Exel6187; [19] Df(3R)ED6220; [20] Df(3R)Exel6203; [21] Df(3R)BSC140; [22] Df(3R)BSC547; [23] Df(3R)ED50003. While the gene(s) present in the deficiencies may affect the production of the affected CHC(s), note that they may instead affect the transport of the CHC(s) to the cuticle. The pathway for the production of branched compounds containing a 2-methyl group is not shown; however, it is predicted to be similar to that of the saturated CHC compounds except that valonyl-CoA would be the immediate precursor rather than acetyl-CoA. Few deficiency lines showed a significant effect for all of the compounds predicted by a gene having an effect on a particular location in the pathway and a single line may contain multiple genes affecting CHC production, each at a different place within the pathway. Thus these placements should be interpreted with some caution

The three compounds that show the greatest difference in amount between the two species (7-T, 7,11-HD, and 7,11-ND; Fig. 2) could have enhanced our ability to detect genes that significantly affect the species-specific CHC profiles of these compounds, skewing our results in favor of finding genes that influence their production. However, this does not appear to be the case: while 15 of the 23 candidate regions significantly influenced levels of 7-T, there were notably fewer candidate regions that had an impact on the accumulation of 7,11-HD (three) and 7,11-ND (one).

Only five deficiencies that significantly affect interspecific differences in a CHC (Table 2) did not also affected intraspecific amounts of another CHC compound (Table 1); however, 22 deficiencies showed the reciprocal scenario of being significant for intraspecific CHC amounts but not interspecific differences. For example, Df(3R)H-B79 had no impact on intraspecific amounts of CHCs but affected interspecific differences in four CHCs, while Df(3R)BSC321 affected intraspecific amounts of four CHCs, but did not contribute to interspecific differences. Thus, a total of 27 deficiency lines that were tested (out of 52) affected only intraspecific or interspecific variation in CHCs, but not both. There was a high degree of overlap for lines affecting both traits: 19 deficiencies were significant for both, while six were not significant for either.

Genes influencing species-specific differences in monoenes were most often uncovered. Twenty-one deficiencies uncovered genes affecting species-specific levels of CHCs for any of the nine monoenes, while only five uncovered genes for any of the eight alkanes, nine uncovered genes for any of the six dienes, and five uncovered genes for any of the five branched-chain alkanes. As with intraspecific comparisons, no region unmasked a gene that affected all of the compounds of a given type.

A comparison of the overlapping regions unmasked by significant and non-significant deficiencies can help dramatically refine the number of candidate genes contributing to CHC differences (Table 2). For example, the overlapping deficiencies Df(3L)BSC284 and Df(3L)BSC223 both affect the species-specific levels of 7-T, and as such, their effect is likely due to the same locus. A comparison of the overlapping genomic region unmasked by these deficiencies reduces the significant region to 79A3-B1 (base position 3L:21,909,520–22,036,810), which only contains 12 protein-coding genes (FlyBase: Marygold et al. 2013). Similarly, the subtraction of the region spanned by the non-significant deficiency Df(3R)ED5664 from that spanned by a deficiency significant for 7-T, Df(3R)ED5177, leaves the region 88E3-5 (base position 3R:11,054,571–11,075,682) which contains only nine genes (FlyBase: Marygold et al. 2013). In both of the above cases, there are no ‘obvious’ candidate genes within these regions (e.g., those encoding desaturases or elongases), but the small number of candidate genes and the availability of individual gene mutants within D. melanogaster make it feasible to test all (or most) of the candidates within each refined region in the future.

Two of the significant deficiencies overlap genes known to be involved in the production of CHCs within D. melanogaster: Df(3L)ED4457 overlaps desatF (also called Fad2), while Df(3R)T-32 overlaps desat1 and desat2. To assess whether these genes also affect interspecific divergence in CHC production, we tested the latter two genes, which were the only ones for which a mutant stock was available (Table S1). We used the same methodology as when testing the deficiencies, but in this case a single allele of D. melanogaster is absent (disrupted) rather than a genomic region: hybrid sim/Mutant flies only have the D. simulans allele of desat1 or desat2, respectively. We found that having only the D. simulans allele of desat1 significantly increases 7-T (P< 0.0001, significant after FDR correction for multiple tests). No other CHCs were significantly affected after correction for multiple tests, but the CHCs that most closely approached significance were 7-P (P= 0.0027) and 31-Br (P= 0.037). Note that the compound 7,11-HD was not significantly affected (P= 0.805). In contrast, while desat2 affects levels of 5,9-HD and 7,11-HD within D. melanogaster (Coyne et al. 1999; Grillet et al. 2012), we did not find that this gene affected interspecific differences in these compounds (P= 0.67 for 7,11-HD; 5,9-HD is not present in the strains of D. melanogaster that we used) or in any of the other CHCs. The compounds that most closely approached significance were 6-T (P= 0.0063), 31-Br (P= 0.010), 7-P (P= 0.021) and 27-Br (P= 0.023), none of which were significant after correction for multiple tests.

Mapping the CHC biosynthesis pathway in Drosophila

Some of the genetic deficiencies that lead to changes in CHC profiles may act directly at the level of CHC compound biosynthesis by encoding enzymes in the CHC biosynthetic pathway. While CHC biosynthesis has been described for insects (see Blomquist and Bagnères 2010 for a review; Wicker-Thomas and Chertemps, 2010), and a small number of individual genes that influence CHCs have been identified in D. melanogaster, the underlying genetic basis of the CHC biochemical pathway in Drosophila remains relatively un-characterized. Therefore we overlaid the genes desat1, desat2, desatF, eloF, Cyp4g1, and Cpr on an expanded CHC biochemical pathway based on the catalytic steps these genes facilitate. We then added the presumed location of action for the gene(s) within the significant deficiencies, as they impact both the double bond pattern distribution and chain length specificity (Fig. 4a). For example, the unsaturation patterns (ω5; ω7; ω7,11; ω9; ω9,13; note: the number(s) following ω indicate the number of carbons between the double bond and the terminal methyl group) are presumed to be established before chain elongation (Fig. 4a), since the reverse would result in more variability in double bond location after the final decarboxylation step. This arrangement accounts for the major CHC profile differences between D. melanogaster (predominantly 7,11-HD and 7,11-ND) and D. simulans (predominantly 7-T), since DesatF, which introduces the second double bond, is not expressed in D. simulans (Legendre et al. 2008; Shirangi et al. 2009). However, this general scheme does not adequately depict the origin of each component found in Drosophila CHCs, nor does it allow the placement of deficiencies with more specific effects on CHC profiles.

Therefore, we also overlaid these components onto a more refined CHC biosynthetic pathway in which the elongation of various precursor fatty acids (e.g., saturated fatty acids, monoenes and dienes) is shown separately (Fig. 4b). In this pathway, Cyp4g1 function in the second step in the conversion of fatty acids to alkanes (Qiu et al. 2012) by oxidatively decarboxylating fatty aldehydes produced by Cpr (Qiu et al. 2012), and thus both are placed at every arrow leading to a CHC from a fatty acid precursor. While this figure diagrams the origination of each compound, there may not be distinct elongation pathways for each subclass of CHC in vivo. For this more detailed pathway, the origin of each CHC precursor fatty acid is clearly traced, with the final decarboxylation step depicted for each end product; this is not meant to imply separate decarboxylases for each reaction, but rather to show the origin of the individual CHC components. Nevertheless, several deficiencies can be placed on this pathway. For example, Df(3L)ED4457 and Df(3L)ED4486 (Fig. 4b: [1] and [3]) both result in an increase in monoenes, with a concomitant decrease in diene accumulation, suggesting a role in partitioning between these two lineages. Similarly, there are eight deficiencies (Fig. 4b: [4], [5], [6], [9], [10], [11], [17] and [18]) that all result in an increased accumulation of 7-T, with no impact on dienes. This suggests a role in the elongation of ω-7 monoenes. Df(3R)ED5330 and Df(3R)BSC140 (Fig. 4b: [8] and [21]) result in an increase in 7-P, suggesting an enhanced role in elongating medium chain ω-7 monoenes. Df(3R)Dl-BX12 (Fig. 4b: [12]) has an opposite effect, but on ω-5 monoenes, resulting in a decreased accumulation of 5-P.

Comparison to regions for behavioral isolation

None of the changes in the female CHC profile due to unmasking the D. simulans allele had a corresponding effect to the previously-reported proportion of females courted by D. melanogaster males (Laturney and Moehring 2012). Indeed, that study found that almost all deficiencies that were tested were courted with equal speed and frequency by D. melanogaster males, even though they expressed only D. simulans alleles within the deficient region. Five regions that affected the CHC profile also affected female receptivity (Table 3; Laturney and Moehring 2012).

Table 3 Regions tested for their effect on intraspecies and interspecies differences in cuticular hydrocarbons (CHCs) within and between Drosophila melanogaster and D. simulans are compared to a previous study (Laturney and Moehring 2012) that mapped the same regions for interspecies mate preference

Discussion

Individuals from D. melanogaster and D. simulans use cuticular hydrocarbons as one, but not the only, cue for attracting and identifying appropriate conspecific mates. Through the use of deficiency mapping in females, we identified 43 candidate genomic regions on the third chromosome affecting within-species CHC abundance (Table 1) and 23 candidate genomic regions on the same chromosome affecting between-species divergence in female CHCs (Table 2). These regions represent the lower bound on the number of genes on this chromosome affecting CHCs because each region may harbor multiple loci affecting CHCs, and additional loci for CHC production are likely present on the third chromosome within regions we did not test. Further, the method we used is also unable to detect loci that act through epistatic interactions. Lastly, the majority of D. melanogaster genes are dominant over D. simulans genes with respect to CHC profiles for some, but not all, compounds (Coyne 1996). Consequently, any D. simulans-specific compounds that are unaffected by D. melanogaster genes would not be detected by deficiency mapping. Even with these caveats, our genetic map provides a strong framework for future fine-mapping: approximately 40% of these regions contain fewer than 20 candidate genes, and ~15% have six or fewer, greatly facilitating the future identification of individual loci underlying variation in CHCs

In order to understand correlations among CHC components, it is important to have a sense of how they are related biosynthetically. To date, six CHC biosynthesis genes have been identified in D. melanogaster (Takahashi et al. 2001; Labeur et al. 2002; Chertemps et al. 2007; Legendre et al. 2008; Qiu et al. 2012). Based on these studies, data from housefly and termite CHC biosynthesis (reviewed in Blomquist and Bagnères 2010), and the patterns of CHC compounds that accumulate in Drosophila sp. (Wicker-Thomas and Chertemps, 2010), we postulated the location of action for the causal genes within each deficiency on the biochemical pathway leading to the CHC compounds found in Drosophila (Fig. 4). For the biochemical pathway, it is known that desaturation occurs early, establishing the number and location of double bonds (e.g., ω-5, ω-7, ω-7,11, ω-9,13), and that chain elongation occurs after desaturation. Based on the known location and function of the six previously-identified CHC genes (desat1, desat2, desatF, eloF, Cyp4g1, Cpr), we overlaid them onto our pathway: desat1 as an ω-7 desaturase with preference for 16:0 and 18:0 fatty acids, desat2 as an ω-5 desaturase with preference for short-chain (14:0) fatty acids, desatF as an ω-11 desaturase with presumed action on fatty acids longer than 16:1, eloF as being involved in very long chain diene formation, and Cyp4g1 and Cpr as being involved in the conversion of fatty acids to alkanes. Here, as demonstrated in earlier versions of the pathway in Drosophila, the lack of 7-T (and other ω-7-derived CHCs) in D. melanogaster can be explained by an efficient conversion of 16:1ω7 in to 16:2ω7,11, followed by an elongase system with diene specificity, such as noted for eloF. Similarly, the high levels of 7,11-HD and 7,11-ND in D. melanogaster (derived from ω-7,11-dieneoic acids) are absent in D. simulans, which lacks desatF. Instead, 7-T is the predominant CHC in this latter species.

We identified the potential sites in our CHC biosynthetic pathway where each deficiency is most likely having the greatest influence (Fig. 4). Note that the placement is based on the effect and does not necessarily reflect the mechanism or location of action. In other words, a gene may affect the pathway at a different node than we have indicated, but the observed difference in the accumulation of the CHC compounds is most strongly seen at the point we have indicated. The effect may be due to the removal of the D. melanogaster allele, or due to the unmasking (and expression) of the D. simulans allele, and may be enzymatic or non-enzymatic. Further, it is possible that the effect is due to unmasking of a regulatory element, with the causal coding region elsewhere in the genome.

While most of the significant regions do not contain obvious candidate genes, there are a few promising candidates for divergence in CHCs. For example, the candidate region spanning 76B4;77B (Table 2) contains the gene Sterol regulatory element binding protein (SREBP), which is a transcription factor involved in fatty acid biosynthesis (Nohturfft and Losick 2002). The candidate region spanning 93B6;C5 contains the gene Dynein heavy chain at 93AB (Dhc93AB), which is an ATPase involved in microtubule based movement (Rasmusson et al. 1994). It is possible that this ATP pump is used in Drosophila to move CHC compounds from the site of synthesis to the cuticle, as similar ATP pumps in plants act to move compounds from the site of synthesis in epidermal cells to the exo-cuticle (Pighin et al. 2004). Within the region 95C12;95D8, the gene CG31141 has predicted fatty acid elongase activity (Flybase: Marygold et al. 2013). Lastly, three candidate genes identified in a recent genome-wide association test for naturally-occurring variants affecting female CHCs also fall within our candidate regions: defective proboscis extension response 17 (87A8-a), unkempt (94E1-2), and julius seizure (98F13-99A1) (genes identified as affecting PC1 in Dembeck et al. 2015).

We were more likely to detect significant increases in the most abundant D. simulans compounds than decreases in the most abundant D. melanogaster compounds, since the unmasking of D. simulans genes in the sim/Df lines should promote more D. simulans character to the CHC profile. Indeed, 16 of the 24 significant deficiency lines influenced levels of 7-T, the most abundant compound in D. simulans females, while only four affected 7,11-HD and 7,11-ND, the primary D. melanogaster female compounds (Table 2). While the genes involved in the production of 7,11-HD and 7,11-ND may be less prevalent in the third chromosome regions we tested, have redundancy elsewhere in the D. melanogaster genome (and thus would not be uncovered by our assay), or simply have fewer genes contributing to their production, it is also possible that genes involved in the partitioning between monoene and diene pools were affected. The latter may involve repressors of the monoene elongation pathway, or subunits within higher diene specificity.

The desat1 and desat2 genes, previously identified as affecting intraspecific variation in CHC production (but see Dembeck et al. 2015), are located within the significant candidate region spanned by the deficiency line Df(3R)T-32 (Table 2). In this sim/Df line, the monoene 7-T accumulates to a greater extent than seen in D. melanogaster, without affecting the accumulation of dienes, suggesting a disruption in the partitioning between these two sub-classes of CHC compounds. We tested these genes directly to determine if desat1 and desat2 could be responsible for interspecific differences in female CHC profiles. Within D. melanogaster, the product of desat1 acts to add a double bond at the ω7 carbon position of the ω-7-monoeneoic acid and (presumably) the ω-7,11 dienoic acid precursors (the latter in conjunction with dsatF; see below) of both 7-T and 7,11-HD (D. simulans and D. melanogaster female sex pheromones, respectively; Dallerac et al. 2000; Labeur et al. 2002; Marcillac et al. 2005). The desat2 locus produces a desaturase that is expressed by females of the African (z) strain of D. melanogaster (Takahashi et al. 2001). This desaturase adds a double bond to the ω5 position of myristic acid, resulting in myristoleic acid. A series of steps including another desaturation, elongation and decarboxylation results in the z-strain D. melanogaster female pheromone 5,9-heptacosadiene (5,9-HD), which is not produced in cosmopolitan D. melanogaster or in D. simulans females (Coyne et al. 1999; Fig. 2). Interestingly, z-strain D. melanogaster females also produce very low amounts of 7,11-HD (Grillet et al. 2012). It therefore seemed likely that desat1 and desat2 are differentially regulated in D. melanogaster and D. simulans females, and underlie the reduced levels of 7,11-HD observed in D. simulans females.

However, we found that neither of these genes affected interspecific differences in 7,11-HD. While desat1 acts on precursors to both 7,11-HD and 7-T within D. melanogaster, this gene only affects between-species differences in the compound 7-T. The desat2 locus does not affect the species-specific levels of any CHC compounds. Thus, genes that control within-species amounts of 7,11-HD are different than those that underlie species-level divergence in its production. Moreover, desat1 has a different influence on the CHC production pathway in the two species and desat2 does not affect divergence in any of the CHCs between these two species. This provides evidence that genes affecting the amount of some CHCs within a species are not the same as those contributing to between-species divergence. However, the results from assays of desat1 also demonstrate how a single gene can affect both intra- and interspecific variation in a CHC, as seen in this gene’s effect on levels of 7-T. Additional tests on the contribution of individual genes to the same compound are necessary to confirm if this mixed influence is common, but we can gain insight from looking at the effect of the other regions that we tested.

While the means by which we determined significance precluded finding a region significant for both intra- and interspecific variation for the same CHC compound, we found that 19/52 deficiencies tested affected both intra- and interspecific levels of one or more of the CHCs, while 27 only affected one type of variation or the other. Thus, across the regions that we tested, there is a mixed effect of some regions altering both intra-and inter-species differences while others impact only one trait or the other.

Another significant deficiency, Df(3L)ED4457, encompasses desatF (also called Fad2). DesatF encodes a desaturase that adds a second double bond to the ω11 position after the 16:1ω7 precursor to 7,11-HD in D. melanogaster (Fig. 4b; Roelofs and Rooney 2003; Chertemps et al. 2006; Shirangi et al. 2009). Although the gene is present in D. simulans, it is not expressed (Legendre et al. 2008). Flies with only D. simulans alleles for this region showed significant changes in the accumulation of six compounds within their CHCs. These included an increase in monoenes (7- T, 5-T and 9-H) and decreases in dienes (7,11-PD and 7,11-HD). In another study, hemizygosity due to a deficiency spanning the desatF gene (Df(3L)lxd6, spanning 67E5-68B4) significantly reduced dienes and increased monoenes within both D. melanogaster and D. simulans (Legendre et al. 2008). Since a mutant stock was not available, we did not test this gene directly for its individual contribution to these changes. However, the overlapping region that we tested (Df(3L)ED4457; 67E2-68A7) had a general reduction in dienes and increase in monoenes due to hemizygosity, and species-specific changes in the levels of 7,11-HD, which is consistent with the previous findings on desatF. We cannot rule out that these effects were instead due to other genes within this region. For example, the closely-linked gene Enhancer of zeste [E(z)] is a methyltransferase that has also previously been implicated in desaturation in the CHC biosynthetic pathway in D. melanogaster (Wicker-Thomas and Jallon 2000). Disruption of E(z) results in both a decrease in 7,11-dienoic fatty acid-derived CHCs (such as 7,11-HD) and an increase in 7-monoenes (such as 7-T), as observed in our study. Likewise, the nearby gene Elongase 68α (Elo68α) encodes an elongase that is involved in pheromone biosynthesis by extending the carbon chain of fatty acid precursors (Chertemps et al. 2005). These genes are strong candidates for further study of their effects on interspecies CHC divergence.

Another pair of interesting candidate genes for interspecific divergence in CHCs are the only two genes found within the significant region at 96F1: Lipophorin receptor 1 (Lpr1) and Lipophorin receptor 2 (Lpr2). Lipophorins, found in insect hemolymph, are the major lipoproteins responsible for lipid transport, and have been shown to be associated with CHCs in D. melanogaster (Pho et al. 1996; Wicker-Thomas et al. 2015). Therefore, these genes are excellent candidates for the transport of CHCs from the hemolymph to the cuticle, an exciting prospect as the underlying mechanism of this transport remains largely unknown.

While the primary focus of this study was to identify loci for the differential CHC production between female D. simulans and D. melanogaster, we can also address whether the CHC differences we identified here influence behavioral isolation by a comparison to earlier work that tested the effect of 37 of the same deficiencies on behavioral isolation in this species pair (Laturney and Moehring 2012). If alteration of the CHC profile reduces the attractiveness of these females, then we would expect to see a decrease in the amount that these females are courted when paired with D. melanogaster males. Since some genes have pleiotropic roles in both the production and detection of CHCs in D. melanogaster (Bousquet et al. 2012), we also examined whether variation in the profile has a corresponding change in female receptivity towards courting males, as indicated by copulation occurrence. We found that none of the changes in the female CHC profile had a corresponding effect on the proportion of females courted by males (Table 3), as reported in this previous study (Laturney and Moehring 2012). While there may be subtle differences in other aspects of male courtship, such as courtship intensity or latency to copulation, dramatic alterations to the female’s CHC profile (Table 2) do not affect a male’s initiation of courtship of that female. Likewise, these changes in the CHC profile have no relationship to the level of female receptivity (Table 3). This suggests that, although genes within the regions tested may influence the female CHC profile, they do not appear to produce a correlated change in the attractiveness or receptivity of these females to D. melanogaster males. However, exploration of additional aspects of courtship are still needed, as more subtle effects may be present.

Data archiving

All raw data is provided in the supplementary material.