Divergence of cuticular hydrocarbons in two sympatric grasshopper species and the evolution of fatty acid synthases and elongases across insects

Cuticular hydrocarbons (CHCs) play a major role in the evolution of reproductive isolation between insect species. The CHC profiles of two closely related sympatric grasshopper species, Chorthippus biguttulus and C. mollis, differ mainly in the position of the first methyl group in major methyl-branched CHCs. The position of methyl branches is determined either by a fatty acid synthase (FAS) or by elongases. Both protein families showed an expansion in insects. Interestingly, the FAS family showed several lineage-specific expansions, especially in insect orders with highly diverse methyl-branched CHC profiles. We found five putative FASs and 12 putative elongases in the reference transcriptomes for both species. A dN/dS test showed no evidence for positive selection acting on FASs and elongases in these grasshoppers. However, one candidate FAS showed species-specific transcriptional differences and may contribute to the shift of the methyl-branch position between the species. In addition, transcript levels of four elongases were expressed differentially between the sexes. Our study indicates that complex methyl-branched CHC profiles are linked to an expansion of FASs genes, but that species differences can also mediated at the transcriptional level.

Cuticular hydrocarbons (CHCs) are omnipresent on the surface of insects and play a dual role in waterproofing and in chemical communication 1 . In many insect species, CHCs are regarded as a central component of mate recognition systems and thus contribute to behavioral isolation between species [2][3][4][5] . Insects have evolved a vast number of CHCs (> 1000) differing in chain lengths, number and position of double bonds and methyl groups, respectively 6,7 . Comparative studies have demonstrated that CHC profiles tend to be species-specific mixtures ranging in complexity from a couple to more than a hundred compounds 8,9 .
The fundamentals of the CHC biosynthesis in insects are well established 10 . The majority of CHCs are synthesized de novo in oenocytes by a sophisticated network of fatty acid synthases (FASs), elongases, desaturases, reductases, and a decarbonylase [10][11][12] . Methyl-branched CHCs result from the incorporating of methylmalonyl-CoA instead of malonyl-CoA early during chain elongation by a microsomal FAS 10,12 . Animal FASs are single multifunctional enzymes consisting of two identical monomers 13,14 . The FAS monomer contains seven distinct functional domains in the following order (from the N-terminus): β -ketoacyl synthase (KS), malonyl-/acetyl transferase (MAT), β -hydroxyacyl dehydratase (DH), enoyl reductase (ER), β -ketoacyl reductase (KR), acyl carrier protein (ACP), and thioesterase (TE). In most biological systems, the major product released by FASs is palmitic acid (C16:0) 10,13,14 . Subsequently, palmitic acid is further elongated to very long-chain fatty acids by members of the elongase family, characterized by the ELO domain (PF01151; GNS1/SUR4 family), with a conserved LHXXHH histidine box motif 15 . Despite our basic knowledge about the biosynthesis and composition of many CHC profiles (phenotypes) in a broad range of insect taxa we lack understanding of how new phenotypes may evolve.

Results
Composition of cuticular hydrocarbons. The CHC profiles were mixtures of n-alkanes and mono-, di-and trimethyl-branched alkanes (Me-, diMe-, triMeCHCs) with carbon backbones ranging from C 25 to C 39 . n-Alkanes and methyl-branched alkanes were equally abundant (Supplementary Table S1). The n-alkane fraction consisted of a homologous series ranging from C 25 to C 33 , with n-nonacosane (n-C29) as the dominant compound in both species. In contrast to the uniform composition of n-alkanes, both species differed considerably in the composition of their methyl-branched alkanes. In general, the position of the first methyl-branch is shifted by 2 carbon units between the species, i.e. from positions 11 and 13 in C. biguttulus to positions 13 and 15 in C. mollis (Supplementary Table S2). Nevertheless, some C. biguttulus individuals showed the branching pattern typical for C. mollis.
To assess quantitative differences of the hydrocarbon profiles we performed a principal component analysis (PCA) using the relative composition of the CHC profiles. The first five principal components together explained 71.3% of total variance in the CHC phenotypes (PC1 = 39.7%, PC2 = 14.5%, PC3 = 8.6%, PC4 = 4.7%, PC5 = 3.9%). PC1 (39.7%) clearly separated the species, while PC2 (14.5%) separated individuals according to sex (Fig. 1, Table 1). A multivariate analysis revealed a significant effects of species, sex and the interaction between the two (species: F(5, 117) = 145.977, p < 0.001; sex F(5, 117) = 13.842, p < 0.001; interaction of species and sex: F(5, 117) = 11.936, p < 0.001). Linear models showed that PC1, PC3 and PC4 differed significantly between species, while males and females differed significantly in PC2 and PC3 scores (Table 1). We also observed a significant species × sex interaction in all principal components ( Table 1). The PC2 interaction is due to a stronger separation between the sexes in C. biguttulus and the PC1, PC3, and PC4 interaction is due to the fact that males and females of C. mollis were more strongly separated in comparison to C. biguttulus (Fig. 1, Supplementary Fig. S1). The compounds that contributed most to PC1 were diMeCHCs (Table 2), with negative factor loadings for 15,x-diMeCHCs (indicative for C. mollis) and positive factor loadings for 13,x-diMeCHCs (indicative for C. biguttulus). The CHC profiles between the sexes differed mainly in the relative amount of triMeCHCs and diMeC35 (peaks 18 and 19). Females exhibited a greater proportion of 11,x-/9,x-/7,x-diMeC35 (peak 19) and 11,x,x-/9,x,x-triMeCHCs (peaks 22 and 31), while males have higher proportions of 13,x-/11,x-/9,x-diMeC35 (peak 18) and 13,x,x-/11,x,x-triMeCHCs (peaks 21 and 30). Similar to the differences between species, the sexes differed mainly in the position of the first methyl-branch of the major CHCs, i.e. shifted by two carbon units between the species (Supplementary Table S2).
Ortholog assignment of fatty acid synthases and elongases in Chorthippus. We identified five transcripts coding for putative FASs in both Chorthippus reference transcriptomes. The assignment of orthologous genes between both Chorthippus species resulted in five ortholog pairs ( Table 3). The similarities of coding nucleotide and protein sequences, respectively, within ortholog pairs were > 98.6% and 99.2%. One ortholog pair (cluster I, pairs in Chorthippus had no reciprocal best hit with a FAS in D. melanogaster. However, each ortholog pair in Chorthippus has a corresponding ortholog in the gomphocerine grasshopper Stenobothrus lineatus (Fig. 2).
The domain structure analysis revealed that only one ortholog pair (cluster I) contained the full open reading frame (ORF) with all seven functional domains. The other ortholog pairs lacked certain domains, showed truncated domains or contained incomplete ORFs (Fig. 2). Two related ortholog pairs (cluster II-a/c) lacked the MAT domain and another closely related ortholog pair (cluster II-b) had an incomplete ORF that contained only the C-terminal domains. In C. mollis, two FAS transcripts with incomplete ORFs (cluster II-b/c) had short overlapping ends (11 AA) with identical protein sequences, which might be a hint that both transcripts belong to a single gene (Fig. 2). All FAS sequences in cluster III lacked the PP domain and showed modification in DH, ER, KR, and TE domains, but not in the KS and MAT domain (cluster III, Fig. 2, Table 3). A phylogenetic analysis based on the KS or MAT domain revealed higher sequence similarity of MAT and KS domains of cluster II and III within a lineage than between sequences of cluster III from divergent insect orders (Fig. 2, Supplementary Fig. S2).
Using the elongase genes from D. melanogaster, a tblastn search resulted in 12 transcripts coding for putative elongases in each Chorthippus reference transcriptome. Both Chorthippus species shared 11 ortholog pairs, only two transcripts had no corresponding ortholog in the other species (Table 3). In the first case, C. biguttulus had two paralogs in the Elo68 cluster while C. mollis had only a single copy (Fig. 3). However, the coding sequences of  Table 2 for loadings). all three transcripts were identical; the 3′ non-coding region of the mRNA differed between the two paralogs in C. biguttulus and allowed an ortholog assignment of the C. mollis transcript. In the second case, C. biguttulus lacked the ortholog to CG6921 (james bond). All putative elongase transcripts of Chorthippus species could be assigned to orthologs in D. melanogaster, except for a single ortholog pair (Fig. 3, Table 3).
Signature of selection analysis. We calculated dN/dS ratios for nine ortholog pairs (Supplementary Table S3).
Four ortholog pairs showed either no nonsynonymous or no synonymous substitutions, and three sequences had no SNPs. The signature of selection analysis provided no evidence for positive selection acting on FAS and elongases in the two Chorthippus species. The dN/dS ratios of ortholog pairs ranged from 0 to 0.129, indicating that purifying selection acts on these genes (Supplementary Table S3).

Evolution of fatty acid synthases (FASs) and elongases in insects.
The majority of FASs of pterygote insects showed two distinct clusters with regular FASs that contain all seven functional domains (clusters I and II; Fig. 2), while the three FAS copies of the two-pronged bristletail Catajapyx aquilonaris (Diplura) formed a single cluster that represents the sister clade to all other insect FASs (Fig. 2). In most cases, each of the two clusters contained a single copy of a regular FAS per species. However, the termite Zootermopsis nevadensis (Isoptera) and ants showed a noticeable increase in copy numbers in cluster I and II, e.g. the red imported fire ant (Solenopsis invicta) (Hymenoptera) possesses six FAS genes. However, other insect orders, like Orthoptera, Hemiptera, Coleoptera, and Lepidoptera, showed expansions of 'aberrant' FASs that either lack certain domains or show an unusual domain structure. The five FAS ortholog pairs of Chorthippus were allocated to three different clusters (Fig. 2). The ortholog pair in cluster I contained all seven functional domains, while the three ortholog pairs in cluster II either lacked the MAT domain or had incomplete open reading frames. The last ortholog pair fell in cluster III and was characterized by an unusual domain structure. The basal split of the elongase phylogeny separated the two previously characterized S/MUFA and PUFA elongase subfamilies 15 (Fig. 3). The basal S/MUFA cluster (baldspot) contained the insect orthologs of the vertebrate ELOVL3 and ELOVL6 genes, with 43% and 51% protein sequence similarity between Chorthippus and cattle (Bos taurus). All other elongases belonged to the PUFA subfamily. We found ten distinct ortholog clusters in the PUFA subfamily; most of them contained a single elongase copy for each of the studied insect orders. However, D. melanogaster and the pea aphid Acyrthosiphon pisum showed a large increase in copy number in the EloF (8 copies) and james bond (4 copies) cluster, respectively. Eight of the 10 PUFA elongase clusters appeared to be insect-specific. Only the Elo68 and CG31522 cluster contained orthologs from non-insect outgroup species. With exception of the ortholog pair in the CG30008 cluster, each ortholog pair of Chorthippus clustered exactly in the ortholog cluster predicted by the ortholog assignment.
Differential expression of candidate fatty acid synthases and elongase genes. Among the 5 FAS and 11 elongase ortholog pairs of Chorthippus species, we found only a single FAS ortholog pair (cluster II-b) that was differentially expressed between both species, with a 2.9-fold higher expression in C. biguttulus (Table 4). However, the expression levels of this FAS transcript differed not only between species, but also strongly between the sexes (7.6-fold higher in females). In addition, we found two other FASs and three elongases that had significantly higher expression in males than in females ( Table 4). The two putative FAS transcripts (cluster II-a and III) showed higher expression in males (8.4 fold higher in C. biguttulus and 2.4-fold higher in C. mollis). The strong differences between the male-biased expression of the FAS cluster II-a transcripts, resulted in a significant species × sex interaction term. Of the three differentially expressed elongases, the CG30008 orthologs showed the strongest male-biased expression (23.2-fold). The other two elongases had 2.3-fold (CG16905) and 3.6-fold (CG5326) higher expression in males.

Discussion
In addition to their divergent acoustic signals, the sympatric Chorthippus grasshopper species, C. biguttulus and C. mollis, differed significantly in their CHC profiles. The CHC profiles of both species consisted of a series of n-alkanes, followed by a series of various methyl-branched alkanes. Our study demonstrated that C. biguttulus and C. mollis as well as males and females of both species show quantitative differences in their CHC phenotypes. Both the general pattern of hydrocarbons with series of n-alkanes and methyl-branched alkanes and the interspecific variation based on quantitative rather than qualitative differences seemed to be relatively conserved throughout the family Acrididae [38][39][40][41][42] . The most striking difference between the two species was the shift of the first methyl-branch position in multimethyl-branched CHCs, i.e. position 13 in C. biguttulus and position 15 in C. mollis. However, C. biguttulus also showed large variability in CHC phenotype, with some individuals exhibiting the methyl-branching pattern typical for C. mollis. These individuals clustered together with C. mollis in the PCA, illustrating that without this shift, both species are nearly indistinguishable based on their CHC phenotypes. Methyl-branches are incorporated during the fatty acid elongation process by FASs and/or elongases 10 . Thus, we focused on these protein families as candidates for producing the species and sex specific CHC pattern.      We found numerous candidate FAS transcripts in Chorthippus that cluster into three groups on our FAS phylogeny. In general, the phylogenetic analysis of the FAS family indicates a duplication of the ancestral FAS gene in the common ancestor of pterygote insects. Thus, most insect orders possess at least two FAS gene copies. This ancestral duplication resulted in two clusters with regular FASs that contain all seven functional domains (cluster I and II). In many insects orders, these two copies underwent additional lineage-specific duplication events resulting in independent expansions of the FAS family (paralogs: Orthoptera: ≥ 4 (Chorthippus); Isoptera: 5 (Z. nevadensis); Heteroptera: 11 (A. pisum); Hymenoptera: 6 (S. invicta); Coleoptera: 6 (T. castaneum); Lepidoptera: 6 (P. xylostella)). Although the two-pronged bristletail C. aquilonaris (Diplura) has also multiple FAS copies, the three paralogs formed a single cluster that is the sister clade to the FASs of pterygote insects.
Some FAS copies showed significant deviations from the classical FAS domain structure. A frequent modification that evolved independently in at least three insect orders (Orthoptera, Coleoptera, Diptera) was the loss of the MAT domain in cluster II FASs. Two such FAS transcripts (cluster II-a and II-b) in Chorthippus showed sex-biased expression but in opposite directions, i.e. male-biased in FAS II-a and female-biased in FAS II-b. In addition, the FAS transcript FAS II-b showed indications for differential expression between the species and might be a potential candidate involved in the generation of the divergent CHC profiles of these grasshopper species. The FAS II-a was previously identified in a population genomic scan on C. biguttulus and C. mollis, indicating that this locus is under selection 43 . Looking at the coding sequence we found one non-synonymous substitution, but no significant evidence for positive selection (dN/dS = 0.103). All the Chorthippus sequences of cluster II lack the MAT domain. This domain is responsible for substrate recruitment and loading 14 . Thus, it is unclear whether these transcripts code for functional proteins. However, in Tribolium castaneum, an RNAi knockdown of TC15337, that also lacks the MAT domain, leads to a mortality of 60% and 40% after larval and pupal injection, respectively 44 . This suggests that TC15337 codes for a functional protein, but it is yet unknown whether it codes for a FAS or another protein.
The grasshopper FAS transcript of the third cluster III showed female-biased expression. This FAS exhibit modifications in the DH, ER, KR, PP, and TE domains that were either truncated or completely lost. A putative FAS in T. castaneum (TC000238) has a very similar domain structure and RNAi knockdown implies that this protein is functionally active (100% mortality after larval injection) 44 . All FAS sequences in cluster III had modifications in one or more domains, with the exception of the KS and MAT domain. In order to examine whether these FASs are derived from a common ancestor sequence or were derived by lineage-specific duplication of FASs, we looked at the phylogeny of the KS and MAT domains separately ( Supplementary Fig. S2). The results of the phylogeny of these domains, together with the inconsistencies of domain modification of the FASs in cluster III, indicates that these genes are most likely derived independently by lineage-specific duplications from FASs in cluster II.
Our FAS phylogeny has shown that many insect groups showed lineage-specific expansions of FAS copies, especially ants, beetles, and aphids. Ants and beetles are known for their highly diverse and complex CHC profiles. Almost 1000 different CHCs have been characterized in ants, of which dimethyl-branched alkanes is the largest group (> 600 compounds) 6 , and beetles show a similar diversity of methyl-branched CHCs 7 . Thus, the expansions of FAS copies observed in some insect groups might be an indication that multiple FASs are involved in the generation of such a great CHC richness.
Early studies on the fatty acid biosynthesis in insects [45][46][47] and vertebrates 48,49 suggest that a single FAS can synthesize both straight-chain and methyl-branched fatty acids. FASs of the bug Triatoma infestans (Hemiptera) 46 , the housefly Musca domestica 47 , and the fruit fly D. melanogaster 45 can incorporate both malonyl-CoA and methylmalonyl-CoA during chain elongation, resulting in methyl-branched fatty acids. However, a recent study of CHC biosynthesis in Drosophila indicates that methyl-branched CHCs are synthesized by a special FAS 12   The regular FASs release fatty acids with chain length up to 16, with palmitic acid (C16:0) as major product 10 . Thus, the production of long-chained CHCs depends on elongases that elongate the medium-chain fatty acids to very-long chain fatty acids. The elongase family comprises two subfamilies, the S/MUFA and the PUFA subfamily 15 . Members of the S/MUFA subfamily are thought to elongate saturated and monounsaturated fatty acids, while members of the PUFA subfamily elongate polyunsaturated fatty acids. However, this classification is largely based on functional characterization in mammals, whereas the specificity of elongases in insects needs not fit into this classification 50 .
In insects, the S/MUFA subfamily forms a single highly conserved cluster (baldspot) and the PUFA subfamily is expanded into ten distinct clusters, of which eight were insect specific (Fig. 3). In contrast to FASs that showed multiple lineage-specific duplications, elongase clusters contained only a single copy per insect species, except for the EloF and james bond cluster that showed expansions in D. melanogaster and A. pisum, respectively. Beside the 11 elongases in these ancient clusters, Chorthippus grasshoppers possess an additional copy without an ortholog in other insect orders. The expression pattern of elongases was similar in both Chorthippus species, but three elongases (EloF, CG30008, and CG5326 orthologs) showed male-biased gene expression. Interestingly, in D. melanogaster the EloF (CG16905) gene shows female-biased expression and is involved in the biosynthesis of sexually dimorphic CHC profiles 51 . Fruit fly males have CHCs with chain length of C23 and C25 and females with C27 and C29. RNAi knockdown of EloF induced a decrease of C29dienes and an increase of C25dienes. In contrast, the CG18609 gene (EloF cluster) shows a male-biased expression 52 and a RNAi knockdown results in a strong decrease of total CHCs in males but not in females 53 . In the honeybee, Apis mellifera, two elongases, GB54399 and GB40681, are positively correlated with the production of methyl-branched CHCs 50 . The expression of GB54399 (james bond ortholog) is correlated with monomethyl-branched CHCs, while GB40681 (CG30008 ortholog) is highly correlated with dimethyl-branched CHCs (11,9,3,. Thus, the male-biased expression in the EloF and CG30008 orthologs makes both genes candidates for the biosynthesis of a higher proportion of diMeC35 in males of C. biguttulus (3.0-fold) and C. mollis (1.7-fold).
In conclusion, we demonstrated that the CHC profiles of the grasshopper species, C. biguttulus and C. mollis, differ in the first methyl-branch position in multimethyl-branched CHCs. The high sequence similarity of ortholog pairs and the absence of positive selection acting on FAS and elongase genes in Chorthippus species suggest that the variation in CHC profiles in closely related species is mainly mediated at the transcriptional level. Similar conclusions can be drawn from the Drosophila sister species D. serrata and D. birchii 12 . Both species have a functional FASN2 gene, responsible for the biosynthesis of 2-MeCHCs, but D. birchii has lost the FASN2 expression in oenocytes, due to cis-regulatory changes. However, the research about the biosynthesis of internally methyl-branched CHCs and its transcriptional regulation is still in its infancy. Although several hundreds of methyl-branched CHCs are known from insects, the enzymatic machinery behind this diversity is largely unknown. In particular, we need a better functional characterization of the FAS and elongase families in insects. Our study has shown that both the FAS and elongase family exhibit an increase in copy numbers in insects. However, the evolutionary histories of both protein families are distinct different. The elongase family has undergone a rapid expansion in the ancestor of insects resulting in eleven paralogs of which eight are insect specific. After this ancestral expansion, the copy numbers did not further increase, with some exceptions. In contrast, the FAS family showed only a single duplication in the ancestor of pterygote insects that was followed by multiple independent lineage-specific expansions. Interestingly, insect groups known for a high diversity of methyl-branched CHCs, as ants or beetles, have high numbers of FAS copies. At least in Drosophila, the biosynthesis of methyl-branched CHCs can be linked to duplication and neofunctionalization of a FAS gene. However, it remains to be tested whether diversification of methyl-branched CHCs is really driven by the expansion of FAS genes.  (Table S4).

Methods
All individuals were caught as late instar nymphs (3 rd & 4 th ) and were subsequently kept in a common room at 25-30 °C, 25-30% relative humidity, and a 16:8 h light:dark cycle. Grasshoppers were fed ad libitum with a mixture of different grasses (Festuca rubra rubra, Dactylis glomerata, Poa pratensis) (seeds from Revierberatung Wolmersdorf Nindorf, Germany). After the final molt, individuals were separated by sex to ensure virginity.
Individuals used for RNA extraction were killed by decapitation within 7 days after their final molt, their gut was removed, and they were stored in liquid nitrogen or in RNAlater (Qiagen, Limburg, Netherlands), due to storage capacity in the liquid nitrogen tank. For RNAlater storage, samples were cut into pieces and incubated in RNAlater at 4 °C overnight, the tissue was removed from the RNAlater and stored at − 80 °C. Although collection dates, populations, and preservation methods vary, both species were always caught together. Therefore species differences are not affected by difference in collection date, population or preservation method.
Extraction of cuticular hydrocarbons. Grasshoppers were frozen at − 20 °C four to six days after their final molt. Grasshoppers were thawed for 15 min at room temperature before hydrocarbons were extracted by immersing an individual in 1 ml of n-hexane for 5 min 41 . Samples were stored at − 20 °C until further analysis. Cuticular extracts were concentrated under a gentle stream of nitrogen to a volume of 100 μ l. A blank hexane sample was treated the same way to control for potential contamination of samples.
Chemical analysis. In order to examine species or sex specific difference in CHC profile, chemical identification of cuticular extracts was performed on a coupled gas chromatograph-mass spectrometer (GC-MS) system (7890A GC-5975C MSD; Agilent, Waldbronn, Germany) equipped with an Agilent 7693A automatic liquid sampler for injection. An aliquot of 1 μ l of each sample was injected in splitless mode at 300 °C. A fused silica column (ZB-5HT Inferno, 30 m × 0.25 mm × 0.25 μ m, Phenomenex Inc., Torrance, CA, USA) was used for separation with a constant helium flow of 1 ml/min. The oven temperature program was started at 100 °C and then heated to 320 °C at a rate of 10 °C/min (20 min isotherm). Electron impact ionization was 70 eV.
Hydrocarbons were identified by their mass spectra and corroborated by their retention indices 54 . Peak areas relative to total peak area were computed for each compound, and peaks that occurred in less than 10 individual CHC profiles were discarded from further analyses. Prior to multivariate statistics, the data were transformed as follows: z ip = ln[A ip /g(A p )], where A ip is the area of peak i for individual p, g(A p ) is the geometric mean of all peaks for individual p, and z ip is the transformed area of peak i for individual p 55 . As the logarithm is not defined for zero values, a constant of 0.01 was added to each relative peak area 56 .
As internally branched alkanes (first methyl-branch at position ≥ 9) could not be sufficiently separated by GC, we used the ratios of the peak heights of their diagnostic fragment ions, i.e. m/z 140 (position 9), m/z 168 (position 11), m/z 196 (position 13), and m/z 224 (position 15), as an approximation to the relative composition of the respective methyl-branched alkanes (Supplementary Table S2).

Statistical analysis.
For quantitative comparisons of the CHC phenotypes, a Principal Component Analysis (PCA) was performed on 34 variables (peaks) and 125 individuals using "FactoMineR" package 57 in R 58 . By using the PC scores for each individual on PC 1-5 we tested for differences between the two species, the sexes within species and the interaction of species and sex. We first ran multivariate analysis of all five PCs and then continued the statistical analysis by running 5 linear models with the pc scores as dependent variable and species and sex as explanatory variables with the interaction of species × sex in R with the lm() function. Tukey's HSD post hoc tests were used for pairwise comparisons of males and females within a species and across species with TukeyHSD() function. All analyses were performed in R (version 3.2.2).

Identification and ortholog assignment of fatty acid synthases and elongases in Chorthippus.
We took a transcriptomic approach to identify candidate genes for CHC synthesis. Based on a literature search, 22 reference protein sequences from Drosophila melanogaster related to CHC biosynthesis (i.e. 3 FASs and 19 elongases) were downloaded from FlyBase (http://flybase.org) (Supplementary Table S5). In order to identify homologs in Chorthippus grasshoppers, we used tblastn to compare our set of 22 reference proteins to a reference transcriptome of C. biguttulus and C. mollis respectively (Mayer et al. unpublished). We retained up to 10 hits per protein with a cut-off e-value of 10 −5 . Best hit transcripts (putative homolog) for each candidate were determined based on highest sequence identity and lowest e-value. Orthologs were then assigned by reciprocal best hits, using the C. biguttulus and C. mollis candidates 59 .
RNA extraction and sequencing. We wanted to determine if any of our candidate genes were differentially expressed between sexes or species. We collected 12 individuals of each species originating from two populations (three males and three females per population). Whole body samples were individually homogenized in TriFast using a MINILYS homogenizer with the Precellys ceramic kit (1.4/2.8 mm) (all from peqlab, VWR International GmbH, Erlangen, Germany). Total RNA was extracted from the samples following the manufacturer's instructions (for peqGOLD TriFast) except that samples that had been stored in RNAlater were precipitated with isopropanol that had been diluted 1:2 with nuclease free water. All total RNA samples were checked for purity and quality using a NanoDrop spectrophotometer (NanoDrop Products, Wilmington, DE, USA) and a 2100 Bioanalyzer (Agilent Technologies, Santa Clara, CA, USA). Total RNA samples were determine as pure with a 260/280 value of ~2.0 and a slightly higher 260/230 value associated. If total RNA samples showed strong differences in absorbance, a re-extraction with 1 ml peqGOLD TriFast was performed. All samples showed no visible RNA degradation at Agilent RNA 6000 Pico Assay electropherogram. For mRNA isolation and to decrease ribosomal RNA contamination, an mRNA enrichment was performed using the Dynabeads mRNA Purification Kit (Life Technologies, Carlsbad, CA, USA).
For Illumina sequencing, we prepared directional, strand specific RNA libraries using the NEXTflex

Differential expression analysis.
After sequencing, we determined if any of our candidate genes were differentially expressed between species or sexes using the Trinity differential expression pipeline 60 . Three biological replicates per sex per species (24 total) were used in the Trinity pipeline for differential expression analysis. For abundance estimation, reads from all samples were aligned against the subset of candidate transcripts from the C. biguttulus reference using bowtie 61 . Then, expression values were estimated using RSEM 62 . Differentially expressed transcripts were extracted using the DESeq2 algorithm 63 with a trimmed mean of M-values normalization. Only contigs with an absolute value of log 2 fold change > 1 and a P-value < 0.05 were classified as differentially expressed and P-values were corrected for multiple testing 64 . We used counts as dependent variable and species and sex as explanatory variables with the interaction of species × sex. We compared the outcome of the DESeq2 package with the results of the egdeR 65 algorithm. Both methods revealed identical differentially expressed contigs, although P-values differed. For the sake of clarity, results are shown only for the DESeq2 algorithm, because this algorithm is more conservative than the edgeR algorithm 66 .
Scientific RepoRts | 6:33695 | DOI: 10.1038/srep33695 Coding sequence divergence analyses and estimation of substitution rates. In addition, we wanted to test whether our candidate FAS and ELO genes have undergone purifying or positive selection. To do this we estimated rates of nonsynonymous (dN) and synonymous (dS) substitutions between C. biguttulus and C. mollis. Based on the tblastn results of C. biguttulus and C. mollis (see Identification of FAS and ELO orthologs above) we calculated dN and dS substitutions for the FAS and ELO orthologs (Table 3) which we had identified before. Reads from all 12 C. biguttulus and 12 C. mollis (see differential expression analysis above) were pooled by species in silico then aligned to the C. biguttulus reference transcriptome (Mayer et al. unpublished). SNPs were called as described in Berdan et al. 43 and used to create two "species-specific" transcriptomes using the FastaAlternateReferenceMaker from GATK 67 . We then used 'transdecoder' (part of the TRINITY package 59 ) to determine Open Reading Frames (ORFs) and estimated dN/dS following the Yang & Nielsen approximate method 68 implemented in KaKs_Calculator (Version 1.2) 69 .
Phylogenetic analysis of fatty acid synthases and elongases. Finally, we wanted to examine FAS and ELO evolution at the level of class (insect). We reconstructed a phylogeny of the fatty acid synthase and elongase families. We first translated the nucleotide sequences of C. biguttulus and C. mollis into amino acid sequences using the translate tool server at http://www.expasy.org/tools/dna.html/. The open reading frames (ORFs) of these protein sequences were then used to extract homologs from selected representatives of all available insect orders from OrthoDB v8 70 , the NCBI database and AphidBase (http://www.aphidbase.com/aphidbase/) by a blastp search (see Supplementary Table S5 for selected species and Genbank accession numbers). The amino acid sequences were aligned using MAFFT version 7 71 . Prior to the phylogenetic analysis, poorly aligned positions were eliminated using GBlocks software 72