Comparative analysis of phenotypic plasticity sheds light on the evolution and molecular underpinnings of locust phase polyphenism

Locusts exhibit one of nature’s most spectacular examples of complex phenotypic plasticity, in which changes in density cause solitary and cryptic individuals to transform into gregarious and conspicuous locusts forming large migrating swarms. We investigated how these coordinated alternative phenotypes might have evolved by studying the Central American locust and three closely related non-swarming grasshoppers in a comparative framework. By experimentally isolating and crowding during nymphal development, we induced density-dependent phenotypic plasticity and quantified the resulting behavioural, morphological, and molecular reaction norms. All four species exhibited clear plasticity, but the individual reaction norms varied among species and showed different magnitudes. Transcriptomic responses were species-specific, but density-responsive genes were functionally similar across species. There were modules of co-expressed genes that were highly correlated with plastic reaction norms, revealing a potential molecular basis of density-dependent phenotypic plasticity. These findings collectively highlight the importance of studying multiple reaction norms from a comparative perspective.


Behavioural assays
To quantify reaction norms in behaviour in response to rearing density, we used the assay system established by Roessingh, et al. 3 initially for the desert locust. In short, the assay utilized a rectangular arena (57 × 31 × 11 cm) with a chamber at each end: an empty non-stimulus chamber to simulate a low-density condition and a stimulus chamber containing 30-50 conspecific last instar nymphs to simulate a high-density condition. For each assay, a test subject, reared in either isolated or crowded condition, was placed into a blackened 50 ml Falcon tube for two minutes to reduce the effect of handling and then slowly introduced into the arena via a hole in the center of the arena floor. A video camera suspended directly above the arena was used to record behavioural responses of the test subject to the stimuli for 12 minutes. We performed this assay 3-5 days after nymphs molted to the last nymphal instar, as previously described 1,2 with a few modifications. We quantified 50 individuals per density condition per species, thus a total of 400 individuals. For piceifrons, we used either a Panasonic HDC-TM900 (704x480, 29.97 frames/second) or a Basler acA1300 -60gc (1280x1024, 60 frames/second), while for americana, cubense, and nitens, the assays were recorded with either a Basler acA1300 (720x480, 29.97 frames/second) or a Basler acA1300 -60gc (1280x1024, 60 frames/second). We used EthoVision XT 12 (Noldus Information Technology Inc., Leesburg, VA) software to video-track behaviour, analysing 12 minutes of each assay at a frame rate of 30 frames per second. The arena was divided into three equal zones using the EthoVision software: a stimulus zone (the third adjacent to the stimulus chamber), a non-stimulus zone (the third adjacent to the non-stimulus chamber), and a neutral zone (the central third). We also designated all four walls (top, bottom, left, right) as a wall zone to measure climbing activity. The optimized detection settings differed for each species and camera used, but consistently used 'differencing' as a detection method. The subject colour was set as 'darker and lighter' than the background, so that both light-coloured and dark-coloured individuals could be detected, except for nitens, where the optimal detection was obtained when choosing 'darker'. Sensitivity and contour erosion/dilation values were highly dependent on both the species and the camera the videos were taken with. For each behavioural assay, raw position data were acquired and all tracks were manually inspected and corrected where necessary. Before exporting data, tracks were smoothed using the 'Minimum Distance Moved' (MDM) filter. This filter removes inherent noises from the video tracks, so that only true movement is captured. It was optimized based on the effect of different MDMs on the 'distance moved' variable, as noted in a previous study 4 . For all species in this study, the optimal MDM was 0.2 cm. Filtering by the MDMs resulted in filtering out actual movement in certain frames, leading to a spotty pattern of movement, and as a direct result, several dependent variables that EthoVision output as default, including 'heading', 'turn angle', 'meander', 'mobility', and 'angular velocity', were deemed to not accurately represent the behavioural variables for the test subjects, and were thus excluded from subsequent analyses. After excluding these variables, we kept a set of five variables as activity-related variables in our final analysis: 'distance moved' describing the distance moved in centimeters, 'movement' describing the time spent moving in seconds, 'velocity' describing the average velocity (cm/second) of the test subject, 'rotation frequency' describing the number of rotations and 'wall climbing' describing the time spent (in seconds) in any of the arena zones designated as wall. For two of these variables, we made the following adjustments before statistical analyses. The 'movement' was calculated as the number of frames representing movement above a certain velocity threshold. To cope with the disjointed pattern of movement caused by the MDM-filter, the 'movement' variable was further optimized by visually comparing it to the actual video. The best estimate of actual movement was found by averaging the movement over 10 samples. The 'velocity' variable exported by EthoVision did not accurately assess the actual average velocity during the assay (see Kilpatrick,et al. 2 ), so we calculated a new 'velocity' variable by manually dividing the 'distance moved' by the 'movement'. At last, a total of five attraction-related behavioural variables were exported: 'stimulus zone time' describing the time spent (in seconds) in the third of the arena close to the stimulus and the walls around it, 'neutral zone time' describing time spent (in seconds) in the middle third of the arena and its walls, 'non-stimulus zone time' describing time spent (in seconds) in the last third of the arena closed to the nonstimulus chamber and the walls around it, 'stimulus wall time' describing the time spent (in seconds) on the wall facing the stimulus chamber, and 'final distance to stimulus wall' describing the linear distance (in cm) between the test subject and the stimulus wall, bringing the total of behavioural variables to ten.

Statistical analysis of behavioural reaction norms
To test whether the behavioural responses differed among the species and the density treatment combinations, we used the following statistical methods. First, we used nonparametric multivariate analyses as implemented in the R package nparMD 5 . The behavioural data we collected were often resistant to transformation and highly skewed by zero values, so these nonparametric techniques enabled us to analyse these non-normal data. We split analysis of our behavioural variables according to measurement unit (i.e., separate analyses for seconds and centimeters). Secondly, we analysed each behavioural variable separately using generalized linear models (GLMs). Due to the aforementioned zero inflation found in several of these behavioural variables, we used a hurdle model approach that first used logistic regression of zero and nonzero data and then a model for all of the nonzero data. We used species, density, and the interaction between species and density as independent variables for each GLM. During the logistic regression step of the hurdle model, we partitioned data into nonzero and zero categories and used a binomial distribution. As our nonzero data were continuous, positive, and skewed, we used a gamma distribution with a log link for the second part of our hurdle models. All results were adjusted for multiple comparisons using the Bonferroni correction. Lastly, we determined the overall similarity in behaviours among all of the density-species treatment pairs. We reduced the number of behavioural parameters with a principal components analysis (PCA) and used the first three axes totaling 69.59% of the variation to create hypervolumes for each species in the overall, isolated, and crowded situations using the hypervolume package 6 . For each hypervolume, we calculated pairwise distance to centroid and similarity according to the Sorenson index. We created three sets of hypervolumes for these comparisons: one using all behavioural values, one using only behavioural values from the crowded individuals, and one using behavioural values from the isolated individuals. All statistical analyses were performed in R version 3.3.3 7 with all results included in an RMarkdown output file (Supplementary Data .

Analysis of black patterning
The amount of black patterning was quantified from 30 individuals for each species and rearing condition. Isolated-reared and crowd-reared nymphs were frozen in their last nymphal instar and stored at -20°C. High-resolution digital images were taken with a LK Imaging System (Visionary Digital) with SpyderCHECKR 24 software for colour correction (Datacolour, Lawrenceville, NJ). For each specimen, images of four different body parts were captured: the frontal view of the head, the lateral view of the pronotum, the left hind femur and the left wing pad. The extent of black patterning in each body part was quantified using the R package patternize 8 in R version 3.6.2 (R Core Team 2017), as previously described in Kilpatrick et al. 2 . Patternize is a pattern identification program that can define homology between patterns across specimens either through manually placed homologous landmarks or through the simple alignment of the patterns themselves 8 . Subsequently, it uses various image manipulation techniques to obtain the exact distribution of each colour of interest 8 . Each digital image was first reduced in size to a JPEG file with a width of 1,000 pixels, after which the polygon tool in ImageJ 9 was used to place between eight and 12 landmarks on each of the investigated body regions, which were saved as xy-coordinates. Additionally, the outlines of the different structures were saved as xy-coordinates by drawing their complete circumference in one representative image in ImageJ, again using the polygon tool. Using the set of landmarks and the respective structure outline, images were aligned with the PatLanRGB-function of patternize, and heatmaps of the extent of black patterning were generated separately for each body region, species and rearing condition by using a set RGB-value of (0,0,0) with a cutoff of 0.15. Finally, a linear PCA was used to better show clustering of specimens. First, patternize generates specimen-specific matrices of pixel coordinates in which coordinates that have the colour of interest have a value of one, while all other coordinates are assigned a value of zero. Subsequently, the variancecovariance matrix obtained from these binary matrices are used for a PCA that also allows visualizing the main variations in colour patterns between samples, as well as showing the predicted colour pattern changes along the principal component (PC) axis 8 .

Analysis of body size and morphometric ratio
For each species and rearing condition, three linear dimensions were measured from 15 males and 15 females. Images were calibrated in ImageJ using a ruler present in each image and three independent measurements were performed for each linear dimension, after which the average was used in all further analyses. The three measured dimensions were the length of the pronotum, the length of the hind femur (F) and the maximum width of the head (C). Additionally, the F/C ratio was calculated because this morphometric ratio is considered a reliable predictor of phase in locusts 10,11 . For the linear dimensions, statistical significance was assessed with a three-way ANOVA, with Tukey's test as a post-hoc test in R version 3.6.2 (R Core Team 2017).

RNA extractions
Isolated-reared and crowd-reared female nymphs were marked on their abdomen with a ceramic marker after molting to their last nymphal instar, and were dissected around 72 hours later. The ceramic marker did not have any influence on the behaviour (unpublished observation) and the marked tissues were not used for RNA extractions to further eliminate any unforeseen effect on the experiment. Only the specimens that molted in the morning were used, and all dissections occurred between 8 and 9 AM. Head and thorax tissues were preserved in RNAlater (Qiagen, Valencia, CA) at -20°C, following the manufacturer's guideline. A total of 40 specimens (5 biological replicates for each density condition x 4 species) were dissected for each tissue, bringing the total number of sequenced samples to 80. RNA was extracted using a Trizolchloroform extraction, followed by clean-up with a RNeasy mini kit using an on-column DNAse treatment with an RNAse-free DNAse set (Thermofisher Scientific, Waltham, MA, and Qiagen, Valencia, CA). RNA concentrations were measured with a spectrophotometer (DS-11, DeNovix, Wilmington, DE), and RNA integrity was analysed with a Fragment Analyser (Agilent Technologies, Ankeny, IA). RNA extracts were only used for sequencing if the ratios of absorbance values of 260 to 280 nm and of 260 to 230 nm were above 2 12 , and if RNA Quality Number (RQN) values were over 3.9 [13][14][15][16] .

Sequencing and read processing
Library preparation, sequencing, and pre-processing steps up to demultiplexing base call files were all performed at Texas A&M AgriLife Research Genomics and Bioinformatics Service. For library preparation, Illumina's TruSeq Stranded Total RNA Library Prep Kit was used and paired-end sequencing (150 bp) was performed using 8.5 lanes on an Illumina HiSeq4000 (San Diego, CA). Cluster identification, quality prefiltering, base calling and uncertainty assessment happened in real time with Illumina's software (HCS 2.2.68; RTA 1.18.66.3; default parameter settings). Demultiplexing base call files and formatting them into FASTQ files was carried out using Illumina's bcl2fastq script (version 2.17.1.14). For further processing, raw reads were imported into a personalized Galaxy environment 17 on a supercomputing cluster of the High-Performance Research Computing group of Texas A&M University (Ada, https://hprc.tamu.edu). We transformed reads to Sanger format with FastQ Groomer 18 and filtered them using Trimmomatic 19 . In Trimmomatic, bases were trimmed at both ends if their quality score was lower than 30, whole reads were trimmed with a sliding window of 3 bases and a minimum average quality score of 30, and finally all reads of less than 30 bp were discarded. Subsequently, FastQ Screen 20 was used to filter out reads from bacterial and other contaminating sources (UniVec core (June 6, 2015), PhiX (NC_001422.1), Illumina adapters, Gregarina niphandrodes genome (GNI3), Encephalitozoon romaleae genome (ASM28003v2), Escherichia coli genome (K12), Methylobacterium sp., Bosea sp., Bradyrhizobium sp., Klebsiella pneumoniae, Sphingomonas sp., Rhodopseudomonas sp. and Propionibacterium acnes). We also used FastQ Screen to assess the percentage of reads mapping to genomes of related species; consistently about 40% of all reads mapped to the genome of the migratory locust Locusta migratoria, while 5% mapped to the termite Zootermopsis nevadensis. Less than 5 % of all reads mapped to the human genome. Reads were re-paired after FastQ Screen using a Galaxy-tool based on the repair.sh script of bbtools (https://sourceforge.net/projects/bbmap/files/). The amount of reads for each sample, before and after filtering, can be found in Supplementary Table 11.

Transcriptome assembly and annotation
The filtered reads were used for transcriptome assembly using Trinity v2.2.0 21 (default settings, in silico normalization using the default value of 50 as max read coverage enabled). Transcriptomes were assembled separately for the head and thorax tissue of each species, resulting in eight different transcriptomes. Similar contigs were removed using CD-hit-EST 22,23 with a threshold of 0.9. An additional filtering step was performed with Transrate 24 , filtering out all contigs scoring under the suggested cut-off. The quality of these final transcriptomes was analysed with Trinitystats 21 and BUSCO (Benchmarking Universal Single-Copy Orthologs) 25 by comparing to single-copy orthologs of the lineage Insecta. The fraction of reads mapping back to their transcriptome was obtained with bowtie2 26,27 in the preset mode 'very sensitive, end-to-end' and flagstat [28][29][30] . We subsequently imported all eight final transcriptomes into a Geneious environment (R10.2.6; BioMatters, Ltd, Auckland, New Zealand), where they were annotated using a Blast2GO-plugin 31 . CloudBlast from Blast2GO was used to blast transcriptome sequences to the nr-database for arthropods. Mapping to protein domains was performed with Interproscan 32,33 on the supercomputing cluster of the High-Performance Research Computing group of Texas A&M University (Ada, https://hprc.tamu.edu). The RepeatMasker Web server 34 with RMBlast and 'other arthropods' selected, was used to discover repeats and transposons in our transcriptomes.

Differential expression analysis
Reads were mapped back to their respective transcriptomes using Bowtie2 in the preset mode 'very sensitive end-to-end alignment', with 'no-mixed behaviour' and 'no-discordant behaviour' disabled. We filtered the resultant mapping files with a threshold quality score of 3. The idxstats tool from Samtools 29 was used to generate count files. Differential expression analysis was performed with SARTools 35 , using both edgeR 36 and DEseq2 37 , in R 3.5.3 7 . For all three programs, all settings were kept at their defaults; the expression in crowd-reared individuals was compared to that of conspecific isolated-reared individuals. As such, we did a total of eight different comparisons. Five samples were removed from all further analysis after a preliminary investigation of gene expression. One crowd-reared and one isolated-reared cubense thorax samples were removed as they showed almost identical gene expression, suggesting possible contamination between the two. Furthermore, two crowded nitens thorax samples and one crowded nitens head sample were identified as extreme outliers, and thus removed. Thus, a total of 39 head samples and 36 thorax samples were used for all further analyses. Genes with an adjusted p value of ≤ 0.05 and an absolute value log2foldchange ≥ 1 in either one of the two programs were considered to be differentially expressed.
In an attempt to assess the overlap of differentially expressed genes between the different species, the reads from the head and thorax tissues of all species were mapped to the head and thorax transcriptome of piceifrons, respectively, using bowtie2 with settings as described above. Production of count files and discovery of differentially expressed genes were performed as described above. The overlap between differentially expressed genes in different species was assessed using an adapted Fisher exact test in the SuperExactTest package 38 in R.

Co-expression network analysis
We used weighted gene co-expression network (WGCNA) analysis 39 to analyse transcript co-expression patterns in the 75 samples. This method allows the clustering of genes with similar expression patterns in the analysed samples into modules, that are expected to share pathway membership or other functionalities 39 . Read counts for each sample, obtained by mapping reads to the appropriate piceifrons transcriptome, were normalized using the TPM (transcripts per million) method in R. Contigs for which the average raw read count was under 5 were removed, after which we retained only the top 60% of contigs with the most variable expression. Remaining contigs were used to generate a one-step unsigned co-expression network. Based on the scale-free topology criterion 40 , we set the soft-threshold power for calculating the adjacency matrix to 14 and 8 for respectively head and thorax, both resulting in an R 2 value of over 0.8. Subsequently, genes were hierarchically clustered based on the TOM (Topological Overlap Measure)-based dissimilarity. Modules were detected using DynamicTreeCut 41 , with a minimum module size of 30 and values of respectively 0.2 and 0.15 for head and thorax to separate branches in the dendrogram. Subsequently, we calculated the correlation between the eigengene of each module and several behavioural variables. The values for the behavioural variables ('distance moved', 'stimulus zone time', and 'non-stimulus zone time') were the average values obtained for these variables from the 50 specimens of its respective species and density treatment. In addition, we created an arbitrary variable named 'swarming', for which we gave a value of '1' for all crowd-reared nymphs of piceifrons, and a value of '0' for all others, as only piceifrons is the swarming locust. Gene networks were visualized in VisANT visualization software 42 , which was also used to select hub genes. Enrichment of gene ontology terms for sets of differentially expressed genes as well as for WGCNA modules was analysed with fatiGO 43 , which is integrated in Blast2GO.

Hexamerin sequence analysis
Hexamerin and hemocyanin sequences for L. migratoria, S. gregaria and other orthopterans were obtained from Genbank (https://www.ncbi.nlm.nih.gov/genbank/), Kang, et al. 44 and Locustmine 45 (http://www.locustmine.org: 8080/locustmine), and were imported in Geneious. Using a combination of Megablast and tblastx with default settings in Geneious, we obtained hexamerin-like sequences from our eight transcriptomes. The obtained sequences were aligned using MUSCLE version 3.8.24 46 , again with default settings, followed by manual curation to obtain full length sequences for each species. To examine the patterns of evolution for the nine hexamerin-like proteins that we obtained from the four Schistocerca species, we first aligned these 36 sequences with the 58 orthopteran hexamerin and hemocyanin sequences based on the conservation of amino acids using MUSCLE in Geneious. The aligned matrix (94 terminals and 2,933 aligned nucleotides) was analysed in a maximum likelihood framework by applying the GTRCAT model using RAxML 8.2.12 47 on XSEDE (Extreme Science and Engineering Discovery Environment, https://www.xsede.org) through the CIPRES Science Gateway 48 . Nodal support was evaluated using 1,000 replications of rapid bootstrapping implemented in RAxML. A summary of these sequences, their orthologues in S. gregaria and L. migratoria, and the Genbank accession numbers for the four species included in this study can be found in Supplementary Table 13.

Data Availability statement
All BioSample information and Raw sequencing data can be found under BioProject PRJNA633949 on the NCBI website. The transcriptomes produced during this project have been deposited as Transcriptome Shotgun Assembly projects at DDBJ/ENA/GenBank under the accessions GIOR00000000, GIOS00000000, GIOT00000000, GIOU00000000, GIOV00000000, GIPC00000000, GIOW00000000, and GIPD00000000. The versions described in this paper are the first versions, respectively GIOR01000000, GIOS01000000, GIOT00000000, GIOU01000000, GIOV01000000, GIPC01000000, GIOW01000000, and GIPD01000000. All sequences for hexamerin-like sequences were deposited in GenBank, and the associated accession numbers can be found in Supplementary Table 13. All raw data used for the analyses of behaviour, nymphal colouration, morphology, differential gene expression, and correlated gene networks, as well as R scripts used for these analyses have been published in Dryad Digital Repository [https://doi.org/10.5061/dryad.dz08kprwz].

Supplementary Figure 1:
Graphs showing all behavioural data for crowd-reared and isolatedreared nymphs of all four species. A series of raincloud plots for attraction-related behaviours (top row) and movement-related behaviours (bottom row) in crowded-reared and isolated-reared nymphs of piceifrons, americana, cubense and nitens. Density curves show the general patterns for each rearing condition, while jittered points detail the observed data.

Supplementary Figure 2:
Results of the hurdle model for all behavioural variables. a. Plots showing the results from the logistic regression portions of the hurdle models for all behavioural variables. b. Plots showing the results from gamma distribution portions of the hurdle models for all variables. For both plots, the x-axis depicts the species and species-density interactions, with the isolated-reared condition of nitens used as reference. The y-axis depicts the back-transformed model coefficients for each of the model parameters. For a, the y-axis represents the probability of finding a nonzero value for each variable, ranging from 0 to 1. For b, the y-axis represents the expected value of each variable. Error bars depict 95% confidence intervals.

Supplementary Figure 3:
Low overlap in density-responsive genes of the head tissue between four different species.
Heatmap showing all genes that show differential expression in the head tissue of at least one of the four species. For all species, the isolated-reared condition was used as reference, and for each gene, the highest log2foldchange of edgeR or DEseq2 was used. Blue colours represent genes that are downregulated in crowded-reared individuals while yellow colours represent genes upregulated under crowded-reared conditions. Green represents genes for which no significant expression differences were found.

Supplementary Figure 4:
Low overlap in density-responsive genes of the thorax tissue between four different species.
Heatmap showing all genes that show differential expression in the thorax tissue of at least one of the four species. For all species, the isolated-reared condition was used as reference, and for each gene, the highest log2foldchange of edgeR or DEseq2 was used. Blue colours represent genes that are downregulated in crowded-reared individuals while yellow colours represent genes upregulated under crowded-reared conditions. Green represents genes for which no significant expression differences were found. Figure 5: WGCNA correlation plot for the head tissue. Heatplot depicting the correlations between different modules of co-expressed genes of the head tissue and 11 behavioural traits. Stronger correlations are shown by more intense colours, with red signifying a positive correlation and blue an inverse correlation. The upper number in each square represents the correlation value between the eigenvalue of the module and the behavioural trait, calculated by WGCNA, while the lower value refers to the associated p value. Figure 6: WGCNA correlation plot for the thorax tissue. Heatplot depicting the correlations between different modules of co-expressed genes of the thorax tissue and 11 behavioural traits. Stronger correlations are shown by more intense colours, with red signifying a positive correlation and blue an inverse correlation. The upper number in each square represents the correlation value between the eigenvalue of the module and the behavioural trait, calculated by WGCNA, while the lower value refers to the associated p value. Figure 7: Hexamerin gene tree inferred in a maximum likelihood framework. We included all hexamerin-like proteins from our transcriptome data, as well as all known hexamerin sequences of Orthoptera and other representative insects as ingroup. We have recovered all orthopteran species forming a monophyletic group, implying that all orthopteran hexamerin can be traced to a single origin. Within Orthoptera, hexamerin went through many gene expansions, and within Schistocerca, we recovered 9 hexamerin-like proteins, shown in magenta, suggesting at least 9 expansion events. Figure 8: Differential expression of hexamerin-like proteins Graph representing the log2foldchange values of the 9 hexamerin sequences for both the head tissue and the thorax tissue for all four species. All values are based on the output of DEseq2, with as reference the isolated-reared condition. Significance was based on the adjusted p value of DEseq2 (*: p < 0.05, **: p < 0.01 and ***: p < 0.001.) Figure 9: Differential expression of JH-related proteins. Graph representing the log2foldchange values of genes involved in JH metabolism or known to interfere with or be regulated by JH, for both head and thorax tissue of piceifrons. All genes were identified based on reciprocal BLAST hits with annotated nucleotides present in Genbank. All values are based on the output of DEseq2, with as reference the isolated-reared condition. Significance was based on the adjusted p value of DEseq2 (*: p < 0.05, **: p < 0.01 and ***: p < 0.001).

SUPPLEMENTARY DATA FILES AND DESCRIPTIONS PROVIDED VIA THE DIGITAL REPOSITORY DRYAD
Files can be found here: https://doi.org/10.5061/dryad.dz08kprwz Supplementary Data File 1: Differentially expressed genes for the head tissue of piceifrons, and their annotation. This table lists all differentially expressed genes for the head tissue of piceifrons, discovered by either DEseq2 or edgeR, and their annotation by Blast2GO. For both DEseq2 and edgeR, the Log2FoldChange (Log2FC) and adjusted p value (padj) were given for each contig. Additionally, gene annotation was obtained with Blast2GO, and the following outputs were given: the gene description, based on BLAST hits (Gene description), contig length (Length), the Expect value associated with the best BLAST hit (E value), the mean similarity at the nucleotide level to its BLAST hits (Similarity), the Gene Ontology IDs (GO IDs) and Gene Ontology names (GO names), and finally the annotation confidence level, for which 'interpro' refers to sequences with a hit to the interpro database, 'blasted' refers to sequences with at least one BLAST hit to the arthropod nr database, 'mapped' was given to sequences for which gene ontology information could be retrieved from its BLAST hit(s), and 'annotated' was only given to sequences for which the mapped GO terms also scored above the default thresholds for annotation.

Supplementary Data File 2:
Differentially expressed genes for the thorax tissue of piceifrons, and their annotation. This table lists all differentially expressed genes for the thorax tissue of piceifrons, discovered by either DEseq2 or edgeR, and their annotation by Blast2GO. For both DEseq2 and edgeR, the Log2FoldChange (Log2FC) and adjusted p value (padj) were given for each contig. Additionally, gene annotation was obtained with Blast2GO, and the following outputs were given: the gene description, based on BLAST hits (Gene description), contig length (Length), the Expect value associated with the best BLAST hit (E value), the mean similarity at the nucleotide level to its BLAST hits (Similarity), the Gene Ontology IDs (GO IDs) and Gene Ontology names (GO names), and finally the annotation confidence level, for which 'interpro' refers to sequences with a hit to the interpro database, 'blasted' refers to sequences with at least one BLAST hit to the arthropod nr database, 'mapped' was given to sequences for which gene ontology information could be retrieved from its BLAST hit(s), and 'annotated' was only given to sequences for which the mapped GO terms also scored above the default thresholds for annotation.

Supplementary Data File 3:
Differentially expressed genes for the head tissue of americana, and their annotation. This table lists all differentially expressed genes for the head tissue of americana, discovered by either DEseq2 or edgeR, and their annotation by Blast2GO. For both DEseq2 and edgeR, the Log2FoldChange (Log2FC) and adjusted p value (padj) were given for each contig. Additionally, gene annotation was obtained with Blast2GO, and the following outputs were given: the gene description, based on BLAST hits (Gene description), contig length (Length), the Expect value associated with the best BLAST hit (E value), the mean similarity at the nucleotide level to its BLAST hits (Similarity), the Gene Ontology IDs (GO IDs) and Gene Ontology names (GO names), and finally the annotation confidence level, for which 'interpro' refers to sequences with a hit to the interpro database, 'blasted' refers to sequences with at least one BLAST hit to the arthropod nr database, 'mapped' was given to sequences for which gene ontology information could be retrieved from its BLAST hit(s), and 'annotated' was only given to sequences for which the mapped GO terms also scored above the default thresholds for annotation.

Supplementary Data File 4:
Differentially expressed genes for the thorax tissue of americana, and their annotation. This table lists all differentially expressed genes for the thorax tissue of americana, discovered by either DEseq2 or edgeR, and their annotation by Blast2GO. For both DEseq2 and edgeR, the Log2FoldChange (Log2FC) and adjusted p value (padj) were given for each contig. Additionally, gene annotation was obtained with Blast2GO, and the following outputs were given: the gene description, based on BLAST hits (Gene description), contig length (Length), the Expect value associated with the best BLAST hit (E value), the mean similarity at the nucleotide level to its BLAST hits (Similarity), the Gene Ontology IDs (GO IDs) and Gene Ontology names (GO names), and finally the annotation confidence level, for which 'interpro' refers to sequences with a hit to the interpro database, 'blasted' refers to sequences with at least one BLAST hit to the arthropod nr database, 'mapped' was given to sequences for which gene ontology information could be retrieved from its BLAST hit(s), and 'annotated' was only given to sequences for which the mapped GO terms also scored above the default thresholds for annotation.

Supplementary Data File 5:
Differentially expressed genes for the head tissue of cubense, and their annotation. This table lists all differentially expressed genes for the head tissue of cubense, discovered by either DEseq2 or edgeR, and their annotation by Blast2GO. For both DEseq2 and edgeR, the Log2FoldChange (Log2FC) and adjusted p value (padj) were given for each contig. Additionally, gene annotation was obtained with Blast2GO, and the following outputs were given: the gene description, based on BLAST hits (Gene description), contig length (Length), the Expect value associated with the best BLAST hit (E value), the mean similarity at the nucleotide level to its BLAST hits (Similarity), the Gene Ontology IDs (GO IDs) and Gene Ontology names (GO names), and finally the annotation confidence level, for which 'interpro' refers to sequences with a hit to the interpro database, 'blasted' refers to sequences with at least one BLAST hit to the arthropod nr database, 'mapped' was given to sequences for which gene ontology information could be retrieved from its BLAST hit(s), and 'annotated' was only given to sequences for which the mapped GO terms also scored above the default thresholds for annotation.
Supplementary Data File 6: Differentially expressed genes for the thorax tissue of cubense, and their annotation.
This table lists all differentially expressed genes for the thorax tissue of cubense, discovered by either DEseq2 or edgeR, and their annotation by Blast2GO. For both DEseq2 and edgeR, the Log2FoldChange (Log2FC) and adjusted p value (padj) were given for each contig. Additionally, gene annotation was obtained with Blast2GO, and the following outputs were given: the gene description, based on BLAST hits (Gene description), contig length (Length), the Expect value associated with the best BLAST hit (E value), the mean similarity at the nucleotide level to its BLAST hits (Similarity), the Gene Ontology IDs (GO IDs) and Gene Ontology names (GO names), and finally the annotation confidence level, for which 'interpro' refers to sequences with a hit to the interpro database, 'blasted' refers to sequences with at least one BLAST hit to the arthropod nr database, 'mapped' was given to sequences for which gene ontology information could be retrieved from its BLAST hit(s), and 'annotated' was only given to sequences for which the mapped GO terms also scored above the default thresholds for annotation.

Supplementary Data File 7:
Differentially expressed genes for the head tissue of nitens, and their annotation. This table lists all differentially expressed genes for the head tissue of nitens, discovered by either DEseq2 or edgeR, and their annotation by Blast2GO. For both DEseq2 and edgeR, the Log2FoldChange (Log2FC) and adjusted p value (padj) were given for each contig. Additionally, gene annotation was obtained with Blast2GO, and the following outputs were given: the gene description, based on BLAST hits (Gene description), contig length (Length), the Expect value associated with the best BLAST hit (E value), the mean similarity at the nucleotide level to its BLAST hits (Similarity), the Gene Ontology IDs (GO IDs) and Gene Ontology names (GO names), and finally the annotation confidence level, for which 'interpro' refers to sequences with a hit to the interpro database, 'blasted' refers to sequences with at least one BLAST hit to the arthropod nr database, 'mapped' was given to sequences for which gene ontology information could be retrieved from its BLAST hit(s), and 'annotated' was only given to sequences for which the mapped GO terms also scored above the default thresholds for annotation.

Supplementary Data File 8:
Differentially expressed genes for the thorax tissue of nitens, and their annotation. This table lists all differentially expressed genes for the thorax tissue of nitens, discovered by either DEseq2 or edgeR, and their annotation by Blast2GO. For both DEseq2 and edgeR, the Log2FoldChange (Log2FC) and adjusted p value (padj) were given for each contig. Additionally, gene annotation was obtained with Blast2GO, and the following outputs were given: the gene description, based on BLAST hits (Gene description), contig length (Length), the Expect value associated with the best BLAST hit (E value), the mean similarity at the nucleotide level to its BLAST hits (Similarity), the Gene Ontology IDs (GO IDs) and Gene Ontology names (GO names), and finally the annotation confidence level, for which 'interpro' refers to sequences with a hit to the interpro database, 'blasted' refers to sequences with at least one BLAST hit to the arthropod nr database, 'mapped' was given to sequences for which gene ontology information could be retrieved from its BLAST hit(s), and 'annotated' was only given to sequences for which the mapped GO terms also scored above the default thresholds for annotation.

Supplementary Data File 9:
Differentially expressed genes for all head tissues using the piceifrons transcriptome as reference, and their annotation. This table lists differentially expressed genes for the head tissues of piceifrons, americana, cubense and nitens, using the piceifrons transcriptome as reference. After aligning all filtered reads for the different species to the piceifrons head transcriptome, differentially expressed genes were calculated with both DEseq2 and EdgeR for each species separately. Any gene that is differentially expressed in at least one species, was included in the table. For both DEseq2 and edgeR, the Log2FoldChange (Log2FC) and adjusted p value (padj) were listed. Additionally, gene annotation was obtained with Blast2GO, and the following outputs were given: the gene description, based on BLAST hits (Gene description), contig length (Length), the Expect value associated with the best BLAST hit (E value), the mean similarity at the nucleotide level to its BLAST hits (Similarity), the Gene Ontology IDs (GO IDs) and Gene Ontology names (GO names), and finally the annotation confidence level, for which 'interpro' refers to sequences with a hit to the interpro database, 'blasted' refers to sequences with at least one BLAST hit to the arthropod nr database, 'mapped' was given to sequences for which gene ontology information could be retrieved from its BLAST hit(s), and 'annotated' was only given to sequences for which the mapped GO terms also scored above the default thresholds for annotation.
Supplementary Data File 10: Differentially expressed genes for all thorax tissues using the piceifrons transcriptome as reference, and their annotation. This table lists differentially expressed genes for the thorax tissues of piceifrons, americana, cubense and nitens, using the piceifrons transcriptome as reference. After aligning all filtered reads for the different species to the piceifrons thorax transcriptome, differentially expressed genes were calculated with both DEseq2 and EdgeR for each species separately. Any gene that is differentially expressed in at least one species, was included in the table. For both DEseq2 and edgeR, the Log2FoldChange (Log2FC) and adjusted p value (padj) were listed. Additionally, gene annotation was obtained with Blast2GO, and the following outputs were given: the gene description, based on BLAST hits (Gene description), contig length (Length), the Expect value associated with the best BLAST hit (E value), the mean similarity at the nucleotide level to its BLAST hits (Similarity), the Gene Ontology IDs (GO IDs) and Gene Ontology names (GO names), and finally the annotation confidence level, for which 'interpro' refers to sequences with a hit to the interpro database, 'blasted' refers to sequences with at least one BLAST hit to the arthropod nr database, 'mapped' was given to sequences for which gene ontology information could be retrieved from its BLAST hit(s), and 'annotated' was only given to sequences for which the mapped GO terms also scored above the default thresholds for annotation.
Supplementary Data File 11: Enrichment of Gene Ontology terms of differentially expressed genes, for each species and tissue combination.
Fisher exact tests were performed in Blast2GO by comparing the Gene Ontology terms in the set of differentially expressed genes to the ones in the whole transcriptome. Results obtained for the four different species and two different tissues were all pooled together in a single table. The first three columns contain the full name (GO name), ID (GO ID) and category (GO category) of each GO term. For each species and tissue combination, the table contains the False Discovery Rate associated with a certain GO term (FDR), the amount present in the differentially expressed genes (# in DE genes) and in the whole transcriptome (# overall), and last the percentage of genes that were annotated with that specific annotation term within the set of all DE genes (% in DE genes) or within the full transcriptome (% overall).
Supplementary Data File 12: WGCNA module membership for each gene in the head tissue. Modules of co-expressed genes for the head tissue were obtained with WGCNA. For each contig, the module for which it obtained the highest membership value was listed in the column 'Module'. As an estimate of the importance of the particular contig in explaining the variation of phenotypic traits, a gene significance (GS) value and an associated p value was calculated for each tested behavioural trait. Additionally, the module membership (MM) of each gene was calculated for each module, in addition to the associated p value. Finally, the table contains the gene annotation of Blast2GO for each gene. The following outputs were given: the gene description, based on BLAST hits (Gene description), contig length (Length), the Expect value associated with the best BLAST hit (E value), the mean similarity at the nucleotide level to its BLAST hits (Similarity), the Gene Ontology IDs (GO IDs) and Gene Ontology names (GO names), and finally the annotation confidence level, for which 'interpro' refers to sequences with a hit to the interpro database, 'blasted' refers to sequences with at least one BLAST hit to the arthropod nr database, 'mapped' was given to sequences for which gene ontology information could be retrieved from its BLAST hit(s), and 'annotated' was only given to sequences for which the mapped GO terms also scored above the default thresholds for annotation.
Supplementary Data File 13: WGCNA module membership for each gene in the thorax tissue. Modules of co-expressed genes for the thorax tissue were obtained with WGCNA. For each contig, the module for which it obtained the highest membership value was listed in the column 'Module'. As an estimate of the importance of the particular contig in explaining the variation of phenotypic traits, a gene significance (GS) value and an associated p value was calculated for each tested behavioural trait. Additionally, the module membership (MM) of each gene was calculated for each module, in addition to the associated p value. Finally, the table contains the gene annotation of Blast2GO for each gene. The following outputs were given: the gene description, based on BLAST hits (Gene description), contig length (Length), the Expect value associated with the best BLAST hit (E value), the mean similarity at the nucleotide level to its BLAST hits (Similarity), the Gene Ontology IDs (GO IDs) and Gene Ontology names (GO names), and finally the annotation confidence level, for which 'interpro' refers to sequences with a hit to the interpro database, 'blasted' refers to sequences with at least one BLAST hit to the arthropod nr database, 'mapped' was given to sequences for which gene ontology information could be retrieved from its BLAST hit(s), and 'annotated' was only given to sequences for which the mapped GO terms also scored above the default thresholds for annotation.