Introduction

When organisms independently evolve similar phenotypes in similar environments, parallel (or convergent) evolution is typically inferred (Langerhans and DeWitt, 2004; Schluter et al., 2004; Arendt and Reznick, 2008; Losos, 2011; Wake et al., 2011). By quantifying the degree of this parallelism, it is possible to assess the role of deterministic natural selection, as opposed to more random processes such as genetic drift (or idiosyncratic selection or genetic architecture) (see, for example, Stuart et al., 2017). This exercise has been central to the long-standing debate in evolutionary biology regarding the relative contributions of contingent versus deterministic processes in evolutionary trajectories (see, for example, Gould, 1989, 2002; Travisano et al., 1995; Conway Morris, 2003; Burbrink et al., 2012; Mahler et al., 2013; Elmer et al., 2014a). If determinism is high, phenotypic parallelism should be high, and so too should be parallelism in the genetic pathways underlying the selected phenotypes. Theoretically, with the correct measurement of predictor variables, the outcome of evolution can be similarly forecast at both the phenotypic and genetic levels. If, on the other hand, chance historical events or other idiosyncratic effects are the stronger determinants of evolution, parallelism should be low; in this case, evolution appears to be a contingent process that is not as easily anticipated or predicted.

Over the past decade, many studies have sought to detect the effects of repeated natural selection at the genomic level, often attempting to specifically relate parallel phenotypic changes to parallel genetic changes (see, for example, Wilding et al., 2001; Storz and Nachman, 2003; Rogers and Bernatchez, 2007; Egan et al., 2008; Schluter et al., 2010; Kaeuffer et al., 2012; Roesti et al., 2012; Roda et al., 2013; Marques et al., 2016). Although these genetic changes can be either protein coding or regulatory, the relative contributions of these two possibilities is not often investigated (but see Jones et al., 2012). Yet, understanding the role of regulatory differences, which can be shaped by both genetic and plastic effects, is important because gene expression levels are subject to natural selection and can produce a phenotype independent of protein-coding sequences (Pritchard et al., 2017). Indeed, a genome scan of marine versus freshwater threespine stickleback (Gasterosteus aculeatus) revealed that divergent loci found in multiple, independent pairs of populations are more common in regulatory than in coding regions of the genome (Jones et al., 2012). Gene expression analysis can also reveal regulatory differences between populations that do not manifest as visible phenotypes (such as physiological differences), and therefore can be overlooked (Pavey et al., 2010). In addition, when gene expression differences are due to adaptive divergence, expression levels in hybrids and immigrants could produce fitness deficits that would contribute to ecologically based reproductive isolation (Pavey et al., 2010). In short, focusing solely on gene sequences can miss important differences between ecotypes that can reveal how natural selection is driving adaptive divergence and speciation. Looking for parallel gene expression, where a gene is repeatedly and significantly upregulated (or downregulated) in one environment type as compared with another environment type, thus allows for a more complete evaluation of the degree of determinism in adaptive divergence.

Studies of gene expression in populations undergoing adaptive divergence (and, putatively, ecological speciation) have become common in the past decade. Studying single pairs of populations, differences in gene expression have been found—as just a few examples—in dwarf versus normal whitefish (Coregonus clupeaformis) (Jeukens et al., 2010), upper versus lower shore periwinkle snails (Littorina saxatilis) (Martínez-Fernández et al., 2010), low- and high-predation guppies (Ghalambor et al., 2015) and lake versus stream threespine stickleback (G. aculeatus) (Lenz et al., 2013). These studies are valuable for understanding adaptive divergence in the specific studied populations; however, the lack of replication (only a single population pair was considered in each) means that the results cannot be generalized to adaptive divergence between those ecotypes as a whole. By instead studying multiple ecotype pairs, the degree of parallelism can be used to evaluate how important the deterministic process of natural selection is in promoting adaptive divergence in gene expression. A number of studies have taken this expanded ‘parallelism’ approach based on data from microarrays (see, for example, Derome et al., 2006; Lai et al., 2008; Filteau et al., 2013; Morris et al., 2014). Microarrays provide useful information but, compared whole-transcriptome methods (for example, RNA sequencing), will miss expression differences in genes not included on the array. Whole-transcriptome methods thus provide powerful tools for detecting parallel gene expression differences.

A growing number of studies avoid these first two limitations by analyzing whole transcriptomes for multiple population pairs—but a limitation remains. In particular, these studies have usually examined gene expression in wild-caught individuals (see, for example, Galindo et al., 2010; Manousaki et al., 2013; Westram et al., 2014) that will include an unknown combination of plastic and gene sequence effects on gene expression. Plasticity in such situations can either promote or constrain adaptive divergence (Pfennig et al., 2010; Moczek et al., 2011; Fitzpatrick, 2012; Ghalambor et al., 2015; Oke et al., 2016); but, importantly, plastic expression changes are not encoded by sequence changes, and therefore cannot be used to predict evolutionary trajectories. Yet, other studies have shown that gene expression variation often can be heritable and has contributed to adaptive divergence in several species (Pritchard et al., 2017). Indeed, a study of expression quantitative loci in stickleback found that up to 98% of transcripts expressed were under additive genetic control (Leder et al., 2014). In this study, we use a common garden design to remove plastic effects in multiple independent ecotype pairs, thereby allowing us to evaluate heritable parallelism in gene expression.

Study system

The threespine stickleback has become a model organism for studying parallel adaptive divergence because of the replication of independent ecotype pairs in a variety of environments (review: McKinnon and Rundle, 2002). One type of ecotype pair that has proven particularly useful in this regard is that formed by parapatric lake–stream populations (see, for example, Moodie, 1972; Reimchen et al., 1985; Lavin and McPhail, 1993; Deagle et al., 1996; Reusch et al., 2001; Hendry et al., 2002; Aguirre, 2009; Berner et al., 2010; Eizaguirre et al., 2011; Kaeuffer et al., 2012; Roesti et al., 2012; Lucek et al., 2013; Ravinet et al., 2013; Oke et al., 2016; Stuart et al., 2017). Throughout its range, the lake ecotype has generally evolved a shallower body and more numerous gill rakers; adaptations for sustained swimming in open water while feeding on limnetic prey (Caldecutt and Adams, 1998; Berner et al., 2008; Kaeuffer et al., 2012). In contrast, the stream ecotype has generally evolved a deeper body and fewer gill rakers; adaptations for swimming in complex, flowing environments while feeding on benthic macroinvertebrates (Caldecutt and Adams, 1998; Berner et al., 2008; Kaeuffer et al., 2012).

Recently, a number of studies have tested for parallelism in lake–stream stickleback genetic sequences, finding—as for phenotypes—a combination of parallel and nonparallel divergence patterns (see, for example, Thompson et al., 1997; Hendry and Taylor, 2004; Deagle et al., 2011; Kaeuffer et al., 2012; Roesti et al., 2012; Feulner et al., 2015; Marques et al., 2016; Stuart et al., 2017). In contrast, only two studies have investigated whole-transcriptome differences between lake and stream stickleback. The first (Lenz et al., 2013) examined gene expression in the head kidneys of a pair experimentally infected with parasites in a common garden. Although that study found significant overall (multivariate) expression differences in some fish, importantly, it did not test for expression differences of individual genes between the control (uninfected) lake and stream fish, nor did it test for parallelism. The second study looked at whole-transcriptome patterns in head kidneys and spleens of four lake–stream pairs (Huang et al., 2016). That study found 139 genes that were consistently and significantly upregulated in one habitat as compared with the other. However, its use of wild-caught individuals prevents insight into whether expression differences were genetic as opposed to plastic responses.

We build on these previous studies by assessing the level of parallelism in both overall and individual gene expression for common garden-raised stickleback from two lake–stream pairs. We also test for patterns of antiparallelism, when gene expression is negatively correlated between replicate pairs; for example, a gene that is upregulated in the lake (relative to the stream) population in one watershed but is downregulated in the lake (relative to the stream) population in another watershed. This additional exploration is not often taken in studies of parallelism, genetic or otherwise, possibly because of the relatively more straightforward interpretability of parallelism as opposed to antiparallelism (but see Derome et al., 2006). However, antiparallel differential expression patterns could influence parallel phenotypic patterns (Derome et al., 2006). Overall, the results of our study will address how genomic parallelism is reflected at the level of gene expression, an important step toward the ultimate goal of identifying genes that play a part in adaptive divergence via expression differences.

Materials and methods

Animal collection and rearing

Between May and June 2013, we used unbaited minnow traps to sample stickleback from both the lake and stream of the Misty and Robert’s watersheds on Vancouver Island. These populations are in independent, isolated watersheds and have lake–stream average FST values for neutral loci of 0.121 and 0.045, respectively (Kaeuffer et al., 2012). Phenotypically, the two watersheds show pronounced parallelism between lake and stream fish (Kaeuffer et al., 2012; Stuart et al., 2017). Males in breeding condition and gravid females were retained and kept in coolers with air pumps for up to 8 h. We then killed the males and stripped eggs from the females in order to produce crosses using in vitro fertilization (each male was used to fertilize the eggs of only one female). We produced a pure-type cross from lake and stream of both Misty and Robert’s watersheds and kept fertilized egg masses in individual tubes at 4 °C for up to 4 days before they were shipped on ice to McGill University (Montreal, Canada). Once arrived (within 24 h), we transferred each egg mass to a separate 20 gallon aquarium in common garden conditions (identical husbandry conditions). Upon hatching, we fed the fish daily with brine shrimp and blood worms. We maintained the water in the tanks at 17 °C for the entire duration of the experiment. Light schedules were adjusted throughout the experiment to match the appropriate sunrise–sunset cycles on Vancouver Island in order that development would progress at a similar rate to that found in the natural habitat.

RNA sampling and library preparation

When the fish reached 2 years of age (range 663–703 days, because of differences in fertilization and birth dates), we used tricaine methanesulfonate to kill three randomly chosen fish from one stream family and one lake family from each of Misty and Robert’s watersheds, for a total of 12 fish. All fish were processed on the same day within 15 min of one another. Immediately after death we dissected the liver from the fish that was then flash-frozen in liquid nitrogen. The liver was chosen because it is involved in a number of important processes in stickleback, including metabolism (Leder et al., 2009), cold tolerance (Orczewska et al., 2010), energy storage (Chellappa et al., 1989; Huntingford et al., 2001), immune function (Kurtz et al., 2006) and response to hypoxia (Leveelahti et al., 2011), all of which are related to the ecological differentiation between lake and stream sticklebacks. We then used the TRIzol Plus RNA Purification Kit (Thermo-Fischer Scientific, Waltham, MA, USA) to extract and purify total RNA. We prepared individual libraries using the Illumina TruSeq Stranded mRNA Prep Kit (Illumina, San Diego, CA, USA) and 100 bp, single-end reads were sequenced on two lanes of an Illumina HiSeq 2000, with libraries spread randomly over the two lanes.

Genome alignment

Raw reads were quality filtered before read mapping using the following steps. All raw reads output to fastq files were 100 bp in length. We used Trim Galore! 0.4.2 (Krueger, 2014) to remove sequencing adaptors and trim read tails with a PHRED quality score below 20. We kept reads that were longer than 20 bp after trimming. We then aligned trimmed reads to the stickleback genome (version 86) downloaded from www.ensembl.org using HISAT2 2.0.5 (Kim et al., 2015; Pertea et al., 2016) with at most one distinct, primary alignment for each read. Output SAM files from HISAT2 were then sorted and converted to BAM files using SAMtools 1.3 (Li et al., 2009; Li, 2011). Finally, we counted gene hits of aligned reads from BAM files using HTSeq 0.6.1p1 (Anders et al., 2014).

Differential expression analysis

All further analyses were conducted in R (R Core Team. R Foundation for Statistical Computing: Vienna, Austria, 2015). We analyzed gene counts using the Bioconductor R Package edgeR 3.16.3 (Robinson et al., 2010). First, weakly expressed genes were filtered out if they had <1 count per million reads in three samples (Anders et al., 2013). All libraries were then simultaneously normalized with the TMM (trimmed mean of the M-value) method (Robinson and Oshlack, 2010) implemented in edgeR. The TMM method computes the scaling factors for the counts based on library size that were used in subsequent model fitting (see below). After applying the TMM method, most genes should have a unified expression level across all samples, and the scaling factors for all libraries should be close to 1 (Dillies et al., 2013). All of our libraries obtained scaling factors from 0.90 to 1.34. Next, the dispersion of the negative binomial distribution for the expression of each gene was estimated in edgeR. It represents the biological coefficient of variation of a gene’s expression. This was used to evaluate the expression variance, where a high dispersion value indicates high variance of gene expression pattern among samples. Finally, we produced a multidimensional scaling plot using the pairwise biological coefficient of variation as a distance measure to visualize the overall relationships between individuals (Figure 1).

Figure 1
figure 1

Multidimensional scaling (MDS) plot between individual samples. Distances between individuals correspond to the leading biological coefficient of variation (BCV) that represents the biological (that is, nontechnical) variation. ML, Misty lake; MS, Misty stream; RL, Robert’s lake; RS, Robert’s stream.

We first used a non-parametric permutational multivariate analysis of variance to look for overall differences in gene expression between the two watersheds (D'haeseleer, 2005; Zapala and Schork, 2006; Lenz et al., 2013). We constructed a distance matrix using Pearson’s correlation in the R package amap (Lucas, 2014) on log counts per million (log2(CPM)) value for each gene. We then used this matrix as the response variable in a model with habitat (lake or stream), watershed (Misty or Robert’s) and their interaction as the independent variables. We ran 25 000 permutations using the adonis function in the R package vegan (Oksanen et al., 2015). We then tested for correlation between the log2 fold change (log2FC) between lake and stream in Misty as compared with Robert’s across all genes using Pearson’s product-moment (PPM) correlation coefficient. For this test, a significant positive correlation would indicate that generally genes that are up- or down-regulated in one watershed will show expression differences in the same direction in the other watershed, indicating a trend of parallelism, whereas an insignificant or negative correlation would support a lack of parallelism in overall gene expression between the two watersheds. More specifically, a significant negative correlation would indicate a general trend of antiparallelism, where genes up- or down-regulated in one watershed will show expression differences in the opposite direction in the other watershed.

We then used two methods to look for genes significantly differentially expressed in parallel (hereafter ‘DEP’) between the two watersheds. In the first (hereafter ‘GLM method’), we fit a negative binomial generalized linear model (GLM) with habitat (lake or stream), watershed (Misty or Robert’s) and their interaction as the explanatory variables. This model uses the Cox–Reid profile-adjusted likelihood method to estimate both the common and gene-wise dispersions. After model fitting and testing using likelihood ratio tests, we obtained lists of genes that were significantly differentially expressed (at a false discovery rate of 0.05) for the habitat term in Misty and Robert’s. We then found the overlap of those two lists and evaluated the probability of the observed overlap from the background of all expressed genes (after filtering) using the phyper function in R (which uses the hypergeometric distribution to calculate the probability of overlap without replacement). We then subtracted from the overlap list those genes that were significant for the habitat by watershed interaction to obtain a list of genes that should be DEP between lake and stream. As this method could exclude genes that have parallel expression in the two watersheds but with a larger difference in one than the other, leading to a significant interaction term, we used a second method to look for DEP genes that did not rely on the use of interaction terms.

In the second method (hereafter ‘Two-model method’), we fit two separate single-factor models for each watershed using the quantile-adjusted conditional maximum-likelihood method and tested for differential expression using the exact test. From both models we obtained a list of differentially expressed genes (with false discovery rate of <0.05), then extracted from those lists the genes that were found on both lists (the probability of the observed overlap was again tested using the phyper function in R) and showed expression differences in the same direction (for example, upregulated in stream in both cases).

We then extracted genes that were found on the DEP lists from both methods and considered these to be our final list of DEP genes. To further confirm that these genes showed similar expression patterns in both watersheds we tested for correlation in log2FC values between the two watersheds using the PPM correlation coefficient. To determine whether more genes were found to be DEP than expected by chance, we constructed a null distribution using permutations. To build the distribution we applied the permutation on the normalized read counts per gene, and randomly permuted the habitats (lake or stream) between libraries (that is, all reads for one individual) within the same watershed while maintaining the sample ID of each library. Each permutated data set was analyzed by the same steps used on the actual data and repeated 1000 times. We then calculated the probability of observing the number of genes that were shared between the two methods given this distribution.

We next tested for antiparallel gene expression. Using both the GLM and two-model methods, we looked for genes that showed patterns of significant negative correlation. Using the GLM method, we found the intersection between the list of genes that were differentially expressed for the habitat term in both Misty and Robert’s, and the list of genes that had a significant interaction term. As this method could include genes that do not show negatively correlated expression patterns but rather genes that show a larger degree of differential expression in one watershed than the other, we also used the two-model method. We first took the list of genes that were found to have differential expression between lake and stream in both watersheds. We then extracted from this list genes that showed opposite expression patterns in the two systems. We then made a list of the genes that were identified using both methods to be antiparallel, differentially expressed genes (hereafter ‘APDE’). The overall correlation between the log2FC values of these genes in both watersheds was quantified using the PPM correlation coefficient. Using the same simulations as described above, we constructed a null distribution of the number of genes expected to be APDE under random processes.

Functional analyses

We tested for the enrichment of Gene Ontology (GO) terms in our DEP and APDE gene sets with the Bioconductor R package topGO 2.26.0 (Alexa et al., 2006), based on Fisher’s exact tests. The gene pools against which we compared the DEP or APDE gene sets were the genes having passed the filtering step and entering the differential expression analyses (see above). Overrepresented GO terms were those with a multiple test corrected P-value (Benjamini–Hochberg’s false discovery rate) below 0.05.

Results

Transcriptome mapping

After trimming for quality, each library was composed of an average of 33 710 122 reads. On average, 86.24% of reads aligned to the stickleback genome, and 13.21% of these mapped to multiple regions of the genome that were subsequently excluded from further analyses. Out of the 22 455 annotated genes retrieved from the stickleback genome (Ensembl version 86), an average of 16 543 genes were found expressed, and 12 300 genes were found expressed across all samples after filtering out weakly expressed genes as described above, and hence were kept for further analysis.

Overall gene expression

The first axis of the multidimensional scaling plot separated the two watersheds from one another, whereas the second axis separated habitats, although to a different degree in each watershed (Figure 1). The results of the permutational multivariate analysis of variance concur with the patterns displayed in the multidimensional scaling plot, with watershed explaining 33.86% of the variation (F=6.584, P=8.00 × 10−5), habitat explaining 12.94% of the variation (F=2.516, P=0.037) and the interaction explaining a further 12.05% of the variation (F=2.35, P=0.049). The PPM coefficient on log2FC values for Misty versus Robert’s found a weak but significant positive correlation (ρ=0.08, P=2.2 × 10−16) (solid line in Figure 2).

Figure 2
figure 2

Scatterplot of log2FC values in Misty versus Robert’s. Colors for differential expression in Misty only, Robert’s only and neither are based on results from the two-model method. Colors for DEP and APDE are based on agreement between both methods. Gray points represent genes that were found to be DEP in the two-model method but not the GLM method. Solid line indicates correlation between all genes, dashed line indicates correlation between DEP genes and dotted line indicates correlation between APDE genes. DE, significantly differentially expressed.

Individual genes differentially expressed in parallel

Using the GLM method, we found 384 and 667 genes significantly differentially expressed between Misty lake and stream, and Robert’s lake and stream, respectively. Of these genes, 59 were found on both lists. The probability of this amount of overlap or more occurring by chance is 6.95 × 10−14. Of these 59 genes, 32 also had a significant interaction term, meaning that the expression difference between habitats was different in the two watersheds, indicating nonparallelism. This left a list of 27 genes that were DEP between the two watersheds (Figure 3).

Figure 3
figure 3

Venn diagrams showing the number of genes found to be DEP (shown in bold) in both methods. In GLM method, circles show the number of genes significantly differentially expressed in the Misty and Robert’s watersheds, and the number of genes found to have a significant habitat by watershed interaction (thereby indicating nonparallelism). The overlap between the two upper circles therefore represents DEP genes. In the two-model method, circles show the number of genes significantly differentially expressed in the Misty and Robert’s watersheds, and the number of genes with directional expression differences (thereby indicating parallelism). The overlap between all three circles therefore represents DEP genes. Lower diagram shows the DEP gene list overlap between the DEP lists derived from each method as shown in the upper diagrams.

Using the two-model method, we found 321 and 616 genes differentially expressed between Misty lake and stream, and Robert’s lake and stream, respectively. There were 53 genes in common between the two watersheds. The probability of this overlap or greater by chance is 2.17 × 10−15. Of these genes, 28 were differentially expressed in the same direction (DEP) in both watersheds (Figure 3).

The intersect of the GLM and two-model methods gave us our final list of 22 DEP genes (Figure 3 and Supplementary Table S1). These genes showed a significant correlation in log2FC between Misty and Robert’s (dashed line in Figure 2; ρ=0.95, P=2.67 × 10−11). This 22 is a much greater number than expected by chance; the P-value for having the same 22 genes shared by both methods was 0.004. Half of the 22 genes had a negative log2FC value, meaning they were upregulated in lakes, whereas the remaining 11 genes were upregulated in streams. Differential expression between habitats was greater for those genes upregulated in the lake: the mean log2FC of these genes was −4.79 as compared with 2.81 for genes upregulated in streams. Genes upregulated in lakes also tended to be more strongly expressed, with the mean of the mean lake expression for those genes being 511 CPM. In comparison, mean stream expression for the 11 genes that were upregulated in streams had a mean of 65 CPM.

We also found slight differences in expression between watersheds; in the Misty watershed, mean DEP gene expression in lake and stream fish was 299 and 67 CPM, respectively, whereas in the Robert’s watershed, mean DEP gene expression in lake and stream fish was 220 and 83 CPM, respectively. Across all 12 300 genes (not just those that were DEP), Misty stickleback had mean gene expression of 80 and 89 CPM in the lake and stream, respectively, whereas Robert’s stickleback had mean gene expression of 80 and 78 CPM in the lake and stream, respectively. The magnitude of differential expression was similar between watersheds; for genes upregulated in lake, the mean log2FC was −5.47 in Misty and −4.11 in Robert’s and for genes upregulated in stream, mean log2FC in Misty was 2.84 and 2.78 in Robert’s.

Of the 22 DEP genes, 9 are novel and thus do not have known functions. The remaining 13 genes reflect a range of functions (Supplementary Table S1).

Antiparallel gene expression

Using the GLM method, we found that 32 of the 59 genes that were differentially expressed between lake and stream in both watersheds also had a significant interaction term, and hence were identified as being APDE. Using the two-model method, 25 of the 53 genes that were differentially expressed between lake and stream in both watersheds were expressed in opposite directions, and were classified as APDE. The overlap between the genes identified with each method was 24 that constituted the final list of APDE genes (Supplementary Table S2). There was a strong negative correlation between the log2FC values between watersheds of these 24 genes (dotted line in Figure 2, ρ=−0.92, P=1.54 × 10−10). Finally, the list of 24 APDE genes was significantly more than would be expected by chance; the P-value for having the same 24 genes shared by both methods was 0.004.

Of the 24 APDE genes, 13 were novel and do not have any known function in fish (Supplementary Table S2). Of the 11 remaining genes, most do not have known functions in teleosts.

Functional analyses of DEP and APDE genes

Both the DEP and APDE genes had no significant GO term enrichment (false discovery rate <0.05). The top GO terms for DEP genes were ionotropic glutamate receptor activity (GO: 0004970) in Molecular Function, intracellular (GO: 0005622) in Cellular Component and branchiomeric skeletal muscle development (GO: 0014707) in Biological Process. The top GO terms for APDE genes were adenoine deaminase activity (GO: 004000) in Molecular Function, intermediate filament (GO: 0005882) in Cellular Component and neural tube formation (GO: 0001841) in Biological Process.

Discussion

Our goal was to determine the level of parallelism in heritable gene expression for repeated cases of adaptive divergence. This work is an important complement to the many studies seeking to quantify parallel variation in genomic sequences, and thereby infer if the same mutations are repeatedly used during the process of adaptive divergence. These studies have generally found equivocal or variable answers; some genomic differences (typically evaluated by scanning for FST-outlier loci) are repeated, but many others are not (Nosil et al., 2009; Elmer and Meyer, 2011; Roda et al., 2013; Soria-Carrasco et al., 2014; Elmer et al., 2014b). As an example of parallelism, 1.4% of markers were FST outliers in all four pairs of dwarf and normal lake whitefish (Campbell and Bernatchez, 2004), and the same was true for 5% of the markers in four pairs of upper- and lower-shore ecotypes of the rough periwinkle (Littorina saxatalis) (Wilding et al., 2001). As other examples, between 0.2% (Jones et al., 2012) and 2.5% (Hohenlohe et al., 2010a) of the genome shows repeated differentiation between marine and freshwater populations of threespine stickleback. As an example in our study system—lake–stream stickleback—0.2% of single-nucleotide polymorphisms were located in parallel ‘islands of genomic differentiation’ in multiple pairs (Marques et al., 2016).

Yet, other studies, often focusing on the same species in which parallel genomic signatures were found (as above), find little or no evidence for parallel differentiation; no overlap was found for outlier loci between two parallel cichlid radiations (Kautt et al., 2012), or in five normal and dwarf lake whitefish pairs (Renaut et al., 2011). This lack of parallel outliers could be because of the low number of markers used: 1030 amplified fragment length polymorphism markers and 112 single-nucleotide polymorphisms in the cichlid and whitefish studies, respectively. However, improved marker density does not always lead to the detection of parallel genetic differentiation; studies using 8417 single-nucleotide polymorphisms (Roesti et al., 2012) and whole-genome sequences (Feulner et al., 2015) found that none of the significant FST-outlier peaks found in lake–stream stickleback pairs were shared among all 4 and 5 populations, respectively, that were studied.

Overall gene expression differences

Given the wide range of genomic parallelism described above, we expected to find some parallelism in gene expression for lake versus stream stickleback, but did not know how large a fraction of expression it would represent. First looking for parallelism in overall gene expression, we found that habitat (lake versus stream) explained 13% of the total variation, but that the magnitude of the difference between lake and stream stickleback was larger in the Robert’s system than in the Misty system (Figure 1). This pattern for gene expression (some parallelism in direction but considerable differences in magnitude) is similar to the patterns of phenotypic parallelism found in other studies of lake and stream stickleback. For example, Berner et al. (2010) compared lake–stream stickleback pairs from Vancouver Island (Canada) and Switzerland, finding that body shape differed in a consistent direction between lake and stream fish, whereas the magnitude of the difference was much greater on Vancouver Island than in Switzerland. Even in populations that are much closer geographically, differences in the magnitude of divergence can be dramatic. For instance, Berner et al. (2009) examined phenotypic divergence in six Vancouver Island lake–stream pairs and found that, although distance downstream from a lake was a significant predictor of body depth and gill raker number, the strength of this distance–trait association varied dramatically among watersheds.

The parallel aspect of overall gene expression divergence documented here was also evident in the significant positive correlation between the log2FC values in Misty versus Roberts. This trend confirms that a gene upregulated in one habitat type (lake or stream) in the Misty watershed tended (on average) to be upregulated in that same habitat type in the Robert’s watershed. Testing for such parallelism in overall gene expression is not common, with the focus typically being on just the genes showing significant differential expression (see, for example, Derome et al., 2006; St‐Cyr et al., 2008; Manousaki et al., 2013). This focused approach on individual genes could underestimate the true degree of expression parallelism if many of the expression differences between ecotypes are small, and this might be expected if adaptive divergence is driven by (the expression of) many genes of small effect. This potential inability to detect small effect loci is also present in studies looking at sequence differentiation between populations through the use of outlier genome scans; loci of small effect, especially those selected by soft sweeps (selection of standing genetic variation), can be missed in such scans (Storz, 2005; Teshima et al., 2006; Hohenlohe et al., 2010b), leading to an underestimate of parallel loci.

Our finding of overall differentiation in gene expression between lake and stream stickleback is contrary to the results found by Lenz et al. (2013), where overall gene expression did not differ between the lab-raised lake and stream fish that served as experimental controls (that is, had not been exposed to parasites). Many possible reasons exist for these contrasting results, including the different tissues used: head kidneys for Lenz et al. (2013) versus liver in our study. Although the liver plays a role in many processes in stickleback (see Materials and methods), head kidneys are frequently used to investigate immunological functions because of their high degree of specialization (see, for example, Kurtz et al., 2006; Bolnick et al., 2015; Stutz et al., 2015). Other potential reasons for the difference between studies could be the use of different populations (northern Germany versus Vancouver Island) that have different genetic histories (including standing genetic variation in the marine ancestor and time since colonization and divergence) and presumably different selection regimes (prey and predator communities, degree of human influence, temperature and so on). Differences could also be because of the ages of the fish (8.5 months for Lenz et al., 2013 versus 24 months in our study). More work will be necessary to establish how differences in overall gene expression might be context specific.

Differential expression of individual genes

Examination of the individual genes that show parallel differential expression is valuable, because these outliers are likely important in divergent adaptation. We found 22 such genes, representing 0.18% of all stickleback genes, and 0.20% of these genes that were minimally expressed (see Materials and methods). In comparison, a study of wild-caught lake–stream pairs found 73 genes differentially expressed in parallel in head kidney tissue and 74 in spleen tissue, representing 0.33% of all stickleback genes in each case (Huang et al., 2016). That the number of DEP genes is almost triple in wild-caught fish as compared with lab-raised fish is not surprising; exposure to parasites, interactions with predators, changing abiotic variables and varied diets are all stimuli not encountered in the benign lab environment that could all promote gene expression (either heritable or plastic) in wild fish. Indeed, Lenz et al. (2013) found that lab-raised lake and stream fish showed differential gene expression only when exposed to parasites multiple times, suggesting that many expression differences occur only in response to particular stimuli. Studies of parallel gene expression in other species have also shown variability in the percentage of genes found to be DEP. In microarray studies, 0.06% (Lai et al., 2008), 1.35% (Derome et al., 2006) and 2.39% (St‐Cyr et al., 2008) of genes were differentially expressed in parallel between multiple population pairs. Of course, caution must be used in comparing microarray experiments to the present whole-transcriptome study because genes chosen a priori to be included on a microarray might not be representative of the whole transcriptome.

How do the above numbers for gene expression parallelism compare with estimates of genomic parallelism at the sequence level? Of the two studies that have looked for genomic parallelism in lake–stream stickleback, one found three loci (0.31 to 0.35% of all single-nucleotide polymorphisms examined, depending on watershed) that were outliers in all three watersheds examined (Deagle et al., 2011), whereas the other found no outliers in all four watersheds examined (Roesti et al., 2012). Considering that the fish in our study were only expressing 55% of all possible stickleback genes after filtering weakly expressed genes (12 300 of 22 455), 0.18% seems not unexpectedly low. More studies on sequence divergence between lake and stream ecotypes would allow for a more thorough exploration of the relationship between parallelism in sequence and gene expression.

Potential functional roles of DEP genes

Of the 13 DEP genes with known function, 2 are from protein families shown to have a role in immune functions in fish: sigirr (single immunoglobulin and Toll-interleukin 1 receptor domain), which plays a role in inhibiting hepatic inflammation in zebrafish (Danio rerio) (Feng et al., 2016), and a TRIM gene, trim35-12 (tripartite motif containing 35-12), implicated in antiviral innate immunity in rainbow trout (Oncorhynchus mykiss) (van der Aa et al., 2009). Interestingly, four TRIM genes from the same TRIM family as that found in the present study (35-12) were found to show signatures of balancing selection in three freshwater and two oceanic stickleback populations (Hohenlohe et al., 2010a). The increase in polymorphism produced by this balancing selection could play a role in stickleback immunity similar to that seen in other teleosts (Hohenlohe et al., 2010a). Finally, in zebrafish (D. rerio), TRIM genes have been found in the same genetic regions as the major histocompatibility complex and its paralogs (Boudinot et al., 2011); immune genes that have the potential to drive reproductive isolation between lake and stream stickleback populations (Eizaguirre et al., 2011).

Of course, the hypothesis that these DEP loci are associated with immune function depends on the gene expression being constitutive, as part of the innate immune system, as the fish were raised in common garden and not exposed to parasites and thus should not have any challenge to the immune system that would activate adaptive immunity (at least not differently between lake and stream). Indeed, studies in stickleback have found significant genetic variation in innate parasite immunity (Wegner et al., 2007). Regardless, most of the genes upregulated in lake fish, including myom1b and pacsin1b, which are involved in myogenesis (Lo et al., 2003) and embryonic notochord development (Edeling et al., 2009), respectively, are unlikely to have immune functions. The reason for upregulation in lakes for these genes is therefore an open question, but this study provides a basis for considering these as candidate genes in determining the genetic basis of divergent adaptation.

Similarly, the 10 genes that were upregulated in stream fish do not have obvious functional roles in the phenotypic divergence between lake and stream populations. The top two differentially expressed genes, akr1b1 and ret, do not have known functions in adult teleost fish. However, two of the other genes do: as3mt (arsenite methyltransferase) and cry1ab (cryptochrome circadian clock 1ab). As3mt is involved in arsenic detoxification and has been found to be upregulated in livers of zebrafish exposed to arsenic (Hamdi et al., 2012). Arsenic, which accumulates in the livers of fish (Maher et al., 1999; Mason et al., 2000; Kirby and Maher, 2002), tends to be high in benthic-feeding fish (Kirby and Maher, 2002; Bordajandi et al., 2003; de Rosemond et al., 2008). Stream stickleback on Vancouver Island have been shown to have diets higher in benthic prey items than lake fish (Berner et al., 2008, 2009; Kaeuffer et al., 2012), and thus may be exposed to higher levels of dietary arsenic than lake fish. cry1ab is part of the cryptochrome protein family that is involved in regulation of the circadian clock in zebrafish (Kobayashi et al., 2000; Lahiri et al., 2005; Liu et al., 2015) and Atlantic cod (Lazado et al., 2014). Sexual maturation in stickleback is dependent on photoperiod (Borg, 1982; Baggerman, 1985; Borg et al., 2004) and the optimal time for breeding could differ between lake and stream (but see Hanson et al., 2016). Overall, the functional roles for the stream upregulated DEP genes are unclear, but future experiments examining expression of these genes in more detail would be valuable.

Antiparallel genes

Another goal of our study was to formally test for genes differentially expressed in an antiparallel manner. We found 24 such genes in our two lake–stream pairs, more than expected by chance. This result intuitively suggests that some aspect of lake–stream divergence—that what is influenced by these antiparallel genes—must also be antiparallel. Indeed, antiparallel phenotypic divergence for some morphological traits is evident for lake–stream stickleback in the Robert’s versus Misty watersheds (Oke et al., 2016), for lake–stream stickeback in other watersheds (Hendry and Taylor, 2004; Kaeuffer et al., 2012) and for many other fishes (Oke et al., 2017). Recent work has attributed some of this lake–stream phenotypic antiparallelism to antiparallel divergence in lake–stream habitat features (Stuart et al., 2017) and therefore, presumably, antiparallel divergence in lake–stream natural selection.

Alternatively, it is possible that antiparallel gene expression may produce parallel phenotypes (Manousaki et al., 2013) that could occur if genetic constraints drive different balances of up- and down-regulation of genes within the same or similar pathways that have functionally similar effects (Derome et al., 2006). This scenario is not unlikely given that parallel phenotypes are often underlain by expression of different genes. As examples, decreased Pitx1 expression causes pelvic reduction in some, but not all populations of stickleback; and, similarly, decreased expression of Agouti is associated with dark coat color in some, but not all, populations of Peromyscus maniculatus mice (Linnen et al., 2009). Although we did not find obvious functional similarities across the different APDE genes, which would suggest they might contribute to the same phenotype but through different expression, many of the APDE genes are not well characterized functionally. For instance, 12 of the 24 APDE genes are novel, and the majority of the remainder have not been functionally characterized in teleosts. As with the DEP genes, these genes represent possible candidate genes on which to focus future research.

Limitations

We finally want to address some of the deficiencies that could be addressed in future work. A first issue was the limited number of population pairs—finding the same DEP (or APDE) genes in more pairs would strengthen claims of their adaptive importance. Of course, nonparallelism can increase with an increasing number of ecotype pairs. For example, the aryl hydrocarbon receptor signaling pathway was the only strong candidate for parallel evolution in four pairs of pollutant-sensitive and -tolerant Atlantic killifish (Reid et al., 2016). A second issue was the low number of individuals per population; however, the three individuals per site did have similar expression in most cases (Figure 1). A third issue was the use of one family per population, whereas more families would inform family-level variation, including differences in social behavior between families (or populations) and parental effects. However, previous gene expression work in lake–stream stickleback found no significant family effect (Lenz et al., 2013), and the trends shown in Figure 1 suggest overall expression levels separate mainly by habitat and watershed, a pattern that would be unlikely if family-level variation was more important. Fourth, fish from the different populations were of slightly different ages at sampling because of differences in fertilization date and days until hatching. However, all fish in this study were past the major developmental stages (Swarup, 1958), and hence should not have major differences in gene expression due to development. Finally, we used only liver tissue from adult individuals, whereas gene expression is known to be specific to tissue and development stage. Thus, using a different tissue, or sampling from embryos or juvenile fish, might have given us different lists of genes. Taking these limitations into account, this work represents a first but necessary step toward a more comprehensive evaluation of parallel gene expression.

General conclusions

Our study contributes toward a greater understanding of the relationship between parallelism in genomic sequences and gene expression patterns that are associated with those sequences. Specifically, we examined heritable, total gene expression patterns in multiple pairs of adaptively divergent populations. We also introduce a novel method of evaluating the number of genes found to be expressed in parallel (or antiparallel) using a permutation approach to construct a null distribution that allows for an empirical assessment of significance. Although we found some parallel gene expression, the vast majority of gene expression was nonparallel (that is, nondivergent or antiparallel), in line with expectations based on genome-wide patterns of sequence variation. This result suggests that, at this molecular level, deterministic natural selection plays a relatively small role in shaping evolutionary trajectories. Alternatively, it could be that genetic or environmental factors are sufficiently different between the Misty and Robert's systems that parallel evolution of gene expression is unlikely, even if natural selection is playing a strong role. For example, if the two systems started with different standing genetic variation, had dissimilar levels of gene flow or have different environmental pressures such as parasites, predators and abiotic factors, we would not expect parallel gene expression patterns. In addition, the potential exists for gene expression to be influenced by genotype–environment interactions (G × E) that could reduce parallelism (Oke et al., 2016). Future work could test these hypotheses. Parallel genetic mechanisms, at both the expression and sequence levels, appear to be present at similarly low levels in populations that exhibit repeated adaptive divergence, a pattern that is important for our understanding of the role of natural selection in parallel evolution.

Data accessibility

R code and gene count files: https://github.com/barrettlabecoevogeno/Analysis-of-parallel-evolution-in-lake-stream-sticklebacks.