Genotyping-by-sequencing supports a genetic basis for alpine wing-reduction in a New Zealand stonefly

Wing polymorphism is a prominent feature of numerous insect groups, but the genomic basis for this diversity remains poorly understood. Wing reduction is a commonly observed trait in many species of stoneflies, particularly in cold or alpine environments. The widespread New Zealand stonefly Zelandoperla fenestrata species group (Z. fenestrata, Z. tillyardi, Z. pennulata) contains populations ranging from long-winged (macropterous) to vestigial-winged (micropterous), with the latter phenotype typically associated with high altitudes. The presence of flightless forms on numerous mountain ranges, separated by lowland fully winged populations, suggests wing reduction has occurred multiple times. We use Genotyping by Sequencing (GBS) to test for genetic differentiation between fully winged (n=62) and vestigial-winged (n=34) individuals, sampled from a sympatric population of distinct wing morphotypes, to test for a genetic basis for wing morphology. We found no population genetic differentiation between these two morphotypes across 6,843 SNP loci, however we did detect several outlier loci that strongly differentiated morphotypes across independent tests. This indicates small regions of the genome are likely to be highly differentiated between morphotypes, indicating a genetic basis for morphotype differentiation. These results provide a clear basis for ongoing genomic analysis to elucidate critical regulatory pathways for wing development in Pterygota.

All reads were trimmed, filtered and analyzed using the STACKS pipeline 74 in order 2 5 3 to create catalogues of comparable SNP loci. We optimized the pipeline according to 2 5 4 the recommendations of Paris, et al. 75 . Initially, the PROCESS_RADTAGS module 2 5 5 was used to separate reads by their barcode, remove low-quality reads (any read with 2 5 6 an average Phred score < 10 in any sliding window of 11bp), trim all reads to 70 base 2 5 7 pairs in length, and remove any reads that did not contain the enzyme recognition 2 5 8 sequence. Next, the USTACKS module was used for the de novo assembly of raw 2 5 9 reads into RAD tags. The minimum number of reads to create a stack was set at 3 (-m 2 6 0 parameter in USTACKS), and the maximum number of pairwise differences between 2 6 1 stacks was 2 (-M parameter in USTACKS). A catalogue of RAD tags was then 2 6 2 generated using the 25 highest coverage individuals from each ecotype in CSTACKS.

6 3
The distance allowed between catalogue loci (-n in CSTACKS) was increased to 2, 2 6 4 after different trials were run to ensure loci were not inaccurately called as separate 2 6 5 stacks. The execution of these components was accomplished using the STACKS 2 6 6 denovo_map.pl script; in running this script, the optional -t flag was used to remove 2 6 7 highly repetitive RAD tags during the USTACKS component of the pipeline.

6 8
Following assembly and genotyping, the data were further filtered to maximize data We investigated the number of populations (or clusters) represented in our data using 2 7 7 FASTSTRUCTURE 78 and the putatively neutral SNP dataset, default parameters, a 2 7 8 logistic prior, and K from 1 to 6. The appropriate number of model components that 2 7 9 explained structure in the dataset was determined using the chooseK.py function 78 .

8 0
Results for the identified optimal values of K were visualized using DISTRUCT 79 .

8 1
We also estimated the number of clusters using the find.clusters command in 2 8 2 ADEGENET, with optimization based on the Bayesian Information Criterion (BIC).

8 3
Finally, we created a Euclidian distance matrix between individuals in the R package 2 8 4 ADEGENET 80 , which we then displayed using a neighbor-joining tree produced in vestigial-winged specimens being approximately twice the number of fully winged 2 9 7 specimens, we performed two independent BAYESCAN runs, both including all fully 2 9 8 winged individuals, but each with a different half of the vestigial-winged group.

9 9
These two comparisons therefore each had a balanced design, and can be used to

5 0
Given these results we conclude that there is no neutral population structure between 3 5 1 fully winged and vestigial-winged individuals when sampled from the same location,  loci that were found to be significant in these largely independent comparisons.  Table 2). Of these, three loci were identified in both comparisons,

7 3
with one locus (14459_12) identified as the most significantly differentiated SNP in 3 7 4 both comparisons, with q-values of (0.00570 and <0.00000). In independent 3 7 5 comparisons with random differences between groups with loci differentiation 3 7 6 distributions to those observed, one would expect 0.03 loci to be detected as outliers 3 7 7 in both comparisons, and the probability that the most differentiated locus would be 3 7 8 identical would be <0.0001.

7 9
In the randomized BAYESCAN runs, an average of 10.6 outlier loci were detected at The observed differentiation between fully winged and vestigial-winged individuals at 3 9 2 these outlier loci strongly suggests that there are regions of the genome highly 3 9 3 differentiated between these two morphotypes. Due to the paucity of genomic data 3 9 4 published for Plecoptera, we were unable to map these outlier loci via BLAST-n to 3 9 5 genomic regions to identify the genes present in the surrounding regions.

9 6
Discussion 3 9 7 In this study, we tested for a genetic basis for wing reduction in the New Zealand isolation between morphotypes; the low dispersal abilities of Z. fenestrata may be the 4 3 0 mechanism that helps maintain this isolation in most streams.

3 1
One question that remains to be addressed is why the Black Jacks Creek Z. fenestrata  resources, it is also impossible to speculate as to the potential underlying genes that 4 9 1 may be responsible for these two phenotypes. With further work creating a genome 4 9 2 assembly for this species we will be able to look at the specific genomic regions 4 9 3 linked to the outlier SNPs defined in this study.

1 7
Evolution 40, 1009-1020, doi:10.2307/2408759 (1986   We have no competing interests of any sort. All processed data from Stacks will be included on Dryad entry # XXXXXXX