Introduction

Intraspecific polymorphisms are defined by the presence of two or more genetically distinct forms which co-occur in both time and space, with the rarest form too common to be solely attributed to recurrent mutation (Mayr 1970). These polymorphisms interest evolutionary biologists for their potential to elucidate the interplay between natural and sexual selection, as well as the incipient stages of speciation (Gray and McKinnon 2006, Forsman et al. 2008). Moreover, because these polymorphisms often have a relatively simple genetic basis they offer outstanding opportunities to connect genotypic and phenotypic variation and promise insights into long-standing questions in evolutionary biology; particularly those questions aimed at elucidating phases of speciation and adaptive radiations (Stinchcombe and Hoekstra 2007; Vijay et al. 2016; Reisch et al. 2017; Wolf and Ellegren 2017).

Color polymorphisms are a widespread phenomenon with extensive documentation in taxa across the tree of life (Cain and Sheppard 1950; Oxford and Gillespie 1996; Wellenreuther et al. 2014). The long-term documentation of the ecological consequences of color polymorphisms in the peppered moth Biston betularia (L.) has provided numerous insights into the role of selection in maintaining variation in natural populations (Cook and Saccheri 2013). By establishing the genetic basis of color pattern over decades, this work allowed current studies to identify a tandem repeat insertion in the gene cortex that produces the carbonaria–typica polymorphism and ultimately gives rise to the industrial melanism of the peppered moth in Britain (Van’t Hof et al. 2016). Intriguingly, cortex has also been implicated in regulating pigmentation patterning in multiple species of Heliconius butterflies, demonstrating that in some cases the same genomic mechanisms are employed across different groups of organisms (Nadeau et al. 2016). Current research taking advantage of long-studied systems such as Darwin’s finches (Abzhanov 2010, Lamichhaney et al. 2015, 2017), Heliconius butterflies (Van Belleghem et al. 2017, Merrill et al. 2015), and Timema stick insects (Soria-Carrasco et al. 2014, Riesch et al. 2017) are employing highly integrative approaches, combining years of ecological, phenotypic, genetic crossing and more recently, genomic data. In this way, these systems have moved towards a deeper understanding of the genomic basis of phenotypic variation and offer insights into how novel adaptive variation evolves and spreads.

Understanding how polymorphisms have evolved and how they are maintained in natural populations first requires an understanding of their frequency of occurrence in nature and their mode of inheritance. To this end, we explored polymorphisms within the “tortoise beetle”, Chelymorpha alternans. The genus Chelymorpha (family Chrysomelidae, subfamily Cassidinae, tribe Mesomphaliini) includes approximately 100 species located primarily in South America with some species ranging through Central and North America. A significant but still undetermined number of species include two or more phenotypes, the genetics of which has only been scratched at (Vasconcellos-Neto 1988). Chelymorpha alternans is distributed widely throughout Central America and northern portions of the Andean region of South America. On the Isthmus of Panama there are multiple, distinct phenotypic variants differing strongly in elytral and pronotal pattern and coloration. Some phenotypes co-occur in the wild, feeding on a range of vining host plant species in the morning glory family, Convolvulaceae (Morrison and Windsor 2018). Variants are conspicuously colored, with bright red, low intensity orange, black, and metallic stripes on the elytra. Intriguingly, the color variation in this system involves both pigments (red and black coloration) as well as structural coloration (metallic stripes).

Red, black and metallic coloration produce a stark contrast between body color and background, forming highly conspicuous signals that have typically been associated with aposematism; serving as a warning to predators of unpalatability by the signaler (Stevens and Ruxton 2012, Arenas et al. 2014). Preliminary trials with C. alternans have shown that they are unpalatable to some predators (personal observations-LS) and their highly specialized diet with their host plant suggests that individuals could potentially be sequestering secondary metabolites, which many species in the Convolvulaceae family have been shown to contain (Eich 2008). The polymorphic nature of color pattern in this species suggests that discrete phenotypes may persist through heterozygote advantage, negative frequency dependence and/or equal fitness of alternative phenotypes (Stevens and Ruxton 2012). Future studies in this system will help to illuminate the existence of phenotypic variation.

Here, using extensive field collections, controlled genetic crosses and broad genomic scans, we record the full range of color polymorphisms found in C. alternans in Isthmian populations. First, we investigated the geographic distribution of the different phenotypes and color pattern alleles. Secondly, we performed crossing assays to elucidate the genetic basis for the color polymorphism found in C. alternans. Lastly, we used RAD-sequencing genomic scans to gain an understanding of the degree of genetic differentiation between phenotypes of C. alternans in a single population.

Materials and methods

Beetle collection and stock preparation

Wild populations of C. alternans were sampled broadly across the Isthmus at 28 sites and assigned unambiguously to one of five distinct phenotypes (metallic, rufipennis, veraguensis, militaris-a and militaris-b) based on elytral and pronotal pattern and coloration (Fig. 1a). In total, 3819 individuals were collected ranging from the western border with Costa Rica to the eastern border with Colombia, including sites on or near the Caribbean and Pacific coasts from 1998 through 2015 (Fig. 1b and Table 1SM). A subset of these individuals were kept for crossing and fertility studies. Beetle stocks were housed in 1-pint plastic “Solo” containers with mesh netting lids and provisioned with individual leaves removed from potted vines of Merremia umbellata (L.) Hallier f. Containers were kept in lab conditions that mimicked conditions during the wet season (12:12 h light: dark cycle and misted daily to maintain humidity). Beetles were moved weekly to recycled containers which were washed, and bleach sterilized. Coffee filters were placed in each new container to absorb excess moisture and collect feculae. Fungal pathogens were minimized by a light spraying of aqueous potassium sorbate.

Fig. 1
figure 1

Phenotypes and distribution. a) Phenotypic variation within C. alternans. The four phenotypes, clockwise left to right: veraguensis, metallic, darien, and rufipennis. The four panels on the right depict the phenotypic variation within the darien, militaris-a (top two panels) and militaris-b (bottom two panels). b) Geographic distribution of phenotypes. Phenotypes of all individuals sampled from 28 sites across Panama from 1998–2016. The size of the pie chart denotes sample size from that site. The dashed line in the middle of the Isthmus represents the Canal

Genetic cross experiments

We performed 13 of the 15 possible crosses between phenotypes, all but one of which was replicated from two to 25 times (a total of 108 crosses; Table 2SM). Arenas were established with virgin male-female pairings of known phenotypes and given the same daily maintenance. Each egg clutch produced was then isolated and reared until emergence as adults (6–8 weeks after oviposition of the egg clutch), when sex ratios and phenotype could be assessed. Crosses to determine genotypes were performed sequentially, beginning with the homozygous metallic phenotype (metallic x metallic gave 100% metallic offspring, Table 1, cross type 10), the males of which were later crossed with other phenotypes to determine homozygous males which could then be used in additional crosses. Differences between the observed and expected number of offspring phenotypes under the simplest inheritance model of a single, multiallelic locus were evaluated with a Chi-Square test using the MASS package (version 7.3–47) in R version 3.3.2 (Table 1). If all traits were controlled by different independently linked loci, it would be possible to obtain 12 phenotypes (combination of pronotal and elytral coloration and markings). However, in the field and laboratory only 4 phenotypes are found (considering the variegated pronotum as a single character) (Fig. 1a).

Table 1 Summary table of F1 offspring from all crosses of C. alternans

Additionally, cross types were analyzed for gender bias. All offspring produced were separated using observed morphological differences between gender (males are smaller and have broader elytra) and a Chi-Square test using the MASS package (version 7.3–47) in R version 3.3.2 was performed to test for deviations from an expected 50:50 sex ratio.

Genome size estimate

Chelymorpha alternans (Coleoptera: Chrysomelidae) were prepared by dissection to separate the head from the thoracic and abdominal segments. The head was further dissected to remove mouthparts and divide in half, enabling two replicates per individual. Each head portion was placed in 1 mL Galbraith buffer along with the head of one Callosobruchus maculatus male and ground with a pestle in a 2 mL Dounce tissue grinder (Kimble Kontes NJ) for 15 strokes to release the nuclei. Samples were maintained on ice throughout preparation. The mixture of C. alternans and C. maculatus tissues was then filtered through 45 uL nylon mesh into 2 mL centrifuge tubes. Samples were brought to 1 mL volume by addition of Galbraith buffer, stained with 25 uL of propidium iodide, mixed with a vortex mixer and stained under refrigeration for 1 h. The relative fluorescence of the sample and standard mixture was measured with a Beckman/Counter Cytoflow Flow Cytometer. The amount of DNA in each tissue sample was determined as the ratio of the average relative fluorescence of the diploid sample nuclei divided by the relative fluorescence of the diploid nuclei of the standard, multiplied by the 1 C amount of DNA (1200 Mbp) in the C. maculatus standard.

RAD-sequencing genomic scans

A genomic scan analysis was performed on 32 adult beetles collected from a single plant growing on a berm of Santa Rita Ridge Road (9.38 N, 79.69 W; Colon Province, Panama) on January 30, 2015. Individuals were divided into three distinct phenotypes, “metallic” (n = 11), “rufipennis” (n = 11), and “darien” (n = 10). Beetles were photographed with a color scale to preserve phenotypic documentation for later comparison with genomic data, then stored in ethyl alcohol and frozen at −80 °C. DNA was extracted from whole-bodies, using DNeasy Mini spin columns (Qiagen,CA), with equal amounts of tissue from each beetle. Libraries were prepared according to the RAD-sequencing protocol by Etter et al. (2011), using 250 ng of DNA from each individual and the restriction enzyme Sbf1. Individuals were identified by a combination of 16 different P1 adaptor sequences and two different R primer tags, all of which had a sequence length of 8 bp, generating two libraries consisting of 16 individuals. The DNA was sheared using a Covaris sonicator with a duty cycle of 10%, intensity of 4, and 200 cycles per burst, for 48 seconds. Sheared DNA was then run on an agarose gel, and the fragments in the 300–650 bp range were selected and manually cut from the gel. These were then purified and amplified, with 16 cycles for library one, and 18 cycles for library two. Both libraries were pooled and sequenced on one lane of an Illumina HiSeq 2000 (paired end, 2 × 101 bp).

Raw sequence data was filtered and demultiplexed using the process_radtags module in STACKS (Version 1.47; Catchen et al. 2011, 2013). Through this process, only reads with a raw phred score higher than 10 are retained. The module clone_filter in STACKS was used to filter out identical reads which are the result of PCR clones. Reads were assembled de novo using the denovo_map.pl module in STACKS (Catchen et al. 2011, 2013). The number of raw reads required to form a stack (m) was set to 3, and the number of nucleotide mismatches allowed between stacks (M) was set to 2. To test the power of our results we performed the analysis with three parameter combinations: (i) m = 3 M = 3, (ii) m = 4 M = 3, and (iii) m = 5 M = 4. Notably, all assemblies gave similar results (Table 3SM).

Clustering and FST “outlier” analysis

To explore genomic grouping among phenotypes, a PCA was performed by filtering individuals using stacks with ≥ 10x coverage in ≥ 7 individuals per phenotype, using the first SNP identified per stack and implemented in the Adegenet package in R version 3.3.2 (Fig. 2).

Fig. 2
figure 2

Hypothesized genotypes with resulting phenotypes, phenotype characteristics, and proposed order of allelic dominance for color pattern in Chelymorpha alternans. Color pattern phenotype in this species can largely be explained by a single locus with four alleles and a clear dominance hierarchy, as well as a codominant interaction between two alleles (R and T)

Genomic regions that provide the strongest starting points for locating candidate genes contributing to color pattern are likely those that are strongly associated with particular phenotypes. To identify these loci, we ran “outlier” scans between phenotypic pairwise comparisons. SNPs with an FST estimate ≥ 0.3 and a Fisher’s p-value less than 0.001 in pairwise comparisons were identified as outliers using a custom script written with Python 2.7.10. Consensus “outlier” loci were extracted using the flag ‘--fasta_strict’ in the populations module of STACKS (Catchen et al. 2011, 2013). A Benjamini-Hochberg correction for multiple testing was applied to these identified p-values using the FSA package in R version 3.3.2 in order to control for the proportion of false discoveries. These sequences were first run in a general search using the BLASTN algorithm on the NCBI server (https://blast.ncbi.nlm.nih.gov) in an attempt to characterize functionality. Following this, the sequences were searched against known beetle genomes, the red flour beetle, Tribolium castaneum; the mountain pine beetle, Dendroctonus ponderosae; the Taurus scarab, Onthophagus taurus; and the chrysomelid beetle, Leptinotarsa decemlineata.

Results

Color polymorphism in wild populations

Individuals collected across Panama typically fell within four major phenotypic classes (Fig. 1a, b). The “rufipennis” and “veraguensis” phenotypes both display red elytra but are distinguished by a black pronotum in the rufipennis phenotype and a red pronotum in the veraguensis phenotype. The “metallic” phenotype is characterized by a red pronotum tinged with gold in the center, red elytra with two confluent metallic stripes and small black spots along each stripe. The “darien” phenotype is distinguished by the presence of black and red stripes. Within this form, we recognize two additional phenotypic variants. The “militaris-a” form has a red and black striped pronotum, as well as red and black striped elytra. The “militaris-b” form also displays red and black striped elytra but has an entirely black pronotum; within each of these two forms, individuals can be further classified by whether elytra are clearly black-striped or contain a single black region.

The different phenotypic variants were not distributed randomly across Panama. The “metallic” phenotype is the single most widely distributed, occurring in nearly every collection site, including two monomorphic populations, Bocas del Toro and Coclé. Notably, this phenotype is absent from the drier Azuero peninsula, where only the “veraguensis” and “rufipennis” phenotypes are found. The “rufipennis” phenotype was also widely distributed, occurring from the eastern to western borders of Panama and crossing over the Panama Canal (Fig. 1b). The “veraguensis” phenotype is limited to West Panama, extending from the Costa Rican border to the western border of the Canal, with frequencies up to 46% occurring in the Pacific slope populations of Coclé and Veraguas provinces of Panama. Both forms of the “darien” phenotype are found from the eastern bank of the Canal area in central Panama to the Colombian border, reaching greatest frequencies within the Darien National Park.

Laboratory crosses and the genetic basis of phenotypic variation

The data presented here are from 130 controlled mating experiments with wild caught individuals of all the major phenotypic classes resulting in over 10,000 offspring. These data suggest that color pattern is controlled by a single locus with at least 4 alleles and a predictable dominance hierarchy. Broods from these crosses produce offspring whose phenotypes are consistent with one of the following Mendelian ratios: [a] 1:0 (expected in an F1 cross between a dominant and recessive individual, Cross Type [CT] = 1, 6, 9, 10, 13 and 15; p > 0.05; Table 1), [b] 1:1 (expected in an F1 cross between a heterozygote and homozygous recessive individual, CT = 2, 4, 5, 7, 11, 14 and 16; p > 0.05; Table 1), [c] 3:1 (expected in an F1 cross between two phenotypically similar heterozygotes, CT = 3 and 12; p > 0.05; Table 1), [d] 1:1:1:1 (expected in an F1 cross between two phenotypically different heterozygotes, CT = 17; p > 0.05; Table 1 and Figure 1ASM), and [e] 1:2:1 (expected in an F1 cross if two alleles display codominance, CT = 18; p > 0.05; Table 1 and Figure 1BSM). Overall, our cross data is consistent with 4 segregating alleles, with a clear dominance hierarchy. We refer to these alleles as R, which represents the “rufipennis” phenotype; T, which represents the “militaris” phenotype; V, which represents the “veraguensis” phenotype; and r, which represents the “metallic” phenotype (Fig. 2). R and T are codominant (see below) and both are dominant to V, which is dominant to r. Codominance between R and T was best exemplified in the phenotypic ratio of offspring in crosses between heterozygous individuals with the darien type 1 (militaris-a) and rufipennis phenotypes (Fig. 2ASM) and between two individuals of the darien type 2 (militaris-b) phenotypes (Figure 2BSM). From these cross types we find a 1:2:1 ratio (expected in an F1 cross between heterozygous parents if the two alleles display codominance) and a 1:1:1:1 ratio (expected in an F1 cross between two phenotypically different heterozygotes). After performing the crossing experiments the difference in the genetic basis of the two militaris phenotypes was elucidated but because of this, the militaris phenotypes were treated the same during initial sampling collections. Therefore, information as to whether there may be differences in the geographic distribution of these phenotypes was not recorded during sampling.

Overall, the results from our crossing experiments indicate that a major locus with four alleles controls much of the color variation between the five mentioned phenotypes: 1) the militaris-b phenotype which is a heterozygote and produced from a codominant state between two alleles (R and T), in which a black pronotum and striped elytra are both expressed; 2) the militaris-a phenotype that is observed when allele T is homozygous or heterozygous with the r or V alleles (genotype TT, Tr, or TV); 3) the rufipennis phenotype that is seen when allele R is homozygous or heterozygous with r and V (genotype RR, Rr, or RV); 4) the veraguensis phenotype which is observed when allele V is homozygous or heterozygous with r (genotype of VV or Vr); and 5) the metallic phenotype which is only inherited in the homozygous recessive state (rr) (Fig. 2).

These crosses provide little evidence for gender bias and most cross types showed no difference in the male and female offspring produced, with the exception of “veraguensis” x “rufipennis” crosses (CT = 4 and 11, Table 1). In nearly all of these crosses, there were more male than female offspring (Table 2SM).

Genome size, polymorphic loci and association

Genome size estimated from flow cytometry was 1 C = 1.601 Gb ( + /– 13 Mbp) for females and 1 C = 1.608 Gb ( + /– 7.5Mbp) for males. These estimates fell within genome sizes reported for Coleoptera, which vary from 160 Mbp (Oryzaephilus surinamensis) to over 5 Gb (Aramigus tessellatus) (McKenna 2018, data from http://www.genomesize.com). For our RAD screen, the restriction enzyme Sbf1 generated a total of 89,038,612 reads of 101 bp each obtained from 32 individuals; metallic (n = 11), rufipennis (n = 11) and darien (n = 10) individuals. Of these, 73,447,420 (82.5 % of raw reads) sequences were retained after quality filtering (reads were removed for ambiguous barcodes, low quality reads, and ambiguous RAD-tags). At 10x coverage, the assembly with stack depth parameter m=3 and mismatch parameter M=2 generated 4464 retained loci.

The sampled phenotypes had a similar stack coverage (number of sequences merged to form individual stacks). Nucleotide diversity (π) and observed heterozygosity across all three phenotypes were low (Table 2). Pairwise FST values were 0.009508 for metallic/rufipennis, 0.015271 for metallic/darien, and 0.015971 for rufipennis/darien comparisons. The frequency distribution of pairwise FST estimates is characterized by a large degree of clustering around zero, with 88% of values being less than 0.1 and relatively few values at 0.4 or greater (Fig. 4). Moreover, the PCA of individuals did not show any obvious clustering by color pattern phenotype (Fig. 3). Further, because all individuals were collected from the same location (a large berm along Santa Rita Ridge Road) the observed variation is likely due to sampling from multiple families rather than geographic variation.

Table 2 Genomic descriptive statistics
Fig. 3
figure 3

Principal Component Analysis. PCA of phenotype relatedness based on FST estimates across 30 individuals. Metallic individuals are represented by gold points, Rufipennis individuals by red points, and Darien individuals by black points

Significant levels of differentiation were identified using our outlier SNP analysis (FST estimate above 0.3 and Fisher’s p-value < 0.001) in 54 unique SNPs occurring in 57 unique RAD loci. After the Benjamini-Hochberg correction was performed to control for the expected proportion of false discoveries, 41 SNPs in 39 RAD loci met our criteria. Of these, 15% (n = 6) occurred in the metallic-rufipennis comparison, 56% (n = 23) in the metallic-darien comparison, and 29% (n = 12) in the rufipennis-darien comparison. Of these, nine SNPs in nine RAD loci were shared by two comparisons but there were none that were shared between all three comparisons.

Since the metallic phenotype is only expressed if the homozygous recessive state is inherited, candidate loci contributing to color pattern are hypothesized to be homozygous in all metallic individuals, but polymorphic in any of the other phenotypes. No SNP’s in our dataset met this expectation. However, there were three SNPS that were homozygous in every metallic individual and mostly heterozygous or absent in the rufipennis and darien phenotypes (Table 3). BLASTN searches for all these loci either against reference beetle genomes or across arthropod genomes did not return any significant hits.

Table 3 Identified outlier SNPs

Discussion

We present the first comprehensive analysis of color pattern variation in Chelymorpha alternans. Our analysis has identified four distinct phenotypic classes occurring in a nonrandom geographic distribution across the Isthmus, yet any relationship to abiotic factors (temperature, humidity, precipitation) remains largely unclear. Even less resolved are the importance of historical and biotic processes (interspecific interactions, habitat, predators) (reviewed in McLean and Stuart-Fox 2014). Preliminary studies using predator bioassays have shown that phenotypes of this species are unpalatable to some predators (mantids, ants, and chickens) with some phenotypes (metallic) eliciting a stronger aversion response (personal observations, LS). Numerous crosses performed among phenotypes revealed a surprisingly simple genetic basis to the polymorphism. Evidence suggests that phenotypic variation is controlled by a single locus and four alleles. No evidence was found indicating levels of sterility or inviability in crosses among phenotypes. An initial genome scan on different co-occurring phenotypes uncovered few strong genetic differences, identifying 41 markers that were significantly associated with phenotype, three of which met a more restrictive criteria for phenotype-genotype associations.

The genetic basis of color pattern phenotypes in C. alternans is similar to other color polymorphisms where phenotypes can be explained by a few alleles acting at a small number of loci (Rankin et al. 2016). Genetic crossing studies of the coqui frog, Eleutherodactylus coqui, endemic to Puerto Rico, found that five phenotypes (four with various stripes and one unstriped) were largely under the control of one autosomal locus with five alleles. In this case, the four alleles which produced striped phenotypes are all codominant and the one unstriped morph is homozygous recessive. From these studies, phenotypic ratios obtained in the offspring followed 1:0, 1:1, 3:1 and 1:1:1:1 ratios (O’Neil and Beard 2010). In cases of female color polymorphism in Ischnura damselflies, where males display one phenotype and females can display one of three phenotypes (one being a male mimic, an alternative form, and the common form) a strict dominance hierarchy can be found, however, this hierarchy is not always the same between species. In the closely related species, I. geneii, I. elegans and I. graellsii, the mode of inheritance is via three alleles at a single autosomal locus which display codominance. In each of these studies offspring phenotypes were obtained in ratios of 1:1, 3:1 and 2:1:1 (Cordero 1990, Sánchez-Guillén et al. 2005, Sanmartín-Villar and Cordero-Rivera 2016).

Studies on the genetic basis of color variation within the genus Chelymorpha is restricted to two other species, C. variabilis and C. cribraria. Described by Vaio et al. (1975), the color polymorphism in C. variabilis is proposed to be controlled by two interacting loci, with one phenotype being dominant and the other being inherited only in the homozygous recessive state. The mechanism of inheritance for the extensive color polymorphisms found in C. cribraria is described to be more complex with at least five loci which segregate as “supergenes”, producing 8 phenotypic classes and at least 14 different forms (Vasconcellos-Neto 1988). This species also displays “metallic” and “rufipennis” phenotypes, as well as a form similar to the “militaris-b” phenotype described here in C. alternans. In this study C. cribraria is proposed to be a part of a mimicry ring with other beetles from the Coccinelidae and Chrysomelidae families, all found throughout Central and South America and all of which display variations of red, black, and sometimes metallic coloration. Many of the proposed species in this mimicry ring have strong preferences for host plants which have been found to contain secondary compounds (mostly in the family Convolvulaceae) and some have been shown to be unpalatable to predators, including the coccinellid beetles, Epilachna cacica and Cycloneda sanguinea (Vasconcellos-Neto 1988; reviewed in Pasteels and Gregoire 1983). No further studies on the potential of mimicry for these and other phenotypically similar beetle species in this area exist.

The most striking aspect of this color polymorphism is that the loci exerting major control produces a phenotype which displays structural coloration. The gold, non-iridescent stripes seen in the metallic phenotype are produced by multilayer reflectors made from thin, parallel layers of chitin and located within the cuticle of the elytra. Differences in the thickness of each layer causes light to scatter at particular angles, producing the resulting gold coloration (Seago et al. 2009). It is likely that the metallic coloration is produced by regulatory pathways which produce additional chitin layers developing in specific areas of the elytra.

The distribution of C. alternans is poorly documented in South America, but specimen collections (STRI insect collection) indicate that metallic, militaris-a and militaris-b phenotypes are found in lowland populations of Central Ecuador on both sides of the Andes. Many species have shown geographic structuring along latitudinal gradients, typically with darker morphs increasing in frequency at higher latitudes as shown in the coccinellid beetles, Cheilomenes sexmaculata and Adalia bipunctata (Kawakami et al. 2013; Brakefield 1984b). Although sites from this study did not differ greatly in longitude or latitude (Table 1SM), abiotic factors such as humidity and precipitation, change drastically along the rainfall gradient (and seasonally) from the Caribbean to the Pacific sides of the Isthmus, and often play a role in the geographic distribution of populations (Condit et al. 1998, Kawakami et al. 2013).

The metallic phenotype has the widest geographic distribution; as seen in other invertebrate color polymorphisms in which the least dominant allele (most recessive) occurs at the greatest frequency throughout geographic populations (Gillespie and Tabashnik 1989, Innes and Haley 1977, Sanmartín-Villar and Cordero-Rivera 2016). Notably, this phenotype is nearly absent along the drier Azuero peninsula and nearly fixed in populations along the wet and aseasonal Caribbean coast. This could be an indication that climate-related factors play a role in determining the distribution of C. alternans elytral color-determining alleles. Intriguingly, metallic coloration has been suggested to absorb solar radiation more efficiently and decrease penetration of ultraviolet light (Mikhailov 2001), however, whether this plays a role in distribution remains to be tested.

The distribution of the veraguensis, militaris-a, and militaris-b phenotypes appear to be restricted by the presence of the Panama Canal, a landscape feature dating to the early 20th century. The distinctly seasonally deciduous forests on the Pacific side may have served as a barrier for the militaris phenotypes who seem to prefer wetter habitats. A reciprocal transplant study with the militaris and veraguensis phenotypes on either side of the Canal would bring further insights to this hypothesis. The presence of the metallic and rufipennis phenotypes on both ends of the Canal suggests that this partitioning may be historic or as a result of Founder’s effects. If the ‘V’ and ‘T’ alleles occurred at a low frequency on one end of the Isthmus, a large structural barrier coupled with low dispersal rates could have inhibited further admixture. A study on the genetic analysis of these phenotypes at multiple geographic populations across Panama would elucidate whether phenotypes from the Western and Eastern end of the Isthmus show distinct genetic clustering or still exhibit high levels of admixture. Further research into the mechanisms of selection acting on this species, as well as increased sampling throughout areas of Central and South America will help to unravel the factors maintaining these color polymorphisms as well as geographic structuring.

Data from RAD-sequencing shows no evidence of clustering by phenotype suggesting low levels of differentiation between phenotypes. The distribution of SNP FST estimates for phenotype pairwise comparisons showed a distinct mode at 0 with more than 94% of SNPs at or less than 0.1 in all phenotype comparisons (Fig. 4) indicating that individuals are genetically very similar. However, part of this is likely due to sampling from a few families, as individuals were collected from a single area and many from the same plant. The particularly low levels of nucleotide diversity (a median of 1.25% in arthropods) may come from sampling closely related individuals in the population, as well as sampling from a particularly isolated population (Leffler et al. 2012). Notably, there were a few areas of the genome ( < 1%) displaying high levels of genetic differentiation, indicating that they may be linked to loci that determine color pattern (Wu 2001). These areas serve as potential bases for further investigations into areas of the genome contributing to the observed phenotypic differences between phenotypes.

Fig. 4
figure 4

Distribution of FST estimates between all phenotype pairwise comparisons. Arrows indicate SNPs that were significantly more differentiated than would be expected by chance. We focus specifically on “outlier” SNP’s that were identified in at least two of the pairwise comparisons

Our results indicate that C. alternans is a single species with five phenotypes resulting from the interactions of four color-coding alleles segregating according to well-established patterns of Mendelian inheritance. While there were no backcrosses performed in this study, current work with inbred lines of two phenotypes (metallic and rufipennis) supports the hypothesis proposed here. Cassidinae (tortoise beetles) are a highly specialized group of tropical and subtropical beetles, with unique life history traits and complex ecological interactions. The discrete character of the polymorphism plus the apparent simple genetic basis make this a promising system with potential to investigate questions related to the fungal endosymbionts of host plants, the role of sexual selection and Mullerian mimicry, population genetics, and the genomics and development of color pattern. Through an understanding of the ecological variables that influence phenotypic variation along with the underlying genomic and developmental factors, we can start to create holistic views of the massive amounts of diversity we see in nature.

Data archiving

Sequence data have been submitted to Sequence Read Archive, accession number: SRP145148.

The Python script is available at: https://github.com/slynn731/PythonScript_FindingSigSNPs