Introduction

The existence of intraspecific variation in ecologically important traits is one of the cornerstones of Darwin’s theory of evolution by natural selection. Understanding the evolution and maintenance of such variation is not simply a question of genes per se, but rather of genetic architecture (minimally defined as the number, genomic location, and phenotypic effect of the loci responsible for a given phenotype, along with an understanding of how the interactions between loci affect the phenotype (Hansen 2006; Mackay 2001)). The genetic architecture of a trait determines whether a given genome can produce and maintain potentially adaptive phenotypic variants (Hansen 2006), and thus its ability to respond to natural selection (Rajon and Plotkin 2013).

Some models have suggested that traits with simple genetic architectures (i.e., controlled by a few loci of large effect) can respond more readily to selection than those with complex architectures, and are thus more likely to lead to adaptive divergence and subsequent speciation (Arnegard and Kondrashov 2004; Gavrilets and Vose 2007; Gavrilets et al. 2007; Hayashi et al. 2007). It is clear that variation in some discrete traits (e.g., morphology (Colosimo et al. 2004; Greenwood et al. 2011), coat color (Steiner et al. 2007), and pheromone production (Groot et al. 2013)) is controlled by a few loci of large effect, and such traits could potentially diverge quite rapidly. But what of more complex, multifactorial traits such as host plant use, which involves physiological and behavioral adaptations across multiple developmental stages? The genetic architecture of such traits is likely to be more complex, as it is improbable that a single (or even a few) loci could regulate all the phenotypic traits that contribute to host plant use. Nonetheless, rapid divergence has been observed in such traits (Price et al. 2017), suggesting that architecture may not constrain evolvability in the expected manner. For some complex traits, this can be explained by “supergenes,” non-recombining genetic regions that contain multiple linked, functionally related, loci (Purcell et al. 2014; Thompson and Jiggins 2014), but other complex traits, including herbivore host plant use, clearly involve loci spread across the genome (Oppenheim et al. 2012).

While most herbivorous insects are specialists on a narrow set of host plants (Forister et al. 2015), others are generalists that feed on a taxonomically diverse set of hosts. Over evolutionary time, changes in both diet breadth (the numeric and taxonomic diversity of host species) and diet content (the specific set of hosts attacked) are common (Winkler and Mitter 2008), and rapid adaptation to novel hosts has frequently been observed on ecological timescales (Forister et al. 2009; Garcia-Robledo and Horvitz 2012; Gompert et al. 2015; Hoang et al. 2015; Messina and Jones 2011). This is a pressing issue for Homo sapiens, because herbivorous insects are undergoing rapid geographic expansions in response to climate change (Lancaster 2016) and many of the newly acquired host species are agricultural crops (Stastny et al. 2006). To date, it is unclear whether generalists and specialists are equally labile in shifts of diet breadth and content. Addressing this question requires a comparative assessment of the genetic architecture of host plant adaptation in species with diverse host ranges, which we undertake in this report.

The lepidopteran subfamily Heliothinae (Noctuidae), in which closely related species have diverged markedly in their host use patterns, is a useful model for examining this issue. The Heliothinae includes both narrow specialists, like Chloridea subflexa (Lepidoptera: Noctuidae: Heliothinae), a specialist on a single plant genus, and broad generalists, like Chloridea virescens (Lepidoptera: Noctuidae: Heliothinae), which feeds on plants in 14+ families (Sheck and Gould 1993). Within this system, we are studying the genetic architecture of intraspecific and interspecific variation in host plant use to address two related questions: whether host plant adaptation can evolve readily in response to selection, and whether identical host use phenotypes depend on the same genetic architecture in generalists and specialists.

In previous experiments, we studied the genetic architecture of differences in host plant use between the specialist, C. subflexa, and the generalist, C. virescens (Oppenheim et al. 2012). There are striking behavioral and physiological difference between these species in their use of Physalis angulata, which is the preferred host of C. subflexa but a novel host for C. virescens. These differences include larval willingness to feed on P. angulata fruits, larval behavior on the fruit and surrounding calyx, and the ability of larvae to convert ingested fruit into larval biomass (i.e., assimilation efficiency). By mapping quantitative trait loci (QTL) in crosses between C. subflexa and C. virescens, we found that the performance of C. subflexa on P. angulata depends on many loci of small effect distributed throughout the genome and that most of these loci have pleiotropic effects and interact epistatically (Oppenheim et al. 2012).

This complex and distributed architecture suggested to us that incremental gains in fitness on P. angulata could eventually produce the phenotype seen in present-day C. subflexa. The most recent common ancestor of C. subflexa and C. virescens was a generalist (Cho et al. 2008; Fang et al. 1997; Mitter et al. 1993; Poole et al. 1993), meaning that the extreme specialization of C. subflexa arose from a generalist genome. We were curious whether selection on present-day C. virescens could produce C. subflexa-like host use phenotypes, and, if so, whether the genetic architecture of those phenotypes would resemble that of C. subflexa.

To address these questions, we have conducted experiments to measure the response to selection on C. virescens for performance on P. angulata, and mapped the QTL associated with differences in performance between selected and unselected C. virescens. Although artificial selection experiments cannot be expected to precisely reproduce the genetic response to natural selection (Rockman 2012; Stern and Orgogozo 2008), they can provide empirical data about the pattern, rate, and phenotypic limits of trait evolution in response to selection (Wray 2013) and thus help generate testable hypotheses about the genomic patterns we might expect to see under different evolutionary scenarios.

Materials and methods

Study system

Chloridea virescens is a major pest of many agricultural crops, and has been the subject of much research; C. subflexa is not considered a pest, but is closely related to C. virescens, with which it has 99% coding sequence similarity in the genes for which comparisons have been made (Cho et al. 1995; Fang et al. 1997). The geographical ranges of C. virescens and C. subflexa overlap broadly (Poole et al. 1993), and the two species are thought to have evolved about 2.5 million years ago from a shared, generalist ancestor (Cho et al. 2008; Fang et al. 1997; Mitter et al. 1993; Poole et al. 1993). Chloridea virescens and C. subflexa have never been found to hybridize under natural conditions (Teal and Tumlinson 1997), but in no-choice laboratory arenas occasional hybrid matings do occur. These produce fertile F 1 females and sterile F 1 males, whose fertility is restored after several backcross generations (Karpenko and Proshold 1977).

Despite their genetic similarity and ability to hybridize, these species differ greatly in host plant use. Chloridea virescens has a very broad host range, feeding on at least 37 plant species in 14 families (Sheck and Gould 1993), while C. subflexa is narrowly specialized on plants in the genus Physalis (Solanaceae: Solanoideae) (Laster et al. 1982), and even within this genus, not all species are used (Bateman 2006). Chloridea virescens is not known to feed on Physalis (or any other Solanoideae genera) in the field, but in the laboratory, C. virescens larvae will accept P. angulata. The survival of C. virescens from neonate to 3rd instar is around 5% on P. angulata fruit, as compared to 55% for C. subflexa larvae (Oppenheim et al. 2012). This willingness to feed on a novel, suboptimal host is consistent with earlier findings that sensitivity to feeding deterrents is much lower in C. virescens than in C. subflexa (Bernays et al. 2000).

The genus Physalis lacks the best-known defense compounds found in Solanaceae (e.g., nicotinoids, capsaicinoids, steroid alkaloids) (Wink 2003), but it does have several unusual secondary metabolites. In particular, Physalis is characterized by the presence of withanolides, a group of steroidal lactones of unusual ergostane-skeleton structure (Eich 2008). Within the Solanaceae, withanolides are found only in the subfamily Solanoideae (Misico et al. 2011), primarily in the genera Withania, Jaborosa, Datura, and Physalis. Withanolides, and more particularly physalins (a group of withanolides found only in Physalis), have a wide range of bioactive properties, including anti-tumor, anti-inflammatory, trypanocidal, and immunoregulatory effects (Chen et al. 2011). The effects of withanolides on insect herbivores have been examined in moths (Ascher et al. 1987), flies (Bado et al. 2004; Mareggiani et al. 2000), and beetles (Ascher et al. 1987; Mareggiani et al. 2001). In all cases, the results are consistent with anti-feedant effects but not with acute toxicity.

Insect strains and rearing

In North America, C. virescens form a homogeneous, panmictic, metapopulation with very little genetic differentiation—indeed, even populations that have been held in culture for several years are genetically indistinguishable from wild-caught individuals (Groot et al. 2011). Thus, phenotypic variation in C. virescens may be determined by environmental rather than genetic factors (Groot et al. 2011). To capture as much environmental variation as possible, our selection line (hereafter referred to as CVselection) was started with wild C. virescens collected from seven geographical locations and four host plant species. Eggs were collected from tobacco (Nicotiana tabacum) in North Carolina (three sites) and South Carolina, from cotton (Gossypium hirsutum) in Louisiana, from velvetleaf (Abutilon theophrasti) in Mississippi, and from garbanzo bean (Cicer arietinum) in Texas. A total of 786 C. virescens larvae hatched from the field-collected eggs and were transferred to artificial diet before entering the selection regime described below.

Chloridea subflexa (hereafter referred to as subflexa) and unselected C. virescens (hereafter referred to as virescens) originated from colonies maintained at North Carolina State University (Groot et al. 2009; Sheck et al. 2006). To mirror the selection line, subflexa and virescens populations used in this study were each started from a set of 50 randomly selected individuals of each sex; progeny were individually reared on artificial diet as described in Sheck & Gould (Sheck and Gould 1995). CVselection larvae were reared either on diet or on the fruits of P. angulata, as described below. All insects were maintained in a 23 °C rearing room under a 16:8 light-dark cycle at 50–70% relative humidity.

Plants and fruits

Although many species of Physalis will support subflexa development, larvae do particularly well on P. angulata (Bateman 2006), and we used this species for all experiments. Physalis angulata plants were grown from seeds collected from naturally occurring P. angulata in Orangeburg County, South Carolina. Seeds were planted in flats in the greenhouse and allowed to grow until 5 cm tall, at which time they were transplanted into 8 litre pots (which are large enough to allow plants to attain sizes typical in the field). When possible, seedlings were moved outdoors after transplantation; in winter months all plants were grown in the greenhouse.

Physalis angulata fruits were collected by cutting the stem of the mature fruit as close as possible to the plant. This procedure minimizes negative effects on the plant, allowing each plant to be used as a fruit source for as long as a month. The entire fruit, calyx, and stem were either used immediately or stored in closed paper bags at ambient indoor temperature and humidity (20–24 ˚C, 40–60% humidity) for up to 4 months.

Selection for performance on Physalis angulata

Our goal was to generate a population of virescens that was phenotypically indistinguishable from subflexa in terms of survival, feeding behavior, and assimilation efficiency on P. angulata. We carried out 12 generations of selection for these traits. When possible, we also selected for other traits consistent with improved performance on P. angulata: Faster development time, higher pupal weight, and increased preference for oviposition on P. angulata. However, because of the constraints imposed by the need for simultaneously available males and females for mating, the only universally applied selection criterion was survival on P. angulata. The number of individuals tested each generation ranged from 54 to 678, and the number of single-pair matings used to produce each generation ranged from 7 to 47 (Supplemental Table 1).

Generation 0

In a first round of selection, larvae from field-collected eggs were maintained individually on artificial diet until the 2nd instar, and then transferred to a small container with one P. angulata fruit. Although selection on 2nd instars is less intense than selection on neonates, we used 2nd instars in this first round of selection to ensure sufficient survival. Fruits were checked daily for consumption and replaced as needed, and larvae were maintained on P. angulata until they either died or pupated. For each larva, we recorded the following: time from hatching to pupation (days); pupal weight (measured 4–6 days after pupation); and time from pupation to emergence (days). For each of the seven geographic populations, we recorded the percent of larvae that survived to 3rd instar, to pupation, and to adulthood. Ninety-four individuals out of the original 786 larvae (12%) survived to adulthood and were used as parents for the next generation. These survivors came from all seven populations, and the number of survivors per population ranged from 9 to 17. Because development time varied widely among individuals, single-pair matings were set up whenever a male and female were available.

Generations 1–12

Starting with the first generation of progeny from our selection line matings, we conducted selection by placing single newly hatched larvae on P. angulata fruits in 30 ml plastic cups closed with a paper lid and maintaining them on fruit until pupation or death. In each generation, we recorded time from hatching to pupation, pupal weight, and time from pupation to emergence for each surviving insect. Within each generation, we recorded survival from neonate to 3rd instar on P. angulata. In addition, we measured the oviposition preference of the female in each single pair mating: mated pairs were placed in a 4 inch by 8 inch PVC pipe section with cheesecloth at each end as an oviposition substrate. Because we use cheesecloth as a substrate for maintaining the Chloridea colonies, we know it is acceptable to females. At one end of the pipe, the cheesecloth was coated with a mashed P. angulata fruit; at the other end, the cheesecloth was untreated. The percentage of eggs laid on each substrate was used to categorize oviposition preference as P. angulata or not P. angulata.

In each generation, we also tested larval performance on diet. We randomly chose 5–10 % of the progeny of single-pair matings, and measured their phenotypes for each the traits described above (except for survival to 3rd instar on P. angulata). Comparison of their phenotypes to those of their fruit-reared siblings allowed us to determine whether selection for performance on P. angulata had affected their performance on diet, and whether larval conditioning affected oviposition preference.

Assays of larval performance on Physalis angulata

In addition to the variables measured during selection, we also conducted additional assays of larval performance on P. angulata for specific selection generations and crosses to gain more precise information. During the sixth and twelfth generations of selection, the performance of subflexa, virescens, and CVselection was measured. After the 12th generation of selection when we crossed CVselection with virescens, we measured performance of F1 and backcross larvae. Larval performance was evaluated by allowing larvae to feed on a single P. angulata fruit for 72 h. Newly hatched larvae were reared on artificial diet and checked daily to determine developmental stage. Larvae were assayed 4–8 h after molting to the 3rd instar. Although we used 2nd instars in our previous interspecific study (Oppenheim et al. 2012), we were concerned that mortality in the intraspecific assays would be high, so used 3rd instars here.

Fresh P. angulata fruits were collected less than 1 h before the start of each assay. Because the ability to feed on P. angulata involves both behavioral and physiological traits (Oppenheim and Gould 2002a; Oppenheim and Gould 2002b), larvae were presented with fruits that were still within their calyces. To feed on the fruits, larvae had first to bore an entry hole through the calyx. Variation in fruit size and maturity can affect larval performance (Bateman 2006), so only fruits of similar size (range: 0.8 to 1.8 g) and maturity were used. At the beginning of each assay, we recorded larval and fruit weight. All weights were measured to the nearest 0.1 mg on a Mettler Toledo microbalance. At the conclusion of each assay, we recorded larval weight, fruit weight, and whether any feeding had occurred (judged by damage to the fruit and recorded as 0 or 1). From these data, the following were calculated: change in larval weight (larval end weight–larval start weight, mg); change in fruit weight (fruit start weight–fruit end weight, g); proportion change in larval and fruit weights (weight change divided by start weight); and assimilation efficiency (mg change in larval weight divided by g change in fruit weight). After the assay, larvae were maintained on artificial diet. Insects were held until adult emergence before freezing at −80 °C. The following traits were measured after the feeding assay: time from hatching to pupation (days); pupal weight (measured 4–6 days after pupation); sex (determined at the pupal stage); and time from pupation to emergence (days).

Backcross matings

To map the QTL associated with subflexa-like performance on P. angulata in CVselection, we crossed CVselection with virescens and backcrossed the progeny to virescens (Fig. 1). To do this, two selection-line females from the same family were crossed to male virescens after 12 generations of selection. Meiotic recombination does not occur in the females of Lepidoptera (Marec 2010), so a backcross map using F 1 females can only resolve linkage groups to the level of chromosome. Interspecific crosses (subflexa × virescens) produce sterile F 1 males (Karpenko and Proshold 1977), so only female-informative, non-recombinant backcrosses are possible. In an intraspecific study, however, it is possible to first use female-informative backcrosses to unambiguously assign markers to chromosomes and then use male-informative backcrosses to determine the location of QTL within chromosomes.

Fig. 1
figure 1

Single pair matings leading to female- and male-informative backcross families. Sex chromosome states: Ws and Zs are the selection line female (W) and male (Z) sex chromosomes, Wu and Zu are from unselected virescens. Circles  =  females, squares = males. CvSel = C. virescens selection line, CvSF = female-informative backcross; CvSM = male-informative backcross

We used the F 1 progeny from our CVselection × virescens crosses to generate one female-informative family in which an F 1 female was crossed to a virescens male (family CvSF), and two male-informative families in which an F 1 male was crossed to a virescens female (families CvSM1 and CvSM2).

AFLP Markers

DNA was prepared as described previously (Oppenheim et al. 2012). We used the commercially available Qiagen (Chatsworth, CA) DNeasy 96 extraction kit to extract DNA from frozen adults, following the animal tissue protocol with some modifications to ensure complete digestion and removal of insect scales and cuticle. After extraction, DNA was prepared for AFLP genotyping using a modified version of the procedure described by Vos et al. (Vos et al. 1995).

Selective amplification was carried out using 33 different primer pairs, and the resulting fragments visualized using fluorescently labeled selective primers. Fragments were separated by capillary electrophoresis on a CEQ 8000 (Beckman-Coulter, Jersey City, NJ), and the resulting electropherograms were first analyzed with the CEQ AFLP software (version 9). Final scoring of all fragments was done manually to ensure that all legitimate peaks were included, and spurious peaks excluded.

Linkage mapping

For linkage mapping, we selected markers that were absent in virescens, present in CVselection, and segregating approximately 1:1 in the backcross families. Because all grandparental crosses were of a CVselection female to a virescens male, the F 1 mother of female-informative family CvSF had a W sex chromosome from CVselection and a Z sex chromosome from virescens. Backcrossing the CvSF F 1 female to a male virescens resulted in female backcross progeny with a W chromosome from CVselection and a Z chromosome from virescens, and male backcross progeny with both Z chromosomes from virescens. In the two male-informative backcross families, CvSM1 and CvSM2, the F 1 fathers had one Z chromosome from CVselection and one Z chromosome from virescens. Backcrossing the CVselection males to virescens females resulted in a Z chromosome from CVselection in 50% of the female progeny (all of whom had a W chromosome from virescens) and 50% of the male progeny (all of whom had at least one Z chromosome from virescens) (Fig. 1).

The program JoinMap (Van Ooijen 2001) was used in a two-step process to sort our AFLP markers into linkage groups. First, using a LOD threshold ≥10 and a threshold of recombination ≤0.5, we identified the linkage groups that originated from the CVselection grandmothers of our backcross families. The LOD (logarithm of odds) score compares the likelihood of the observed data if the two loci are linked, to the likelihood of observing these data purely by chance, and LOD ≥10 represents 1010 to 1 odds of linkage (van Ooijen 1999). In the CvSF progeny, linkage between markers on the same chromosome should be complete and the level of recombination between them should be zero with an infinitely high LOD score. In practice, however, missing data and errors in determining marker genotypes combine to reduce the association between markers. Thus, small departures from the ideal values were treated as experimental error.

In the second step of linkage mapping, we determined the order of loci within a chromosome. Using the progeny of our male-informative crosses, we identified the linkage groups that occurred in the male-informative families. Because recombination does occur in males, we used a less stringent LOD ≥4 threshold (104 to 1 odds of linkage) for determining linkage in these families. Once the linkage groups based on male-informative loci were constructed, we looked for loci that were present in both the male-informative and the female-informative maps. These were used as anchors to determine homology between the two kinds of maps. We used the regression mapping algorithm in JoinMap to determine the order of loci within male-informative linkage groups. The Kosambi mapping function was used to convert recombination frequencies into map distances in centiMorgans (cM).

QTL mapping: statistical analysis

We regressed phenotypes on chromosome state (for the female-informative backcross) or marker state (for male-informative backcrosses) for assimilation efficiency, the proportion of larvae that fed on Physalis, and the percent change in larval weight. To determine empirical significance thresholds for chromosome and marker-phenotype associations, we randomly permuted the phenotype values among genotypes (Churchill and Doerge 1994). We performed 1000 permutations for each phenotypic variable, and recorded the maximum F statistic generated in each replicate. The resultant population of F statistics was then sorted from lowest to highest, and the 900th and 950th greatest F statistics (which correspond to p = 0.1 and p = 0.05 experiment-wise Type I error rates) were used as the critical threshold for declaring whether the observed F statistic for each marker indicated a significant (p < 0.05) or “suggestive” (0.05 < p < 0.1) QTL effect (Kruglyak and Lander 1995). Because stringent significance tests can eliminate causal QTL when the amount of variance they explain is small (Yang et al. 2010), we retained both suggestive and significant QTL for further analyses. Although suggestive QTL are at an increased risk of being false positives, we feel their inclusion is important if we wish to understand the overall genetic architecture of host plant use.

To determine whether the effect of the QTL selected in the previous step varied between the sexes or among backcross families, we conducted a mixed-model analysis of variance for each QTL and trait. We used PROC GLIMMIX (SAS 9.2 (Institute 2008)) to evaluate the effect of QTL (fixed), sex (fixed), family (random), and their interactions on the observed phenotype. Effects that were not significant were dropped from the model, and a reduced model used to estimate the effect of chromosome on phenotype. Because our sample sizes were unequal, we used least-squares means to examine differences between means. We corrected for multiple comparisons within each dependent variable by using the SIMULATE option in the LSMEANS statement. SIMULATE is a simulation-based method for controlling the family-wise error rate by estimating the precise value of the adjusted p-values given the number of tests performed, and is both more precise and more liberal than Bonferroni correction (Edwards and Berry 1987).

Phenotypic differences associated with QTL state within each level of sex and family were examined using the LSMEANS SLICE option in GLIMMIX. SLICE performs a partitioned analysis of a given factor at different levels of the other factors (i.e., simple main effects (Winer 1971)), allowing us to evaluate the statistical significance of a QTL’s effect at each level (e.g., in males versus females). We used the SIMULATE option to obtain p-values corrected for the number of tests performed.

QTL effects

The effect of a QTL is the average difference in the phenotype of the trait between alternative states of the QTL. We evaluated QTL effects in several contexts: (1) the amount of backcross variation explained; (2) the amount of variation between subflexa and virescens explained; and (3) the amount of variation between virescens and CVselection explained. The percent of variation explained (PVE) within a backcross family was determined by comparing the phenotypes of individuals with and without a given QTL, using regression analysis to estimate PVE (expressed as r 2) for each separate QTL.

To examine QTL effects in the context of whether intraspecific differences (CVselection versus virescens) reflected interspecific differences (subflexa versus virescens), we calculated the percent of the phenotypic gap between (1) subflexa and virescens (percent interspecific difference explained) and (2) CVselection and virescens (percent intraspecific difference explained) accounted for by a given QTL (Fishman et al. 2002; Lexer et al. 2005; True et al. 1997). The equation used to determine the percent of interspecific variation explained by each QTL was: Percentage species difference = (average effect of QTL ÷ average difference between subflexa and virescens) × 100. The percent of intraspecific variation within virescens explained by each QTL was estimated as: Percentage intraspecific difference = (average effect of QTL ÷ average difference between CVselection and virescens) × 100.

Results

Response to selection

Since our interest is in the genetic architecture of traits that distinguish subflexa from virescens, we first examined interspecific differences in these traits. The phenotypes of virescens also serve as a baseline against which changes in CVselection can be evaluated. The traits described below were measured in all generations, but we report results only for CVselection generations 1, 6, 9, and 12 because these illustrate the changes (or lack thereof) that occurred. See archived data file “Phenotypes” for complete results.

Neonate survival on P. angulata fruits is much higher in subflexa than in virescens (Table 1). Although CVselection survival rates never equaled those of subflexa, they did increase steadily over the course of selection. At generation 9, survival to the 3rd instar was 48% and by generation 12 it had reached 73% (Fig. 2a).

Table 1 Average phenotypes of subflexa, virescens, and CVselection for traits measured in backcross insects for QTL mapping. Means and standard deviations shown
Fig. 2
figure 2

Larval phenotypes over the course of 12 generations of selection for performance on P. angulata. Means and standard errors are shown for Chloridea subflexa, C. virescens, and CVselection. a Percent of larvae surviving from neonate to 3rd instar on P. angulata fruits; b Percent of larvae that fed on P. angulata fruits; c Larval percent weight gain on P. angulata fruits; d Larval assimilation efficiency on P. angulata fruits

The oviposition preference of females (measured as a binary trait where preference = P. angulata or Not P. angulata) differs between subflexa and virescens, with 87% of subflexa and 33% of virescens ovipositing on P. angulata (p = 0.002). Over the 12 generations of selection, oviposition preference did not change. However, fruit-reared and diet-reared insects differed in oviposition preference: 64% of diet-reared insects oviposited on P. angulata, while only 38% of fruit-reared insects did so (p = 0.004, r 2 = 0.03), suggesting a negative effect of larval experience on P. angulata. This trend was consistent within each generation (Fig. 3).

Fig. 3
figure 3

Percent of Chloridea subflexa, C. virescens, and CVselection females ovipositing on P. angulata-treated substrate over 12 generations of selection. Oviposition tests were conducted in an open tube with P. angulata-treated cheesecloth covering one end, untreated cheesecloth covering the other. Fruit and diet reared larvae were full siblings randomly assigned to either P. angulata fruit or artificial diet. Means and standard errors shown

Assimilation efficiency (mg change in larval weight per g fruit consumed) was almost three times higher in subflexa than in virescens, and responded strongly to selection (Fig. 2d). By generation 6, assimilation efficiency phenotypes in CVselection were indistinguishable from those of subflexa, and by generation 12, average assimilation efficiency in CVselection was higher than in subflexa.

Both subflexa and CVselection were more likely than virescens to feed on P. angulata, and by generation 6 this trait had reached subflexa-like values (Fig. 2b). Over subsequent generations, the proportion of larvae feeding on P. angulata increased slightly, though not significantly. In the backcross progeny, feeding varied slightly between families and was affected by larval start weight (heavier larvae were more likely to feed) (Supplemental Fig. 1).

There was no difference between subflexa and virescens in the amount of fruit a larva consumed, nor did the sexes differ. Although CVselection larvae ate more than subflexa or virescens, this difference was not significant and the amount eaten did not change over 12 generations of selection.

As expected, subflexa had the highest values for percent change in larval weight [(change in larval weight ÷ larval start weight) × 100] when feeding on P. angulata, but CVselection showed little evidence of a response to selection (Fig. 2c) and in fact had lower average values than virescens (Table 1). The percent change in larval weight did not differ between the backcross families, and was not affected by sex or fruit start weight. Larval start weight did have an effect, with larger larvae gaining more.

Although we were primarily interested in traits associated with use of P. angulata as a host plant (e.g., willingness to feed on P. angulata and assimilation efficiency), we also examined the correlation between all measured traits. Since we do not fully understand the basis of performance on P. angulata, it was impossible to predict how traits affecting use of P. angulata might relate to more canonical life history traits (e.g., development time). Thus, we tested all 25 possible pairwise correlations between traits in the selection, backcross, and unselected populations (Table 2). CVselection and backcross progeny showed far more inter-trait correlation (17 and 20 correlations, respectively) than virescens and subflexa (which each had six correlations). Selection on survival, pupal weight, development time on P. angulata, and maternal preference for oviposition P. angulata appears to have caused correlated responses in almost all the traits we measured.

Table 2 Kendall correlation coefficients between traits measured for QTL mapping. Correlations are shown for subflexa, virescens, CVselection, and backcross families

Linkage mapping

The 19 primer pairs yielded a total of 330 informative markers in the three CVselection families. Thirty-five markers occurred in all three families, and 144 markers occurred in at least two families (Supplemental Table 2). On average, each marker occurred in 52% of the families.

In the female-informative family CvSF, 207 markers mapped to 31 linkage groups (see archived data files “Marker Genotypes” and “Linkage Groups” for details). Because there is no recombination in females, each of these linkage groups corresponds to a CVselection chromosome. The true autosome number in heliothines is 30, and there is one pair of homologous sex chromosomes. We could not map the male sex chromosome in CvSF, because all males were homozygous for the virescens Z chromosome, but were able to map the female sex chromosome (Chromosome W) (Fig. 1).

In the two male-informative families we first determined linkage within each family, finding 29 linkage groups in CvSM1 and 33 linkage groups in CvSM2. We next looked for markers that occurred in both families, and used these as anchors to establish chromosome homology. Linkage groups with anchoring markers were combined using the JoinMap command “Combine groups for map integration,” while groups without anchoring markers remained separate. Fourteen groups occurred in both families, fifteen groups occurred only in CvSM1, and nineteen groups occurred only in CvSM2.

After independently mapping the female- and male-informative families, we integrated the two sets of maps (Supplemental Table 3). We screened the male-informative groups against the linkage map based on female-informative markers and used anchoring markers to establish homology. Of the 30 autosomes mapped in family CvSF, 24 were present in one or both of the male-informative families. The six remaining autosomes from CvSF did not have male-informative homologs, possibly because we did not have a large enough number of markers to anchor the remaining linkage groups. One linkage group, representing the CVselection male sex chromosome (Chromosome Z) was present only in the male-informative families, where it occurred in 50% of females and 49% of males.

In the twenty-four male-informative linkage groups that could be homologized with female-informative chromosomes, chromosome size ranged from 15 to 91 cM, with an average length of 55 cM (Fig. 4). Although the smaller chromosomes could be a result of depauperate marker coverage, similar chromosome size ranges have been found in Bicyclus anynana (14 – 122 cM (Beldade et al. 2009)) and in Papilio canadensis (14–99 cM (Winter and Porter 2010b)), so the size range we detected may reflect actual variation in the genetic size of the chromosomes. By extrapolating from the average chromosome size, we found a total genome size of 1658 cM. The average distance between markers was 11 cM, ranging from 1 to 48 cM. Linkage mapping in other lepidopterans has shown that genome size is quite variable, ranging from 1167 cM in Papilio (Winter and Porter 2010a) to 2542 in Colias (Wang and Porter 2004), so our estimate seems reasonable. The genome size of virescens is estimated to be 401 Mbp (Taylor et al. 1993), meaning that the recombination rate in our mapping population was 243 Kb/cM, somewhat higher than the rates seen in Heliconius butterflies (C. erato: 165 Kb/cM (Tobler et al. 2005); C. melpomene: 180 Kb/cM (Jiggins et al. 2005)), but quite similar to that of Bombyx mori (297 Kb/cM (Yamamoto et al. 2008)).

Fig. 4
figure 4

Recombinant linkage groups from male-informative backcross families CvSM1 and CvSM2. QTL affecting traits in male-informative lines are shown as circles, with filled circles denoting CVselection-like phenotypic effects, empty circles denoting virescens-like effects. Chromosomes affecting traits in the female-informative family CvSF are indicated with vertical lines. Solid lines = CVselection-like effects, dashed lines =  virescens-like effects. Affected trait indicated by color. See Tables 35 for phenotypic effect of each QTL

Marker-phenotype associations

A total of thirty-nine QTL showed some level of association with the phenotypes measured (Tables 35; Fig. 4). Six of these were unlinked, and the remainder were distributed across sixteen autosomes and the male sex chromosome. We calculated the effect of each QTL as (1) the amount of backcross variation explained; (2) the amount of variation between subflexa and virescens explained; and (3) the amount of variation between virescens and CVselection explained.

Table 3 QTL associated with variation in the assimilation efficiency of larvae feeding on Physalis angulata
Table 4 QTL associated with variation in the occurrence of larval feeding on Physalis angulata
Table 5 QTL associated with variation in the proportion change in larval weight after 72 h of feeding on Physalis angulata

Assimilation efficiency

Assimilation efficiency was much higher in subflexa than in virescens, and even higher in CVselection. Four chromosomes affected assimilation efficiency in CvSF, one of which also affected assimilation efficiency in the two male-informative families. See Table 3 for a summary of the effects and significance levels of all QTL affecting variation in assimilation efficiency.

Three chromosomes affected assimilation efficiency in CvSF only. Two of these had positive effects, one had negative effects. In the male-informative families, nine QTL from seven chromosomes had significant effects. Three QTL were located in a 35 cM region of Chromosome 18 (which has a total length of about 69 cM) and all had positive effects. QTL from the remaining chromosomes had a mixture of positive and negative effects.

Larval feeding

Larval feeding occurred more frequently in CVselection and subflexa than in virescens, and fourteen QTL (two unlinked) from ten chromosomes were associated with backcross variation in larval feeding (Table 4). Of these, five had positive effects (i.e., their presence was associated with an increased occurrence of larval feeding) and nine had negative effects.

Percent change in larval weight

After feeding on a P. angulata fruit for 72 h, the weight of subflexa larvae increased by a greater percentage than did that of virescens larvae. As described above, the proportion change in larval weight was lower in CVselection than in virescens. Thus, QTL that increased the proportion change in larval weight are subflexa-like in their interspecific effect (subflexa > virescens), but virescens-like in their intraspecific effects (because virescens > CVselection). About half of the QTL had negative effects (Table 5), and two of these were located on Chromosomes 13 and 23, which also harbor positive-effect QTL.

Discussion

The evolution of ecologically adaptive phenotypes depends upon their underlying genetic architecture, which determines whether and how quickly a trait can respond to natural selection (Rajon and Plotkin 2013). We used artificial selection and QTL mapping to examine the architecture of performance on a novel host plant, P. angulata, by the generalist C. virescens. In contrast to models suggesting that traits with simple genetic architectures will respond more readily to selection than those with complex architectures (Arnegard and Kondrashov 2004; Gavrilets and Vose 2007; Gavrilets et al. 2007; Hayashi et al. 2007), performance on P. angulata improved rapidly despite a complex architecture involving QTL on multiple chromosomes. Many previous studies have documented rapid phenotypic responses to natural and artificial selection in complex traits (Messer et al. 2016), and, where the architecture of this response has been examined, it appears to involve many loci of small effect distributed throughout the genome (e.g., (Burke et al. 2010; Turner et al. 2011)).

One mechanism that might explain the rapid phenotypic adaptation we observed is the retention of genetic variation for ancestral host plant use. Several authors have suggested that an organism’s ability to respond to selection is influenced by its evolutionary history, so that lineages whose predecessors evolved in variable environments retain more “adaptive potential” (Kopp and Matuszewski 2014). This potential can result from the retention of variation for the use of ancestral host plants (Janz and Nylin 2008; Li et al. 2001), or from a more general effect of selection for increased plasticity (Hansen 2006). Phylogenetic evidence shows that changes in diet breadth are reticulate over evolutionary time, with lineages experiencing repeated expansions and contractions in the set of hosts they feed on (Winkler and Mitter 2008). In the Heliothinae, diet breadth has gone from narrow (in early diverging lineages) to broad (in the “mega-pest lineage” that includes Chloridea) to narrow again (as demonstrated by reversions to monophagy in C. subflexa and other specialists) (Cho et al. 2008; Gordon et al. 2009). Thus it is likely that P. angulata and other Solanaceae species are part of the collective host repertoire that Chloridea has experienced over evolutionary time. Although empirical evidence regarding the retention of allelic variation for bygone phenotypes is scarce, data from selection experiments in Drosophila melanogaster demonstrate that even hundreds of generations of selection may not lead to elimination of alleles for the “old” phenotype (Burke et al. 2010), suggesting that even when there is no apparent pressure to retain genetic variation, selection fails to expunge it.

If the retention of unused allelic diversity is responsible for the rapid response to selection we observed, then the causal genetic variation was in place long before we started selection. Looking at the distribution of physiological and behavioral phenotypes in virescens (Fig. 5), there does appear to be substantial standing variation for performance on P. angulata. For two of the component traits that contribute to assimilation efficiency (5a-b), the highest-performing unselected virescens already exceeded the average subflexa score. For assimilation efficiency itself (5c), which measures the ability to convert ingested fruit into body mass, there is a sizeable gap between unselected virescens and subflexa. This gap suggests that the optimal combination of variants for some (as-yet-unidentified) component trait is rare or absent in unselected virescens. In assessing the standing variation for behavioral traits, larval willingness to feed on P. angulata is quite common in virescens (Table 1; Fig. 2b). For more complex behavioral traits, we previously measured a set of fruit colonization behaviors that contribute to subflexa’s ability to use the calyx of P. angulata as a refuge from natural enemies (Oppenheim and Gould 2002a). Briefly, by fully entering the calyx before feeding, subflexa spend less time exposed to specialist parasitoids. As shown in Fig. 5d, while most virescens took much longer than subflexa to fully colonize a fruit, a few were almost as fast as subflexa. Similarly, the number of entry holes a larva bores in the calyx of P. angulata (a laboratory-based measure of the behaviors required for subflexa-like colonization behavior (Oppenheim et al. 2012)) is consistently higher in virescens than in subflexa (5e), but about 30% of virescens show subflexa-like phenotypes.

Fig. 5
figure 5

Phenotype distributions for Chloridea subflexa and C. virescens. For physiological traits measured in this study: a amount of fruit eaten by larva, b amount of weight gained by larva, c larval assimilation efficiency. For behavioral traits measured in previous studies: d time required for larva to fully colonize P. angulata fruit; e number of entry holes in the calyx surrounding P. angulata fruit. Vertical axis = percent of larvae

The acquisition of a novel host critically depends on a suite of possibly unrelated traits that span multiple life stages, but improved performance on the novel host need not require simultaneous optimization of all these traits. A small change in one trait (for example, larval willingness to feed on the novel host) could change the adaptive value of standing variation in other traits (such as larval behavior). Under such a “many ways to skin the cat” model, fitness can increase along many different trajectories, and a fortuitous combination of existing variants could allow for rapid phenotypic change.

The genetic architecture of intraspecific and interspecific variation

In an earlier study, we investigated the genetic architecture of variation between subflexa and virescens in the use of P. angulata (Oppenheim et al. 2012). As in the present study, we identified QTL involved in assimilation efficiency, the proportion change in larval weight, and the occurrence of larval feeding on P. angulata. Although we have not yet determined the homology of the particular loci responsible for differences in use of P. angulata between virescens and subflexa with those affecting use of P. angulata within virescens, we can compare the genetic architecture of P. angulata use at the chromosomal level.

Although the interspecific study involved almost five times as many backcross progeny and thus should have had greater power to detect QTL (Nicod et al. 2016), the number of chromosomes that harbored QTL was the same (seventeen out of thirty-one). This surprising absence of a sample size effect may result from a limitation peculiar to female-informative crosses in Lepidoptera: Because recombination does not occur in females, whole chromosomes are inherited intact. Thus, complementary QTL (those with opposing effects) on a given chromosome may cancel each other out. (In male-informative crosses, QTL from the same chromosomes can segregate independently and their individual effects can be estimated.) A more direct estimate of sample size effects can be gained by comparing the interspecific study (which used only female-informative crosses) to the female-informative intraspecific backcross reported here. The results corroborate the effect of sample size on QTL detection: In the current female-informative intraspecific analysis (where N = 173) we identified nine chromosomes affecting variation in at least one of the three traits, compared to seventeen chromosomes in the interspecific backcrosses (N = 1462).

In both studies, we found some QTL with effects in the wrong direction: Three interspecific and six intraspecific chromosomes harbored QTL associated with phenotypes that were virescens-like rather than subflexa-like. The proportion of wrong direction QTL varied between traits: 60% of the QTL affecting larval feeding had virescens-like effects, as did 43% of the QTL associated with assimilation efficiency. The high frequency of virescens-like effects for larval feeding on P. angulata is particularly surprising, given that larvae that did not feed during the selection regime would die and thus the phenotype strongly selected against.

The explanation for this may lay in an overall difference between subflexa and virescens in the willingness of larvae to feed: Previous studies have found that a surprisingly large fraction of subflexa larvae simply fail to initiate feeding, even on favored hosts (Bateman 2006; Laster et al. 1982; Sheck and Gould 1993). It has been suggested that one important distinction between generalists and specialists is that generalists are indiscriminate feeders while specialists feed only upon plants that they recognize as hosts (Forister et al. 2007). If this is true in our system, selection for performance on P. angulata may have been accompanied by an overall decrease in willingness to feed that is unrelated to the particular host plant but reflective of a typical specialist phenotype.

In any case, for assimilation efficiency, which is the trait that most clearly distinguishes subflexa from virescens, the genetic architecture of intraspecific and interspecific variation was remarkably similar. We found ten chromosomes associated with interspecific differences between subflexa and virescens, and eight of these together accounted for 37% of the backcross variance and 170% of the interspecific difference. In the intraspecific study reported here, we found 12 QTL distributed across ten chromosomes associated with differences between virescens and CVselection, and six of these together accounted for 49% of the backcross variance and 90% of the intraspecific difference.

These strikingly similar patterns might suggest that the same genes are responsible for phenotypic variation within and between Chloridea species, but evidence for this is lacking. Results from insect courtship studies demonstrate that very similar interspecific and intraspecific genetic architectures underlie reproductively isolating traits in different populations and species, but that different traits, mechanisms, and genes are involved (Arbuthnott 2009). Of greater interest than whether similar architectures imply parallel genotype–phenotype relationships is whether architectural differences have consequences for the evolution of ecologically adaptive phenotypes. While architecture as a trait in itself is poorly studied, some work in model systems does suggest that different modes of selection can lead to substantial variation in trait architectures. Variation in craniofacial morphology, which has been subject to anthropogenic selection in dogs and sheep, has a very simple architecture in domesticated species (Boyko et al. 2010). In wild species, however, where the only selection pressure is inter- and intraspecific recognition, variation is controlled by many genes of small effect and has the same architecture within and between species (Pallares et al. 2016). It seems possible that, irrespective of the specific genes involved, the genetic architecture of a trait can reveal the selection pressures that gave rise to it, as well as the potential to evolve new phenotypic variants.

Implications for the evolution of ecologically important traits

What implications do our results have for the evolution of ecologically important traits? One striking result from our research is that the response to selection in virescens for performance on P. angulata was relatively rapid and unconstrained, suggesting that standing variation in virescens would allow for an evolutionary shift onto this novel host plant, consistent with recent findings in Lycaeides butterflies (Gompert et al. 2015). Selection for increased survival on P. angulata produced a response in other traits, including assimilation efficiency and willingness to feed on P. angulata. Thus, although host plant use involves many component traits, it appears that simple selection for survival on a novel host can drive adaptation in a suite of related host use traits.

The architecture of adaptation to P. angulata is complex, in that it involves QTL from more than half of the 31 Chloridea chromosomes, but the path to adaptation may be fairly simple. In contrast to systems where adaptation involves resistance to plant defense toxins, it seems that adaptation to P. angulata does not proceed as a single leap to a new adaptive peak. Instead, adaptation to this host is a mixture of many interchangeable loci with incremental effects on the ability to use P. angulata, suggesting that gradual adaptation in response to varying ecological selection pressure would be possible in the field. An environment rich in natural enemies, or one in which Physalis species were more abundant (or reliable) than other potential hosts, would exert pressure for larvae to adopt Physalis as a host, and the results we report here suggest that standing variation is sufficient to allow a phenotypic response.

Although we do not know whether the response to selection we observed in C. virescens is based on the same selection pressures that led C. subflexa to specialize on P. angulata, we can now say that the genetic architecture of intraspecific and interspecific variation is quite similar in terms of the number, distribution, and effect size of the loci involved. A full understanding of the genetic basis of host plant use, which is a multifactorial trait involving quantitative variation across a range of component traits, may require the identification of tens to hundreds of different loci (Flint and Mackay 2009). While increasing the number of markers used (e.g., SNP genotyping or RAD sequencing) can help reduce QTL interval sizes (and thus the number of base pairs that must be evaluated in the search for causal loci), there is no simple way to determine the genetic basis of a complex trait. We are currently engaged in genomic and transcriptomic analyses of C. subflexa and C. virescens and hope through a combination of approaches to identify candidate genes and regions responsible for intraspecific and interspecific variation in host plant use.

Data availability

All data files (Larval Phenotypes, Linkage Groups, and Marker Genotypes) are available from the Dryad Digital Repository: https://doi.org/10.5061/dryad.q2932