Introduction

Host plant adaptation traits are of fundamental importance to herbivorous insects. Most herbivores are adapted to a specific host plant or a narrow range of host plants on which they rely for food or other resources. Some host plant–insect associations are evolutionarily ancient, whereas others result from recent colonization events due to introduction of either plant or insect to the range of the other (Tabashnik, 1983; Fox, 2006). Herbivorous insects may broaden their host range and include new hosts, shift to a novel host or narrow the range to exclude a former host (Via, 1990; Thompson and Pellmyr, 1991). Recent host shifts or range expansions provide interesting scenarios for the study of the genetics of adaptation. Although this field has been studied and discussed intensively over the past decades (reviewed in Orr and Coyne, 1992; Orr, 2005), there is still much controversy about the genetic basis of adaptation. Much debate centers on whether adaptation primarily arises from a few genes with large effect each (that is, mono- or oligogenic) or from many genes with small effect each (that is, polygenic) (reviewed in Orr, 2005). Similarly, questions remain as to whether adaptation arises from new mutations or standing genetic variation, whether novel adaptive alleles are generally dominant or recessive and whether adaptations arising from human disturbance differ in their genetic architecture from those adaptations that arise under natural conditions. An understanding of the genetic architecture (that is, mode of inheritance, the number of genes involved) of a trait that allows for survival on a novel host plant is important because it dictates its evolutionary potential.

Successful adaptation to a host plant requires that the adult female accepts it for oviposition, and that larvae are able to feed on and develop to maturity on the host. The genetics of host plant adaptation or host-associated performance have been studied in several insects and a range of genetic architectures have been observed: Hawthorne and Via (2001) detected polygenic inheritance in pea aphids, whereas in other cases evidence for the influence of a limited number of genetic factors (oligo- to monogenic inheritance) was found (Jones, 1998; Sezer and Butlin, 1998; de Jong et al., 2000). For the mode of inheritance, a general pattern seems to apply: genes affecting larval performance consistently map to the autosomes (Tang et al., 2006). Genes controlling oviposition preference are less consistent: they are sex-linked in Papilio butterflies (Thompson, 1988), but autosomal in moths, Heliothis virescens and H. subflexa (Sheck and Gould, 1995), and Yponomeuta species (Hora et al., 2005).

A major limitation in studying the genetic basis underlying novel host plant colonization in herbivorous insects is the difficulty of identifying systems in which adaptation is a recently completed or still-ongoing process. A suitable system would be one that enables crosses between individuals that recently specialized on different hosts, such as two strains of a species that vary in host use. The crucifer-specialist diamondback moth (DBM), Plutella xylostella (Lepidoptera: Plutellidae), provides a unique opportunity to study the genetic basis of a very recent host plant colonization. DBM feeds on crucifers (Brassicaceae), a plant family characterized by the glucosinolate–myrosinase defense system toxic to most herbivores but harmless to P. xylostella larvae that circumvent this defense system with a specific enzyme, glucosinolate sulfatase (Ratzka et al., 2002). Owing to the success of this detoxicative strategy, DBM is a major pest of cultivated cruciferous vegetables in tropical and semitropical regions worldwide. However, in 1999 a population in the Kenyan Rift Valley was reported to feed on sugar snap pea (Pisum sativum L. var. macrocarpon, cultivar Oregon Sugar Pod (Fabaceae)) (Löhr, 2001; Löhr and Gathu, 2002). This surprising discovery led to further studies; larvae from this population were collected in 2001 and 2002, and reared on the same pea plant cultivar in the lab since then (Löhr, 2001). As this population can survive and develop fully on pea plants, it is referred to as the pea host-strain (P-strain). Löhr and Gathu (2002) showed that P-strain larvae can develop equally well on both host plants, cabbage and pea. In a separate selection experiment with a typical crucifer-feeding strain of DBM, they were able to increase larval survival on pea from 2.4 to 49.7% within six generations (Löhr and Gathu, 2002), suggesting one or a few major pre-existing genes being responsible for adaptation to pea.

So far, nothing is known about the genetics of DBM's P-strain recent larval adaptation to sugar pea. A first step in studying the genetic basis of host plant adaptation involves deciphering of the genetic inheritance. Considering the rapid evolution of the trait, we hypothesized a simple genetic basis (for example, single dominant or recessive gene). To understand the genetic basis of larval adaptation to pea, we hybridized the P-strain with a C-strain (cabbage-feeding but not pea-adapted strain) of P. xylostella, examined the survival rates of F1 hybrids, F2 hybrids and backcrosses on pea plants and performed linkage analysis using amplified fragment length polymorphism (AFLP) markers in female-informative backcross families to establish a linkage map. Being a non-model organism, so far no homologized genetic linkage map exists for DBM; developing such a map would help in identifying genomic regions contributing to host performance. We found that the trait is mainly autosomal and oligogenic, with additional maternal but not sex-linked effects, and with a surprising degree of genetic heterogeneity still present in the P-strain. We discuss our results in the light of other studies on host plant adaptation and the genetics of adaptation in general.

Materials and methods

Insect strains

Two P. xylostella strains, Waite and DBM-P, were used for crosses. The Waite strain is a cabbage-adapted strain that is unable to survive on pea. It was originally obtained from Waite Campus, Adelaide, South Australia and derived from a field collection in South Australia and was maintained as a laboratory culture for many generations in the laboratory of Dr Nancy Endersby, Victorian Department of Natural Resources, from whom it was obtained. At the Max Planck Institute for Chemical Ecology in Jena, it has been reared since 2005 for more than 50 generations. The DBM-P strain (P-strain) originates from Kenya and was sent to Max Planck Institute for Chemical Ecology by Bernhard Löhr from the International Centre of Insect Physiology and Ecology (ICIPE) in Nairobi, Kenya, in 2005. Originally, DBM-P was collected from the infested pea field in Naivasha, Rift Valley Province, Kenya in 2002, where the host shift was observed in 1999. It was maintained as a laboratory culture ever since at ICIPE in Kenya. At Max Planck Institute for Chemical Ecology in Jena, the DBM-P strain has subsequently been reared for more than 40 generations. Both strains are kept under the same conditions in a Viessmann climatic room at 21 °C, 55% relative humidity and a photoperiod of 16:8 light–dark.

Rearing of the Waite C-strain of P. xylostella took place in a rearing cage (40 × 40 × 40 cm) with kale leaves (Brassica napus) as stimulus and substrate for egg-laying and 5% honey solution provided as adult food source. The leaves and the eggs laid thereon were transferred to smaller plastic containers (18 × 18 × 6 cm), and hatched larvae were provided with freshly cut B. napus leaves. Pupae were collected from plastic containers and relocated to rearing cages for emergence, mating and oviposition.

Rearing procedure for the DBM-P was similar to the Waite C-strain, except that the mating occurred in plastic containers (18 × 18 × 6 cm) with cut pea plants offered as egg laying substrate and 5% honey solution as adult food source. For larval development, pupation and emergence of adult moths plant material together with the eggs laid thereon was transferred to a rearing cage (60 × 60 × 60 cm) with fresh leaves of Pisum sativum var. Oregon Sugar Pod. Plants used for the maintenance of insect colonies were reared in the greenhouse at 21–23 °C, 50–60% relative humidity and 14:10 light–dark.

Crossing strategies

To assess survival on pea, matings between the cabbage-adapted Waite C-strain (C) and the pea-adapted DBM-P strain (P) were performed to generate F1 progeny. We chose to use a C-strain from a different continent than the African P-strain to maximize amount of the AFLP polymorphism segregating in the backcrosses. All crosses were single pair matings between virgin males and females; therefore, individuals were confined in small tubes and kept therein until emergence. In the first crossing design, from now on referred to as cross 1 (C1), C and P males and females were crossed in each direction. Cross-types are abbreviated by writing the paternal strain first, for example, CP utilized a C-strain male and a P-strain female. The resulting F1 offspring of each family was divided equally on kale and pea plants and reared to adulthood. The sexes of F1 adults were determined upon emergence, and these were backcrossed with the respective backcross partner from the Waite strain producing a backcross generation. F1 adults were also intercrossed producing an F2 generation.

Since in the C1 crosses the number of offspring that survived on pea was very low, we conducted a second crossing design, referred to as cross 2 (C2). In this case, the parental generation (C- and P-strain) was raised on kale to eliminate host-associated maternal effects and derived from intrastrain single pair matings to reduce the genetic heterogeneity within subsequent backcrosses. This strategy was used instead of inbreeding parental lines, to avoid inbreeding depression known to cause severe effects (infertility, inviability, and so on) in Lepidoptera (for example, Roush, 1986). Similar to C1, C- and P-strain males and females were crossed in each possible direction to establish an F1 generation. However, this time F1 offspring were exclusively raised on kale. Resulting F1 adults were either backcrossed to the P-strain males, P-strain females (these backcross partners derived from intrastrain crosses also reared on kale) or intercrossed in single pair matings so that all parents and grandparents were known. An overview of the mating design of both C1 and C2 is given in Figure 1.

Figure 1
figure 1

Mating design for cross C1 (a) and cross C2 (b). Males and females of the pea-adapted host strain (P) and cabbage-adapted strain (C) were crossed in every possible direction. In C1, F1 and backcross progenies were distributed equally on pea (½) and kale (½) plants. In cross C2, all F1 and P-strain individuals used in the backcross were reared on kale. F2 and backcross progenies were distributed in a 2:1 ratio (2/3 on pea and 1/3 on kale).

Feeding assay

A feeding assay was conducted with F1, backcross and F2 progeny from C1 and backcross and F2 progeny from C2 to assess the phenotype ‘survival on pea’. In all, 10–15 eggs were transferred with a fine brush on a leaf of a potted kale or pea plant and the survival rates per family and host plant were determined by assessing the number of emerging adults. In C1, progeny from each family (in F1, backcross and F2, generations) were divided equally on kale and pea plants. In C2 the feeding assay was slightly modified. The whole F1 generation was raised on kale to minimize any maternal host plant effect, and offspring from backcross and F2 families were unequally apportioned to host plants, with 2/3 on pea and 1/3 on kale (Figure 1b). This served to increase the number of surviving offspring on pea for later genetic analysis. Three backcross families showed high survivorship on pea and were chosen for AFLP analysis; these were BC_01 with 23 of 37 individuals surviving on kale and 35 of 74 on pea, BC_02 with 33 of 46 surviving on kale and 45 of 94 on pea and BC_03 with 56 of 56 surviving on kale and 59 of 110 on pea (Figure 2). The Pearson's product–moment correlation was calculated to assess correlation between survival rates on kale and pea for each F1 and backcross family using R (R Development Core Team, 2010).

Figure 2
figure 2

Survival rates of larvae per family in (a) F1 generation of cross C1; (b) backcross generation of cross C1; and (c) backcross generation of cross C2. Diamonds: survival rates on kale plants; squares: survival rate on pea plants. Each vertical pair of a square and a diamond, connected with a dotted line, represents the survival rate of larvae from one family on pea and kale, respectively.

AFLP template preparation and analysis

For genetic analysis genomic DNA was extracted according to a modified protocol from Reineke et al. (1998), using CTAB and a TissueLyser (Qiagen, Hilden, Germany) for mechanical disruption. The quality of genomic DNA was verified using agarose gel electrophoresis and concentration was measured using a Nanodrop ND1000 (Thermo Scientific, Wilmington, DE, USA).

AFLP analyses were performed according to Vos et al. (1995). Extracted genomic DNA (±200 ng) was digested with EcoRI and MseI (New England Biolabs, Schwalbach, Germany). Sequence information of adapters and primers used for AFLP analysis can be found in Supplementary Table 1. In each polymerase chain reaction (PCR) reaction, two differently labeled EcoRI-selective primers, one labeled with IRDye 700 and the other labeled with IRDye 800, respectively, were used in each PCR reaction. This multiplexing allowed detection of two different AFLP reactions simultaneously on the same gel due to the detection of fragments in two different channels of the LI-COR DNA Analyser 4300 (LI-COR Biosciences, Bad Homburg, Germany). For isolation of AFLP band for sequencing, gels were re-run using only one EcoRI primer.

Amplified products were separated based on size with an LI-COR DNA Analyzer 4300. A formamide-dye stop solution was added to the AFLP reactions and samples were heat-denatured before electrophoresis. For separation, a 6.5% polyacrylamide gel (KB-PLUS, LI-COR) was chosen. A labeled size standard was loaded at each end. The gels were run for 2.5 h and the images were collected automatically in a computer file. Gels were scored using the image analysis program SagaMX Version 3.3 (LI-COR).

Families analyzed consisted of grandparents, parents and F1 progeny that survived on pea and kale. Bands present in the F1 female, absent in the recurrent backcross father and segregating in the backcross progeny were scored. In this way, only female-informative AFLP bands were used to identify linkage groups (LGs). From C1, four families were analyzed, and from C2, three families were analyzed (BC_01, BC_02 and BC_03). The establishment of LGs was only performed on C2-derived backcross families. Because there is no crossing-over in meiosis in Lepidopteran females (Heckel, 1993), all markers on the same chromosome will co-segregate as a single unit. A Pascal program written by DGH (DBM3Lnk.p) was used to identify groups of co-segregating AFLPs in those female-informative backcross families. After LGs were identified, we determined the source of the chromosome passed on from the F1 mother (that is, inherited from the grandfather or grandmother) to her progeny for each of the LGs.

We used a two-step procedure to estimate the contribution of chromosomes to differential survival on the two hosts. First, for each LG in each backcross family, Fisher's exact test was computed from the 2 × 2 contingency table showing the numbers of backcross progeny feeding on pea carrying alleles from the P-strain (a) or C-strain grandparent (b) in the first row, and numbers feeding on kale carrying alleles from the P-strain (c) or C-strain grandparent (d) in the second row. Because of the expectation that alleles from the P-strain with differential survival would confer an advantage on pea-feeding individuals, we computed the one-tailed probability corresponding to tables with the observed value of a or greater, using the FREQ procedure in SAS version 9.1.

In the second step, data from homologous chromosomes were pooled over as many families as possible. Homologous chromosomes were identified by bulked segregant analysis using AFLPs as described below, and assigned numbers in sequence. For Chromosome 1, corresponding LGs could be identified in all three families, using an AFLP that was segregating in two families and scored as a co-dominant marker in the third. Chromosome 3 likewise could be identified in all three families, and Chromosomes 2, 4 and 6 in two families by bulked segregant analysis using AFLPs. Only Chromosome 5, which was identified in BC_01 in the first step, failed to be matched with LGs from either of the other two families. Data for each chromosome were pooled across all the families for which LGs could be identified, and subjected to Fisher's exact test as before. This time the criterion for significance was P<0.0083, by applying the Bonferroni correction for six independent tests.

Excision, reamplification and sequencing of AFLP fragments

To develop markers that could be used to homologize LGs between backcross families and/or to assess whether any of the AFLP fragments are of specific coding region(s), AFLP bands from the four LGs of BC_02 with the smallest P-values by Fisher's exact test were excised from AFLP gels for sequencing and further analysis. For this purpose, the selective AFLP reaction that gave rise to the band of interest was repeated as a non-multiplex-selective AFLP reaction and the PCR product was loaded on the gel. Recovery of the bands of interest followed the LI-COR protocol (LI-COR AFLP manual, 2007). PCR conditions were 35 cycles of 94 °C for 30 s, 56 °C for 1 min and 72 °C for 1 min, with a final extension step of 2 min. The amplified fragments were analyzed by agarose gel electrophoresis. When reamplification was successful, 30 ng of the PCR products were treated with ExoSAP (USB Corporation, Cleveland, OH, USA) to remove primers and nucleotides. In cases where agarose gel electrophoresis showed only a faint band of the PCR product, a second reamplification was performed with 5 μl of the first reamplification as template (20 cycles of 94 °C for 30 s, 56 °C for 1 min and 72 °C for 1 min). Reamplified AFLP fragments were directly sequenced on an automated sequencer (Applied Biosystems 3730/XL/96 capillary DNA analyzer, PE Applied Biosystems, Foster City, CA, USA). Sequences were deposited in GenBank (accession nos. GU594729–GU594732).

Amplification of fragment M-CGA_E-AAG in BC_03

AFLP fragment M-CGA_E-AAG_483 occurred in one LG of BC_01 and one of BC_02; therefore, these LGs were considered to be the same and named Chromosome 1. However, this fragment did not appear as an AFLP marker in BC_03. Specific primers (PxCG10501-F1 and PxCG10501-R2; Supplementary Table 1) were designed to amplify a portion of this fragment in BC_03. PCR was carried out in a total reaction volume of 20 μl with 50 ng template DNA, 10 × mi-Taq buffer, 2.0 mM dNTPs, 20 mM forward and reverse primer, respectively, and 5 U μl−1 Taq polymerase. PCR conditions were 2 min at 95 °C for denaturing, followed by 30 cycles of 95 °C for 30 s, 56 °C for 30 s and 72 °C for 1 min, with a final extension step of 2 min. The PCR product was cleaned with QIAquick PCR purification kit (Qiagen) and directly sequenced, which identified an SNP polymorphism that was used to map the fragment in BC_03.

Bulked segregant analysis

To efficiently identify homologous LGs from the three female-informative backcross families of C2, we used a modified form of bulked segregant analysis (Michelmore et al, 1991). The basic strategy was to create bulks separately for each family–chromosome combination by pooling DNA from backcross individuals receiving P- vs C-chromosomes from their F1 parent, and then to screen these bulks simultaneously with many AFLP primer pairs. We created two bulks per family for each of several selected LGs: one bulk consisting of individuals where the AFLP marker for this LG band was present (plus) and another in which this specific band was absent (minus). LGs chosen for bulking were LG2, LG3, LG4, LG5 and LG6 of BC_01; LG2, LG3 and LG4 of BC_02 and LG6 and LG7 of BC_03. To create each bulk, the pre-amplification products of eight individuals (half-males and half-females; half-pea and half-kale survivors) were combined. All bulks were subjected to selective AFLP amplifications and run together on a polyacrylamide gel. Those pairs of bulks that showed the same presence–absence pattern for the same-sized AFLP band in two or more families were scored as the same LG between these families and assigned the same chromosome ID.

Results

Crossing experiment

Interstrain crosses yielding F1, F2 and backcross progeny are depicted in Figure 1 and the survival rates on pea and kale per family are shown in Figure 2. The number of families per type of cross and the overall survival rates of larvae in cross 1 (C1) and 2 (C2) are given in Table 1. In the F1 generation, we would expect a 100% survival rate on pea under dominant inheritance of the trait ‘survival on pea’, and no survival if the trait is inherited recessively. Overall, in the F1 generation (of C1) the survival rate on kale was higher than on pea, irrespective of the direction of cross whether C-father × P-mother (CP) or P-father × C-mother (PC). The survival rates on pea of the two crossing types differed: CP-type offspring, with a P-strain mother, had higher survivorship on pea plants (up to 30%; Figure 2a) than F1 progeny from PC-type, that is, with a C-mother, where almost no survival occurred on pea. Thus, the offspring of P-strain mothers performed better on pea, indicating a maternal effect. (Possible maternal effects include those due to DBM strain origin or host plant consumed by the larva, which are confounded in this cross since all P-strain mothers consumed pea in this cross.) The survival rates on kale were highly heterogeneous in the CP- as well as PC-type, ranging from 30 to 90% survival, with no obvious pattern. The fact that survivorship in the F1 generation occurred at all led us to first investigate the assumption of a partially dominant inheritance of the trait ‘survival on pea’, and thus directed our choice of the C-strain as backcross partner in C1.

Table 1 Number of families of C1 and C2 per type of cross (male × female), generation and corresponding overall survival rates on kale and pea

Cross 1, backcross and F2 generation

F1 survivors descending from the CP cross (C1) that survived on kale and pea were either backcrossed to the C-strain to obtain a backcross generation, or intercrossed to establish an F2 generation. The survival rates of backcross and F2 progeny were assayed on the two host plants, kale and pea. Under the assumption of a single locus, the expected ratio of offspring from the backcross would have been 50% homozygous for kale feeding and 50% heterozygous for pea feeding. Under our predicted (partial) dominant inheritance of the trait ‘survival on pea’, we would expect a 50% survival rate among backcross progeny and 75% survivorship in F2 families. However, when assaying the survival rates of the (CP) × C and C × (CP) backcross progeny, the survival rate on pea plants was uniformly low (⩽10%; Figure 2b). The survivorship on kale was not as heterogeneous as seen in the F1 generation and could be divided into two classes: one in the range of ±90% and the other clustering at ±60% of surviving offspring. However, the single F2 family (CP × CP) showed a higher survivorship on pea (38%), which was not significantly different from 25% (χ2=0.05, d.f.=1). The survival rate in the F2 resembling a 1:3 ratio and the overall low numbers of pea survivors in the backcross generation suggested a recessive inheritance of the gene(s) responsible for the trait ‘survival on pea’ rather than the previously assumed dominant inheritance.

Cross 2, backcross and F2 generation

In C2 the two parental strains (C and P), the entire F1 progeny (from PC and CP) and the P-strain individuals used in the backcross as parents were reared on kale to minimize maternal effects that may be generated by the type of host plant the mother consumed (for the mating design of C2 see Figure 1b). To further investigate the putative recessive inheritance indicated from the first cross (C1), F1 progeny of C2 were backcrossed to the P-strain as well as intercrossed. Overall, the survival rates on pea plants among backcross progeny of C2 were higher (Figure 2c) than in backcross progeny of C1, which had been backcrossed to the C-strain, supporting a recessive type of inheritance of the trait ‘survival on pea’. Apart from a general higher survival rate on kale than on pea, the survival rates on both plants were highly heterogeneous with no obvious pattern of positive or negative correlation in survivorship on the two hosts (Figure 2c).

Overall, we did not detect a significant correlation for survival on kale vs pea, that is, a high survival rate on kale was neither positively nor negatively correlated with a high survival on pea (Figures 2a–c), the strongest correlation was shown by the (CP) × C-type of cross, but was not significant (R=0.81; t=1.94, P=0.19).

Linkage analysis

The segregation patterns of AFLP markers were used to identify LGs in both sets of female-informative backcrosses, in which the mother was always an F1. Because of the absence of crossing-over in female Lepidoptera (Heckel, 1993), LGs can be identified as non-recombinant blocks of AFLPs with the same segregation pattern across backcross progeny sets. Four female-informative backcross families of C1 (C × (CP)-type: BC_07, BC_09, BC_12 and BC_19) were scored for AFLP bands inherited from the F1 mother that were absent in the C-father. Each family consisted of grandparents, parents and 12 backcross progeny. Previously published AFLP analyses for P. xylostella utilized one large family that was scored per AFLP gel (Heckel et al. 1999; Baxter et al., 2005); however, owing to the small number of offspring in any given family, offspring of the four families were scored in parallel on a single gel. This limitation together with the high genetic diversity within each of the P. xylostella strains (as revealed in genetic analyses; unpublished data) led to the fact that a polymorphic band scored in one family was generally not polymorphic in the other three families. AFLP analysis with 23 primer combinations resulted in the scoring of 239 different AFLP markers, which were assembled into a 0/1 matrix. The number of markers per primer combination ranged from 19 loci (E-ATG_M-CTA) to three (E-ATG_M-CTT). Out of the 239 scored markers, we found only two markers that were informative in all four families (E-AAC_M-CGA at 465 bp and E-ACA_M-CTG at 229 bp). Eighteen informative markers were present in three families, 115 informative markers in two families and the remaining 104 markers were only present in one of the four families. The low fraction of co-informative markers made it impossible to combine data from the families into a single linkage map. On the basis of previous studies (Heckel, et al., 1999; Baxter et al., 2005), about 250 AFLP markers informative in all four families would have been required to identify all 31 chromosomes present in P. xylostella.

The number of surviving backcross individuals per family in C2 was higher and therefore female-informative families could be scored individually for the establishment of LGs. Two families of type P × CP and one family of type P × PC were analyzed. Per family, >200 female-informative AFLP markers were identified and these were grouped according to common segregation patterns to identify LGs. Family BC_01 was scored with 44 primer combinations that resulted in 247 informative markers and 31 LGs, BC_02 with 48 primer combinations resulting in 199 informative AFLP markers grouped into 31 LGs and BC_03 with 29 primer combinations resulting in 203 informative markers and 29 LGs (the two missing LGs are likely marked by a single marker each).

Association of LGs with the trait ‘survival on pea’ in C2

Backcross progeny that survived on pea were expected to inherit alleles promoting the survivorship on pea from the P-strain to a greater degree than backcross progeny that survived on kale. For each LG, a P-value was calculated using a one-tailed Fisher's exact test, to compare P-strain- vs C-strain-derived homologues among pea and kale survivors. In this approach, the frequency of P-alleles among pea survivors vs the frequency of P-alleles among kale survivors was compared; this measures the relative contribution of the P-alleles towards ‘survival on pea’. Over-representation of a P-derived LG among pea survivors thus yields a positive association with the trait ‘survival on pea’ indicated by a significant P-value. In this case, a factor (gene) conferring adaptation to pea found in the P-strain is likely to be located on this chromosome, enabling larvae to feed and survive on the new host plant pea. Under-representation among pea survivors of the P-strain homolog with concomitant over-representation of the LG inherited from the cabbage-adapted C-strain produces a negative association with the trait ‘survival on pea’ and nonsignificant P-values. BC_01 had five LGs and BC_03 had two with P<0.05 (Figure 3). To test the significance of these LGs overall, we first needed to identify homologous LGs in all three backcross families.

Figure 3
figure 3

Association between chromosomes of the three analyzed backcross families BC_01, BC_02 and BC_03 in cross C2 and the trait ‘survival on pea’. The P-values were calculated using Fisher's exact test. Significant P-values indicate over-representation of chromosomes inherited from the P-strain grandparent via the F1 female in pea survivors relative to kale survivors. P-values corresponding to chromosomes based on pooling homologous linkage groups over families are shown below, with the families that were combined indicated with an X. The Bonferroni-corrected probability value for six independent tests is shown as P=0.0083.

Assignment of LGs—common AFLP fragments

AFLP fragments belonging to the four LGs in BC_02 with the lowest P-values were sequenced (GenBank accession nos. GU594729–GU594732). AFLP fragment M-CGA_E-AAG_483 (GenBank accession no. GU594729) showed similarity to a DOPA-decarboxylase-like gene (Drosophila melanogaster and CG10501; E=1e–11). The same fragment (same primer combination at same size) was also found in one of the LGs of BC_02. As individuals of both families showed an AFLP band at the same size for the same primer combination, these LGs were considered to be homologous and named Chromosome 1. In BC_03 this AFLP fragment was not present; however, specific primers designed from the AFLP sequence successfully amplified the corresponding gene fragment from this family. Sequencing this PCR product revealed that at one nucleotide position, the F1 mother and some backcross progeny were heterozygous, whereas the P-strain father and other offspring were homozygous. The pattern of hetero- and homozygous individuals was the same as the pattern of one of the LGs in BC_03, which was therefore considered to be homologous to Chromosome 1 of BC_01 and BC_02. On the basis of sequence similarity of the AFLP to the Bombyx mori predicted protein BGIBMGA002958 (Xia et al., 2008), this would correspond to Chromosome 4 of B. mori.

Assignment of LGs—bulked segregant analysis

To find additional homologies among LGs between families BC_01, BC_02 and BC_03, we used a modified form of bulked segregant analysis (Michelmore et al., 1991), using AFLP segregation patterns of selected LGs within families to define the bulks. We created two different bulks for each such LG (Figure 4): one bulk with individuals showing the band (plus) and another bulk from individuals in which the band was absent (minus). If the bulks of two or three families showed the same presence–absence pattern, for a given primer combination and band size, these two LGs were considered homologous, thus representing the same chromosome. With this approach we were able to assign the following LGs to the same chromosome: LG2 of BC_02 and LG2 of BC_01 (Chromosome 2), LG3 of BC_02, LG3 of BC_01 and LG6 of BC_03 (Chromosome 3), LG4 of BC_02 and LG4 of BC_01 (Chromosome 4) and LG6 of BC_01 and LG7 of BC_03 (Chromosome 6) (Figure 4 and Supplementary Table 2). LG5 of BC_01 represented an additional chromosome that we were unable to homologize to LGs in the other two BC families (Chromosome 5; Figure 4).

Figure 4
figure 4

Bulked segregant analysis of AFLPs to identify homologous linkage groups. Each LG in each family is represented by two bulks: one with individuals that carry the AFLP band and the other where the band is absent. Linkage groups within families are listed across the top of the figure. AFLP markers with concordant patterns in two of the bulks are listed down the side, grouped according to chromosome. Primer combinations and band sizes (in bp) are given along with a slice of the gel image showing the AFLP pattern. Circles placed immediately below the relevant band denote where a specific presence/absence pattern (filled circle vs open circle) occurs, indicating the correspondence between the same linkage group in two different families. Only LG5 in BC_01 fails to match with any of the other patterns shown.

Tests on data pooled over families

We calculated Fisher's exact test for the five chromosomes for which homologous LGs could be identified, to assess an overall significance with the larger sample sizes obtained by pooling the families (Figure 3). For Chromosomes 2–4 and 6, pooling the data confirmed the trends shown by individual families, that is, the P-derived homologues were over-represented in pea survivors relative to kale survivors, and the overall significance increased as data were combined from separate families. For Chromosome 1, pooling BC_01 and BC_02 also greatly increased the significance, but adding BC_03 reduced significance because BC_03 showed no effect of this LG (Figure 3). Chromosomes 1, 2, 3 and 4 are significant at the level of P=0.0083 (Bonferroni correction for six independent tests).

Discussion

This study provided a first approach towards understanding the genetic basis of the recent adaptation to sugar pea in the newly evolved P-strain of the DBM, P. xylostella. The heterogeneity of larval survivorship on pea as well as on kale between and within the different types of crosses eliminates any simple genetic hypothesis, and this new trait appears to have a complex genetic basis. Our data show that the genes significantly affecting the trait ‘survival on pea’ in the P-strain of DBM are all autosomal. Differences among reciprocal crosses indicate the presence of maternal effects, not sex-linkage. Survival on pea also shows characteristics of a recessive rather than a dominant mode of inheritance. Further, the trait is not inherited monogenically but under multi-locus control, with the involvement of two to five different loci, depending on the family, which thus contributes to its complex nature.

Recessive inheritance

The conclusion of recessive inheritance of the trait ‘survival on pea’ is based on crosses and backcrosses between the newly evolved P-strain of DBM and a cabbage-adapted strain and the assessment of the survival rates of F1, F2 and backcross progeny on pea. The occurrence of some survivorship on pea in the F1 generation in the first series of crosses (C1), although low, suggested a partially dominant inheritance. However, the low numbers of survivors in the backcross generation together with the higher survivorship of F2 progeny of C1 on pea contradicted this assumption, and is instead consistent with the hypothesis of one or more recessive genes being responsible for the trait ‘survival on pea’. In addition, the survival rate on pea in the F1 generation was significantly higher among F1 progeny from pea reared P-strain mothers in comparison to those from C-strain mothers. Thus, the survivorship in the F1 generation on pea that we found in the first cross appears to be due to a maternal effect (discussed below) rather than to partial dominance.

In the second backcross (C2), where F1 progeny were hybridized with the P-strain, we found a much higher but still an overall low survival rate on pea. Given the P-strain's rapid adaptation to pea, we initially assumed dominant inheritance of a recently derived mutation; yet, the current picture is more consistent with recessive inheritance of standing variation. Interestingly, Orr and Betancourt (2001) have shown that the traditional principle of Haldane's sieve (Turner, 1981), that is, recessive alleles having a lower chance of fixation because they predominantly occur in heterozygotes where they are shielded from selection, holds when fixation of an adaptive trait comes from new mutations, but not when it results from standing genetic variation for which the degree of dominance is marginal. In host races of Mitoura butterflies, the ability to successfully use cedar as a host in the larval stages was expressed as a recessive trait as well (Forister, 2005). The recessive mode of inheritance of loci controlling adaptation in DBM might be a common mechanism in this insect, as resistance to Bt toxins was also found to be completely or partially recessive in DBM (Hama et al., 1992; Tabashnik et al., 1997; Tang et al., 1997). Strong selection was exerted for survival on pea as no other food source was available at that time, a scenario comparable to the development of insecticide resistance.

Autosomal inheritance

We conclude that the trait ‘survival on pea’ is autosomally inherited because all predictions of sex-linkage were rejected. Differences between reciprocal F1 families in the first crosses (C1) may be caused in principle either by maternal effects or by sex-linked genes. Taking into account the fact that P. xylostella, like most Lepidoptera, has a ZZ/ZW type of sex determination and female Lepidoptera are the heterogametic sex (ZW), effects of loci on the W-chromosome are confounded with extra-chromosomal maternal effects. Under the hypothesis of W-linkage, as only females carry the W-chromosome, any W-linked genes responsible for pea adaptation can only be passed from mother to daughter, and consequently, females would have a higher survivorship on pea, which we did not find. The chance of detecting a W-linked trait in Lepidoptera seems very low as the W-chromosome harbors almost no (known) genes (Traut, 1999). The only published examples are W-linked copies of the period gene in the silkmoth Antheraea pernyi (Gotter et al., 1999).

We also considered the possibility of Z-linked sex-linkage, again taking into account that the trait ‘survival on pea’ is inherited recessively, but still evolved quite rapidly in the field. A rare recessive allele has a lower chance of fixation when autosomal because it is shielded from selection when heterozygous, than when occurring on the sex-chromosome, where it is completely exposed to selection in the hemizygous sex (Haldane, 1922). Furthermore, many Z-linked traits in Lepidoptera are known to account for inter- and intraspecific differentiation (Prowell, 1998; Janz, 2003). However, this is not what we found. Under Z-linkage, all F1 females descending from the PC-cross would carry a pea-adaptation-conferring Z-chromosome (ZPWC) inherited from their father, but F1 females from the CP-cross (with C-strain father; ZCWP) would lack it. The fact that CP females survived on pea plants, and moreover, were used in single pair matings for back- and intercrosses in which segregation for the trait occurred, rejects the hypothesis of a Z-linkage of the trait ‘survival on pea’.

Maternal effect

Besides the genetic inheritance of pea adaptation, in C1 maternal effects were evident. F1 progeny descending from a pea-fed P-strain mother had a higher survival rate on pea than did F1 hybrids with a kale-fed C-strain mother. An effect due to the origin of the maternal strain cannot be separated from an effect due to the maternal diet in this cross; one or both may be operative. It would have been interesting to assign the mothers of the P- and C-strain to each others’ host plant, thereby taking the maternal effect into full account. However, this is not possible because the cabbage-adapted strain cannot survive on pea. Instead, we aimed at minimizing possible maternal effects by rearing the P- and the C-strain on kale before the F1 cross, as well as the F1 generation and all P-strain backcross partners in cross C2.

Although their mechanisms are poorly understood, non-genetic maternal effects on offspring phenotype appear to be widespread and are often of profound importance (Kirkpatrick and Lande, 1989; Agrawal, 2001). They may be adaptive for organisms in heterogeneous environments, such as phytophagous insects, whereby mothers produce offspring that are physiologically ‘acclimated’ to her rearing host (possibly by induction of enzymatic activity in the offspring) (Fox et al., 1995; Bernardo, 1996). Maternal effects were frequently only treated as a troublesome nuisance in quantitative genetic studies that need to be overcome by experimental design, rather than a target of experimental studies (Falconer and Mackay, 1996; Wolf et al., 1998; Andersen et al., 2005). Only recently have the pervasiveness and ecological and evolutionary significance of maternal effects become appreciated (Mousseau and Fox, 1998; Bonduriansky and Day, 2009). Fox et al. (1995) found that maternal rearing host affected offspring in the seed beetle Stator limbatus: mothers reared on Cercidium floridum produced larger offspring that developed faster than offspring of mothers reared on Acacia greggii, due to a non-genetic maternal effect. Although we know that pea adaptation has a genetic basis, the maternal effect might have contributed to the rate of population adaptation. Considering the recessive nature of the trait, presence of a maternal effect could have accelerated the rate of evolution. Exploring the possible mechanism behind the maternal effect (for example, transmission of secondary plant compounds or mRNA via the eggs that enhance activity/transcription of digestive gut enzymes) would contribute to the newly evolving field of maternal effect studies.

Oligogenic basis

The complex pattern of larval survival rates in the backcrosses and the results of the LG analysis indicate an oligogenic inheritance. Thus, inheritance of pea adaptation neither involves a very large number of genes of small effect, as it would be seen if loci contributing to pea adaptation were spread over a large number of chromosomes, nor is it controlled by a single gene. At least two genes are involved, because we found that factors contributing to pea adaptation are located on at least two chromosomes. It is very likely that more than a few genes are involved in larval adaptation to a new host plant, because the successful development of a larva on its host plant, which was recorded as survivorship in our study, requires a suite of mechanisms. The larva must be able to recognize, digest and fully develop on the newly acquired host plant (Thomas et al., 1987), and failure to do so can have multiple causes. Some larvae died because they did not initiate feeding, others initiated feeding but died before molting to the next larval stage, and still others passed through three larval instars but were not able to pupate successfully (Henniges-Janssen, pers. obs.). The chance of a single gene controlling such complex patterns seems unlikely, whereas it seems plausible that each of these steps is under complex genetic control. Therefore, adaptation to pea is likely to be oligogenic and our finding of multiple chromosomes associated with pea adaptation is not surprising. There has been a long-standing debate concerning the number of genes involved in adaptation. Historically, it has been argued that most adaptations result from numerous small changes (Fisher, 1930). This view was challenged by more recent laboratory studies (Orr and Coyne, 1992; Orr, 2005) and theoretical models (Orr, 1998), which led to the conclusion that a few major genes account for a large portion of adaptation. The genetic basis of pea adaptation is probably best explained as a mixture of a few genes (detected in our linkage analysis) with major effects plus the influence of many more loci with minor effects as reflected in the complex pattern of inheritance. Earlier we made the comparison to the genetic basis and inheritance of insecticide resistance; thus, adaptive changes to human-disturbed environments. Insecticide resistance arising in over-sprayed field populations, however, is a typically monogenic trait in most cases (Roush and McKenzie, 1987). In contrast, the multiplicity of factors that govern the genetics of host adaptation has been stressed by several other authors (Bernays and Graham, 1988; Lu et al., 2001). Sheck and Gould (1996) showed that different host-associated feeding behaviors in Heliothis virescens and H. subflexa were controlled by multiple and likely different loci.

Trade-offs in adaptation

Our evidence that a particular chromosome harbors genes contributing to successful growth and development on pea is based on the over-representation of alleles from the P-strain in surviving backcross progeny feeding on pea. To control for possible segregation distortion, that is, the preferential transmission of an allele during meiosis (Sandler and Novitski, 1957), it would have been ideal to also score backcross progeny feeding on pea that failed to survive. However, this was not feasible owing to the inability to obtain sufficient DNA from tiny larvae that died at various times throughout the experiment. Therefore, we were confined to comparisons among groups of individuals that had survived at least to the pupal stage, and we reared additional backcross progeny on kale to provide a control group for each family. Significance in the Fisher's exact test comparing these two groups is sensitive to two effects: selection for P-strain alleles on pea and selection against P-strain alleles on kale. With sufficiently large sample sizes, these two effects could be distinguished statistically by comparison to the 1:1 segregation ratio expected in the absence of any selection or segregation distortion; however, even with pooling over families, our sample sizes were too small to detect any but the most extreme deviations. However, qualitative comparison of the segregation ratios of pea vs kale survivors (Figure 3) shows that for each of the four chromosomes with the P-strain allele over-represented among pea survivors, it was also under-represented among kale survivors. This provides some evidence of a trade-off in adaptation to different hosts at the gene level, even though there was no correlation among families for performance on pea vs kale.

Heterogeneity in P-strain

In comparing the overall survival rates of the different types of crosses, obvious patterns were a generally higher survival rate on kale than on pea, irrespective of generation or type of cross, and a higher survivorship on pea in the P-strain backcross than in the backcross to C-strain. Furthermore, almost no larval survivorship on pea was detectable in the PC-type F1 offspring from C1 but up to 30% survived in the CP-type. However, no such pattern was obvious when studying the survival rates of the individual families within and between the types of backcrosses. Moreover, we found no significant positive or negative correlation between the survival rate on pea and kale of families from the same direction of cross, that is, a high survival rate on kale was not associated with a low survival rate on pea or vice versa. These findings allow for several conclusions: (1) High survival on pea was not just due to general vigor, whereby some families simply survived better regardless of the host plant. (2) Loci affecting survivorship on kale are not linked to loci affecting survivorship on pea. (3) There is no evidence for a trade-off at the family level, such that adaptation to one host results in a relatively poorer performance on alternative hosts (Agrawal, 2000).

We conclude that the highly variable survivorship on pea among replicate backcross families is evidence that considerable genetic variation still exists in the P-strain; with those families showing higher survivorship on pea segregating for more pea-adapted genes than families with lower survivorship. The set of chromosomes showing significant effects was also variable across the three high-surviving backcross families. Low-surviving backcross families, which were not analyzed owing to lower sample sizes, would be predicted to be segregating even smaller combinations of these chromosomes. Therefore, the P-strain is not yet homozygous for all the genes that contribute to pea adaptation, but likely in a transition phase to complete host expansion and homozygosity for the trait ‘survival on pea’. We hypothesize that an individual does not have to be homozygous at all pea-adapted loci, but being homozygous for a core number of alleles enables successful development on pea.

The potential to adapt to a novel host plant in a short evolutionary timescale, as seen in the example of the P-strain's rapid spread to sugar pea, is increased when it results from standing genetic variation, whereas more time is needed for awaiting a beneficial mutation (Barrett and Schluter, 2008). Adaptation from standing genetic variation arises faster because the advantageous allele is already present in multiple copies and not only as a single mutation. There is evidence that some standing genetic variation for the ability to feed on legumes and on other host plant species exists in DBM populations: DBM has occasionally been found on plants other than Brassicaceae, among them plants from the Fabaceae family (Robinson et al., 2010). Gupta and Thorsteinson (1960) showed that some DBM larvae were able to survive on legumes under laboratory conditions. Moreover, an unrelated C-strain responded to laboratory selection on pea to increase its survivorship to nearly 50% over six generations (Löhr and Gathu, 2002). This suggests that sufficient pre-existing genetic variation existed in Kenya for the unusually strong and extended selection pressure for survivorship on pea, after destruction of suitable crucifer hosts, to result in the observed sudden host range expansion.

Conclusion and perspective

Our results show that adaptation to sugar pea in DBM P-strain larvae cannot be explained by simple Mendelian inheritance, but instead by an intricate genetic pattern composed of an autosomal oligogenic inheritance with a maternal effect and can thus be considered a complex trait. Considering the suite of mechanisms necessary for successful larval host plant adaptation (for example, host perception, adequate digestion and detoxification), it seems not surprising to find more than one underlying gene responsible. Now the challenge is to identify candidate genes underlying the P-strain's mechanisms of adaptation to sugar pea. In our follow-up study on the transcriptional response to pea feeding in DBM larvae (Henniges-Janssen et al., in preparation), we have identified transcripts with roles in gustation and perception (for example, odorant binding proteins), detoxification (for example, cytochrome P450 monooxygenases, glutathione S-transferases), digestion (proteinases) and stress responses. Future mapping efforts will assess whether any of the LGs identified here harbor genes that affect expression changes, providing a genetic mechanism of the host expansion of P. xylostella to pea.