Introduction

Human-mediated plant domestication, introduction of non-native crop plants and insect invasion to new habitats has greatly shaped the evolutionary history of many phytophagous insect species by providing new niches to explore (Kim and McPheron 1993). Plant domestication—the breeding for desired agronomic traits—provides new niches as this process involves tradeoffs among human-desired characteristics and plant defenses (Chen et al. 2015; Mitchell et al. 2016), known as the domestication syndrome (Hammer 1984; Gross and Olsen 2010). Breeding may have also facilitated insect adaptation; for example, the original range of western corn rootworm, Diabrotica virgifera virgifera expanded following maize (Zea mays subsp. mays) domestication in Mexico and Central America and later by intense maize cropping in the US spanning from the Rocky Mountains to the Atlantic Ocean (Smith 1966; Kim and Sappington 2005; Gray et al. 2009). The introduction of domesticated plants to novel areas provides new niches, leading to adaptation, host shifting and subsequent host-associated differentiation (HAD; Medina et al. 2012; Antwi et al. 2015) by native insects (Bush 1969; Shirai and Morimoto 1999; Calcagno et al. 2007). The apple maggot, Rhagoletis pomonella shifted from its native host plant, hawthorn (Crataegus spp. mollis) ≈160 years ago in Northeastern United States to the introduced and domesticated apple, Malus domestica (Feder et al. 2003) using genetic variation from ancestral populations in Mexico (Michel et al. 2007). These agricultural impacts have certainly affected insect evolution for the past 10,000 years.

The sugarcane borer, Diatraea saccharalis, is a Lepidopteran found throughout the Americas, including southern United States, Caribbean, Central America, and the warmer portions of South America, including parts of northern Argentina (Dyar and Heinrich 1927; Box 1931, 1951, 1956). The Delta region of the Orinocos River, Venezuela, is assumed to be its center of origin, near its densely populated original host plants—aquatic and semi-aquatic grasses of Paspalum and Echinochloa genera (Myers 1932; Uvarov 1964). The sugarcane borer can feed on a broad range of Poaceae members (i.e., grasses) such as sugarcane (Saccharum spp.), maize (Zea mays L.) and rice (Oryza sativa L.) and non-crop grasses (Myers 1932; Box 1950a, b; Roe et al. 1981; Smith et al. 1988). On agriculturally important crops, the sugarcane borer causes substantial losses due to its direct feeding and facilitates pathogens and secondary pests infestations (Capinera 2001; Cruz 2007; Dinardo-Miranda et al. 2012; Wilson and VanWeelden 2017).

The sugarcane borer’s adaptation to agricultural crops of maize and sugarcane provides an opportunity to understand how ancestral plant domestication and the introduction and production of non-native crop species can influence the evolutionary history and HAD of insect species. One hypothesis suggests that the sugarcane borer adapted to early maize landraces after this crop’s domestication and spread (≈7500 years ago) (Pashley et al. 1990; Silva-Brandao et al. 2015). This adaptation was followed by a second expansion or host shift after European colonizers introduced sugarcane to South America (Pashley et al. 1990) and the Orinoco Delta River region in the mid-seventeenth century (Myers 1935). Portuguese colonizers replaced large forested areas in the Northeast coastal areas of Brazil with sugarcane plantations. Sugarcane cultivation continued along the Brazilian coastline until the eighteenth century, when coffee shifted the less profitable sugarcane away from the coast and towards the interior. With the agricultural modernization of the early 1950s, sugarcane production returned as a major crop, but concentrated in São Paulo and Minas Gerais states instead of the Northeastern coast (McNeill 1986). Maize production was concentrated in small farms until 1970. However, the agricultural technological revolution of 1990s shifted the agricultural landscape to the Midwest (The Economist 2010) where maize reached its highest production in Brazil with the introduction of transgenic varieties in 2008 (James 2015).

Changes in Brazilian agriculture likely impacted insect pest evolutionary genetics; however, this is not completely understood. We posit that the evolution of sugarcane borer is associated with the changes in the Brazilian landscape from historical and modern agricultural activity. We generated data from 17 populations using 10 microsatellite loci. Populations were collected from three host plants (maize, sugarcane and sorghum) throughout the range of the sugarcane borer in Brazil. We hypothesized that the modification of large areas of sugarcane during Portuguese colonization facilitated a host shift to sugarcane, whereas more recent agricultural expansion of maize and sugarcane production have provided sufficient evolutionary pressure leading to population structure and HAD. The implications of this study are also discussed in terms of insect resistance management strategies commonly used to delay the evolution of resistance to Bt maize in Brazil.

Materials and methods

Sampling, DNA extraction and microsatellite genotyping

We collected sugarcane borer larvae in seven states corresponding to the most representative areas of maize and sugarcane production in Brazil. Larvae from multiple host plants were taken (e.g., maize and sugarcane Table 1 and Figure S1) in states that produce different crops. We collected individuals from maize fields after the introduction of transgenic varieties; however, we sampled in non-transgenic fields. At each site, individuals were randomly collected from several individual plants. Collecting one larva from distant plants reduces the likelihood of sampling siblings. All larvae were feed artificial diet during transport to the laboratory and until pupation. Pupae from each population were placed in separated cages until adult emergence after which the adults were stored in 80% alcohol. With one exception (sgMG), sugarcane population’s larvae were placed on artificial diet for transportation and stored in alcohol 80% for DNA extraction. For consistency, collections from an individual field are considered one population in the subsequent analyses (in some cases we considered the clustering of individuals identified a posteriori after initial analyses, see below).

Table 1 Population ID, sample locality, latitude, longitude, collection date, host, sample size (N) and microsatellite summary intra-population statistics: average number of alleles (nA), mean expected (HE ± sd) and observed (HO ± sd) heterozigosities, and Weir and Cockerham inbreeding coefficient (f)

We extracted the genomic DNA from each individual using the DNA extraction Kit Wizard® (Promega, Madison, WI, USA) following the provided instructions. DNA from adults were extracted from the thorax and DNA extraction from larvae were made using the whole individual. Before extraction, all larvae were dissected to remove parasitoid pupae.

Microsatellite genotyping was performed following (Pavinato et al. 2013) using the loci Dsc1, Dsc2, Dsc3, Dsc5, Dsc7, Dsc9, Dsc10, Dsc11, Dsc19 and Dsc20 (the annealing temperature of Dsc5 was changed to 56° C). Each microsatellite loci was genotyped separately and the alleles were scored on a 6% polyacrylamide gel with an Analyzer 4300 s Li-Cor (Biosciences, Lincoln, NE, USA).

Intra-population summary statistics

Intra-specific population parameters (average number of alleles (nA), expected (H E) and observed (H O) heterozygosity and inbreeding coefficient (f)) were calculated using GDA software (Weir 1996). The significance of inbreeding coefficient estimates (f) was assessed using 1000 two-side permutation tests with FSTAT software (Goudet 1995). Fisher’s exact test for deviations from Hardy–Weinberg (H-W) proportions for each loci in each population was calculated with 1000 Monte Carlo permutations using the R package pegas version 0.8–2 (Paradis 2010). We estimated the null allele frequency of loci that showed significant deviation from (H-W) proportion in a majority of populations with a maximum likelihood estimator through an EM algorithm using the software FREENA (Chapuis and Estoup 2007; Chapuis et al. 2008). Linkage disequilibrium was estimated among loci using Genepop package version 4.0 (Rousset 2008). All nominal values for multiple tests were corrected using Bonferroni procedure (Šidák 1967).

Genetic structure

F-statistic estimates (F ST, F IS and F IT) were obtained through Weir and Cockerham (1984) (F ST θ, F IS f and F IT F) estimators. An overall estimate was obtained using GDA software (Weir 1996) and significance of each estimate was assessed by 95% confidence intervals (CIs) obtained with 1000 bootstraps. Pairwise estimates of population differentiation (pairwise F ST = pF ST) were calculated using FSTAT software (Goudet 1995). Pairwise tests for population differentiation were conducted with 2720 permutations and nominal levels were adjusted using the Bonferroni procedure (Šidák 1967).

We identified significant biological clusters by running a model-based classification analysis implemented in STRUCTURE (Pritchard et al. 2000). This analysis not only allowed us to identify the most likely number of significant biological groups that best represented the data, it also allowed us to correctly assign the individuals to each cluster (Falush et al. 2007; Hubisz et al. 2009). Analyses were performed assuming the admixture model with a correlated allele frequency between groups. We removed 5 × 105 iterations of each run to limit bias associated with starting values (i.e., burn-in) and Markov Chain Monte Carlo sampling was set to 106; 20 independent runs of each K was made from K = 1 to K = 22 (K = 22 means: N sampled populations + 5 additional Ks). We defined the most likely number of groups following Evanno’s method (Evanno et al. 2005), implemented in the Structure Harvester web-page (Earl and vonHoldt 2011). Cluster labeling normalization and averages of posterior probability among independent runs were made using CLUMMP software (Jakobsson and Rosenberg 2007). We also assessed the relationship among populations through neighbor-joining tree reconstruction on Nei’s genetic distance (Nei 1978), calculated using software MSA version 4.05 (Dieringer and Schlötterer 2003). Significance of nodes were calculated by bootstrapping loci and re-estimating the genetic distance matrix 1000 times. NJ clustering and consensus tree were obtained using the PHYLIP package (Felsenstein 2005).

Contemporary gene flow

We assessed the contemporary dispersion rates among populations using the software BAYESASS, v. 3.0.1 (Wilson and Rannala 2003). Fifteen million (1.5 × 107) Markov Chain Monte Carlo interactions with a 1.5 × 106 burn-in were used to calculate the posterior mean migration rate between pairwise populations. Markov Chain Monte Carlo convergence was assessed by plotting total log-likelihood scores vs. iterations using software TRACER v. 1.5 (Rambaut et al. 2004). We obtained a circular plot representing the number of individuals exchanged among populations using R version 3.1.0 and a modified script from Abel and Sander (2014).

Isolation-by-environment or distance

Higher values of pairwise FST (pF ST) between populations from different host plants suggest introgression barriers among hosts (e.g., host preference, selection against migrants or hybrids, isolation-by-environment, or IBE) (Nosil 2012). The alternative hypothesis is that genetic distance between populations is correlated with geographic distance (isolation-by-distance or IBD). Both IBE and IBD can produce similar patterns when geography is correlated with environmental variables (Meirmans 2012; Orsini et al. 2013; Wang and Bradburd 2014). In contrast, when they operate in opposition, IBD can constrain or accelerate IBE (Nosil et al. 2008, 2009), thus preventing its detection or leading to false positives if it was not properly controlled in statistical tests (Thibert-Plante and Hendry 2010).

We tested the hypothesis of host affiliation by using data from only the populations of São Paulo, Minas Gerais and Goiás (dashed ellipse in Figure S1), as, in these locations, we have collections in different distances and host plants. We performed a partial Mantel test in the framework of causal modeling (Cushman et al. 2006; Castillo et al. 2014). This framework allowed us to jointly test the main hypothesis of IBE and the alternative hypothesis of IBD. This joint procedure is more robust than conducting a partial Mantel test for every hypothesis independently (Cushman et al. 2013). Causal modeling was conducted with a permutation partial Mantel test with 1000 replicates using the R package PopGenReport (Adamack and Gruber 2014).

Demographic history

We used two approaches to infer past and/or recent changes in effective population size (Ne). One uses a full-likelihood Bayesian approach (i.e., the hierarchical Bayesian analysis of microsatellite variation (MSVAR) method; Beaumont 1999; Storz and Beaumont 2002) and the second is a moment-based method that uses two summary statistics to test the hypothesis of changes in population size (i.e., the BOTTLENECK method (Cornuet and Luikart 1996). The analyses were performed at two population levels: (1) all collected populations were analyzed independently; and (2) we pooled the individuals in clusters based on the STRUCTURE analysis.

MSVAR method

For each MSVAR analysis (each population and cluster), we performed three independent runs of 1 × 109 steps, varying the starting values and means for priors and hyper-priors (values in Table S1). Each independent run was set to simulate one of the three possible scenarios: (a) population expansion; (b) a stable condition with no changes in N e and (c) a population decline. To express time (t a ) in years (T), we considered that the generation time of D. saccharalis was 0.1 years (≈10 generations per year) (Capinera 2001). Parameters were thinned with an interval of 1 × 104 steps, resulting in output files with 1 × 105 values. Chain convergence (each independent run is considered a chain), for each analysis, was checked with Gelman and Rubin diagnosis analysis (Gelman and Rubin 1992) implemented in R package coda (Plummer et al. 2006).

We tested the hypothesis of a more recent (e.g., last < 100 years ago) vs. an historical signal (>100 years ago) of demographic change in the data using the posterior distribution for time of changes t a . We followed the procedures and calculations presented by Sousa et al. (2010). Our objective with this analysis was to test if changes in N e was consistent with (1) changes imposed by modern agricultural practices in the last 60 years ago (e.g, intensification in pest management and maize cropping); (2) changes ≈60–500 years ago (e.g., the introduction of sugarcane and its spread and the land use changes from sugarcane to coffee production); or (3) ancestral population changes that took place before European colonization, >500 years ago (e.g., maize domestication). We used ‘Bayes factors’ (BFs) to measure the weight of evidence of alternative time intervals. The BFs were computed for time periods in a sliding window composed of two time intervals: (1) from 1 to 500 years ago, with time changed every 5 years, and (2) from 600 to 5000 years ago, with periods of 100 years. The statistical test was conducted considering two alternative hypotheses, here indicated as H1, the population started to decrease or increase during the time interval of t i<T<t i+5 (t i = 5, 10, 15…) and t j<T<t i+100 (t j = 600, 700, …) where t i and t j indicate the two time intervals; and the alternative hypothesis (H2) that considered changes started in any other period. A BF of 1.0 indicates that the two hypotheses are equally probable, whereas higher values (>1.0) provide stronger support for H 1 than H 2.

We also estimated the magnitude of demographic changes through the calculation of effect size Hedges’ d (Hedges and Olkin 1985) and its 95% CI for each independent run (chains) of MSVAR (Paz-Vinas et al. 2013). Mean effect size (MES) 95% CIs that contain zeros indicates that the population did not show significant demographic changes; significant negative values correspond to a strong bottleneck; and the significant positive values mean a significant population expansion. Comparisons of pairs of MES values (e.g., between pairs of populations) rely on the overlap of each 95% CIs.

BOTTLENECK method

The EWCL (Ewens, Watterson, Cornuet and Luikart) method implemented in the software BOTTLENECK v 1.2.02 (Cornuet and Luikart 1996) is a moment-based method that uses two summary statistics (i.e., number of alleles—n A and expected heterozygosity—HE) to test for population departures from the expected pattern of genetic diversity for a demographically stable population (i.e., under mutation-drift equilibrium) (Watterson 1986; Cornuet and Luikart 1996). Three mutation models were used to obtain the distribution of H E conditioned to n A and on the sample size for each population: the infinite allele model (IAM), the single stepwise model (SSM) and the two-phase model (TPM). Then, values of H E obtained with the real data set were compared with values obtained by coalescent simulations. For the TPM, we allowed for 5% multi-step mutation events and 95% SSM. Two statistical tests recommended for the number of loci and sample size were performed: sign test and Wilcoxon test, both implemented in BOTTLENECK software. The methods calculated the probability of departure, in the real data set, from the expected number of loci with heterozygosity excess or deficiency under mutation-drift equilibrium. Departures from the null hypothesis (no changes) suggest that a population underwent selection, expansion or decline. Past and/or recent demographic events leave a genome-wide signal consistent with a majority of loci departing from the null model. We considered a signal of population size changes significant if more than one mutation model had significant p-values.

Results

Intra-population summary statistics

The calculation of intra-population summary statistics allowed us to estimate the genetic diversity of each sugarcane borer population and, for the first time, determine the broad scale genetic variability of this species in Brazil. All populations showed low genetic variability, as measured by the average number of alleles (Table 1). Two populations showed significant values of mean inbreeding coefficient, but the majority of populations were in H–W proportion. The Fisher’s exact test for H–W proportion, in combination with null allele estimation for each loci/population, allowed us to determine if all loci were suitable for the subsequent analyses. None of the loci showed deviation in all populations (Table S2), although loci Dsc3, Dsc9 and Dsc19 showed significant departure from H-W proportion >40% of populations. We retained all loci in the data set as it is difficult to differentiate among the Wahlund effect (e.g., fine-scale population structure) and true null alleles (Chapuis and Estoup 2007). We did not observe a significant departure from H-W proportion in all populations and mean null alleles frequencies within each population were low (Table S3).

Genetic structure related to geography and host

Clustering methods based on individual assignment identified the most significant number of biological groups that represented the sampled allele frequency. Calculation of ΔK following Evanno’s method showed that the most significant number of biological groups were K = 2 followed by K = 3 (Figure S2). Fig. 1 shows the result of Bayesian classification performed with STRUCTURE with pie charts representing the average proportion of each k cluster in each sampled population (see the average Q in Table S4). At K = 2, individuals mostly clustered by geographic proximity, and represented spatial differentiation. At K = 3, one of the previously identified clusters was split into 2 “sub-clusters” respective of host plant (e.g., sugarcane or maize) within small geographic proximity (sugarcane fields in São Paulo and Minas Gerais). The calculations of pairwise genetic distance between populations served as a cross-validation to the number of significant clusters. A NJ tree based on genetic distance (Fig. 1) supported the STRUCTURE results; two clusters are apparent, with the further “sub-clustering” within the São Paulo and Minas Gerais states.

Fig. 1
figure 1

Output from analysis of genetic relationship of populations. NJ tree on Nei’s genetic distance (1978) matrix (on left) showed the relationship among populations and colored branches indicated significant relationship after 1000 bootstrapping. Pie chart on the map (on right) corresponds to the average individual posterior assignment (Q) of each populations recovered by STRUCTURE and colors within pie charts represents the groups corresponded to K = 3. Significant branches of NJ tree are also represented in the map and corroborates with the individual clustering recovered by assignment analysis. Pie charts radious are proportional to the number of individuals collected at each sample site

The estimates of Wright’s F-statistics using the W&C analysis of variance allowed us quantify the overall genetic structure between sugarcane borer populations (Table S5). The F ST estimate among populations showed a significant and moderate level of genetic differentiation −0.078 (95% bootstrap CI 0.053–0.102). Following Meirmans’ (Meirmans 2015) recommendation, we estimated overall and pairwise θ estimates for the K = 2 and K = 3 clustering possibilities by pooling individuals with clusters membership higher than 60%. Overall estimation decreased when using biological groups and θ K=2 (0.054; 95% CI 0.018–0.092) was slightly lower than θ K=3 (0.057; 95% CI 0.031–0.082).

Contemporary gene flow

We predicted that recent individual movement—natural or as a consequence of unintentional dispersion mediated by agricultural activity—may occur in sugarcane borer populations despite potential selection for host plants in Brazil. The Bayesian method implemented in BAYESASS, in combination with microsatellite markers, provides the power to detect individual immigrants and directly estimate recent migration rates (Wilson and Rannala 2003). The method also provided a framework to confirm the previous clustering, as it can identify migrants that dominate each cluster with larger gene pool contributions. The BAYESASS circular plot (Figure S3) showed that most of the individuals were resident, as indicated by thick lines that depart and arrive at the same population, suggesting low gene flow. The movement of insects that were supported by the data also indicated that most of emigrants belonged to the major STRUCTURE cluster (k1); most of the immigrants that arrived at k2 and k3 were from this major cluster. However, there was limited gene flow between k2 and k3, also supported by the high pairwise F ST estimates between theses two clusters.

IBE and causal modeling

In order to test if higher values of pairwise F ST (pF ST) between populations from different host plants indicate the existence of introgression barriers among hosts (e.g., host preference, selection against migrants or hybrids) we used a partial Mantel test in the framework of causal modeling. This framework tested the competing hypothesis of genetic differentiation (pF ST) determined by host affiliation (IBE) or by geographic distance (IBD), taking into account the influence of one and the other in the pF ST estimates. Patterns of genetic differentiation determined by host affiliation can be inflated when there is a strong influence of IBD, therefore this method accounts for IBD when assessing IBE. Our analysis suggested that IBE is the major source of genetic differentiation among populations highlighted in Figure S1, as higher values of pF ST (Table S6) were found between populations collected in different host plants (empirical p-value = 0.027); the IBD pattern among those populations was not significant.

Demographic history

We posited that agricultural activities in the past 200 years, including the shift in crop production across geographies and the expansion into the Brazilian Midwest (Séguy and Bouzinac 1994; The Economist 2010) were the major factors shaping the genetic variability and structure of sugarcane borer in Brazil. In this section, we present results from two different but complementary approaches to infer changes population size (N e): MSVAR and BOTTLENECK.

MSVAR method

The MSVAR approach consistently showed no overlap in posterior distributions for current N e (log10 (N 0)) and ancestral N e (log10 (N 1)), indicating significant population size changes. In all populations and clusters, there was evidence for a population decrease, as mean values for N 0 were lower than N 1 (Figures S4A and S4B). The ancestral N e had no differences between populations with average log10 (N 1) = 5 (100,000 individuals). However, the current N e varied between sub-populations; whereas most populations had a log10 (N 0) distribution centered in ≈2.3 (≈200 individuals), six populations showed a more dramatic decrease in population size (Figure S4A). In respect to the clusters, the signal of population change indicated that the ancestral N e of 100 000 decreased to ≈1000 individuals in k1, ≈300 in k3 and ≈100 in k2 (Figure S4B). The 95% CIs for the each estimate of Hedges’ d effect size indicated that a significant and moderate to strong bottleneck occurred in each population and cluster. However, the magnitude of the bottleneck event was similar, as the 95% CI of each population overlapped (Figure S5).

Using a generation time for sugarcane borer of ≈10 generations per year, our estimates of time since the population size change (T) fall between ≈50 and 300 years ago, although the distributions of log10 T contain values from 0 (recent) to 1000 years ago (Figure S6). As indicated in by the analysis, populations with the more drastic size contraction had a shorter time since population size change (<100 years ago, Figure S6). BF analysis estimated the time since population decrease, T, by measuring the weight of evidence of alternative time intervals (i.e., more ancestral vs. more recent changes). Our results were consistent with a more recent bottleneck with the majority of significant peaks for each population around 100 years ago, although some populations (sgSP, mzSP, mzSPj, mzMTple, mzMTsr and mzMTca) showed significant T around 50 years ago (Figure S7A). In the case of k clusters, the BF analyses indicated that the population size changes occurred within the past 200 years (Figure S7B).

BOTTLENECK method

Most of the populations departed from the expected number of alleles (n A) and expected heterozygosity (H E) in one or two of mutation models (IAM, SMM and TPM) and with the test statistic (sign test and two-tail Wilcoxon test) indicating deviations from the null model of mutation-drift equilibrium (Table 2). The signal of population size changes captured with those tests indicated a more recent population expansion than population contraction in the majority of the populations, except for population sgMG that had a signal of population bottleneck. It is not possible to estimate time since population change using the heterozygosity model implemented in Bottleneck, although it is more likely to reveal recent demographic changes rather than historical changes.

Table 2 Summary results for BOTTLENECK method

Discussion

By combining a specific sampling scheme across the sugarcane borer’s distribution with genetic data obtained by microsatellite genotyping, we identified the major factors that contributed to the population differentiation in Brazil, which appear to be related to historical (sixteenth–nineteenth centuries) and contemporary (≈60 years ago) agricultural practices. Bayesian model-based assignment method identified two (K = 2) and three (K = 3) significant genetic clusters that corresponded to: (i) a strong signal of spatial genetic structure and, (ii) a cryptic signal of host differentiation. Within an area delimitated by three states (São Paulo, Goiás and Minas Gerais), we found intermediate to strong genetic differentiation between pairs of populations collected from different host plants but no signal of IBD. Finally, we detected a strong signal of a recent (≈200 ya) population contraction using coalescent simulations. We inferred that (i) genetic differentiation may be fostered by historical and recent agricultural expansion of sugarcane production and, (ii) that the host differentiation was accelerated by the adoption of monocultures, further facilitated by the recent landscape changes imposed by expansion of maize fields in areas that traditionally were used for sugarcane. Currently, the co-occurrence of different crops with different phenology and nutritional characteristics is leading to divergence selection and increasing niche dimensionality differences.

The co-existence of two complementary models of population differentiation

The domestication of wild species to crop production and the introduction of non-native crop species greatly impact insect communities and are well known to open new niches (Gray et al. 2009; Bourguet et al. 2014). The sugarcane borer was ancestrally restricted to non-crop grasses, but is now a major pest in Brazil and South America on three crop species: maize, sugarcane and rice.

Maize has been planted in South America, including Brazil, since its domestication ≈8000 years ago; however, widespread maize production occurred after European colonization, 500 years ago (Bonavia 2013), but was still planted in small farms until 1970 (McNeill 1986) when modern agricultural technology led to expanding maize production. Sugarcane production, in contrast, dominated the agricultural landscape of early 1600s after its introduction to Brazil. Sugarcane plants brought from Africa by Portuguese colonizers were first planted along the coast, from Northeast Brazil to Rio de Janeiro in large areas dedicated to produce sugar for exportation. From late 1700 to about 1950, coffee production replaced sugarcane, and production of the latter shifted towards the Brazilian interior (McNeill 1986). In ≈1950, large areas of sugarcane appeared in the states of Minas Gerais and São Paulo, and, in 1960 the agricultural landscape shifted to Midwestern Brazil (Goais and Mato Grosso) (McNeill 1986; Séguy and Bouzinac 1994). This complex agricultural history provided important context for the sugarcane borer’s evolutionary history.

Model-based individual assignment suggested two clustering scenarios: (i) two clusters (K = 2) related to spatial differentiation imposed by recent agricultural systems; and (ii) three clusters (K = 3) indicating incipient host differentiation. For K = 2, the clusters were linked with demographic and colonization events and represented most of the actual crop production areas. These two clusters may represent: (1) the increase of agricultural activity in the Midwest, and/or (2) the expansion of corn production in areas that historically were used for sugarcane production.

This species is considered a short distance flyer (Guagliumi 1972) and non-specialist phytophagous insects tend to disperse less (Peterson and Denno 1998; Oliver 2006). Other studies found a strong IBD and population differentiation when distant populations of sugarcane borer were compared (Pashley et al. 1990; Joyce et al. 2014). Despite IBD, migration does occur among clusters (Figure S3) likely due to non-intentional dispersion, local dispersion among fields in a mostly contiguous landscape, or resulting from the reallocation and expansion of maize fields that retained polymorphism. In the last 30 years, most of maize productions were concentrated at Minas Gerais and recently it changed to Goiás state in the Midwest. As maize production shifted and expanded, N e grew and retained genetic diversity, such that migration-drift equilibrium may not have been reached. Further and more detailed analyses with additional populations and markers may be needed to fully differentiate migration and polymorphism.

The identification of a third cluster may indicate, however, an incipient signal of divergent selection. For K = 3, two competing hypotheses of HAD are possible: (1) it may represent an ancestral signal of association between sugarcane borers and sugarcane plants that was hidden by more recent demographic and dispersion events; in this model we assume that HAD within sugarcane borer is arising between sugarcane and maize plants; (2) it may represent a recent association to sugarcane plants, as the clustering signal is weak; in this model we can assume reproductive barriers emerged following a recent ecological adaptation to sugarcane. Some behavioral studies suggest there is indeed a reproductive barrier among sugarcane borers collected from different hosts (Fogliata et al. 2016; Joyce et al. 2016). Identifying the most likely hypothesis of ecological divergence would help us understand how agricultural intensification can facilitate insect divergence and potentially speciation by increasing differences in niche dimensionality. Sugarcane borer is an interesting model since, in this system, the domestication of maize could have facilitated the host shift in ancestral stages of its evolutionary history, with a more recent divergence onto introduced crops such as sugarcane and rice accelerating reproductive isolation.

It is difficult to address the impact of non-crop grasses on the population genetic structure of sugarcane borer as it is difficult to predict how these plants affect population growth and infestation of crop fields (Capinera 2005). Nonetheless, the population density of sugarcane borer on non-crop grasses is likely to have a small effect on crop infestation (Beuzelin et al. 2011) as the larval development is slow in non-crop grasses compared with crops (Cruz 2007). This delay may reinforce assortative mating within SCB populations on crop habitats. However, addressing the influence of dispersion and mating of individuals from non-crop grasses on the contemporary genetic structure by systematically collecting sugarcane borer populations from close non-crop grasses fields will help identify how these plants affect the delay of insect resistance evolution and how it might facilitate or constrain HAD.

Population decline and expansion in the Brazilian agroecosystem

Silva-Brandao et al. (2015), based on haplotype variability found on partial mitochondrial DNA genes, suggested that one explanation for the signal of genetic structure in sugarcane borer populations in Brazil is the demographic events that took place in South America. The signal of demographic changes could be explained by two plausible hypotheses. First, due to the ancestral demographic history, one or more selective bottleneck events were followed by population expansions: one after maize domestication ≈7500 years ago and the second linked to the more recent selection and expansion onto sugarcane ≈400 years ago (Pashley et al. 1990). Second, the sugarcane borer could have been reduced by population management practices such as insecticidal applications, transgenic maize and/or by cultural management. The advantage of using nuclear molecular markers as microsatellites, however, is that it allows the identification of more recent demographic events, as microsatellite mutations are more frequent than in mitochondrial DNA, and they include recombination (Sunnucks 2000). The signal of demographic changes captured by the microsatellite data indicated two strong bottlenecks and a recent population expansion. We did not reject the possibility of one or more events of population expansion that followed: (1) maize domestication and (2) expansion of sugarcane fields during the colonization; however, we assume the latter as a hypothetical event to link the strong signal of population contraction observed ≈200 years ago (Fig. 2). We posit that the bottleneck signal of ≈200 years ago was associated with intense human intervention that dramatically changed the Brazil landscape at the beginning of the nineteenth century, such as the reduction of sugarcane plantation and removal of potential ancestral plant hosts (McNeill 1986). Such drastic changes associated with increases of coffee production and substantial pest management greatly reduce sugarcane borer effective population size (N e).

Fig. 2
figure 2

Our hypothesis of how the observed demographic changes in sugarcane borer populations were linked with historical events of Brazilian agriculture. Blue triangles represent the historical events. Orange triangles represent the estimated time of population contraction obtained by MSVAR. We placed the signal of population expansion of BOTTLENECK (yellow triangle) were it is more likely given the historical event of agricultural intensification. The dashed triangles represent hypothetical expansion (pink) and contraction (orange) given the widespread production of sugarcane during the colonization and the recent plantation of Bt maize

By 1970, the sugarcane production had expanded into the states of Goiás, Mato Grosso and Mato Grosso do Sul. In this scenario, MSVAR simulations also indicated a more recent population contraction, ≈50 years ago, in Sao Paulo that may be related to the widespread use of biological control (Parra and Zucchi 2004). We did not exclude another bottleneck imposed by the recent plantation of Bt maize, which is effective against sugarcane borer. Our collections occurred before their widespread use; however, the population contraction of Mato Grosso populations may suggest this possibility as it was collected 3 years after (2011) Bt maize production.

How resistance management can benefit with pest evolutionary history

Since 2008, transgenic maize plants that express Bt toxins have been planted in Brazil to control fall armyworm, Spodoptera frugiperda, and other caterpillar pests including the sugarcane borer (Blanco et al. 2016). The high-dose refuge strategy was implemented in order to delay insect resistance evolution (Bernardi et al. 2015). For this strategy to work in the field, random mating among resistant insects from transgenic plants and susceptible insects from non-transgenic plants should occur.

Our findings reinforce the need to include information about pest evolutionary history during the development of insect resistance management strategies. Our gene flow estimates suggest that the majority of individuals remain in the field, and not disperse long distances, indicating that the best arrangement of Bt and refuge area should respect the dispersal ability of sugarcane borer (Guagliumi 1972). Our estimates of population structure identified a cryptic signal of host adaptation indicating that the use of alternative crop hosts such as sugarcane, may decrease random mating. In Goias, where sugarcane and maize fields are planted almost the entire year, it is unlikely that phenological differences between sugarcane borer emerging from maize and from sugarcane exist; however, both plants have distinct growing seasons in Minas Gerais and São Paulo (Figure S8), potentially leading to allochronic divergence and limited gene flow among resistant and susceptible individuals. Additional crossing and fitness experiments between insects from maize and sugarcane would be necessary to fully understand gene flow and its impact on managing Bt resistance.

Conclusions

Genetic evidence suggests that the historical introduction of sugarcane production along with the more recent expansion of maize and sugarcane fields, greatly impacted the genetic diversity and evolutionary dynamics of the sugarcane borer in Brazil. The combination of microsatellite markers and coalescent models indicated that demographic events (bottleneck followed by an expansion) coincided with known agricultural events in the Brazilian landscape. In addition, we observed signals of incipient divergence among hosts (sugarcane and maize). These evidences highlight the importance of understanding the evolutionary history and dynamics of pest species develop sustainable pest control methods.

Data archiving

Microsatellite data set, R scripts for analyzing genetic data and MSVAR outputs are available from Dryad: https://doi.org/10.5061/dryad.9p7b3.