Introduction

Thrips comprise a single insect order with around 5000 described species. While most thrips species remain inconspicuous, some show all the features that predispose them to be major pest species, by causing direct feeding damage and by spreading viral diseases to food, fiber and ornamental crops. Most thrips are host-plant specific, but some economically important species are polyphagous. The diversity of ecosystems in which these latter species are encountered raises the question of genetic variation in the use of resources by different populations (reviewed by Futuyma and Peterson, 1985). From an evolutionary point of view, it is interesting to assess the genetic structure of thrips populations relative to their host plants. Moreover, pest management through the use of natural enemies or the application of specific pesticides would certainly benefit from the ability to discriminate between thrips genotypes on different hosts.

Thrips tabaci (Thysanoptera; Terebrantia) is a widespread pest that has attracted special interest as the first identified vector of tomato spotted wilt virus (TSWV), a tospovirus with a wide host range that is capable of causing serious epidemics and crop losses. Interestingly, the effectiveness of T. tabaci as a vector for TSWV and host-plant preference can vary dramatically among populations (eg Zawirska, 1976). Given this ecological diversity, Zawirska (1976) suggested that T. tabaci consists of two biotypes. The ‘tabaci type’ is found on tobacco plants and is associated with the spread of TSWV. In contrast, populations of the ‘communis type’ infest a variety of host-plants (but not tobacco) and are not vectors for TSWV. These considerations, however, have received very little attention, and today T. tabaci is generally assumed to be a single, cosmopolitan and highly polyphagous species. Host-plant transfer experiments, and a survey of studies on the long-standing problem of vector status (Chatzivassiliou, 2002) support Zawirska, and reinforce the idea that T. tabaci is a heterogeneous taxon representing at least two differentiated biotypes or subspecies.

Resolution of the taxonomic status of T. tabaci populations has been hampered by a lack of population genetic studies. The development of molecular genetic techniques during the last two decades, particularly analysis of mitochondrial DNA (mtDNA), has substantially contributed to an understanding of natural genetic diversity and speciation issues (Moritz et al, 1987; Avise, 1994). These approaches are especially helpful in such puzzling groups as thrips, which show a mosaic of diverse ecological traits superimposed onto a conserved morphology. A growing appreciation of cryptic genetic differentiation in insects suggests three alternatives: T. tabaci might be (a) a single polyphagous species, (b) a complex of host races with partial genetic differentiation but ongoing gene flow, or (c) a complex of morphologically cryptic species no longer joined via gene flow.

This study investigates the evolutionary relationship of T. tabaci populations based on sequence variation of the mitochondrial cytochrome oxidase I (COI) gene. The goals are: (1) To examine (for the first time) the genetic population structure of T. tabaci over a broad range. Such comparisons are essential, otherwise previous and more restricted results cannot be placed within a more general framework and must therefore remain isolated and anecdotal. (2) To evaluate congruence of genetic findings with morphology (upon which current taxonomy is based), and ecological characteristics such as host-plant preference and vectoring efficiency in order to corroborate previous laboratory findings.

Materials and methods

Data collection

T. tabaci were collected on leek and tobacco plants from 22 sites in Switzerland, Greece and Bulgaria (Figure 1). Specimens were first identified morphologically, and only thrips unambiguously identified as T. tabaci were subjected to genetic analyses. In addition, morphological species status was confirmed by an external specialist (S Nakahara, USDA) for specimens assigned to the three major clades (T, L1 and L2) identified in Figures 2, 3 and 4.

Figure 1
figure 1

Map showing the sampling sites for T. tabaci: 1, Biel; 2, Murten; 3, Rütihof; 4, Wädenswil; 5, Zürich; 6, Trungen; 7, Florina; 8, Ionia; 9, Arethoussa; 10, Doumbia; 11, Karditsa; 12, Volos; 13, Gorgopi; 14, Pella; 15, Velkio; 16, Xanthi; 17, Bulgaria.

Figure 2
figure 2

Maximum-likelihood tree based on the TMV+I+G evolutionary model. For clarity, branch length distances, and ML (italics) and MP bootstrap values are given only for major clades. Haplotype designation for T. tabaci (‘Tt’) is as per Table 1, followed by the number of haplotypes detected on tobacco plants ‘T’ or leek ‘L’, respectively. The tree was rooted using the COI sequence of L. migratoria.

Figure 3
figure 3

The estimated cladogram with 95% plausible set of haplotype connections for T. tabaci. Numbers within circles are the haplotype designations as per Table 1. Hash marks and numbers in parenthesis denote single-base substitutions and dashed lines indicate connections between divergent clades. Circle size is proportional to sample size and circle color indicates the three major clades also identified in Figures 2 and 4 (ie T=tobacco; L1, L2=leek).

Figure 4
figure 4

Ultrametric tree based on the NPRS. Black bars indicate major transitions during the evolution of the Thysanoptera: 1, suborder Tubulifera; 2, suborder Terebrantia; 3, leek as the main host plant; 4, tobacco as the main food plant. The scale bar below the tree shows the time scale resulting from a calibration of the molecular clock (fossil record of Thysanoptera) based on the circled node. The star indicates the node of the phylogenetic split of T. tabaci into tobacco- and leek-associated lineages.

Total genomic DNA was extracted from single thrips using the slightly modified protocol by Kawasaki (1990), and a fragment of the mitochondrial COI gene was amplified via a standard PCR reaction using the primers C1-J-1751 and C1-N-2191 (Simon et al, 1994). DNA sequences were generated using the same two primers on an ABI 3100 automated sequencer (Applied Biosystems, Foster City, CA, USA). All DNA was sequenced in both directions and aligned with the multiple sequence editor CLUSTAL X (Thomson et al, 1997).

Data analysis

Genetic variability and pairwise (traditional) FST statistics were calculated using ARLEQUIN version 2.0. (Schneider et al, 2000) to compare among group divergence with studies from other species.

We applied maximum-likelihood (ML) and maximum parsimony (MP) analyses using the computer program PAUP* 4.0b10 (Swofford, 2002) to assess phylogenetic relationships. MODELTEST 3.06 (Posada and Crandall, 1998) was used to determine the substitution model that best fits the data set for the ML analysis. The hierarchical likelihood-ratio test (LRT) implemented in MODELTEST selected the TVM+I+G model (proportion of invariable sites=0.39; gamma distribution shape=0.49). The MP analysis using all equal weights was performed under the heuristic search option (50 replicate searches with random addition of taxa). A bootstrap analysis was performed to test for statistical significance of the trees generated with 500 pseudoreplicates and under the fast stepwise addition option for MP. Two specimens each from T. palmi and T. angusticeps served as intrageneric comparisons, and because they are considered closely related to T. tabaci (Brunner et al, 2002). GenBank sequences from Oncothrips sterni (AF386719) and O. rodwayi (AF386693) served as representatives for the Tubulifera, the other suborder of the Thysanoptera. Locusta migratoria (NC_001712) was used as an outgroup.

Gene genealogies within T. tabaci was also estimated using TCS (Clement et al, 2000), a program that implements the estimation of gene genealogies from DNA sequences as described by Templeton et al (1992). This cladogram estimation method – also known as statistical parsimony – displays the number of base-pair differences between haplotypes and provides the 95% parsimoniously plausible branch connection between haplotypes.

An LRT (Huelsenbeck and Crandall, 1997) was performed with and without a molecular clock enforced. To date, major cladogenetic events such as ultrametric trees were constructed using the nonparametric rate smoothing (NPRS) method (Sanderson, 1997) as implemented in TreeEdit 1.0 (Rambaut and Charleston, 2001). To estimate roughly divergence times between major clades, we applied a molecular clock using the estimated age of extant Thysanopterans (Mayhew, 2002). The age was estimated by first compiling the age of the oldest fossil definitely attributed to the Thysanoptera, from Ross and Jarzembowski (1993). Second, these taxon age estimates were modified by making a further logical step based on phylogenetic relationships. Sister taxa (ie the two suborders Tubulifera and Terebrantia) are, by definition, the same age. Therefore, if the estimated age of two sister taxa using oldest fossils differed, both were assigned the age of the oldest of the pair. This assumes that any inconsistency in the age of the earliest fossils arises from the incompleteness of the fossil record rather then through paraphyly. Finally, adding Oncothrips as a genetically distant comparison (eg Brunner et al, 2002) is conservative because it may underestimate the age of T. tabaci.

Results

Phylogeny of T. tabaci

COI sequencing yielded a 433 bp long fragment for 107 individuals of T. tabaci. A total of 56 bp (12.9%) were polymorphic, and these defined 17 haplotypes (sequences deposited in GenBank with Accession nos. AY196831–AY196849). All variation was in the form of silent, single base-pair substitutions, except for position 46. Here, guanine (G) is replaced by adenosine (A), resulting in an amino-acid replacement from glycine to serine (Table 1).

Table 1 Variable positions in the 433 bp segment of the COI gene defining 17 different haplotypes and their frequency distribution across 22 sampling sites of Thrips tabaci

The ML tree revealed two major subdivisions that primarily arrayed T. tabaci haplotypes according to host plants: clade ‘T’ was exclusively composed of haplotypes from 32 T. tabaci individuals collected on tobacco from Greece, and the other clades L1+L2 contained haplotypes from 75 individuals collected on leek with the exception of four individuals that were collected on tobacco (Figure 2). A second division was observed within the ‘leek’ clade: clade L1 was exclusively composed of individuals collected in Greece, while clade L2 was composed of all individuals collected in Switzerland (sites 1–6) plus individuals from locations 7 (Greece) and 17 (Bulgaria). The maximum parsimony tree (not shown) revealed, with stronger bootstrap values for the major clades, the same clustering as in the ML tree. The dominant feature of phylogenetic splits into three major clades (T, L1 and L2) is also reflected in the 95% parsimoniously plausible haplotype network (Figure 3).

Rates of evolution

An LRT with (−ln L=1980.58) and without (−ln L=1997.81), the molecular clock enforced rejected overall constancy of the rates of evolution in the Thysanoptera (δ=34.46, df=21, P=<0.05). In the absence of rate constancy, we used the NPRS method to construct an ultrametric tree (Figure 4) based on the ML tree, which was used for further analyses. Calibration of a molecular clock with the estimated date of divergence of extant Thysanoptera (148.9 million years) allowed us to date roughly the main cladogenetic events that occurred in T. tabaci history. The initial divergence into the leek- and tobacco-associated clades (L1, L2 and T) occurred during the Oligocene–Miocene boundary (around 28 million years ago). Subsequent diversification of the leek-associated lineages into L1 and L2 occurred during the Miocene (around 21 million years ago; Figure 4).

Discussion

This study was designed primarily to test the hypothesis that T. tabaci represents a host-plant associated taxonomic complex. Our analyses clearly indicate that genetic differentiation is significant among (sympatric and allopatric) populations of T. tabaci collected from different host plants. Clustering analysis and haplotype networks strongly suggest three distinct, well-supported major lineages in T. tabaci. The many fixed nucleotide differences support both an ancient origin and a long-term isolation of these lineages (Table 1).

Genetic findings and current taxonomy

Traditionally, taxa are distinguished using morphological characters. However, not all species lend themselves to this approach, most often because of insufficient phenotypic variation. Thrips are notorious for eliciting taxonomic problems due to their minute size, a scarcity of solid morphological characters, the fact that most species are associated with more than one host, and the finding that different species often coexist on the same plant. Crespi et al (1998), for example, examined Australian gall-forming thrips species using sequences of the COI gene. They found that each species apparently represented a pair of sibling species, previously undistinguishable due to ‘long-term morphological stasis’. Thus, the repeated appearance of stereotypic morphologies (and ecotypes) does not necessarily indicate sister relationships, perhaps as a consequence of the unique asymmetrical mouthparts that allow only for a piercing–sucking type of feeding and, hence, strongly limit the possibilities of ecological and morphological diversification in this order (ie ‘phylogenetic constraints’ sensu; Douglas and Matthews, 1992; Douglas and Brunner, 2002). In fact, Zawirska (1976) noted that a character on the abdomen of the second larval stage of T. tabaci differentiates the two biotypes (ie ‘tabaci’ vs ‘communis’), whereas adults are morphologically indistinguishable. Unfortunately, we were not capable of confirming (or refute) this observation before the submission of the manuscript.

Molecular evidence presented in this study strongly suggests T. tabaci forming three distinct, well-supported lineages consistent with a tobacco group and two leek groups. Given that tobacco and leek groups are genetically distinct, are they host races (ie lineages with partial reproductive isolation as a consequence of adaptation to different hosts)? Or are they instead cryptic species between which gene flow has ceased? Clearly, the answer to these questions depends on one's definition of species. However, the thrips populations studied here would qualify as (sub)species under the view of genetic distinctiveness in sympatry because the tobacco- and the two leek-associated lineages remain distinct both genetically and ecologically. Under the biological species concept, on the other hand, complete reproductive isolation is required before two groups can be accorded species status. Although estimates of FST alone cannot be used to conclude whether all gene flow between two groups has ceased (Whitlock and McCauley, 1999), it is useful to put them into the context of similar estimates for other groups. For example, estimates for host races of the gall-moth Gnorimoschema gallaesolidaginis (Nason et al, 2002) and the pea aphid Acyrthosiphon pisum (Via, 1999) are 0.16 and 0.21, respectively. In contrast, estimates obtained for sympatric species of Gryllus (crickets) (Harrison, 1979) and (Echenopa) planthoppers (Guttman and Weigt, 1989) are 0.92 and 0.91, respectively. Differentiation between T. tabaci lineages clearly exceeds typical host-race estimates, and falls within the range for sympatric species; FST=0.824 between clades L1 and L2, 0.946 between clades L1 and T and 0.954 between clades L2 and T. Finally, nucleotide sequence divergences between leek and tobacco haplotypes equate with host-specific, morphologically indistinguishable sibling species of Australian gall-forming thrips (range 8–16%; Crespi et al, 1998), but are substantially lower than those detected among morphologically distinguishable thrips species (range 16–27.5%; Brunner et al, 2002).

Population structure and host-associated differentiation

Whereas geographic isolation and genetic drift contribute to pronounced intraspecific phylogeographic structure, gene flow retards the genetic divergence of populations (Avise et al, 1987). The latter may be massive enough to reverse adaptive differentiation, unless the integrity of populations is maintained by reproductive isolation. In other known phytophagous host-race pairs (eg Eurosta, Abrahamson and Weis, 1997; Rhagoletis, Feder et al, 1998) common suites of factors such as strong host preference and/or mating on the host plant often contribute to the maintenance of host association and assortative mating. If host fidelity is perfect, then reproductive isolation is complete. If some migration between hosts occurs, then reproductive isolation can only evolve if there is premating selection against migrants or if there is some form of postmating selection against hybrid progeny (Liou and Price, 1994).

How reliable is host-plant choice in T. tabaci? Host-plant preference in T. tabaci seems to be so pronounced that the two differentiated lineages have been found in neighboring fields (sampling sites 9b and 9c in Figure 1), but strictly on their respective host-plant type (Figure 2). This strongly suggests that there is an active choice for the right plant. However, genetic haplotype assignment revealed that host preference was not perfect as four adult females were collected on the ‘wrong’ host plant (Figure 2). One explanation is that these host-mismatched adult thrips accidentally migrated to the reciprocal host. Although thrips are weak flyers, their fringed wings enable them to remain easily airborne long enough to travel between neighboring fields, and to be blown by the wind over far greater distances as a component of ‘air plankton’ (Lewis, 1997). Few empirical data are available to determine the extent and cause of reproductive isolation between host-associated insect populations. One notable exception is the apple maggot fly, Rhagoletis pomonella. Here, host choice studies (apple and hawthorn, respectively) suggested that about 6% of adult R. pomonella on a given host may have migrated from the other host (Feder et al, 1994). Although this host mismatch is almost twice that found in leek vs tobacco populations of T. tabaci, selection on phenology mediated by different rates of fruit development and rot may be enough to eliminate any gene flow from this migration in R. pomonella (Feder et al, 1997). A second explanation is that barriers to gene flow other than host choice, such as selection against migrants or hybrid sterility, must also exist. Indeed, reciprocal laboratory host-plant experiments (Chatzivassiliou et al, 2002; Chatzivassiliou, 2002) suggests strong host-specific (physiological) adaptations. While T. tabaci populations from both host plants apparently thrived on leek, those originally collected from leek failed to survive on tobacco.

Implications for pest management strategy

Thrips continue to increase as major pests of agricultural and greenhouse crops. Although T. tabaci is damaging in its own right, its potential for harm is far greater because it can serve as a vector of TSWV. Virus diseases can be very difficult to control and usually hinge on the control of the vector. Control of thrips with insecticides, however, is difficult. Eggs within leaf tissue and pupae in the soil/leaf litter are protected from most sprays. Similarly, larvae and adults are difficult to contact with spray applications because they are highly agile and protected within developing buds and flowers. Despite these basic biological facts, many growers still utilize insecticides as their primary method of control. The situation has been exacerbated as the heavy pesticide usage has encouraged the rapid development of insecticide resistance in many species. As early as in the mid-1950s at least five insecticides tested were already becoming ineffective against T. tabaci, and new products rarely retain their efficacy for more than 4 years (Richardson and Wene, 1956). Furthermore, stringent quality requirements allow only minimum/zero pesticide residues and little or no damage on marketable produce. To combat these relatively recent problems, much effort has been devoted to developing integrated pest management (IPM) to control pests, including thrips. IPM is a complex approach to pest control with biological control techniques playing a crucial part. Hence, it is clear that an understanding of the biology, ecology, population structure, etc, of the pest – as herein presented for T. tabaci – is a fundamental first step to make sound decisions. For example, the identification of genetically and ecologically distinct T. tabaci lineages can substantially reduce the application of pesticides. While the occurrence of the vector lineage (ie clade ‘T’; the ‘tabaci-type’ sensu Zawirska) might still need the rapid application of specific pesticides to limit the spread of TSWV, this might be reduced in favor of biological control techniques such as predatory mites or bugs in case of the nonvector lineages ‘L1’ and ‘L2’ (the ‘communis type’). Additional experiments with different plants remain to be conducted, but eventually our knowledge of host-plant preference/avoidance will allow growers to slow the development of T. tabaci (and other thrips) populations and even break the life cycle by careful selection of plants and cultivars.

To conclude, genetic studies of apparently generalist phytophagous insects often reveal complexes of genetically differentiated host races or cryptic species. The molecular results presented in this study, and previous observations and experiments on the ability to transmit TSWV and host preference, provide independent and very strong evidence that T. tabaci represents a complex of at least three taxa. We were able to demonstrate unexpectedly strong genetic differentiation among T. tabaci populations that suggests an ancient origin for the three major phylogenetic lineages. Thus, our findings clearly refute the general belief that T. tabaci is a single cosmopolitan and polyphagous species. On the contrary, by the standards of genetic and ecological differentiation in other species groups, the recognition of host-associated and distinct T. tabaci (sub)species must be considered.