Introduction

Chagas’ disease (American trypanosomiasis) is produced by infection with Trypanosoma cruzi (Chagas), which is transmitted by hematophagous insects of the subfamily Triatominae (Hemiptera: Reduviidae). The disease is a serious public health problem in Latin America, with about 12 million persons infected and around 90 million living in endemic areas (Schmunis, 1999). The subfamily Triatominae includes more than 130 species of insects, most of which can be either naturally or experimentally infected with T. cruzi. However, only a few of these species are considered important from an epidemiologically standpoint. Among them, Triatoma infestans (Klug) is widely distributed in South America between latitudes 10°S and 46°S, where it has been recognized as the main vector of Chagas’ disease because it is one of the species with a high degree of adaptation to the domestic environment and therefore, one of the most closely associated with humans.

In Argentina, the National Chagas Control Program established in 1962 pursued the elimination of domestic and peridomestic populations of T. infestans by insecticide spraying. From 1962 to 2000 almost all the localities within the endemic area received insecticide treatment, but in most of them the entomological surveillance was not maintained after 20 years of the implemented programme (Segura et al., 1999). As a consequence, resurgence of vector transmission of the disease was observed in the endemic region, where new acute cases since the year 2000 have been detected (Segura, 2007). The long-term effectiveness of the control campaigns is greatly dependent upon the vector population structure. The genetic analysis of these populations is important in this respect and may provide information on the evolutionary history and migration patterns of this species, which is of central importance for the development of control strategies.

Mitochondrial DNA (mtDNA) genes have been used in a number of population genetic analyses and have been recognized particularly useful for phylogeographical studies in many species of insects (Roderick, 1996); however, the maternally inherited markers analyzed until the present in T. infestans, either exhibited low levels of variation (Monteiro et al., 1999; García et al., 2003; Segura et al., 2009) or have not been very useful for phylogeographical inferences of the Chagas’ disease vector in Argentina (Piccinali et al., 2009). Among the new DNA-based methods, microsatellites or simple sequence repeat markers have permitted greater resolution of genetic variation in populations (Bruford and Wayne, 1993; Schlötterer and Pemberton, 1994). In a previous work, 93 microsatellite loci were isolated from partial genomic libraries of T. infestans, 30 were amplified and 10 of these polymorphic microsatellite loci for which different allele types could be clearly resolved were selected for genotyping (García et al., 2004). With these nuclear markers, we evaluated the possible effect of eradication campaigns on the genetic structure of the vector populations, comparing levels of genetic variability in natural populations of T. infestans from areas with different elapsed periods since the last insecticide treatment and from areas that never received treatment. Surprisingly, the most of T. infestans populations from insecticide-treated localities seemed to have retained a substantial proportion of genetic diversity. As microgeographic genetic structure analyses supported the existence of subdivision in the populations of T. infestans, a population bottleneck would result in independent genetic-drift effects that could randomly preserve different combinations of alleles in each sub-population. These events followed by a rapid population growth could have preserved high levels of genetic diversity. Our studies supported the hypothesis of vector population recovery from survivors of the insecticide-treated areas, highlighting the value of population genetic analyses in assessing the effectiveness of Chagas’ disease vector control programmes (Pérez de Rosas et al., 2007, 2008).

In this study, we propose a multilocus approach with 10 microsatellite neutral loci (García et al., 2004), which have demonstrated to be a valuable molecular tool (Richer et al., 2007; Pizarro et al., 2008; Pérez de Rosas et al., 2007, 2008) to infer the phylogeography and migration patterns from different T. infestans populations covering almost the entire species range in Argentina.

Materials and methods

Specimens

A total of 27 populations of Triatoma infestans were sampled in Argentina. Sampling sites are shown in Table 1, 24 of them were reported previously (Pérez de Rosas et al., 2007, 2008) and three (Salvador Mazza (19, Salta Province), and El Nochero and Morajú (17 and 18, Santa Fe Province)) were examined in this study. The specimens obtained in each locality were considered belonging to one population. The insects were captured in each locality from several houses or peridomiciliary sites, except in Santa Rosa (16, La Pampa province) where they were obtained from an attic invaded by pigeons. All sites were treated with insecticide after insect collection. The specimens were processed between 10 and 72 h after collection and their legs were preserved at −20 °C. In all, 10 microsatellite loci were analyzed.

Table 1 Sampling site and sample size of Triatoma infestans from Argentina

DNA extraction and amplification

The DNA was extracted from frozen preserved legs of each bug using a simplified protocol of a phenol-chloroform extraction procedure reported by Werman et al. (1990). The 10 primer pairs used for PCR were previously designed and tested for proper amplification of microsatellite DNA from whole-genomic DNA of T. infestans (García et al., 2004). The forward primer from each primer set was 5′-fluorescent labeled with one of three dyes, 6-FAM, HEX or NED (Applied Biosystems, Foster City, CA, USA). PCR amplifications were carried out in a Hybaid thermal cycler (Omnigene, Hampton Hill, Middlesex, UK) in 25 μl of a solution containing 10 mM Tris-HCl (pH 8.3), 50 mM KCl, 1.5 mM MgCl2, each dNTP at 200 μM, each primer at 1 μM, genomic DNA (10–50 ng), and 1 U of Amplitaq Gold (Applied Biosystems, Branchburg, NJ, USA). Thermal profiles consisted of an initial denaturation step at 94 °C for 5 min, followed by 30 cycles of 20 s at 94 °C (denaturation), 30 s at 55 °C (annealing), and 40 s at 72 °C (extension), with a final extension step of 15 min at 72 °C. Reaction products were visualized after electrophoresis on a 3% agarose gel (MetaPhor, Rockland, ME, USA). Size of the PCR products was estimated relative to an internal standard (GeneScan-500 ROX) in polyacrylamide gels using an ABI PRISM 377 automated DNA sequencer, GeneScan 3.1 and Genotyper 2.5 softwares (all from Perkin-Elmer Applied Biosystems, Foster City, CA, USA).

Analyses

The genotypes were assigned with the same criteria for all individuals comparing previous and new data sets. Linkage disequilibrium was tested between all pairs of loci and for all locations using the program FSTAT version 2.9.3 (Goudet, 2001). Genotypic disequilibrium was not apparent for any pair of loci according to a global test for each of the 45 different pairs of loci across all populations based on 900 permutations; therefore, we proceeded under the assumption of statistical independence between loci.

The degree of differentiation between and across all sites was determined using Weir and Cockerham (1984) estimator (θ) of Wright's FST. Global and pairwise comparisons of θ were calculated by the program FSTAT version 2.9.3 (Goudet, 2001). The significance of all θ was tested using permutation. Bonferroni corrections (Rice, 1989) for multiple comparisons were applied when necessary. Standard errors of θ were calculated by jackknifing over populations and loci, and a 95% confidence interval was generated by bootstrapping over loci. Significance of global θ estimate was further evaluated with an exact G-test after 1000 randomizations of allele among sites (Goudet et al., 1996). Isolation by distance was examined by testing the correlation between θ/(1−θ) and the natural logarithm of pairwise geographical distances between populations (Rousset, 1997) in different geographical scales. The significance of the correlation was examined by a Mantel test with permutations as implemented in FSTAT version 2.9.3 (Goudet, 2001).

As population genetics data are known to be multivariate, because of the multilocus data set, univariate statistics such as FST may not be sufficient to fully describe them. On the basis of this presumption, evidence for genetic structure, revealed by discontinuities among populations, was assessed by a factorial correspondence analysis performed using GENETIX version 4.03 (Belkhir et al., 2000). This technique was used to project populations into multivariate space and weigh and partition the influences of specific alleles on the global pattern of genetic variance. The method uses a similar strategy of a multivariate model adapted to the design of population genetics data to infer patterns of genetic differentiation, which is not completely independent from differentiation parameters such as FST (Guinand, 1996). This model is derived from Multiple Correspondence Analysis to study distribution profiles for a number of discrete variables (loci), which could have different states or modalities (alleles). The aim of the method is to find composite axes, which are a combination of the variables and optimize the differences between the analyzed populations.

A Bayesian approach implemented in the program STRUCTURE version 2 (Pritchard et al., 2000) was used to infer spatial population structure. The program STRUCTURE uses a Markov chain Monte Carlo approach to infer the number of populations (K) in a data set without prior information of the sampling locations. Considering the significant levels of genetic differentiation among T. infestans populations inferred previously (Pérez de Rosas et al., 2007, 2008), we assumed no admixture model, which is very appropriate to analyze discrete populations and it may be more successful in detecting subtle structure (Pritchard et al., 2000). On the other hand, uncorrelated allele frequencies within populations were assumed, given that Pritchard et al. (2000) suggested that this model is a convenient approximation for populations that are not too closely related and can produce accurate clustering. To select an appropriate K value for data analysis, a series of eight independent runs for each value of K between 1 and 10 were conducted. In each run, a burn-in period of 100 000 and a 1000 000 run length were performed. Ideally, a comparison of the replicates and the probability of data (called LnP(D) in STRUCTURE output) for each value of K allows the estimation of the point of inflection and the more likely numbers of clusters (Evanno et al., 2005).

An assignment analysis was performed to identify possible first-generation migrants as implemented by the program GENECLASS2 (Piry et al., 2004). The program uses multilocus genotypes to identify putative immigrants within each population and the most likely source of these immigrants. The marginal probability of given individual multilocus genotype was compared with the distribution of marginal probabilities of randomly generated multilocus genotypes (10 000 replicates), when the P-value was <0.01 the individual was considered not belonging to the population. First-generation migrant test was carried out using Rannala and Mountain (1997) Bayesian individual assignment method and the Monte Carlo resampling method of Paetkau et al. (2004).

Results

A multilocus microsatellite data set of 836 individuals from 27 populations of Triatoma infestans, from all its range of distribution in Argentina, was analyzed (see microsatellite genotype data set in the supplementary Table 1). Evidence of genetic differentiation among the 27 populations of T. infestans was examined using overall and pairwise comparisons of multilocus FST (using the estimator θ); θ values ranged from 0.012 between the populations of Palo Santo and La Esperanza (12 and 14, both from Formosa province) to 0.500 between the populations of Caucete (4, San Juan province) and El Zapallo (15, Corrientes province) (Table 2). The overall θ value of 0.146 is significantly different from zero (P=0.001), as are all pairwise θ values (P<0.05 and P<0.01, FSTAT permutation procedure), except between the populations of Palo Santo (12) and La Esperanza (14). There was a significant association between geographical distance and genetic differentiation among sites, suggesting isolation by distance (Mantel r=0.405, P=0.0005, Figure 1a). However, considering different geographical scales, no significant correlation between genetic and geographical pairwise distance was found among populations pairs situated from 0–150 km (Mantel r=0.163, P=0.51, Figure 1b).

Table 2 Pairwise matrix of genetic differentiation between populations (θ, estimator of FST)
Figure 1
figure 1

Isolation by distance among sampling sites. Pairwise estimates of θ/(1−θ) are plotted against the corresponding natural logharitm of the geographical distances between sites (Rousset, 1997). (a) All sampling site pairs considered (Mantel r=0.405, P=0.0005). (b) Sampling site pairs at distances ranging from 0–150 km (Mantel r=0.163, P=0.51).

The results of the factorial correspondence analysis are given in Figure 2, which present the projection of the populations on the plane defined by the first two axes of the analysis. The distribution of the populations along the first axis separated most of the populations belonging to localities from the west of Argentina (provinces of La Rioja (1), San Juan (4), Catamarca (21, 22, 23, 25, 26, 27) and the west of Córdoba province (2, 3)) from those belonging to the provinces of Santa Fe (17, 18), Salta (19), Santiago del Estero (5–8), Formosa (11–14), Chaco (9, 10), Corrientes (15), La Pampa (16) and two localities of Catamarca province (20, 24). The position of populations along the first correspondence axis reflected their geographical position from east to west along a longitudinal transect, as indicated by the significant correlation between the axis values and the longitudinal geographical coordinates of the different localities (r=0.68, P=0.0005). The negative part of the first axis clustered 17 populations distributed in a wide geographical range of Argentina; they were closely situated in the plane, except the populations from El Nochero (17, Santa Fe province) and Salvador Mazza (19, Salta province). On the other hand, the majority of the western populations placed at the positive end of the first axis were distributed more separately in the plane, suggesting a higher degree of genetic differentiation among the populations of this region with exception of two populations from Catamarca province, Saujil (23, Tinogasta) and Medanitos (26), which were found superimposed. On the positive side of the first axis, the populations were separated along the second correspondence axis; El Jardín (1, La Rioja province), Caucete (4, San Juan province) and Chancaní and Sauce Arriba (2 and 3, Córdoba province) were distributed over the positive part of the second axis and the populations belonging to Catamarca province were on the negative part of this axis.

Figure 2
figure 2

Positions of 27 populations of Triatoma infestans in multivariate space defined by factorial correspondence analysis axes 1 and 2. Numerical code for populations corresponds to Table 1.

Using the program STRUCTURE version 2 (Pritchard et al., 2000), seven different genetic clusters could be identified (Figure 3). Groupings obtained with K>7 were not repeatable between different runs, although the estimated probability of the data slightly improved with higher K (see supplementary Figure 1). Moreover, from K>7 the variation of the probability among the replicates increased suggesting instability among the replicates and empty groups were obtained frequently, a phenomenon that happens when K is higher than the number of ‘real’ groups. Thus, the point of inflection was clearly at K=7. As suggested by the Bayesian approach, there are six distinct, almost homogeneous clusters (A, B, C, D, E and F) distributed across specific geographical locations. Although 11 populations correspond to the six different, nearly homogeneous clusters (more than 80% of ancestry shared among individuals), 16 populations were more heterogeneous presenting a mix of different clusters. Most of the first 11 populations form three of the almost homogeneous clusters (A, B and C) detected in the provinces of San Juan (4), La Rioja (1) and Catamarca (23–27). In Catamarca, the samples belonging to the localities of Medanitos (26), Fiambalá (24) and Saujil (23, Tinogasta) form the cluster A; the samples from San José Banda (20), Santa Rosa (21) and Saujil (22, Pomán) share with this cluster around 37,26 and 61% of the ancestry, respectively. In the same province, the populations belonging to Copacabana (25) and Huillapima (27) form a separate cluster (B). On the other hand, the populations from Caucete (4, San Juan province) and El Jardín (1, La Rioja province) form a third almost homogeneous cluster (C), sharing the two populations analyzed from the west of Córdoba province (Chancaní, 2; Sauce Arriba, 3) a high level of the ancestry (71 years 74%, respectively) with this cluster. The populations belonging to the other three almost homogeneous clusters are geographically distant; Salvador Mazza (19, Salta province), Siete Árboles (10, Chaco province) and El Zapallo (15, Corrientes province) and Santa Rosa (16, La Pampa province) form the clusters D, E and F, respectively. The seventh cluster (G) was found to contribute in different proportions to populations that presented a higher degree of heterogeneity, principally those from localities belonging to the provinces of Formosa (11–14), Chaco (9) and Santiago del Estero (5–8).

Figure 3
figure 3

Pie chart of combined genetic ancestries of all individuals sampled in each population, as obtained by STRUCTURE (Pritchard et al., 2000). The seven different colors correspond to seven different genetic clusters, which are indicated with A, B, C, D, E, F and G. In italic the capital of Argentina as point of reference. Numerical code for populations corresponds to Table 1. A full color version of this figure is available at the Heredity journal online.

Comparison of allelic richness (RA, number of alleles corrected for sample size) and FST values (a measure of the variance of gene frequencies) was performed between the group of the heterogeneous populations and the group of populations that correspond to the almost homogeneous clusters. The results revealed a significant highest level of allelic richness in the first group (RA=5.04 and RA=3.81, respectively; P=0.01) and of genetic differentiation in the second group (θ=0.084 and θ=0.225, respectively; P=0.001).

Results of the first-generation migrant test are shown in Table 3. The first-generation migrant test identifies immigrants within each population and the most likely source of these immigrants. A total of 41 putative immigrants were identified originating from 17 of the localities and migrating to 24 of the localities. The localities within Santiago del Estero and Formosa were the places of origin of the majority of the individuals identified as immigrants in distant localities in Argentina; while other localities as, for example, from Catamarca province, principally interchanged migrants among them, as well as among the populations from La Rioja and Córdoba provinces.

Table 3 Results of first-generation migrant test for 27 populations of Triatoma infestans using Geneclass 2.0 (Piry et al., 2004)

Discussion

Genetic drift and limited gene flow seem to have generated a substantial degree of genetic differentiation among populations of Triatoma infestans (θ=0.146, P=0.001) as suggested by previous studies (Pérez de Rosas et al., 2007, 2008; Pizarro et al., 2008). This is not surprising, as the limited dispersal described for T. infestans (Schofield 1988) would tend to increase genetic differentiation through the process of genetic drift. Moreover, significant isolation by distance among sampling sites, with nearby sites apparently exchanging more genes than distant ones, was also confirmed (Figure 1a). However, considering distance ranges between population pairs smaller than 150 km, significant association between geographical distance and genetic differentiation among sites was not detected (Figure 1b). The results for sites in this smaller geographical scale, as well as for some sites in a bigger geographical scale (Figure 1a), which exhibited deviation from a pattern of isolation by distance, suggest that allele frequencies in each of these sites drift independently of the geographical distances separating them; probably factors other than geographical distance could be involved in generating the structure of populations. The majority of T. infestans populations studied belong to areas with different elapsed periods since the last insecticide treatment and only three are from areas that never received treatment (Pérez de Rosas et al., 2007, 2008). The possibility is that local differentiation by genetic drift enhanced by the reduction of population size by insecticide treatment could create significant structuring independent of distance. In this regard, the high degree of divergence between some geographically close populations may be because of a short time as the last insecticide intervention in at least one of the locations, which would limit the time available for gene flow to diminish differentiation. Therefore, as a consequence of insecticide treatment, it seems that genetic drift may have enhanced differentiation among populations (Pérez de Rosas et al., 2007).

The first-generation migrant assignment test suggests putative immigrants in 24 of the 27 populations (Table 3). However, it is important to distinguish between current and historical levels of gene flow. In this regard, the estimates of gene flow and migration rates via Wright's FST statistic (a measure of the variance of gene frequencies between populations) represent evolutionary averages, rather than contemporaneous patterns of dispersal (Bossart and Prowell, 1998; Palsboll, 1999), whereas assignment methods have the potential to provide direct estimates of real-time dispersal through the detection of individual immigrants (Paetkau et al., 2004). Considering that historical events can leave imprints on the distribution of genetic variability (Ibrahim et al., 1996) that can last for many generations (Nei et al., 1975), and the important role of genetic drift generating genetic differentiation among populations, it seems that the degree of current dispersion is not enough to produce a homogeneous metapopulation.

Pritchard et al. (2000) highlight the strengths of the Bayesian approach in evolutionary studies of genetic relationships among populations, compared with the summary statistics such as FST. Both approaches suggest that populations of T. infestans are highly structured and are reinforced by the results of the factorial correspondence analysis. Three different nearly homogeneous clusters (A, B and C) were detected in the west of Argentina, which comprise seven T. infestans populations from the provinces of Catamarca (23–27), San Juan (4) and La Rioja (1), as well as two populations from the west of Córdoba province (2, 3) that share a high level of the ancestry with cluster C (Figure 3). These three groups of populations practically do not share ancestry among them, suggesting a lower level of gene flow and consequently a higher population differentiation by drift in this region. Besides, these populations belonging to localities geographically close to the Andean highlands almost do not share ancestry with the rest of the populations analyzed. Panzera et al. (2004) identified two chromosomic allopatric groups in T. infestans populations, named Andean and non-Andean. These authors detected differences in heterochromatin that have been considered the main cause of the variation in the DNA content between both groups, with 30% more hetrochromatin in the Andean insects. On the basis of the observation that Andean samples came from geographical regions generally above 1800 m, whereas non-Andean populations were mainly from localities below 500 m, these authors suggested that heterochromatin and DNA variation reflect adaptative genomic changes that contribute to the ability of T. infestans to survive and reproduce in environments with different altitudes; populations with large genome would be better adapted to highlands (Andean regions), whereas populations with smaller genomes would do better in lowlands (non-Andean regions). Moreover, phylogenetic inferences carried out, using nuclear rDNA (ITS-1 and ITS-2) sequences, supported the previously suggested origin of T. infestans in Bolivia highlands (Schofield, 1988; Giordano et al., 2005; Noireau et al., 2005) and two different dispersal lines, one throughout Andean regions of Bolivia and Perú, and the second in non-Andean lowlands of Chile, Paraguay, Argentina, Uruguay and Brazil (Bargues et al., 2006). Considering these findings, it is probable that the populations from the western area of Argentina mentioned above, most of which are located between 1000 and 1700 m, could have been established from the dispersal line of T. infestans that would have arrived to Argentina through the Andes. Presumably, the presence in this area of populations that share some degree of ancestry with eastern populations (for example, the two populations from Córdoba province) suggests a process of secondary contact between populations from the different dispersal lines.

On the other hand, probably most of the other populations analyzed may have derived from the dispersal line of T. infestans in non-Andean lowlands associated to human migratory movements. These populations, principally those from localities belonging to the provinces of Formosa (11–14), Chaco (9), Santiago del Estero (5–8) and Santa Fe (17, 18), which presented lower degree of genetic differentiation, showed higher heterogeneity (mix of different clusters) sharing different percentages of ancestry (Figure 3). The human internal migratory movement linked to regional economies, possibly associated with passive dispersal, would allow to maintain a higher genetic exchange between these populations of T. infestans; for example, the migratory movement of people from provinces as Santiago del Estero originated by the cotton crop in Chaco, Formosa and North of Santa Fe. Consistent with these observations, the results of the first-generation migrant test suggested that principally the localities belonging to Santiago del Estero province and those from Formosa province were the places of origin of the majority of the individuals identified as immigrants in different localities of Argentina (Table 3).

The almost homogeneous cluster detected in Santa Rosa (16, La Pampa province), unlike the other six, did not contribute >18% to any of the other populations (Figure 3). As the insects were captured in each place from several houses or peridomiciliary sites, except the sample from Santa Rosa that was obtained from an attic invaded by pigeons, the results obtained in this locality are compatible with a founder event. On the other side, the populations from three localities of Santiago del Estero province (5, Sabagasta; 6, Vaca Human; 7, Taco Totorayo), El Nochero (17, Santa Fe province) and Santa Rosa (21, Catamarca province) share a high degree of the ancestry with the geographical distant population from Salvador Mazza (19, Salta province). The analysis of antennal phenotypes in populations from Argentina, Paraguay and Bolivia was carried out by Hernández et al. (2008), showed the existence of two groups: (a) the group of populations analyzed from the Andean valleys of Bolivia, (b) the group that comprised the populations from the Gran Chaco and the intermediate valleys. In agreement with our results, within the second group a close relationship among samples from Salvador Mazza, Santiago del Estero and Catamarca provinces (the study did not include samples from Santa Fe province) was detected using the antennal phenotype data (Hernández et al., 2008).

In summary, it seems probable that the evolutionary history of T. infestans was marked by two independent events of colonization and secondary contacts. Comparisons of the populations analyzed in this study with populations of the Andean and non-Andean groups determined in previous studies (Panzera et al., 2004; Bargues et al., 2006) should be performed to test the hypothesis of two independent migration events of colonization in Argentina. This study, using microsatellite markers, provides a new approach for evaluating the validity of the different hypotheses concerning the evolutionary history of this species.