Introduction

The population genetics of parasites has traditionally been a comparatively neglected field (Grant, 1994), and it is only in the last decade that it has received any serious attention (Anderson et al, 1998; Gasser and Chilton, 2001; Hu et al, 2004). This means that some quite basic predictions about the structure of parasite populations remain largely untested. One of these predictions concerns the evolutionary stability of population sizes in parasites of human-associated host species compared to those of naturally occurring host species. It has been suggested that parasites of human-associated organisms are more likely to show rapid recent demographic expansions than are other types of parasites (Blouin et al, 1995; Donnelly et al, 2001; Mes, 2003). Such demographic expansions may include increases in the numbers of individuals in the species and/or increases in the geographic range of the species, for example.

By human-associated species we mean humans themselves, domestic pets, farm animals, agricultural and silvicultural plants, and possibly other species that are now more common in human-disturbed (eg urban and agricultural) environments than elsewhere. The prediction of population expansions is based on the idea that certain aspects of the parasite biology often closely track those of their hosts (Cox, 1993; Anderson et al, 1998; Poulin, 1998), as part of the coevolutionary process, and the hosts in this case have usually undergone population expansions relatively recently as a result of their own association with the rapid demographic expansion of the human population over the past 10 000 years (Cavalli-Sforza et al, 1994). For example, a number of pathogens have apparently undergone recent expansions in concert with that of humans, including the causal agents of plague, malaria, AIDS, tuberculosis and Japanese encephalitis (Conway and Baum, 2002). Furthermore, many of the human-associated host species are also subject to large-scale spatial movements, particularly for farm animals (compared to the usually more restricted dispersal of wild hosts), and this will also contribute to a population expansion in the associated parasite species when they come into contact with new host individuals (Blouin et al, 1995; Anderson et al, 1998; McCoy et al, 2003; Criscione and Blouin, 2004). Note that the hypothesis makes no specific prediction about how frequent population expansions will be among parasites of non-human-associated hosts, but merely predicts that they will be more frequent among parasites of human-associated hosts. There are likely to be many other factors that can potentially create demographic expansions in parasite populations, and the process of testing the hypothesis is likely to provide deeper insights into the population genetics of naturally occurring species as well.

It is important to test this prediction, and to assess its generality, because it has enormous practical as well as theoretical implications. From the theoretical point of view, the inter-relationship between parasites and their hosts is one of the most important aspects of parasite biology, and it thus dominates most scientific studies of parasites; indeed, this distinguishes such studies from those of the hosts themselves, as the latter usually ignore the parasites entirely. From the practical viewpoint, the epidemiology of medically and veterinarily important diseases is clearly affected by the demography of the associated parasites. Furthermore, large-scale efforts are under way to control parasites of most human-associated species, notably livestock, companion animals and agricultural plants, as well as humans, particularly via past development of drugs and the more recent interest in vaccines. The success or failure of these projects will clearly be determined to a large extent by how successfully they deal with any changes occurring in parasite population sizes (eg the current spread of resistance to anthelminthic drugs). However, the possibility of population expansions has rarely been quantitatively tested for parasite species (eg Hughes and Verra, 2001).

Here, we attempt to test directly the generality of this prediction using as many species of parasitic nematodes (phylum Nematoda) as possible, by analysing the data currently available in the literature and the public-access genetic databases.

Materials and methods

The best of the current methods for directly testing population expansions require nucleotide sequence data (although good methods have recently been developed also for microsatellite data; Beaumont, 1999). Unfortunately, this requirement excludes from consideration much of the population-genetics data for nematodes, which have historically been based on allozyme, RAPD or RFLP analyses (with occasional use of microsatellites and AFLP). Also, the test methods assume that there has been no recombination in the recent evolutionary history of the population, and so we chose to restrict the data set to mitochondrial gene sequences, since there is little evidence for recombination in nematode mitochondria (Mes, 2003). This has the added advantages of still including the most variable genes (Blouin, 2000b), making all of the population estimates directly comparable (ie there are no confounding effects of inheritance patterns or ploidy), and making the sequence alignment simple because there is little length variation of these genes. Finally, the test methods require data for at least four individuals from the population and with at least one polymorphic site. This requirement excludes data based on consensus sequences from pooled isolates, cDNA libraries and laboratory-maintained strains. Within these constraints, we performed literature searches as well as searches of nucleotide databases, in order to locate suitable data sets. We have assumed that all of the genetic markers used are neutral or nearly neutral, rather than subject to some form of selection – this is hard to test, but our results using synonymous sites (for the protein-coding genes) are not different from those using all nucleotide sites, for example.

For each data set, the degree of demographic expansion was first assessed using maximum likelihood estimates of the population growth rate based on coalescent simulations. Calculations were performed using the Fluctuate v1.4 program (Kuhner et al, 1998), which estimates the exponential growth rate (g) of the population within an explicitly genealogical framework. Since the implementation of these calculations is sensitive to the parameters specified for the coalescent model (Beaumont, 1999; Mes, 2003; Criscione and Blouin, 2004), the nucleotide frequencies, transition:transversion ratio and proportion of invariable sites were all estimated quantitatively, as these have very nonstandard values for nematode mitochondria (Blouin, 2000a) – there is a large AT bias, a high transition:transversion ratio and most of the sites are invariable. These three sets of estimates were produced simultaneously via maximum likelihood using the Tree-Puzzle v5.0 program (Strimmer and von Haesler, 1996) – the exact likelihood function was used with parameters estimated by quartet sampling plus the neighbor-joining tree, based on the HKY nucleotide-substitution model. In Fluctuate, the Tree-Puzzle estimated values were then prespecified, and the Watterson estimate was used as the starting value of theta and zero as the starting value of g, along with a random starting tree. The Metropolis-Hastings sampler used 10 short chains with 1000 steps of increment 20, followed by three long chains with 20 000 steps of increment 20. If this search strategy was insufficient to produce a stable result, then we tried (in order): an alternative random-number seed, doubling the number of short chains and doubling the number of long chains.

As an independent assessment of population expansion for each data set, we used the statistic from the Fs test of selective neutrality (Fu, 1997). This has been shown to be one of the most powerful of the tests based on summary statistics (Ramos-Onsins and Rozas, 2002), being based simply on the haplotype distribution within the population. Calculations were performed using the DnaSP v4.0 program (Rozas et al, 2003), and statistical significance was assessed using 10 000 coalescent simulations based on segregating sites, with all of the parameters set to the observed values.

For both sets of analyses, the calculations were performed separately for each gene sequence for each combination of parasite species and host species. The calculations were also performed separately for suitable subsets (eg at least four individuals with at least one polymorphic site) of the data, such as geographically separated subpopulations. For Fluctuate, the analysis substitution parameters were estimated independently for each host/parasite combination; however, for each subpopulation within a host, the values estimated for the whole population were used.

The relationship between whether or not there has been a recent population expansion in the parasites and whether or not their hosts are human-associated was tested with a log-likelihood 2 × 2 contingency test with Williams correction (Sokal and Rohlf, 1994). Decisions as to whether the particular host populations sampled were human-associated or not were based on the information provided in the original publications – for example, the muskoxen sampled for Teladorsagia boreoarcticus were from natural and historically isolated populations, whereas muskoxen have been deliberately introduced into other locations, and the white-tailed deer sampled for Mazamastrongylus odocoilei were human-associated – and all decisions were made before the data analyses were carried out.

Results

Suitable data were located for 18 nematode species, covering 23 host/parasite combinations (Table 1). No usable data were found relating to plant-parasitic nematodes, and the data set is restricted almost entirely to parasites of vertebrates. Some other potentially suitable publications were found but could not be used, either because the sequence data are not available in public databases (Leignel and Humbert, 2001), the available sequence data cannot be attributed directly to the individual nematodes (Hawdon et al, 2001; Leignel et al, 2002; van der Veer et al, 2003) or the sequence data are invariant among the individuals (Perlman et al, 2003).

Table 1 Summary details of the data included in the analyses, indicating the nematode species sampled and its taxonomic classification, the host from which it was collected, the gene sequenced, the number of individuals sampled (n), the estimated rate of population change prior to the time of sampling based on exponential growth (g) with its standard error (SE) and the result of the test of selective neutrality (Fs) with its associated probability (P)

The analyses performed using Fluctuate appear to be more powerful than those using Fs, as all of the population expansions strongly detected using Fs were also detected using Fluctuate, but Fluctuate also detected several weaker expansions (Table 1). This result could also be a by-product of the tendency towards upward bias that can occur in the values of g (Kuhner et al, 1998). Furthermore, Fluctuate also detected some population contractions (indicated as negative g-values in Table 1), which Fs does not directly test. Our classification of the nematode species as having a recent demographic expansion or not is based on the agreement between the two analyses.

For those data sets where several loci were sampled, we combined the results using a weighted average of the values from each locus, weighted by the respective sample sizes. For the g-values, we calculated a standardized statistic for testing purposes, calculated as g divided by its standard error. We thus produced a single value of Fs and standardized g for each host/parasite combination (Figure 1).

Figure 1
figure 1

Relationship between the weighted average of the Fs values for the different loci and the weighted average of the standardized g-values (ie g/SE). Each symbol represents one host–parasite combination, with open symbols for human-associated species and filled symbols for the other species. The dashed lines represent the boundary of the values considered here to represent agreement between the Fs and g analyses regarding demographic expansion (to the lower right) and contraction (to the upper left).

Of the 23 host/parasite combinations, there are seven human-associated parasite species with expanding populations and three without, and there are three non-human-associated parasite species with expanding populations and 10 without (Figure 1). The contingency statistical test for these data yields G=0.90, P=0.027. So, we can reject the null hypothesis that there is no relationship between parasite population expansions and human-associated hosts. However, one of the four expected frequency values is <5 (ie 4.4), and so a cross-check with the more conservative Fisher exact test is appropriate. This yields P=0.040, thus confirming the result. Nevertheless, the retrospective power of the G-test is relatively low, at 0.64, and thus relatively small changes in the composition of the data would have a notable effect on the outcome – 18 host/parasite combinations would be needed in both the human-associated and nonassociated groups in order to raise the power to 0.80 (calculated using the DStplan v4.2 program; Brown et al, 2000).

Discussion

A clear test of the hypothesis is evident in the data currently available: nematode parasites of human-associated hosts are more likely to show evidence of rapid population expansions in their recent evolutionary history than are nematode parasites of naturally occurring hosts. This conclusion seems to apply to a wide taxonomic diversity of nematodes, and it is based on a consistent methodology applied across all of the taxa, without having to assume a common evolutionary history for the species. However, the available data restrict this conclusion largely to nematodes with vertebrate hosts, and the hypothesis that the same situation applies to nematode parasites of plants, for example, remains to be tested. It also remains to be tested for other candidate parasites, such as those of the phyla Platyhelminthes, Acari and Insecta. Nevertheless, the consistent nature of the pattern suggests that it may be true for these other parasites as well (but see below).

There are several specific patterns within the nematode species that also confirm this general result in their details. First, samples were taken for Oesophagostomum bifurcum from both human and Mona monkey hosts in northern Ghana, and the human sample shows clear evidence of a recent population expansion while the non-human sample does not. While the two sample sizes are not large, this is precisely the sort of detailed result that the theory predicts. Second, for those species with sufficient data, the geographical subpopulations usually show the same pattern as that of the population as a whole. This provides evidence that the observed pattern is a general one for those species, at least at the locations sampled, and not an artefact of the particular sample taken. This confirmation is important because the maximum likelihood coalescent method used by Fluctuate is designed to deal with panmictic (ie undivided) populations, and strong geographic subdivision can produce demographic patterns that are opposite to those of population expansion.

Cautionary notes

In contrast, it should be noted that, for those parasite species where multiple mitochondrial genes have been sampled, the results based on different genes are sometimes not congruent (Table 1). This is unfortunate, because consistent patterns across loci would provide stronger evidence for or against a demographic expansion. In particular, it proved to be impossible to combine the data for multiple genes for any species except one – Fluctuate would not produce estimates of the standard error in any of these cases. This indicates that the loci are producing very different estimates of the rate of change in the population from which they came, which is also obvious in most cases simply by comparing the g-values shown for each gene in Table 1. So, accurate estimates of the magnitude of the demographic expansions cannot be obtained even when there is consistent evidence that an expansion (or contraction) has occurred. Other recent studies of population expansions have not been confronted with this problem, because they have been based on only one locus (eg Lessa et al, 2003; Mes, 2003). If an overall estimate is needed, it might thus be more useful to produce a weighted average of the individual estimates (which is the strategy we adopted here) rather than trying to combine the data first and then produce a single estimate.

There are several other considerations that indicate that the results here should be treated with at least some caution. As with all studies of unobservable historical events for which manipulative experiments cannot be performed, there are a series of untestable assumptions any one of which might lead to a failure to detect the predicted pattern or to false detection of a nonexistent pattern. For example, the demographic expansion might follow linear growth, logistic growth or instantaneous growth rather than the exponential growth that was tested here, and therefore go undetected (Beaumont, 1999). Alternatively, the genetic markers used might not be neutral or nearly neutral but instead be subject to some form of selection, which can produce patterns indistinguishable from those produced by demographic changes. Furthermore, it is likely that much of the data analysed here do not fit the infinite-sites mutational model on which the Fs analyses (but not the Fluctuate analyses) are based, although Mes (2003) demonstrated considerable robustness to this assumption for the nematode data that he analysed. It is for these reasons that the general congruence of the Fs and Fluctuate analyses is important, as they provide relatively independent evidence for population growth (ie they are based on different model assumptions). Moreover, the strong concordance of the observed patterns across the diverse range of taxa militates against alternative ad hoc explanations.

Furthermore, it is important to note that because the experiment performed here is basically a descriptive one, there is little control over potentially confounding factors concerning those data appearing in the analysis. The most notable confounding factor in this particular instance is the choice of mitochondrial gene(s) used in the various source studies. Most of those studies that produced evidence for demographic expansions in the parasites of human-associated hosts were based on the nad4 gene, while most of the studies that produced no evidence for demographic expansions in the parasites of non-human-associated hosts were based on the cox1 gene (Table 1). It is therefore entirely conceivable that the overall pattern observed here is nothing more than a by-product of the differential usefulness of these two genes for detecting population growth. However, while these two genes are usually considered to have different degrees of evolutionary conservation (Blouin, 2000b), there is currently no evidence that this difference affects the detection of demographic growth. So, population growth remains the simplest (ie most parsimonious) explanation for the departures from neutrality observed here, because it provides a single explanation (predicted a priori) for most of the observed patterns.

Finally, we note that failure to use a suitable nucleotide-substitution model when calculating the maximum likelihood estimates of the population growth rate based on coalescent simulations (ie the Fluctuate analysis) results in useless estimates for these mitochondrial data. In particular, use of the default program settings is inappropriate, because these settings do not yield a realistic model, and therefore the estimated values are very different from those found here – they are much larger, apparently exaggerating the propensity of the method to overestimate the population growth rate. As previously reported (eg for nematodes: Blouin et al, 1998; Blouin, 2000a; Mes, 2003), mitochondrial genes can have a large AT bias, a high transition:transversion ratio and most of the sites can be invariable. Therefore, it is essential to estimate simultaneously appropriate values for these parameters in the model independently of Fluctuate (as we did here), because Fluctuate does not do this itself – Fluctuate will then produce more conservative and more realistic demographic values. This issue can also be problematic when trying to combine the data from different genes, as the parameter estimates are different for different genes and thus any attempt to pool the data results in a poor compromise for the combined estimates. The results of studies that do not report attempts to provide realistic estimates of the model parameters should therefore be interpreted with care, especially if the analyses produce very large values of g.

Other factors affecting population expansions

There is little new to be said about the individual results relating to those parasites with expanding populations and human-associated hosts and those with relatively stable populations and non-human-associated hosts, since this is the pattern predicted by the theory. Instead, it is more relevant to further examine some of those nematode species that do not fit this predicted general pattern, in order to assess the relative effect of other potential influences on parasite demography.

Dictyocaulus viviparus is the lungworm species in domesticated cattle worldwide, and in the analysis performed here all of the farms sampled from Sweden show no evidence at all of recent demographic expansion (Table 1). This result is in accord with other population data for this nematode species in Sweden, as strong population structure has been confirmed using AFLP data (Höglund et al, 2004), with most of the nematode genetic variation being between farms rather than within farms. It is unclear whether this result is a general one for this species in Europe, or indeed elsewhere. However, it is in marked contrast to the situation encountered for other nematode species of domestic ruminant hosts, where most of the variation is observed to be within farms rather than between farms both in North America (Blouin et al, 1995) and Europe (Leignel and Humbert, 2001). Clearly, a more detailed comparison of lungworms with related nematodes is warranted, especially a comparison of their demographic structure on different continents. Furthermore, parasitologists will need to be careful when developing vaccines or anthelminthics, because the strong population structure means that results from one location should not be directly extrapolated to another (cf Le Jambre, 1993).

Neither of the two Ancylostoma (hookworm) species showed consistent evidence of recent changes in population size (Table 1), in spite of the fact that they were sampled from different locations and that the other hookworm species (Necator americanus) did do so. This result is particularly unexpected for the nematode of the dog host, which actually shows a small but significant population contraction (Table 1 and Figure 1), since this host is a recent introduction into Australia (where those particular samples were taken) and has therefore itself recently had a considerable population expansion on that continent. Several possible explanations for this situation present themselves. First, this result could simply be related to the possible confounding effect of using the cox1 gene, as noted above. This possibility can only be tested by examining other mitochondrial genes. Alternatively, there could be something unusual about the host–parasite relationship for dogs in Australia.

As a heuristic test of this second idea, we analysed the mitochondrial rrnS data of Skerrat et al (2002) for the ectoparasitic mite Sarcoptes scabiei (Acari) in Australia, using Fluctuate. While the sample sizes are not large, there is moderate evidence of population expansion within the parasite species as a whole (g±SE=12±3), but there is no convincing evidence for any of the three host/parasite combinations individually: humans (18±10), dogs (–9±10) and wombats (–4±17). It is likely in this case that any population expansion that exists is related as much to switches between the three host species as to expansion within any one host. We thus cannot reject the hypothesis that population expansions are unusual among dog parasites in Australia; further tests of this hypothesis may therefore be worth pursuing. One possible explanation is that the 200 years since European-domesticated dogs were introduced into Australia, and apparently much less time for the parasites (Walton et al, 2004), are not long enough to leave a genetic footprint of population expansion in the parasites – not enough is yet known about the demography of these parasites to address this possibility quantitatively.

In contrast, T. boreoarcticus is an abomasal parasite of wild muskoxen, while some of its close relatives are parasites of ruminant livestock. It might thus be predicted (from the hypothesis being tested here) that these relatives would show recent demographic expansions while T. boreoarcticus would not. In fact, all of these particular nematode species show population expansions in North America (where these samples were taken) (Table 1). This may simply indicate that the muskoxen sampled in northern Canada have been human-associated to a greater extent than previously thought. Alternatively, Hoberg et al (1999) attribute the current population structure of T. boreoarcticus in North America to muskox migration patterns since the last glaciation. If this is so, then the results of Lessa et al (2003) predict that T. boreoarcticus should, in fact, also show a demographic expansion reflecting population growth as a result of its arctic and subarctic distribution in North America. Our results are therefore likely to have been caused by two different mechanisms operating at somewhat different temporal scales: nematode population growth due to the range expansion of wild host species during the late Quaternary and nematode population growth due to the expansion of domesticated host species during the more recent European occupation of North America.

Longistriata caudabullata is an intestinal parasite of several species of wild short-tailed shrew in North America, and it is therefore not explicitly predicted to show a demographic expansion although it does so in each of the two host species sampled here (Table 1). Brant and Ortí (2003b) note that there appears to be considerable gene flow within this nematode species across both host species, indicating that it forms one panmictic population. Thus, it might be true in this case, as well, that any population expansion that exists is related as much to switches between the two host species as to expansion within any one host.

However, in this particular case, it is possible to test directly the congruence between the parasite biology and host biology that is predicted by our hypothesis. This is because Brant and Ortí (2003b) also sequenced the mitochondrial control region (or D loop) of the host individuals from which the parasites were collected. We thus analysed the data for the two host species using Fluctuate. There is considerable evidence of population expansion within each of the host species: Blarina brevicaudata, g±SE=40±11; Blarina hylophaga, g±SE=45±8. Thus, for this characteristic, the parasite biology does appear to be tracking that of the hosts, providing an additional test of our hypothesis. Brant and Ortí (2003a) attribute the population expansion in B. brevicaudata, at least, to range expansion following the last glaciation.

Alternatively, it is possible that most trichostrongylid nematodes have undergone recent demographic expansions irrespective of their host status – for the data analysed here, of the taxa in the superfamily Trichostrongyloidea only D. viviparus fails to show evidence of population growth (see above). Indeed, a 2 × 2 contingency statistical test comparing the frequency of population expansions in this superfamily to that of the other superfamilies yields G=12.82, P<0.001. However, this comparison is confounded by the fact that most of the human-associated taxa are in this superfamily, with most of the remainder in the superfamily Ancylostomatoidea, as well as being an a posteriori hypothesis test rather than an a priori one (ie the hypothesis arises after examining the data and so the data cannot be a valid test of that hypothesis). Furthermore, no one has yet suggested a model that would explain why demographic expansions would be associated with particular taxonomic groups, although the taxonomic distribution of life-cycle characteristics (eg presence or absence of intermediate hosts, mode of reproduction) may play a part (Viney, 1998; Blouin et al, 1999; Criscione and Blouin, 2004). Nevertheless, should this alternative idea turn out to be correct, then the data may be providing evidence that D. viviparus is currently placed in an inappropriate taxonomic group, a suggestion that has already been made based on phylogenetic analyses (Höglund et al, 2003).

Overall, it seems likely that the situation is more complicated than our simple hypothesis test suggests. In particular, other causes of recent population expansions in the hosts clearly can confound the effects of human association. Also, the existence of direct (single host) and indirect (alternate hosts, with different life-history stages in different types of host) life cycles may have large effects, particularly if one of the hosts of a multihost parasite is human-associated while the other is not. This situation will become even more complex for taxa like the platyhelminths, for example, where different parasitic species can have one, two or three alternating hosts – in the latter cases, rarely is more than one of the hosts human-associated. Testing the hypothesis for platyhelminths, incidentally, has another difficulty as there is currently a dearth of data for non-human-associated host species – almost all of the data sets currently available are for parasites of humans, livestock or salmonid fishes.

The results observed here also show some consistent indications of population contractions rather than expansions among nematodes from non-human-associated hosts, particularly among the Capillaria, Contracaecum and Heterorhabditis species (Figure 1). However, most of the rate estimates have large standard errors and so these results are not necessarily reliable (Table 1). For example, the baikal seal has shown a population decline (Reijnders et al, 1993), and so its parasite species can be expected to show a decline also, but this is not clear from the data analysed here because of the large standard error of the estimate. Nevertheless, the possibility of population decline among nematode species clearly bears looking into.

Future prospects

Our analysis aimed to investigate the prospects for using sequence data to test general hypotheses about the evolution of parasite populations. From this perspective, the results are encouraging, and bioinformatic data-mining of the currently available information can clearly be productive. However, our study also highlights the limitations of this approach, which is basically a descriptive one in which there is little control over potentially confounding factors such as the taxonomy and life-history characteristics of those parasites appearing in the analysis (cf Blouin, 1998; Criscione and Blouin, 2004). In particular, it may not be worth pursuing this line much further for this particular hypothesis, although a similar strategy could be used for parasites in the Platyhelminthes, Acari and Insecta for example, as the scale of resolution is too poor and therefore further ad hoc comparisons may not yield any more-useful data. The indications provided by such analyses are promising, but they do not provide a definitive test of the hypothesis.

Further advances will therefore require more controlled comparisons of parasite species, directed at testing predictions that are more specific. For example, more critical tests of the hypothesis examined here could involve assessing: (i) the same parasite species in both human-associated and non-human-associated hosts (eg as carried out here for O. bifurcum); (ii) closely related pairs of parasite species in both human-associated and non-human-associated hosts (eg as carried out here for the Teladorsagia species); (iii) the same parasite species in the same host species both in areas where the host has been introduced and in areas where it has not (eg host species such as roe deer in different parts of Europe); (iv) the same parasite species in the same host species in areas where the host has been introduced or domesticated at different times (eg hosts in Europe vs those in North America or Australia); (v) different parasite species in the same host individuals; (vi) closely related parasites with different life-history characteristics (to assess alternative influences on the demographic patterns); or (vii) simultaneous sampling of both the hosts and the parasites to confirm that the population behaviour is the same in both species (eg as carried out here for L. caudabullata and its shrew hosts).