Introduction

Species radiations are a common feature of isolated islands, which provide natural laboratories for the study of speciation (Williamson, 1981). Inter-island colonization is widely viewed as a central component of island radiations (Carson and Templeton, 1984) although populations within islands may also become isolated from one another if dispersal is limited (Cowie, 1995; Shaw, 1996). Additionally, strong ecological affinities can reinforce isolation even in the absence of geographical distance (Kambysellis et al., 1995; Gillespie, 2004). Exploitation of novel resources and competition for shared resources are among the ecological causes of speciation expected to be especially important during a radiation because of the coexistence of many closely related species (Losos, 1994; Schluter, 1996).

Simulium oviceps and Simulium dussertorum are members of a larger radiation of simuliids within the Society Islands, French Polynesia (Craig et al., 2001). The Society Islands, like the Hawaiian Islands, are a hot spot archipelago: a linear array of progressively younger islands formed by a submantle plume (Duncan and McDougall, 1976; Carson and Clague, 1995). Both S. oviceps and S. dussertorum are found on the youngest island, Tahiti, while S. oviceps is also found on Moorea (Figure 1), and an older island, Raiatea. The two species are sister taxa and are morphologically similar, both showing a reduction in labral fan size and the number of fan rays, a specialized feeding adaptation in the larvae (Craig, 1977). Before the application of molecular techniques they were difficult to tell apart and Craig (1987) suggested, based on morphological variation, that S. oviceps might be a species complex, and in fact S. dussertorum was initially thought to be an environmental variant of S. oviceps, but was later described as a separate species (Craig, 1997).

Figure 1
figure 1

Sampling locations for S. oviceps and S. dussertorum on Moorea and Tahiti. Corresponding location names are given in Table 1. Open square, river; closed circle, cascade.

A defining characteristic of black flies in French Polynesian is that larvae have colonized a diverse array of habitats but that most species have narrow habitat requirements – with the larvae of each species found in one or a few similar habitats (Craig et al., 2001; Joy and Conn, 2001). In contrast, S. oviceps larvae have a relatively broad ecological niche: cascades and large rivers. The two habitats differ markedly from each other in terms of nutrient content and flow regime. Cascades in the Society Islands have a higher concentration of particulate matter, and a higher proportion of diatoms than rivers (Resh et al., 1990), and diatoms are a high-quality food source for black fly larvae (Thompson, 1987). Unlike large rivers, water velocity can be highly variable among and even within individual cascades in the Society Islands (Craig et al., 2001). Because of its seemingly broad habitat requirements, S. oviceps presents a unique opportunity to study the evolution of habitat specialization in the subgenus Inseliellum.

The objectives of this study were to reconstruct the molecular phylogeny of S. oviceps and S. dussertorum populations, to measure genetic and morphological variation across their ranges, and relate these differences to ecological and geographical isolation to gain insight into possible mechanisms underlying the Society Island black fly radiation. In addition, we discuss the feasibility of applying nested clade analysis (NCA) to a system with a known but highly complex geological history.

Materials and methods

Type specimens and sampling localities

The Simulium (Inseliellum) oviceps Edwards 1933 holotype is deposited at the Entomology Department, Natural History Museum, (Cromwell Road, London, SW7 5BD, England). The Simulium (Inseliellum) dussertorum Craig, 1997 holotype is deposited at the Department of Entomology, Bernice P Bishop Museum (1525 Bernice Street, Honolulu, Hawai'i, 96817–2704). A total of 339 S. oviceps and 94 S. dussertorum larvae were collected from 26 sites (15 rivers and 11 cascades) on Tahiti and Moorea (Figure 1; Table 1). S. oviceps was found on both islands while S. dussertorum was found only on Tahiti. The S. dussertorum type locality is Grottes de Maara (Craig, 1997). As only a single cytochrome oxidase I (COI) haplotype was collected from this location (haplotype A), we designated the clade containing this haplotype as S. dussertorum. Simulium bogusium and Simulium proctorae were chosen as outgroups because they are closely related to the two focal species (Joy and Conn, 2001) and are endemic to Raiatea, an older island.

Table 1 Collection sites, PCRD haplotypes and sample sizes for PCRD and sequence-based analyses

DNA extraction and amplification

Genomic DNA was isolated from frozen (−80°C) larvae following the protocol of Collins et al. (1987). Initial COI sequences were obtained with primers UEA5 and UEA10 (Lunt et al., 1996) and were used to design Simulium-specific primers. These primers amplified a 920 bp fragment from the 5′-end of the COI gene. PCR parameters consisted of initial denaturing at 94°C (3 min), five cycles at 94°C (1 min), 53°C (1 min) and 72°C (1.5 min), followed by 30 cycles at 94°C (1 min), and a ramp from 50 to 72°C (1°C/6 s). Double-stranded PCR products were purified with polyethylene glycol. Reactions for automated sequencing were prepared using the ABI Prism Dye Terminator DNA Sequencing Kit.

Sequence analyses

Sequences were compiled, edited and initially aligned using Sequencher 4.0 (Gene Codes Corp., Ann Arbor, MI, USA). Aligned sequences were imported into PAUP*, v4.0b10 (Swofford, 2003). Phylogenetic relationships were reconstructed using maximum parsimony (MP) and maximum likelihood (ML) after redundant sequences were removed. We used MP to generate initial topologies for estimating ML model parameters and to evaluate models. One of 36 most-parsimonious trees (heuristic search, 20 random stepwise addition replicates, tree bissection-reconnection (TBR) branch swapping) was arbitrarily chosen to assess the fit of 56 models using likelihood ratio tests as implemented in MODELTEST 3.06 (Posada and Crandall, 1998). MP bootstrap support was estimated from 500 replicates (step-wise addition, TBR branch swapping). The ML tree was estimated using successive approximation (step-wise addition, TBR branch swapping), and 100 bootstrap replicates.

We estimated the time to most recent common ancestor (TMRCA) for S. oviceps, S. dussertorum, and subsets within S. oviceps from sequence data, using the Bayesian coalescent framework in BEAST v1.4 (Drummond and Rambaut, 2006). The TrN+I+Γ model (Tamura and Nei, 1993) was chosen based on likelihood ratio tests (Posada and Crandall, 1998). We used a mutation rate of 2.0 × 10−8 nucleotide substitutions/site year, based on estimates for COI in dipterans, specifically Hawaiian Drosophila (DeSalle et al., 1987), Drosophila melanogaster and Drosophila simulans (Satta et al., 1987), and in several other arthropod species (Brower, 1994). For each TMRCA estimate we ran two independent Markov chain Monte Carlo (MCMC) chains of 20 000 000 steps each, preceded by a burn-in of 2 000 000 steps. Convergence of the chains was checked using the program tracer (Rambaut and Drummond, 2004), and results from the two chains were found to be in agreement and were combined.

PCR-digest screening and NCA

In addition to sequencing, we performed PCR digest (PCRD) on the 920 bp COI fragment. We screened 43 restriction enzymes using Sequencher 4.0 (Gene Codes Corp.), and found that a combination of three – HindIII, DdeI and AccI – produced unique fragment patterns for 14 groups also identified by sequencing. DNA from up to 20 individuals (average=16) from each of 26 collection sites was cut with these enzymes and new fragment patterns, when encountered, were sequenced, allowing us to identify four more restriction haplotypes. To verify that identical fragment patterns from different localities were the product of the same restriction sites, we sequenced at least one individual per haplotype per site (n=149). In doing so, we identified three additional haplotypes. We subsequently re-screened all localities with HpaII, which in combination with the other three restriction enzymes produced unique restriction patterns for the three new haplotypes. In all, four restriction enzymes allowed us to identify 21 restriction haplotypes (Supplementary Table 1). Because our method was not exhaustive, it is possible that we missed some haplotypes; however, restriction haplotypes largely reflected the pattern of sequence variation while allowing us to greatly increase our sample size.

We constructed a PCRD haplotype network based on the information in Supplementary Table 1. Statistical parsimony was used to determine the number of connections considered parsimonious at the 95% confidence level (Templeton et al., 1992). The null hypothesis of no geographical association was tested by treating location as a categorical variable for nested exact permutational tests using GeoDis (Posada et al., 2000). Geographical distance was incorporated as clade distance (Dc) and nested clade distance (Dn) that measure a clade's geographical range, and its distribution relative to closely related clades, respectively (Templeton et al., 1995). Observed Dc and Dn values were compared to a distribution of randomly permuted values, which simulate the null hypothesis of no geographical association (Posada et al., 2000). In cases where the null hypothesis was rejected three possibilities were considered: restricted gene flow, past fragmentation and range expansion. To distinguish among patterns, we used the inference key found at http://zoology.byu.edu/crandall_lab/geodis.htm, which is a modification of the original key (Templeton et al., 1995). Outgroup weights were also calculated for each PCRD haplotype using a heuristic procedure that takes into account haplotype connectedness (tip vs interior position) and frequency in the sample (Castelloe and Templeton, 1994).

Population structure

Population structure was assessed by partitioning genetic diversity into hierarchical components using analysis of molecular variance (AMOVA) (Excoffier et al., 1992). Populations were grouped according to larval habitat (river or cascade), and location with reference to putative geographic barriers to gene flow – the open ocean separating Moorea and Tahiti, and a narrow isthmus connecting Tahiti-nui and Tahiti-iti (Figure 1).

Morphological variation

We measured head width, body length, length of the longest labral fan ray, and number of fan rays for 323 S. oviceps and 65 S. dussertorum final instar larvae. DNA from 87% of measured larvae (n=338) was successfully amplified and cut with restriction enzymes. We used analysis of variance to test for significant differences in larval morphology based on habitat (river vs cascade) and phylogeny (species and clades within S. oviceps).

Results

COI sequence phylogeny

The TrN substitution model (Tamura and Nei, 1993) with a proportion of invariable sites and a γ-distribution of rate variation among sites (TrN+I+Γ model) was chosen based on likelihood ratio tests (Posada and Crandall, 1998). The ML and MP (not shown) trees differed only in the arrangement of poorly supported branches within S. dussertorum. Neither ML nor MP support reciprocal monophyly between S. oviceps and S. dussertorum (Figure 2). Within S. oviceps, there is strong support for one clade, comprised of S. oviceps larvae collected from rivers (henceforth river ecotype, Figure 2) with the exception of haplotypes E1 and E2, collected from cascades on Moorea (Table 1). On the other hand, there is no support for reciprocal monophyly between this clade and cascade-dwelling S. oviceps (henceforth cascade ecotype, Figure 2).

Figure 2
figure 2

ML tree estimated from COI sequences under the TrN+I+Γ model. The two outgroup species S. bogusium and S. proctorae are endemic to Raiaitea. Sequence haplotype names are given at the tips of branches. ML followed by MP bootstrap proportions are shown above the branches (500 replicates). Branches with bootstrap values less than 50% are collapsed. COI, cytochrome oxidase I; ML, maximum likelihood; MP, maximum parsimony.

The marginal posterior distribution of TMRCA for S. oviceps, S. dussertorum and the two ecotypes are shown in Figure 3. The TMRCAs for S. oviceps (mean, 1.77 MY; 95% highest posterior density interval (HPD), 1.47–2.10 MY) and the cascade ecotype (mean, 1.70 MY; 95% HPD, 1.42–1.99 MY) showed considerable overlap and were the oldest (Figure 3). The river ecotype appears to have diverged more recently (mean, 0.89 MY; 95% HPD, 0.66–1.14 MY), with S. dussertorum intermediate between the others but overlapping extensively with the river ecotype (mean, 1.05 MY; 95% HPD, 0.79–1.30 MY).

Figure 3
figure 3

Bayesian estimates of TMRCA for S. oviceps, S. dussertorum, S. oviceps river ecotype and S. oviceps cascade ecotype. ESS were 4284, 3454, 4085 and 3180 for S. oviceps, S. dussertorum and the river and cascade ecotypes, respectively. ESS, effective sample sizes; TMRCA, time to most recent common ancestor.

PCRD network and NCA

Restriction haplotypes corresponded well with sequence haplotypes, and divided S. oviceps into two groups, again reflecting larval habitat (Figure 4). Larvae were not distributed evenly among groups, with the S. oviceps river ecotype making up the majority of the collected larvae; almost one-third (n=161) were haplotype C. Seven haplotypes were sampled only once. Two networks were supported by statistical parsimony, one consisting only of haplotype M. The two S. oviceps ecotypes and S. dussertorum were connected through haplotype R. Haplotype C had the highest outgroup weight (0.772; Supplementary Table 1) within the S. oviceps river ecotype and haplotype G had the highest outgroup weight (0.512) within the cascade ecotype, while within S. dussertorum, haplotype N had the highest outgroup weight (0.556).

Figure 4
figure 4

Minimum spanning network for S. oviceps and S. dussertorum based on the restriction cut sites of a 920 bp COI fragment. Each branch represents the presence (+) or absence (−) of a restriction enzyme cut site with location given in bps. Black dots indicate intermediate haplotypes that were not sampled. Letters identify PCRD haplotypes listed in Supplementary Table 1. White circle, S. oviceps river ecotype; cross-hatched circle, S. oviceps cascade ecotype; square, S. dussertorum. Symbol size is proportional to frequency of haplotype in the sample. COI, cytochrome oxidase I; PCRD, PCR digest.

Significant nonrandom associations between clades and geographical location were found for 3 out of 11 nested clades. In two other instances, geographical sampling was insufficient to make biological inferences. Within the S. oviceps river clade, haplotypes C and D had broadly overlapping ranges on Tahiti, while haplotype C was absent from Moorea (Figure 5a). An inference of contiguous range expansion from Tahiti to Moorea is supported by the distribution of these two haplotypes (Supplementary Figure 2). At the next highest nesting level, haplotype E, (clades 1–4 in Supplementary Figure 2) is found only in a few cascades on Moorea at the periphery of the nesting clade, which includes C, D and E (Figure 5b). The large discrepancy in range sizes and the location of haplotype E leads to an inference of past fragmentation (Supplementary Figure 2).

Figure 5
figure 5

An overlay of nested clades on geography for clades in which the null hypothesis of no geographical structure was rejected. (a) S. oviceps river ecotype, range expansion; (b) S. oviceps river ecotype, past fragmentation and; (c) S. dussertorum, range expansion.

In S. dussertorum, the ranges of haplotypes A, B, and N overlap but differ in size (Figure 5c). The interior haplotype N was found only within the central crater of Tahiti, near the geographic center of the clade. Haplotype B was restricted to the Trois Cascades site. In contrast, the range of the tip haplotype A extended from within the central crater to the western and northern island periphery. These findings suggest range expansion outward from the central crater (Supplementary Figure 2).

Population structure

Most of the genetic variation within S. oviceps can be explained by differences in larval habitat while geographic barriers explained none of the variation (Table 2). In contrast, geographic barriers accounted for a large and significant percentage of the DNA variation in the S. oviceps cascade ecotype, although the amount of variation within populations was also substantial. For the river ecotype, genetic variation was split fairly and evenly between larval habitat and the within-populations component (Table 2), the former reflecting the fact that although E haplotypes fall within the river ecotype genetically, they were collected from cascades. In contrast with the overall pattern for S. oviceps but in agreement with the S. oviceps cascade ecotype, the presence of geographical barriers accounted for more genetic variation in S. dussertorum than did larval habitat (Table 2).

Table 2 AMOVA for S. oviceps and S. dussertorum populations

Morphological variation

Mean head width and body length were significantly greater for larvae collected from cascades (head width: F1,336=138.57, P<0.0001; body length: F1,334=95.94, P<0.0001). Larvae from cascades also had significantly smaller labral fans than those from rivers (length of longest ray: F1,325=52.98, P<0.0001; number of rays: F1,302=86.79, P<0.0001).

Phylogenetic units (species and ecotype) also differed significantly from each other in overall size, with S. oviceps cascade ecotype larvae being the largest followed by S. dussertorum and finally S. oviceps river ecotype larvae (head width: F2,270=59.57, P<0.0001; body length: F2,268=65.06, P<0.0001); and differences among all three groups were significant (Tukey–Kramer honestly significant difference (HSD) post hoc test). This may simply reflect the effect of habitat. On the other hand, fan morphology differed significantly between species but not between S. oviceps ecotypes (length of longest ray: F2,264=81.62, P<0.0001; number of rays: F2,243=173.66, P<0.0001; Tukey–Kramer HSD post hoc test).

Discussion

Our data illustrate the potential importance of larval ecology in the diversification of black flies in the Society Islands. We observed a pattern of genetic variation within S. oviceps not previously revealed by morphological analyses, and corresponding to larval aquatic habitat. Specifically, we found support for a genetically distinct river ecotype within S. oviceps. The two S. oviceps ecotypes are geographically sympatric, but are effectively allopatric because they occupy different stream types. The potential role of larval habitat in restricting gene flow is reinforced by the well-known vagility of black fly adults (Crosskey, 1990), making it surprising, in the absence of obvious barriers that subpopulation structure would exist on relatively small islands in close proximity to each other. Although S. dussertorum larvae are also found in rivers and cascades, ecological differences failed to explain genetic variation in this species, possibly because so few larvae were collected from rivers. An association between larval habitat shifts and cladogenesis has previously been proposed for Society Island black flies, based on the observation that species within the same clade tend to be found in similar stream types (Joy and Conn, 2001).

Despite strong support for a river ecotype clade, there is no such support for a monophyletic cascade ecotype within S. oviceps. As a result we are not able to distinguish between incipient and cryptic species status for the two S. oviceps ecotypes. Topological uncertainty also exists regarding the reciprocal monophyly of S. oviceps and S. dussertorum. It is well known that incomplete lineage sorting can obscure evolutionary relationships among recently diverged taxa (Maddison, 1997). Craig (1987) speculated that S. dussertorum might be a variant within S. oviceps, and the morphological data presented here were equivocal. Labral fan size, which has been shown to be phenotypically plastic in response to current velocity (Zhang and Malmqvist, 1997), differed significantly between S. oviceps and S. dussertorum even within the same habitat. In contrast, fan size was conserved in S. oviceps across habitats. On the other hand, S. dussertorum was intermediate between the two S. oviceps ecotypes with respect to head width and body length.

S. oviceps and S. dussertorum are members of the oviceps group (Craig et al., 2001); a monophyletic clade consisting of at least nine species for which, with the exception of S. oviceps, cascades are the exclusive larval habitat (Craig et al., 2001; Joy and Conn, 2001). This strongly suggests that in the oviceps group, cascades are the ancestral stream type and that rivers were subsequently exploited; and our TMRCA estimates support this view. Large rivers on Tahiti and Moorea have relatively recent origins, their drainage basins having formed when the central calderas collapsed (Craig et al., 2001). The Papanoo River in the central cauldera on Tahiti is estimated to have formed <0.87 MYA (Craig et al., 2001), which overlaps with the river ecotype age estimate (mean, 0.89 MY; 95% HPD, 0.66–1.14 MY).

The larvae of several other Society Island black fly species occupy rivers, although none have reduced feeding fans similar to S. oviceps. Schröder (1985) showed that S. oviceps in rivers had a high percentage of detritus particles in their gut while associated Simulium tahitiense larvae had a high percentage of unicellular algae in theirs, indicating that the two do not compete extensively for food. During a species radiation the invasion of novel habitats can provide new ecological opportunities and release from competition (Schluter, 1996). Here, we find S. oviceps exploiting resources in rivers by means of specialized feeding behavior (scraping) and associated morphology (reduced feeding fans) that presumably evolved in cascades. Thus, fine-scale partitioning of feeding niches may have facilitated the reinvasion of rivers.

Cascade populations exhibited higher levels of genetic sub-division than river populations. AMOVA indicated isolation between Tahiti-nui and Tahiti-iti for cascade but not river populations. The land bridge connecting the two is narrow and may restrict migration, but this does not explain the absence of sub-division among river populations. Nor can geographic distance between populations explain different levels of isolation for river and cascade populations. One possible explanation is that cascades may be intrinsically more isolated from each other than rivers due to their smaller size and patchy distribution. Of equal importance, small cascades in the Society Islands are often ephemeral, experiencing only intermittent, seasonal rainfall dependant flow (D Craig, personal communication). These conditions could lead to frequent local extinctions, thus augmenting isolation among cascade populations. Alternatively, given the younger geological age of rivers compared with cascades (Craig, 2003), and our estimated TMRCA for the river ecotype, river populations may have had less time to diverge from each other.

The complex geological history of the Society Islands may have confounded estimates of relative haplotype age as well as NCA inferences. As hot spot islands age they erode and subside such that older islands are typically smaller than younger ones. Owing to the link between geological and hydrological profiles, the overall size of the catchment, as well as the diversity of stream environments, is expected to initially increase and later diminish as volcanic islands age and eventually subside (Haynes, 1990; Craig, 2003). Thus, we would expect local lineage extinction rates to be related to island age. The inability of NCA to detect localized haplotype extinction has been shown to lead to incorrect inferences (Masta et al., 2003). Given this and other difficulties in interpreting NCA results outlined by Knowles and Maddison (2002), we have treated NCA inferences not supported by other results as tentative. NCA and AMOVA both detected population structure with respect to the anomalous haplotype E. These larvae, collected from a cascade on Moorea are genetically within the S. oviceps river ecotype. On the other hand, the NCA inference of range expansion from Tahiti to Moorea for the S. oviceps river ecotype was not supported by other data. Additionally, S. oviceps haplotypes M and E are rare and endemic to cascades on Moorea, and outgroup weights for both are low due to their rarity and peripheral location in the PCRD network (Figure 3), suggesting recent origins. However, both had basal placements in the sequence phylogeny (Figure 2). The smaller size and greater age of Moorea (Figure 1) would predict a greater loss of suitable stream habitats overtime compared with Tahiti. This may in turn have distorted the haplotype network, such that older haplotypes on Moorea (possibly E and M), are currently less abundant than younger haplotypes on Tahiti.

Phenotypic plasticity in growth rate and labral fan morphology in response to food availability and current velocity has been widely documented for black fly larvae (Zhang and Malmqvist, 1997; Lucas and Hunter, 1999; Zhang, 2005). Genetic assimilation (Waddington, 1961), in which moderate levels of phenotypic plasticity promote establishment in novel habitats, and subsequent selection for extreme phenotypes leading to genetic differentiation and perhaps speciation, has been proposed as an important mechanism for black fly diversification (Zhang, 2005). We suggest that genetic assimilation may figure prominently in the radiation of black flies in the Society Islands by enhancing their ability to successfully colonize novel niches. Our finding that S. oviceps larvae from rivers are genetically distinct supports this view.