Introduction

The structure of natural populations can be viewed from two standpoints: the demographic structure, which is affected by processes such as birth, death and dispersal; and the genetic structure, which is influenced by processes such as mutation, drift, selection and gene flow (Slatkin, 1994). Two factors from these different standpoints that are tightly linked are dispersal (and migration) and gene flow, which refer, respectively, to the movement of individuals and gametes among populations. The geographical scale of dispersal among subpopulations determines the size of self-recruiting units, knowledge of which is critical for predicting demographic processes with population dynamic models. The scale of gene flow among subpopulations determines the size of integral evolutionary units (Slatkin, 1987). Several population genetic models have been developed to describe patterns of genetic variation expected under various dispersal scenarios. For example, limited dispersal throughout a continuous distribution was the focus of Wright's (1943) concepts of ‘neighbourhood size’ and ‘isolation-by-distance’. When dispersal is among discrete demes, ‘stepping-stone’ models reflect more accurately the underlying dispersal processes, although isolation-by-distance is approximated when the number of demes is large (Kimura & Weiss, 1964). Throughout this paper, for brevity, dispersal is defined as the movement of individuals (including larvae, adults and juveniles), except where otherwise indicated.

In the absence of selection, limited gene flow will lead to genetic differentiation among populations as a result of genetic drift. The rate at which this occurs is largely dependent on population size, requiring approximately Ne generations to reach equilibrium, where Ne is the effective population size (Crow & Aoki, 1986). Thus, the demographic structure of populations that are large or have undergone recent range expansions may be more visible to assays of genetic markers that have a smaller effective population size. Theoretically, mitochondrial DNA, by virtue of its uniparental inheritance and haploid nature, has a fourfold smaller effective population size than nuclear genes (Birky et al., 1989). Furthermore, a higher mutation rate provides more lineages for genetic drift and gene flow to sort (Moritz et al., 1987). These attributes make mitochondrial DNA a powerful locus for detecting patterns of demographic structure in natural systems and may be particularly useful in those systems characterized by large effective population sizes, where maximum discriminatory power is essential. Mainly because of high potential for, and absence of barriers to, dispersal, marine organisms may often support large effective population sizes.

Several studies have attempted to correlate life history dispersal characteristics with geographical patterns of genetic variation with mixed success in marine systems. Hellberg (1996) was able to detect differences in population genetic structure in two solitary corals that differ in larval dispersal potential. Waples (1987) found a correlation between estimated dispersal ability and genetic differentiation among 10 species of shore fish. Similarly, Doherty et al. (1995) reported a relationship between genetic differentiation and planktonic larval duration in coral reef fishes on the Great Barrier Reef. In contrast, Shulman & Bermingham (1995) could not detect any correlation between life history parameters and phylogeographical structure in seven species of Caribbean coral reef fishes. An evolutionarily (small numbers of individuals per locality) rather than a demographically (large numbers of individuals per locality) resolute analysis and sampling scheme possibly contributed to their contrasting results. Intrinsic factors, such as life history traits, are more likely to correlate with demographic level population genetic structure than to be entrenched in phylogeographical structure because of the need for temporally consistent allopatry. Such environmental constancy is seldom offered in coral reef environments because of sea level fluctuations. On a broader scale, Ward et al. (1994) detected differences among general life histories (marine, freshwater and anadromous) and genetic variation in bony fishes.

Lates calcarifer is a catadromous teleost, which inhabits shallow waters in estuaries, rivers and bays from east Africa through the South-East Asian archipelago to Australia's east coast. Individuals spend much of their lives in freshwater regions of rivers but undergo annual spawning migration into bays and estuaries. Larvae, although pelagic, develop rapidly and exhibit active migration into coastal swamps and wetlands (Russell & Garrett, 1988). Direct dispersal estimates (mark–recapture studies) from adults and juveniles suggest that migration is restricted to the parental catchment and neighbouring rivers (Russell & Garrett, 1988). Several population genetic models are available to explain the patterns of gene flow and population structure expected in L. calcarifer. Wright's (1943) island model, in which the likelihood of exchange of migrants is equal among all demes, may be violated for L. calcarifer as populations occur primarily in one spatial dimension — the coastline. A one-dimensional stepping-stone model in which gene flow occurs chiefly among neighbouring demes (Kimura & Weiss, 1964) may be more appropriate.

The link between dispersal and gene flow needs further investigation in species in which direct measures of dispersal exist, in order to validate the power of population genetic models in predicting demographically informative dispersal patterns (Slatkin, 1994). Here, the pattern of mitochondrial DNA variation in a species for which direct dispersal estimates, and for which data from nuclear gene (allozyme) variation are already available for comparison, is investigated. Specific hypotheses for L. calcarifer are that: (i) the genetic patterns of mtDNA variation will be best explained by isolation-by-distance; (ii) population subdivision will be weaker within oceanic regions than among them because of extrinsic barriers to dispersal; and (iii) these patterns will be concordant with, yet more marked than, those found in allozyme variation owing to mtDNA corresponding to a smaller effective population size than nuclear genes.

Materials and methods

Sampling design

To sample the spatial distribution of mtDNA variation in L. calcarifer over its far-eastern range, a two-level hierarchy with balanced sampling effort in each collection was used (Fig. 1). Regions were selected to investigate potential oceanographic barriers to gene flow: Torres Strait and the Gulf of Carpentaria. Random samples of 30 individuals are sufficient to detect at least one copy of all haplotypes that occur at 10 per cent frequency with 95 per cent confidence (Schwager et al., 1990). Temporal variation in haplotype frequencies was assumed not to vary significantly over the sampling period, an assumption with some empirical support from allozyme data (Keenan, 1994).

Fig. 1
figure 1

Sampling design and sample sizes for Lates calcarifer collections.

Laboratory methods

Total genomic DNA was isolated with sequential phenol–chloroform extraction from ground muscle tissue. Two oligonucleotide primers were used to amplify a 290-bp fragment of region 1 of the mitochondrial control region. The heavy strand primer (MT16498H; 5§-CCT GAA GTA GGA ACC AGA TG-3§; Meyer et al., 1990) is located in a conserved central domain of the control region. The light strand primer (BRCR1L; 5§-TTT ATG CTA ACC AAT AAG T-3§) was designed from homologies among L. calcarifer sequences amplified using the LN20 primer of Bernatchez & Danzmann (1993), which lies in the proline tRNA gene flanking the 5§ end of the control region.

Temperature gradient gel electrophoresis combined with heteroduplex analysis with a reference conspecific (TGGE-HA) was used to screen amplified fragments for variation. Representatives of each haplotype were sequenced with an automated sequencing machine (Applied Biosystems 373). A random sample of 10 haplotypes that were detected more than once was sequenced several times to ensure that any two copies of a putative haplotype shared identical nucleotide sequences. Where possible, five individuals bearing the same haplotype were sequenced; if fewer than five individuals were reported, all individuals were sequenced. The identity of many haplotypes with similar banding patterns was verified by running them for a second time on the same gel. More details of the technique are outlined in Chenoweth & Hughes (1997).

Statistical analyses

Mitochondrial DNA variation within localities and regions was summarized with haplotype and nucleotide diversity statistics (Nei, 1987). Standard errors for nucleotide diversity statistics were calculated using the algorithm of Nei & Jin (1989). Spatial patterns of mtDNA variation were characterized using the multidimensional scaling routine implemented in the computer package PATN (Belbin, 1995). Spatial heterogeneity in haplotype frequencies among localities was tested using the Monte Carlo randomization algorithm of Roff & Bentzen (1989). One thousand randomizations were carried out in each pairwise test.

AMOVA (Excoffier et al., 1992), an analysis of variance procedure that partitions molecular variance according to sampling design, was used to investigate population subdivision. When sequence divergences among haplotypes are omitted, AMOVA yields a statistic (ΦST) analogous to Weir & Cockerham's (1984) unbiased FST estimator, θ. AMOVA analyses were also performed using the Tamura & Nei (1993) gamma distance (gamma parameter=0.5) between haplotypes as a measure of evolutionary divergence, which accounts for mutation rate heterogeneity among nucleotide sites. Tests for significance at the regional level assume integrity of locality samples and randomize their placement within regions; tests of locality-level statistics relax the population groupings and assign individuals randomly to new populations to construct null distributions. All Φ-statistics were tested for significance using 1000 random permutations. Significance levels of all pairwise tests were corrected for multiple comparisons with the sequential Bonferroni procedure of Hochberg (1988).

Tests for isolation-by-distance were conducted using the phenetic approach proposed by Slatkin (1993), which uses an ordinary least-squares regression of female gene flow on distance (minimum coastline distance) between all pairs of sites. Estimates of Nemf (effective female gene flow between pairs of populations) were calculated using the relationship:

which is corrected for mtDNA data. The statistical significance of associations between geographical distance and gene flow was tested using Mantel's (1967) test. A reconstruction of the phylogenetic relationships among haplotypes was performed using the phenetic neighbour-joining algorithm of Saitou & Nei (1987). Tamura & Nei (1993) gamma distances (gamma parameter=0.5) were used to estimate sequence divergence among haplotypes.

Results

Distribution and genealogy of mtDNA lineages

From a total of 270 individuals assayed, 63 putative haplotypes were resolved. Many haplotypes were shared within regions, whereas among regions no haplotypes were shared (Table 1). Both regional nucleotide and haplotype diversity decreased from west to east, with diversity being higher in the Western Arafura and Gulf of Carpentaria than in the Coral Sea. Coral Sea localities were characterized by consistently smaller numbers of haplotypes and lower diversity indices. The full data set of haplotype frequencies is available from the authors on request.

Table 1 Mitochondrial DNA control region diversity within localities and regions sampled for Lates calcarifer

Sequence divergence among haplotypes ranged from 0.4 per cent to 13.2 per cent with a mean of 5.8 per cent. The neighbour-joining tree of all 63 haplotypes revealed no striking pattern of geographical structuring of clades at either locality or regional scales (Fig. 2). This apparent lack of phylogeographical structuring occurs despite complete nonsharing of haplotypes among regions. The nucleotide sequence for the most common haplotype, haplotype 1, appears in GenBank under accession number U85194.

Fig. 2
figure 2

Unrooted neighbour-joining tree for Lates calcarifer showing reconstructed phylogenetic relationships among 63 mtDNA control region haplotypes. Shaded boxes indicate regions, numbers indicate localities at which haplotypes were detected (see Fig. 1 for key to locality numbers). Scale bar represents 1 per cent sequence divergence.

Population subdivision

If the spatial distribution of haplotypes and endemism indices (Table 1) are considered, varying degrees of population subdivision are evident. Haplotypes are often shared among localities but never among regions. AMOVA results partitioned significantly larger portions of molecular variation among localities and regions than would be expected by chance (Table 2). Although both haplotype-based and sequence-based analyses partitioned molecular variation in a similar pattern, there were some noteworthy discrepancies between the two approaches. The haplotype-based analysis, which treated all haplotypes as evolutionarily equidistant, partitioned more variation within regions (8.6 per cent) than the sequence-based analysis (3.5 per cent), whereas the sequence-based analysis indicated more genetic subdivision among regions (30.3 per cent) than the haplotype-based analysis (10.4 per cent).

Table 2 Population subdivision analysis for Lates calcarifer and inferred levels of female gene flow (Nemf) within and among regions

The patterns outlined above are also evident in the multidimensional scaling plots (Fig. 3). A tighter clustering of regional localities is evident in the sequence-based plot than in the haplotype-based plot. Coral Sea localities were consistently less clustered than other regional localities, an interesting result as these localities were geographically further apart than localities within the Arafura Sea and the Gulf of Carpentaria.

Fig. 3
figure 3

Multidimensional scaling plots of two different measures of genetic differentiation among Lates calcarifer samples plotted in two dimensions: Coral Sea samples are squares, Gulf of Carpentaria samples are circles and Western Arafura Sea samples are triangles. (a) Haplotype-based ΦST among localities (stress=0.9). (b) Sequence-based ΦST among localities (stress=0.1).

Pairwise chi-squared analyses of haplotype frequencies among localities discriminated all localities from each other (P<0.05). Pairwise haplotype-based AMOVA tests indicated significant differentiation among more localities than the sequence-based analysis (Table 3). Sequence-based comparisons could not differentiate localities within the Western Arafura or Gulf of Carpentaria regions.

Table 3 Pairwise population subdivision analyses among all localities sampled for Lates calcarifer

Isolation-by-distance

A significant negative correlation between the log–log regression of Nemf and geographical distance was taken as evidence for isolation-by-distance (Slatkin, 1993). In the regression using a sequence-based Φ ST to infer gene flow, geographical distance among localities explained 79 per cent of the variation in gene flow with a slope of −1.22, Mantel's matrix r=−0.887, P<0.001 (Fig. 4). The relationship calculated using a haplotype-based Φ ST had a significant Mantel's correlation (r=−0.475, P<0.01), but displayed a substantially weaker pattern of isolation-by-distance. In the regression, distance explained only 23 per cent of the variation in inferred gene flow with a slope of −0.33 (Fig. 4).

Fig. 4
figure 4

Log–log regressions for Lates calcarifer of maternal gene flow Nemf [calculated from ΦST either omitting (a) or including (b) sequence diver-gence among haplotypes] on geographical distance (minimum coastline distance).

Discussion

We predicted that patterns of dispersal implied by L. calcarifer's life history and estimated via direct studies would be mirrored by its pattern of mitochondrial DNA variation. That is, population genetic analyses would reflect the limited dispersal potential of the species. Further, we predicted that patterns of genetic structure would be more apparent in mtDNA than in nuclear markers, because of its theoretically smaller effective population size. The results are substantially concordant with these predictions, as is discussed below.

Life history and direct studies

Lates calcarifer shares similar patterns of genetic structure with the perciform Macquaria novemaculeata (Chenoweth & Hughes, 1997). Macquaria novemaculeata has a nearly identical catadromous life history to L. calcarifer — adults are confined to estuaries and rivers and spawn near the mouths of them. The two species are similarly differentiated in mtDNA control region haplotypes over comparable spatial scales along Australia's east coast (700 km spatial scale: L. calcarifer Φ ST=0.098, M. novemaculeata Φ ST=0.146). It has been argued on the basis of allozyme data that M. novemaculeata exhibits stronger geographical structure than L. calcarifer as a result of its demersal eggs and subtle preference for spawning at lower salinities, which occur further upstream than L. calcarifer spawning (Chenoweth & Hughes, 1997). Life history characteristics may play a significant role in shaping the evolutionary dynamics of both species. Similar conclusions have been reached by other researchers for shore fishes (Waples, 1987) and coral reef fishes (Doherty et al., 1995).

In a direct estimation of L. calcarifer movements in Papua New Guinea, 98 per cent of all individuals marked in a catchment were recaptured in the same location (Moore & Reynolds, 1982). Moreover, movements of tagged adults and juveniles along Australia's east coast are more common among neighbouring catchments than among distant ones (Russell & Garrett, 1988; W. Sawynok, pers. comm.). The above data, although not giving an estimate of prerecruitment larval dispersal, suggest that, in the absence of significant barriers to dispersal, populations should be more genetically differentiated among distant localities than among proximal ones.

Such patterns are evident in our analysis. Population subdivision tests, regardless of analysis type, suggest more subdivision among regions than within them. Furthermore, localities that are more sparsely spaced within a region (e.g. the Coral Sea) are more genetically differentiated than their more proximal counterparts in other regions (Fig. 3). Regressions of geographical distance on maternal gene flow indicated a strong negative association of Nemf with geographical distance. Such a relationship is synonymous with genetic differentiation generated via isolation-by-distance and is expected under limited dispersal at equilibrium (Wright, 1943). The slope of the regression in the sequence-based analysis was close to the theoretical prediction of −1.0 from a one-dimensional stepping-stone model (Slatkin & Maddison, 1990), which seems appropriate as the L. calcarifer habitat occurs chiefly in one dimension — the coastline.

The haplotype-based analysis does not show the striking pattern evident in the sequence-based isolation-by-distance analysis. Also, haplotype-based analyses appear to be more discriminatory in pairwise comparisons of localities than sequence-based analyses (Table 3). The former analysis is strongly affected by high within-locality haplotype diversity generated by large numbers of rare haplotypes. For example, the nonzero value of Nemf among regions may appear unreasonable considering that no haplotypes are shared between regions. As Φ ST (and FST) is simply a measure of the proportion of haplotypic variation attributed to differences among populations, an analysis of samples that do not share any haplotypes but still contain a large amount of variation within them will return a Φ ST value of less than one and thus gene flow estimates greater than zero. Sequence-based analyses weight the presence of haplotypes depending on net divergence from all other haplotypes in the sample, and thus an effect of pooling closely related haplotypes may result. Discrepancies between haplotype and sequence statistics have also been noted by others. Lavery et al. (1996) detected more population subdivision in the coconut crab using haplotype statistics; however, Excoffier et al. (1992) found the reverse analysing a human mtDNA data set.

The detection of isolation-by-distance invites the tentative conclusion that L. calcarifer populations are at equilibrium (Slatkin, 1993). Thus, sufficient time may have elapsed since past population expansion or contraction events for limited dispersal to generate this pattern. The discrepancy between analysis types is also interesting here. Populations appear to be closer to equilibrium for sequence statistics than for haplotype statistics. The theoretical basis for differences in the rate of approach to equilibrium between these two methods is unclear. Further theoretical and empirical attention to the characteristics of sequence- and haplotype-based analyses including potential differential biases would be valuable.

That 63 per cent of all haplotypes were detected at only one locality also suggests that dispersal is limited. As a balanced sampling design was used, an unbiased comparison of localities can be made. However, a highly variable segment of the mitochondrial genome was assayed, which makes the effects of a high mutation rate and limited dispersal difficult to distinguish. The likelihood of a haplotype being translocated among demes by gene flow in a given generation is dependent upon its frequency. For L. calcarifer, many private haplotypes occur at low frequencies (mean frequency of all private haplotypes=0.073) and may not yet have been translocated by contemporary gene flow. Thus, gene flow inferred purely from private haplotypes may be an underestimate when highly variable markers are assayed.

The isolation-by-distance analysis suggests no particular barriers to gene flow beyond the common restriction of dispersal to adjacent estuaries.

However, this may be somewhat misleading, as nonsharing of haplotypes between regions suggests very limited gene flow and that the regional patterns were not simply an artefact of a stratified sampling design but had a degree of biological relevance for L. calcarifer. Indeed, several other marine organisms may exchange a limited number of genes among the regions sampled for L. calcarifer. Marine turtles display limited mitochondrial gene flow between the Gulf of Carpentaria and the Coral Sea (Norman et al., 1994). Populations of the pearl oyster, Pinctada maxima, are genetically differentiated between Cape York (on the boundary of the Coral Sea and the Gulf of Carpentaria regions) and populations in the Arafura Sea (Johnson & Joll, 1993). Some penaeid prawns also display significant differences in allozyme frequencies across northern Australia (Benzie et al., 1992). No reports could be found of marked L. calcarifer individuals moving the considerable distance among these regions, but one individual was recorded as moving from the Fly river in Papua New Guinea into the Gulf of Carpentaria (W. Sawynok, pers. comm.).

Either contemporary or historical barriers to dispersal may be responsible for these regional patterns. Oceanic currents may exhibit limited mixing between these regions. However, the existence of historical barriers to dispersal seems a more attractive explanation in view of the fact that many of the above-mentioned species, including L. calcarifer, are presently continuously distributed across regional boundaries. Sea levels in the late Pleistocene were extremely variable in this region and facilitated closure and reopening of the Torres Strait between Papua New Guinea and Cape York Peninsular on many occasions (Chappell & Shackleton, 1986). The last reopening occurred only 6000 years BP or approximately 1200 L. calcarifer generations ago. Such fluctuations in sea level greatly influenced the size, shape and connectivity of marine and estuarine habitat available through time. Thus, perhaps the large genetic differences are remnants of isolation past, the persistence of which has been upheld by the limited dispersal potential of L. calcarifer.

For L. calcarifer, if complete isolation among regions had been constant on an evolutionary timescale, it might have been expected that there would be monophyletic mtDNA lineages within each region. The absence of this pattern indicates that dispersal among regions may not always have been as limited as the data indicate. A more detailed analysis of L. calcarifer population history is the subject of a forthcoming paper and will not be discussed further here.

Despite the minor analytical discrepancies, the present study demonstrates a common pattern in all types of analysis: isolation-by-distance and varying degrees of population subdivision. The most striking pattern is the large genetic differentiation among regions characterized by completely unique haplotypes.

Concordance with allozyme variation

Keenan (1994) conducted an extensive analysis of L. calcarifer allozyme variation in 6000 individuals over 5500 km of coastline. Several features of Keenan's data are of particular relevance. First, regional patterns detected in the present study are mirrored in the allozyme data. For example, some alleles change frequency sharply along the coastline. The sharp changes correspond strongly with the geographical boundaries sampled in the present study.

Secondly, clines in allele frequencies show some evidence of isolation-by-distance but at the within-region scale. This highlights the nested patterns of genetic structure, which can exist in natural populations, the detection of which is tightly linked to sampling scale and statistical power. Lavery et al. (1995) found different population genetic models applied at different spatial scales in the coconut crab, Birgus latro. Within oceans, isolation-by-distance via stepping-stone gene flow was detected. Between oceans, an island model fitted the observed patterns more closely. An a posteriori regression of gene flow (sequence-based) and geographical distance excluding all between-region comparisons returned a negative slope of −0.92 (data not shown). It is unfortunate that it was not possible to sample more localities within regions to support a within-region isolation-by-distance interpretation statistically.

Thirdly, in the present study, all localities can be discriminated significantly on the basis of at least one pairwise population subdivision test. Some of our localities fall within a population boundary proposed by Keenan (1994). For example, Keenan's study could not discriminate between Mitchell, Flinders and Leichardt localities in the Gulf of Carpentaria despite large sample sizes (average n=200) and high statistical power. All these localities were distinguished on the basis of randomized chi-squared tests and several by haplotype-based population subdivision analyses. The mtDNA data may reflect more closely the scale of demographic subdivision in L. calcarifer. Such discriminatory power of mtDNA is most probably linked to its smaller effective population size and high mutation rate against which the effects of genetic drift and gene flow are more visible.

Conclusion

Our demonstration of concordance between patterns of gene flow and dispersal in L. calcarifer has implications for the biological accuracy of population genetic models. That genetic analyses could detect population structure at demographically relevant scales highlights their value for resolving such structure in species whose habitat may prohibit direct estimation. Caution is emphasized as the effects of historical events can persist in genetic data for long periods of time. This temporal ‘lag’ between historical and contemporary patterns greatly contributes to the error in inferences of demographic structure made from genetic patterns. Such error may be partially reduced via genetic assays of cytoplasmic markers, such as mtDNA, because of a small theoretical effective population size.