Introduction

The analysis of population genetic structure using mitochondrial DNA (mtDNA) restriction fragment length polymorphisms (RFLP) with universal primers is a popular alternative to microsatellite analysis in species where species-specific microsatellite primers are not available. The interpretation of such data, however, is often compromised by uncertainties concerning the choice of mtDNA region and the effects of extremely high variability observed in many pelagic fish species. Since different regions of mtDNA evolve at different rates, specific mtDNA genes have been targeted for phylogeny reconstruction (Hillis et al., 1996), species identification and assays of intraspecific variation (Chow et al., 1993; Cronin et al., 1993; Bembo et al., 1995). The D-loop in particular has been used for population studies, and although its rate of evolution is reputedly two to five times higher than that of mitochondrial protein-coding genes (Meyer, 1993), low levels of variability have been observed in many fish (Park et al., 1993; Hansen & Loeschke, 1996). Genes coding for subunits of NADH dehydrogenase (ND genes), on the other hand, have been used successfully in salmonids (Cronin et al., 1993; Park et al., 1993; Hall & Nawrocki, 1995) and clupeids (Bembo et al., 1995; Hauser et al., 1995). Indeed, clupeids have so far shown extraordinarily high levels of genetic variability in the ND5/6 genes (Bembo et al., 1995; Hauser et al., 1995). Surprisingly, however, even such highly polymorphic mtDNA data appear to reveal population differentiation less often than microsatellites with similar levels of variability (Hauser & Ward, 1998). The reasons for this apparent difference in resolving power between these two marker systems remain uncertain (Hauser & Ward, 1998).

Atlantic herring, Clupea harengus, has been the subject of much fundamental work on marine fish population structure (Sinclair & Solemdal, 1988), not only because of its economic and cultural importance, but also because of its exceptional heterogeneity in life-history, morphology and behaviour (McQuinn, 1997). For example, five oceanic and six shelf stocks were distinguished in the north-eastern Atlantic (Parrish & Saville, 1965) on the basis of tag returns (Wheeler & Winters, 1984), differences in spawning times (Haegele & Schweigert, 1985), morphometrics (Rosenberg & Palmen, 1982) and rates of growth, recruitment and mortality (Cushing & Bridger, 1966; Burd, 1985). However, molecular genetic studies, mainly based on allozyme electrophoresis, provided little support for genetic differentiation in North Atlantic herring (Andersson et al., 1981; Grant, 1984; Ryman et al., 1984; King et al., 1987). Only one study detected small, but significant differentiation among North Atlantic stocks (Jørstad et al., 1991). Some fjord populations in Norway were genetically highly differentiated, although the taxonomic status of herring in these fjords is uncertain and allozyme data suggest that these populations are genetically more similar to Pacific than to Atlantic herring (Jørstad et al., 1994). Similarly, herring in the Barents Sea show different allele frequencies at both allozymes and microsatellites, and are thus most likely a separate population from herring in the North Atlantic (Shaw et al., 1999). Findings of genetic differences between spring and autumn spawners of herring in the western Atlantic (Kornfield et al., 1982) were questioned (Smith & Jamieson, 1986), and studies using mtDNA analysis revealed no major genetic differentiation between putative stocks (Kornfield & Bogdanowicz, 1987; Dahle & Erikson, 1990). Some evidence for genetic differentiation has, however, been found for Georges Bank herring (Stephenson & Kornfield, 1990). Because of these ambiguous results, herring is often cited as an example where molecular tools did little to help resolve the stock structure of an exploited species (Carvalho & Hauser, 1994).

Here, we used mtDNA RFLP analysis of two regions of ND genes, ND3/4 and ND5/6, to investigate the genetic population structure of north-east Atlantic herring populations. The aim was to compare results from two different regions as well as to clarify the stock structure and recruitment patterns in herring. Results were compared with previously published allozyme (Turan et al., 1998) and microsatellite data (Shaw et al., 1999) of the same samples.

Materials and methods

Samples

Herring were caught throughout the north-east Atlantic Ocean (Fig. 1, Table 1), and kept frozen at − 20°C until transportation on dry ice. On arrival, samples of muscle tissue were removed quickly and stored in 90% ethanol. All samples were analysed for ND5/6, but due to time constraints only five samples could be screened for ND3/4.

Fig. 1
figure 1

Map of the sampling locations for Atlantic herring: Icelandic summer-spawners (1994) (IC1), Icelandic summer-spawners (1995) (IC2), Norwegian spring-spawners (Barents Sea) (BS), Norwegian spring-spawners (central Norwegian Sea) (NSS), Balsfjord herring (BF), Trondheimsfjord herring (TF), Baltic herring (BA), northern North Sea (Buchan herring) (NSN), southern North Sea (Downs herring) (NSD), Celtic Sea (Dunmore) (CS). Further sample details are given in Table 1. Full circles refer to samples analysed for ND3/4 and ND5/6, whereas samples indicated by empty circles were only analysed for ND5/6.

Table 1 Origin, length, age and sex of Clupea harengus samples used in this study

Laboratory procedures

DNA was extracted according to a standard protocol (Taggart et al., 1992). Universal vertebrate primer sequences (modified from Cronin et al., 1993) were used to amplify two regions of mtDNA genes encoding for subunits of NADH dehydrogenase: a 2.5 Kb region (ND5/6 genes) and a 2.4 Kb region (ND3/4 genes). The primer sequences were:

ND5/6: A: 5′-AAT AGT TTA TCC AGT TGG TCT TAG- 3′, B: 5′-TTA CAA CGA TGG TTT TTC ATA GTC A-3′;

ND3/4: A: 5′-TAA (C/T)TA GTA CAG (C/T)TG ACT TCC AA-3′, B: 5′-TTT TGG TTC CTA AGA CCA A(C/T)G GAT-3′.

PCR was carried out using previously published reaction mixtures and temperature cycles (Turan et al., 1998). The PCR product was restricted with one of six endonucleases recognizing four base sequences: AluI, CfoI, HaeIII, HinfI, MspI, RsaI. The fragments of the restricted DNA samples were separated on 6% polyacrylamide gels, together with a pGem marker (Promega). A modified silver nitrate staining protocol (Tegelström, 1987) was used to visualize the DNA fragments.

Data analysis

Fragment sizes were estimated from their mobilities relative to a standard pGem DNA-marker (Promega) using the BIOGENE gel documentation package (Vilber-Lormat, France) and DNA-FRAG version 3.03. Fragment data were used to estimate genetic distances between haplotypes (Nei & Li, 1979; Nei, 1987), which were then compared for the same set of individuals between the two regions using a Mantel test. Haplotype dendrograms were constructed, and closely related haplotypes were arbitrarily `binned' into haplotype groups for further analyses (Stepien, 1999). Patterns of diversity were described by haplotype diversity (Nei's unbiased estimate (Nei, 1987), frequency of common and unique haplotypes and the Shannon Index (Begon et al., 1986). Nucleotide diversities and divergences (Nei & Tajima, 1981; Nei, 1987) were calculated using the computer package REAP (McElroy et al., 1992). Overall FST values were estimated using a molecular analysis of variance (AMOVA, Excoffier et al., 1992) without consideration of molecular distances among haplotypes. The same software was used for a hierarchical AMOVA on the ND5/6 samples, with the two samples from Iceland (IC1, IC2) and the Norwegian samples (NSS, BF, TF) as groups; the other samples were kept separately. The significance of genetic differentiation overall and between pairs of samples was tested using Monte Carlo χ2 tests (MC χ2, Roff & Bentzen, 1989; REAP, McElroy et al., 1992) with 10 000 randomizations. These tests were carried out on the original haplotypes as well as on haplotype groups. To assess the effects of haplotype binning, haplotypes were randomized 1000 times between haplotype groups (using POPTOOLS for EXCEL by Greg Hood, CSIRO, Australia, available at http://www.dwe.csiro.au/vbc/poptools) and Monte Carlo χ2 tests repeated. Patterns of genetic differentiation were displayed by multidimensional scaling of Reynolds et al.'s (1983) genetic distances.

Results

Comparison of mtDNA regions

The average size of the ND3/4 region was 2410 base pairs (bp), of which 53 fragments or 212 bases (four bases per fragment) were surveyed by the restriction analysis. The ND5/6 fragment was longer (2514 bp), though fewer fragments were observed (47 fragments, i.e. 188 bp).

Both regions were extremely polymorphic with similar haplotype diversities (Table 2). However, more haplotypes were found for ND5/6, and the number of unique haplotypes (found only once in the data set) was higher. Furthermore, there were considerably more common haplotypes in ND3/4. The Shannon diversity index, which is more sensitive to very rare frequencies, was higher in ND5/6 than in ND3/4. Thus, ND5/6 was more variable, with many rare or unique haplotypes and few common haplotypes. Nucleotide diversity, on the other hand, was low with 0.0053 (± 0.0004) at the ND3/4 region and 0.0081 (± 0.0018) in the ND5/6 region.

Table 2 Vital statistics of the two mtDNA regions of Atlantic herring

Nucleotide divergences between haplotypes were not correlated between the two regions (Mantel test: P=0.350), though inclusion of a sample of Pacific herring (data not shown; see Turan, 1997) resulted in a highly significant correlation (P < 0.001).

When the two regions were combined, average haplotype diversity was extremely high (0.97), with the majority of haplotypes occurring only once in the sample set (Table 2).

Geographic heterogeneity

Tests for genetic heterogeneity in ND3/4 haplotype frequencies among samples revealed overall highly significant heterogeneity in haplotype frequencies among the Atlantic herring samples (Table 2). In pairwise comparisons, the Icelandic herring sample (IC1) exhibited varying degrees of significant geographical differentiation from all other samples, especially when MC χ2 tests were used (Table 3). Significant differences in haplotype frequency were also observed between the Baltic and Celtic sea samples and, for MC χ2 tests only, between the Trondheims fjord (TF) and Baltic and Celtic Sea (BA, CS) samples (P < 0.05).

Table 3 Pairwise comparisons of ND3/4 haplotype frequencies among herring samples

In contrast, no significant overall heterogeneity in haplotype frequencies was observed in the ND5/6 region (Table 2), though the MC χ2 including only samples also analysed for ND3/4 was marginally significant. Similarly, a hierarchical analysis of molecular variance (AMOVA) revealed no differentiation among samples (FST=0.003, NS), among groups (FCT=− 0.002, NS) or among samples within groups (FSC=0.005, NS).

Both regions combined yielded contradicting results, with low but significant FST, but no significant heterogeneity from Monte Carlo comparisons. Nevertheless, the Icelandic herring appeared to be genetically different from all samples but the Celtic Sea (CS) (Table 5).

Table 5 Pairwise comparisons of haplotype frequencies of ND3/4 and ND5/6 herring data combined

Effect of haplotype binning

The binning of haplotypes generally reduced haplotype diversity only slightly, but decreased the number of unique haplotypes and the Shannon Index considerably. Estimates of population differentiation (FST) in the two regions were similar after binning, though significance levels appeared to be lower (Table 2). A hierarchical analysis on binned ND5/6 data of all samples, with North Sea (NSN, NSD), Norwegian spring spawners (NSS, BF, TF) and Icelandic herring (IC1, IC2) as groups, revealed highly significant heterogeneity among groups (FST=0.015, FCT=0.024***, FSC=− 0.009), and subsequent tests on samples pooled within groups strongly indicated that Icelandic samples were significantly different from all others except the Celtic Sea (CS) (Table 4, Fig. 2).

Table 4 Pairwise comparisons of combined haplotype frequencies of ND5/6 data with pooled herring samples (IC: IC1, IC2; NW: TF, BF, NSS, NS, NSN, NSD)
Fig. 2
figure 2

Multidimensional Scaling Plot of Reynolds et al.'s (1983) distances using binned ND5/6 haplotype data of Atlantic herring. See Table 1 for sample designations. (Stress=0.103, r2=0.922).

For the two regions combined, the effects of haplotype binning were striking, with an increase of FST from 0.005 to 0.018 (Table 2). Pairwise comparisons suggested that the Icelandic (IC) herring were different from other samples, and that there was also differentiation between the Celtic Sea (CS) and the Barents Sea (BS), as well as the Baltic Sea (BA) (Table 5, Fig. 3).

Fig. 3
figure 3

Multidimensional Scaling Plot of Reynolds et al.'s (1983) distances using binned ND3/4/5/6 haplotype data of Atlantic herring. See Table 1 for sample designations. (Stress=0.086, r2=0.930).

Randomization of haplotypes among haplotype groups and subsequent MC χ2 tests showed that for each individual region random binning gave similar probabilities of population differentiation to the binning based on the haplotype tree (probability of higher χ2 by chance: ND3/4, P=0.55; ND5/6, P=0.237). However, for both regions together, the level of population structure obtained from binning observed haplotypes significantly exceeded that of the randomized trees (MC χ2, P=0.01), indicating that the haplotype tree adequately reflected true genetic distances.

Discussion

Patterns of variability and effects on the detection of heterogeneity

In common with many other marine pelagic fish species (Hauser & Ward, 1998; Grant & Waples, 2000), Atlantic herring revealed very high haplotype diversity and low nucleotide diversity, demonstrating the existence of many closely related haplotypes. Such patterns are typical for large populations which have recently expanded rapidly from a small population, so allowing the retention of new mutations without sufficient time for the accumulation of large differentiation among haplotypes (Stepien, 1999; Grant & Waples, 2000). In Atlantic herring, such patterns are probably due to the population expansion and colonization of temperate regions in the North Atlantic after the last Pleistocene glaciation.

In practical terms, high haplotype diversity with small distances between haplotypes complicate the analysis of data considerably, as methods based on haplotype frequencies are inappropriate in cases where nearly all individuals carry unique haplotypes. Several methods have been suggested to deal with such data (Stepien, 1999), for example (i) exclusion of rare haplotypes, (ii) ;consideration of only a selected part of the variation (e.g. transversions or non-synonymous substitutions) and (iii) pooling of related haplotypes. In the present study, exclusion of rare haplotypes was ineffective (data not shown), and RFLP data do not allow the distinction of different types of base substitution. Thus, pooling (or `binning') of related haplotypes was used to analyse the data.

Comparisons between the two mtDNA regions studied here and previously published microsatellite data (Shaw et al., 1999) suggest a relationship between the allele frequency distribution of marker loci and their power in the detection of population differentiation. Heterozygosity/haplotype diversity may here not be the most useful measure to describe such patterns of variability: although the haplotype diversity was very similar in the two mtDNA regions surveyed, closer examination of data revealed considerable differences in total number of haplotypes and number of common haplotypes. These differences may be best summarized in the Shannon index (Begon et al., 1986), which is more sensitive to rare haplotypes/alleles, and which is considerably higher in ND5/6 than in ND3/4. Such rare haplotypes do not affect the outcome of Monte Carlo tests (pers. obs.), but effectively reduce the informative sample size for population comparisons and thus decrease the power of statistical tests. Thus, the higher heterogeneity observed at ND3/4 than at ND5/6 may be a result of the greater number of unique haplotypes in the latter. Indeed, the Shannon index of ND3/4 is very similar to that of microsatellites previously used in herring taken from a similar geographical range, which revealed considerable geographical heterogeneity (Table 2, Shaw et al., 1999). Although the number of loci considered in this comparison of variability patterns is small (two mtDNA regions and four microsatellites), they suggest that the difference in sensitivity between microsatellites and mtDNA often observed in pelagic fish (Hauser & Ward, 1998) is at least in part due to different distributions in allele/haplotype frequencies.

Binning of haplotypes

The reduction of variability by binning related haplotypes had inconsistent effects on the detection of geographical heterogeneity: whereas FST values for ND5/6 remained small and non-significant, the FST for ND3/4 decreased considerably and lost its significance.

In contrast, when haplotypes for both regions were binned, FST increased from 0.005 to 0.018 and was highly significant. The main reasons for these different effects may be both the original haplotype diversity and the quality of distance data between individual haplotypes. As only a small proportion of the mtDNA was surveyed for each region separately, the fragment data used in this study may have recovered true distances between haplotypes only incompletely, resulting in the combination of unrelated haplotypes and an effective randomization of data. This was also indicated by the permutation tests, which indicated that randomization of haplotypes among the branches on the tree resulted in similar levels of differentiation to that observed. In contrast, both regions combined surveyed a larger proportion of the mtDNA over a longer stretch of the mtDNA genome, and distances between haplotypes may thus be much more meaningful. Correspondingly, randomization of haplotypes among haplotype groups greatly reduced the level of differentiation detected. The difference in the quality of distance data highlights the importance of reducing the overall sampling variance by improving sampling of nucleotides (Lynch & Crease, 1990). Unfortunately, sequence data were not available in the present study, as sequencing of all haplotypes in both regions (total sequence length 4924 bp) was not feasible. In future studies on species with high levels of mtDNA variability it may be advisable to improve the reconstruction of phylogenetic relationships among haplotypes, either by collecting extensive sequence data or by surveying large segments of the mtDNA using many restriction enzymes.

Genetic stock structure of north-east Atlantic herring

The present study revealed low but significant differentiation among samples of herring collected across the north-east Atlantic. Such differentiation contradicts previous reports of extensive mtDNA homogeneity in North Atlantic herring (Kornfield & Bogdanowicz, 1987; Dahle & Erikson, 1990; Jørstad et al., 1994; Turan et al., 1998). The reasons for this discrepancy may be differences in molecular methods (specific regions vs. whole mtDNA) and analyses (haplotype binning avoiding problems with hypervariability). In any case, the study provided evidence for (i) genetic differentiation of the Icelandic spring spawners from Norwegian spring spawners, (ii) some genetic differences between Baltic and Celtic Sea herring and (iii) a lack of differentiation of the Norwegian fjord herring and of Barents Sea herring from other samples.

Both mtDNA regions provided evidence for genetic differentiation of Icelandic herring from the rest of the north-western Atlantic (Tables 3,4,5). The results thus represent the first genetic evidence for the existence of distinct herring stocks in Iceland and Norway, though distinct spawning areas (Parrish & Saville, 1965) and phenotypic differences (Johansen, 1926; Fridriksson, 1944; Fridriksson & Aasen, 1952) have suggested this possibility previously. There was also evidence for some differentiation between herring from Iceland and the North Sea from tests on pooled samples (Table 4), though these results need confirmation in a more extensive survey.

Evidence for genetic differentiation was also found between herring from the Baltic and the Celtic Sea, both from ND3/4 (Table 3) and combined data (Table 5), with FST values among the highest found in this study. The Baltic and the Celtic Sea have long been recognized as harbouring distinct herring stocks (Aro, 1988; Parrish & Saville, 1965), and our data confirm that there is limited exchange of herring across the North Sea. As herring in the North Sea itself were not surveyed for ND3/4 variability, we cannot comment on their status as an independent population, though our study indicates the potential for distinct spawning populations in the area.

The present study did not reveal the considerable genetic differentiation of Trondheimsfjord herring observed with allozymes (Jørstad et al., 1994; Turan et al., 1998). Similarly, a microsatellite study failed to confirm allozyme differentiation in Balsfjord, possibly because the sample may have consisted entirely or in part of oceanic Norwegian spring spawners, rather than resident Balsfjord herring (Shaw et al., 1999). The careful collection of spawning fish is therefore of vital importance in future studies.

Even more surprising was the lack of differentiation of the Barents Sea herring from other stocks. The same sample of fish revealed much higher differentiation from the other samples at both allozyme (Turan, 1997) and microsatellite loci (Shaw et al., 1999), corresponding to earlier findings of morphological differences between northern and southern fish (Debarros & Holst, 1995; Stenevik et al., 1996). This discrepancy may be caused by different demographic dynamics of nuclear loci compared to maternally inherited mtDNA, and selective effects at the allozyme loci. In any case, previous allozyme and microsatellite studies clearly indicate that Barents Sea herring should be treated as a separate stock from other north-east Atlantic groups.

Concluding remarks

In summary, the results presented here demonstrate the utility of mtDNA RFLPs in the identification of populations of pelagic fish, if their haplotype frequency distributions are sufficiently even. Heterozygosity is thus only one measure that should be considered in the choice of molecular markers; other parameters such as the Shannon index, the number of common haplotypes and allele frequency distributions may be equally important. While the present study was restricted to two mtDNA regions, similar investigation could be carried out on microsatellite data. If similar correlations between allele frequencies and resolving power of loci are found, isolation and choice of microsatellite markers for stock structure analysis could be more targeted at suitable markers.

The data presented here also show that allele or haplotype binning may be a useful approach where variability is too high to allow powerful statistical tests (Stepien, 1999). The accurate reconstruction of genetic distances is here of utmost importance, and every effort should be made to ensure an appropriate survey of the locus in question.

The results also demonstrate the value of using several different marker systems on the same samples. For example, although the mtDNA analysis presented here did not reveal significant differentiation of Barents Sea herring from the rest of the north-eastern Atlantic, microsatellites (Shaw et al., 1999) and allozymes (Turan, 1997) clearly showed such genetic differences. On the other hand, the differentiation of Icelandic herring was much clearer in mtDNA than in the other two marker systems. Such differences are most likely due to differences in mutation rates and demographic dynamics of the three marker systems. In any case, their simultaneous application to the same sample sets maximizes the chances of detecting genetic differences among conspecific populations.