Introduction

The present distribution of genetic variation in larch in Eurasia may, in large part, reflect the way the species spread and interacted over the postglacial recolonization period. Unlike northern Europe which was entirely covered by ice at the time of the last glacial maximum (18 kya), northern Siberia and the Russian far east were only locally glaciated (Bennett, 1997). This was recently confirmed indirectly by molecular studies in Arctic lemmings (Dicrostonyx); using mitochondrial DNA, Fedorov et al. (in press) revealed a complex phylogeographical pattern reflecting the past subdivisions of the present range by ice sheets. They also showed that the formerly glaciated Verchoyansky Mountains, on the eastern shore of the Lena river, acted as a strong barrier to gene flow. In comparison, postglacial recolonization of Siberia by plant species is still poorly understood. Glacial refuges and routes of recolonization have been partly documented for a few forest species: Abies sibirica Ledeb., Picea obovata Ledeb., Pinus sibirica Du Tour. and Pinus sylvestris L. All of these species were absent from the northern part of the current range during the last glaciation. In contrast, four Larix fossils from northern Siberia were radiocarbon-dated to 13, 15, 16 and 21 kya, respectively (Kremenetski, 1994), thus suggesting the presence of Larix species during the last glaciation, or at least shortly before or after its maximum. On the other hand, palaeobotanical data indicate a radical transformation of plant cover during the late Pleistocene and Holocene in Siberia (Neishtadt, 1957; Bojarskaja & Malajeva, 1967; Belova, 1985).

Introgression has played an important part in plant evolution as, for instance, in the recolonization of Europe by oaks after the last glaciation (Dumolin-Lapégue, 1998). Larch species can easily be hybridized, and morphological and genetic studies indicate that hybridization is widespread in natural populations. The natural hybrid complex between L. sibirica Ledeb. and L. gmelinii Rupr. occurs along a long belt running from the Taymir Peninsula down to Mongolia (Kruklis & Milyutin, 1977). Another possible case of larch introgressive hybridization can be found in the Russian far east (Dylis, 1961; Bobrov, 1972), where hybrid origins are suspected for many of the larch species described previously (L. amurensis Dylis, L. maritima Sukacz., L. komarovii Kolesn., L. middendorfii Kolesn., L. ochotensis Kolesn. and L. lubarskii Sukacz.). Because morphological differences between larch taxa are small, the taxonomy of Larix cannot be resolved on the basis of morphological traits only and there is currently no common view on the taxonomic status of some geographical varieties, and hence on the total number of species (Bobrov, 1972). Unsurprisingly, there is little congruence among phylogenetic trees based on morphological traits (Sukaczew, 1924; Kolesnikov, 1946; Dylis, 1947, 1961; Lepage & Basinger, 1995) and phylogenies based on molecular data (Kisanuki et al., 1995; Qian et al., 1995). Introgression and the population genetic structure of Eurasian larch species remain poorly known (but see Semerikov & Matveev, 1995; Shigapov et al., 1998, on L. sibirica; Maier, 1992 on L. decidua Mill.).

In the present study, allozyme diversity within and between Eurasian Larix species of the section Pauciseriales will be examined. More specifically, we will (i) compare genetic distances within and between species at allozyme loci with the differentiation pattern based on morphological traits, and (ii) relate genetic divergence within and between species to what is known of Larix fossil history and the extent of glaciation in eastern Siberia. If Larix species survived at high latitudes during the last glaciation, as suggested by the fossil record, no clear north– south cline in genetic variation is expected, and northern populations might contain as much genetic variation as southern ones. Furthermore, large glaciated areas seem to have existed (in the Verchojansky and Putorana mountains in the Taymir Peninsula) and acted as barriers to longitudinal migration (Fedorov et al., 1998).

Materials and methods

The species investigated, the number of populations per species and their locations are given in Table 1 and Fig. 1. The identification of the species was based on characteristic morphological traits and previously published information on the species distributions (Kolesnikov, 1946; Bobrov, 1972; Dylis, 1947, 1961). For most populations, needle tissues were analysed. When only seeds were available, seedlings were grown under artificial light for 2–4 weeks and the whole seedling used for analysis. In some populations, seeds from half-sib families were available, whereas in others the number of trees from which the seeds originated was unknown.

Table 1 Location and description of the Larix populations used in the present study
Fig. 1
figure 1

Distribution map of Larix taxa in northern Eurasia, also indicating the locations of the 32 investigated populations. The numbers refer to the localities given in Table 1.

Allozymes

Eleven enzyme systems representing 15 protein loci were examined: glutamic-oxaloacetic transaminase (GOT, EC 2.6.1.1, two loci), glutamate dehydrogenase (GDH, EC 1.4.1.2), isocitrate dehydrogenase (IDH, EC 1.1.1.42), diaphorase (DIA, EC 1.8.1.4), phosphoglucoisomerase (PGI, EC 5.3.1.9, two loci), 6-phosphogluconate dehydrogenase (6PGD, EC 1.1.1.44), shikimate dehydrogenase (SKDH, EC 1.1.1.25), glucose-6-phosphate dehydrogenase (G6PDH, EC 1.1.1.49), phosphoglucomutase (PGM, EC 5.4.2.2, two loci), superoxide dismutase (SOD, EC 1.15.1.1, two loci) and fluorescent esterase (EST, EC 3.1.1.1).

Protein extraction and polyacrylamide gel electrophoresis were carried out according to Shurkhal et al. (1992) and Semerikov & Matveev (1995). 150 mg of needles and 150 mg of insoluble PVP were ground with liquid nitrogen in a mortar, and mixed with extraction buffer. The extraction buffer was composed of 1 M sucrose, 5.7 mM L-ascorbic acid, 8.3 mM DL-cysteine, 0.02 M dithiothreitol, and 1.5 mM aminocapronic acid dissolved in electrode buffer diluted 1:1.7. 1 mL Tween-80 was added to 100 mL of this solution and after 2–14 h at 4°C the mixture of extraction buffer, ground needles and PVP was filtered through nylon filters. The supernatant was centrifuged after adding a small amount of CCl4 in order to improve the centrifugation of the supernatant from small tissue particles. Extracts from seedlings were prepared by grinding needles in 0.15 mL extraction buffer and centrifugation with CCl4. Electrophoresis in 7.0% polyacrylamide gel in the Tris-EDTA-borate system was conducted. The electrode buffer (pH 8.0) was the following: 116 mM Tris, 3.5 mM EDTA, 161 mM boric acid. The gel buffer (pH 8.6) was the following: 118 mM Tris, 3.5 mM EDTA, 118 mM boric acid. Histochemical staining was carried out using standard methods (Harris & Hopkinson, 1976).

Data analysis

Data were analysed using GENEPOP (version 2; Raymond & Rousset, 1995b), FSTAT (version 1.2; Goudet, 1995), PHYLIP (Felsenstein, 1993) and NTSYS-pc (Rohlf, 1988). One individual per half-sib family was used in analyses of Hardy–Weinberg and linkage disequilibrium. All individuals were used in cluster analysis and ordination (similar results were obtained when one individual per half-sib family was used).

Hardy–Weinberg expectations

The fit of genotypic distributions to Hardy–Weinberg expectations was tested by the exact test proposed by Haldane (1954). The overall significance for each locus was estimated by Fisher's combined probability test (Fisher, 1954). According to this test, if P-values are obtained for each locus separately under the null hypothesis, then

is distributed according to a χ2 distribution with n d.f., where n is the number of loci (Sokal & Rohlf, 1995). FIS values, where FIS is the correlation between two uniting gametes within a subpopulation, were estimated according to Weir & Cockerham (1984). Heterozygote deficiencies or excesses were tested using an exact test (Rousset & Raymond, 1995).

Linkage disequilibrium

For each population, the nonrandom association between pairs of loci, or linkage disequilibrium, was tested using Fisher's exact test on contingency tables. Contingency tables are created for all pairs of loci in each population and an unbiased estimate of the exact probability obtained using a Markov Chain Monte Carlo method (Raymond & Rousset, 1995a). Each test is unaffected by a potential departure from Hardy–Weinberg, because each contingency table considers the genotypic composition, not the allelic composition. For each pair of loci, a global measure was obtained by averaging across populations and a global test was obtained through Fisher's combined test.

Population differentiation

Genetic differentiation between populations or groups of populations was tested for each locus separately using Fisher's exact test on contingency tables. As for linkage disequilibrium, a Markov Chain Monte Carlo method permits the attainment of an unbiased estimate of the exact probability (Raymond & Rousset, 1995a). Wright's F-statistics, FIS, FIT and FST, were estimated according to Weir & Cockerham (1984) and a 95% confidence interval was estimated by bootstrapping over loci. FIS and FIT are the correlations between two uniting gametes relative to the subpopulation and relative to the total population, respectively, and FST is the correlation between two gametes drawn at random from each subpopulation and measures the degree of genetic differentiation of subpopulations (Nei, 1987). Only statistically independent loci which did not depart from Hardy–Weinberg proportions were retained. Finally, genetic distances between populations were estimated with Nei's genetic distances (PHYLIP; Felsenstein, 1993). Ordination of populations was performed by Multidimensional Scaling (procedure MDSCALE in NTSYS-pc). A Mantel test (Sokal & Rohlf, 1995) was used to test for a relationship between genetic and geographical distances. The distance between populations was estimated using simple trigonometric formulae under the assumption that the Earth is a sphere with radius 6360 km.

Migration and isolation by distance

The number of effective migrants (Nm) between populations was estimated using the relationship Nm=(1/FST−1)/4 that holds for an island model (Wright, 1969) and by the private alleles method (Slatkin, 1985). Finally, isolation by distance was analysed according to Slatkin (1993).

Results

Polymorphism

Mean heterozygosity (Hexp) ranged from 8.8% (L. olgensis; Olga Bay) to 19.5% (Table 2; the allozyme frequency table is available from the first author). There were very few rare alleles, the only case being found in L. decidua, where Skdh-82 and Skdh-71 were present with average frequencies of 60% and 2%, respectively, whereas they were absent in other populations. It should be noted that in seedling tissues the products of these alleles displayed much lower activity (‘seminull alleles’) than other alleles. A null-allele could only be recorded in the homozygous state, as only diploid individuals were examined. Such a homozygote was observed only once in the progeny of a tree from population 11 (Yada); the seedling extract showed no activity for the analysed Skdh locus and was considered as a null-allele homozygote. Additional analysis of megagametophytes from this parent tree confirmed that the parental tree was a null-allele heterozygote.

Table 2 Average sample size per locus, average number of alleles (A), proportion of polymorphic loci (P), and observed and expected heterozygosities (Hobs and Hexp) in 32 populations of Larix examined at 15 allozyme loci (standard errors in parentheses)

Statistical independence among loci

Overall, 1199 pairs of loci were analysed. Among them only 50 pairs showed a statistically significant linkage disequilibrium at the 5% level. Thus, globally, independence between loci cannot be rejected.

Hardy–Weinberg equilibrium

A total of 255 exact tests of Hardy–Weinberg proportions were carried out with the Markov Chain Monte Carlo method for the 13 polymorphic loci and 32 populations. Statistically significant departures (P<0.05) were detected in 19 cases: the loci Got-B, Pgm-B and Sod-A were involved in one case each, G6pdh, 6Pgd and Skdh in two cases each, Pgm-A in three cases, and Est in seven cases. Most (18 out of 19) of these departures were heterozygote deficits. Overall, only the Est locus significantly departed from Hardy–Weinberg equilibrium (χ262=126.6, P<0.0001 for Fisher's method). Among the populations only the Magadan population (L. ochotensis) and the population from Chadan (L. sibirica) had significant deviations over all loci.

Within-species differentiation

The levels of interpopulation differentiation estimated as FST are given in Table 3. The number of effective migrants Nm, estimated according to the equation Nm=(1/FST−1)/4, was 2.91 for L. sibirica and 11.65 for L. gmelinii. The same parameter computed using Slatkin's ‘private alleles’ method was 5.57 for L. sibirica and 9.87 for L. gmelinii. There was no clear isolation by distance among L. sibirica populations (log(M)=−0.185 log(distance)+1.068, where M=(1/FST−1)/4).

Table 3 Wright's F-statistics at all loci in Larix sibirica (populations 1–12, 30, 31, 32) and L. gmelinii (populations 13–18)

The L. sibirica populations split into two groups (Fig. 2). The western group consists of the populations from the Urals and western Siberia. The eastern one includes populations located in eastern Siberia and in the north of western Siberia. The frequencies of alleles Got-A-112, Dia-85, 6Pgd-109, Sod-119 and Est-108 give the strongest differentiation between these two groups. Some of the L. sibirica populations (populations 4, 10, 7, 31, 32), characterized by intermediate frequencies of these alleles, occur between these two groups on the scatter-plot. There is a clear dependence between Nei's genetic distance between any two populations and their longitudinal distance (P=0.0094, Mantel test), but no significant relationship with latitude (P=0.6482, Mantel test; Fig. 2). Moreover, the higher the latitude, the more narrow the zone of intermediate populations. For instance, in the northern part of the Siberian larch range, population 3 which is classified as ‘western’ and population 11 which is classified as ‘eastern’ are only 260 km apart. In contrast, in southern Siberia, the allele frequencies of populations ranging from the Altai Mountains up to Tuva (populations 7, 31, 30) are intermediate between those characterizing western and eastern populations.

Fig. 2
figure 2

Multidimensional Scaling Analysis, using Nei's genetic distance (Nei, 1987) matrix, based on 15 allozyme loci in 32 Larix populations. The population numbers correspond to those given in Table 1. The groupings (ellipses) correspond to different taxa.

Between-species differentiation

Larix amurensis, L. cajanderi, L. olgensis and L. gmelinii are very similar genetically (Table 4; Figs. 2 and 3). The frequency of allele Pgi-B-79 in population 29 (Magadan) of L. ochotensis was about 50%, whereas it was less than 13% in all other populations. As illustrated by the UPGMA dendrogram (Fig. 3) and ordination of the populations using multidimensional scaling (Fig. 2), Nei's genetic distances and classical taxonomy give congruent groupings. The genetic difference between L. ochotensis (Magadan population) on the one hand, and L. olgensis, L. gmelinii and L. cajanderi on the other hand, is mainly caused by frequency differences of allele Pgi-B-79. Finally, genetic differences were more pronounced between L. kamtschatica and L. kaempferi (Table 4), than between L. kamtschatica and L. cajanderi, L. ochotensis and L. amurensis.

Table 4 Mean genetic distances (D; Nei, 1987) within and among the Larix taxa
Fig. 3
figure 3

UPGMA dendrogram of 32 Larix populations based on 15 allozyme loci and Nei's genetic distance, D (Nei, 1987).

Discussion

Three main conclusions can be drawn from our study. First, the classification of species according to Nei's genetic distances is congruent with classical taxonomy. For instance, allozymes reveal the difference between eastern and western (L. sukaczevii Dyl.) populations of L. sibirica. Secondly, whereas there is a marked dependence of the genetic distance between any two populations of L. sibirica and the longitudinal distance separating them, there is no significant relationship with latitude. Thirdly, there is a decrease in genetic variation in marginal populations.

Genetic variation at the margin

As in Douglas-fir (Li & Adams, 1989) and Jeffrey pine (Furnier & Adams, 1986), marginal populations of larch were genetically depauperate. For example, the populations at Olga Bay and at the Polar tree line (populations 11 and 19) had low heterozygosity. These populations are generally very small and isolated and the narrowness of the ecological niche occupied by larch at Olga Bay probably results from competition from the Manchurian broad-leaved flora and historical events (climatic fluctuation, anthropogenic factors).

Population differentiation

The FST values obtained in our study were of the same order of magnitude as those observed in other conifer species (Picea abies Karst., 5.2% (Lagercrantz & Ryman, 1990); Picea maritima (Mill.) B.S.P., 5.9% (Yeh et al., 1986); Pinus sibirica du Tour, 1.6% (Krutovsky et al., 1989); Pinus sylvestris L., 2.5% (Semerikov et al., 1993)). In L. occidentalis, estimates vary from 8.6% (Fins & Seeb, 1986) to 5% (Cheliak et al., 1988; overall range). The majority of Larix species of section Pauciseriales exhibit little genetic differentiation, as do other conifers which occupy continuous and recently recolonized areas. Substantial gene flow and history (colonization from a single refuge) are possible causes of the observed lack of genetic differentiation. Based on morphological variability, Dylis (1947) concluded that L. sibirica includes two species, a western one, which he named L. sukaczewii, and an eastern one, L. sibirica, with a border lying approximately along the Ob and Irtish Rivers (see Fig. 1). Allozymes reveal that there are two genetically differentiated groups in the western and eastern parts of the L. sibirica range, but allozyme data do not correspond exactly to Dylis's hypothesis because in the southern part of the range, allele frequencies change gradually and eastern and western populations cannot clearly be distinguished. This population differentiation pattern may be the result of the recolonization of northern Eurasia by L. sibirica during the late Pleistocene and Holocene. There are strong suggestions that forests remained only in restricted refuges during the last glacial maxima. Some of the refugia were preserved in southern Siberia, south of the Urals and in northern Kazakhstan. So, at the end of glaciation, forest species extended into the north of the Urals and northern Siberia, preserving the allozyme allele frequencies of the refugia. The genetic similarity of larches from the Yamal and Taz Peninsulas (populations 8, 9 and 11) with those from the Upper Yenisey (populations 30 and 31) could, for instance, be connected with the spread of the forests along the Yenisey and Taz rivers. Fossil records also suggest that, unlike other tree species, larch glacial refugia might never have completely disappeared from higher latitudes during the last glaciation (Kremenetski, 1994), and the present genetic pattern might reflect postglacial expansion from these surviving populations. However, the results reported here are more consistent with the latter because a more pronounced longitudinal, rather than latitudinal gradient was observed. Furthermore, had the southern populations been glacial refugia one might expect to find a markedly higher number of alleles in these populations, which was not the case. On the other hand, differentiation between southern and northern populations is not pronounced and neither of the two alternatives can be unambiguously rejected. This issue may be clarified by the use of DNA markers, for instance cpDNA or mtDNA sequences, that give information on allele genealogies.

Hybridization and species differentiation

In the Russian far east only two larch species, L. gmelinii and L. olgensis, can be easily identified. Both of them have characteristic morphological traits and occupy well-defined areas. Outside these areas, morphologically variable populations occur, in which features of L. gmelinii, L. olgensis and even L. kaempferi can be found. Characteristic features of mature cones (flat, marginally jagged and nonpubescent cone scales) of some far-eastern larch species (L. cajanderi, L. ochotensis, L. amurensis and L. kamtschatica) are close to those of L. gmelinii, but other traits (large cones, numerous and sometimes convex or marginally recurved cone scales) resemble more closely those found in either L. olgensis or L. kaempferi. Therefore a hypothesis of introgressive hybridization of larches in the far east is quite plausible (Dylis, 1961; Bobrov, 1972). The introgression zone between L. gmelinii and L. sibirica probably dates from the Holocene or late Pleistocene, at least to the north of Lake Baikal. This is supported by the narrowness of this zone and the co-occurrence in this region of the marginal areas of many plant species, e.g. Chosenia arbutifolia (Pall.) A. Skvorts., Pinus pumila R.G.L., Betula middendorfii Trautv. et Mey, B. exilis Sukacz, Populus suaveolens Fish.; Sokolov et al. (1977). In contrast, the introgressive zone between L. gmelinii and L. olgensis is wide, and possibly originated in the early Pleistocene when L. gmelinii first occurred (Bobrov, 1972).

Larix olgensis appears to be genetically more similar to eastern populations of L. sibirica (D=0.020) than to L. gmelinii (D=0.028), suggesting that the L. olgensisL. sibirica divergence took place more recently than that of L. olgensis and L. gmelinii, although the difference is not very large. However, taking into consideration the similarity of L. olgensis and L. sibirica for the main taxonomic morphological traits (Bobrov, 1972), this inference may be regarded as quite plausible. The oldest larch fossil records are from the early Pleistocene in north-eastern Siberia (Dylis, 1961). It is generally supposed that after the emergence of L. gmelinii, it expanded to the south and south-west and forced out both L. sibirica and L. olgensis, as they were less adapted to the new, more continental, climatic conditions of the Pleistocene (Dylis, 1961). As a result, the common L. sibirica – L. olgensis range was disrupted. The genetic closeness of these two species might therefore be a consequence of the relative short period of time that elapsed since this event, although additional investigations are needed to clarify this.

The larger genetic differences between L. kamtschatica and L. kaempferi (Table 4) compared to those between L. kamtschatica and L. cajanderi, L. ochotensis and L. amurensis suggest that L. kamtschatica originated from continental populations rather than from L. kaempferi. A clearer picture emerges from this allozyme study than from the cpDNA study carried out by Qian et al. (1995). These authors suggest that it might be a consequence of the importance of hybridization, cpDNA, which is paternally inherited in larch, crossing barrier species more readily than nuclear DNA. Whether this is actually the case remains to be shown.