Introduction

There are five species of the genus Santalum endemic to Australia. Of these Santalum spicatum (Western Australian Sandalwood) is found in Western Australia and South Australia, while S. lanceolatum (Plumbush), S. acuminatum (Quandong) and S. murrayanum (Bitter Quandong) are distributed throughout Australia, and S. obtusifolium is confined to the eastern coast (Hewson and George, 1984). The Indian Sandalwood S. album also occurs in Australia in the northern most parts of the Northern Territory. Among the Australian endemic species, only S. spicatum has commercial significance as it produces a sesquiterpene oil similar to S. album (Shea et al, 1998). S. spicatum has been harvested commercially since 1845 and was a primary industry in the early years of European settlement in Western Australia (Loneragan, 1990).

S. spicatum occurs across a wide range of environmental conditions as it has a broad distribution throughout the southern semiarid (approximately 300–600 mm rainfall), and northern arid (approximately 150–300 mm rainfall) areas of Western Australia. Morphological variation has been observed across the range suggesting the presence of two ecotypes within the species (Fox and Brand, 1993). In the higher rainfall areas, in the south-west of the distribution, the species shows larger leaves and nuts, has higher chlorophyll content and forms more of a tree habit than in the arid areas in the north and east of the distribution (Fox and Brand, 1993). Trees in the arid areas also show higher concentrations of oil in the wood (Loneragan, 1990). A genetic study using nuclear restriction fragment length polymorphism (RFLP) markers showed some evidence of differentiation of ecotypes, but it was not significant (Byrne et al, 2003). However, the study did show significant differences in the genetic influences within the two regions, with the northern region showing stable history with equilibrium between drift and gene flow, and the southern region showing no drift–gene flow equilibrium and variances consistent with fragmented, isolated populations under greater influence of drift than gene flow. This pattern of differentiation between the regions may reflect the influences of historic processes.

Analysis of genetic variation in the chloroplast genome allows the elucidation of historical factors influencing genetic variation (Schaal et al, 1998). The reduced effective population size of haploid genomes combined with the restricted nature of gene flow through seed dispersal makes maternally inherited organelle markers more likely to record the effects of population history in present-day genetic patterns than nuclear markers (Ennos et al, 1999). Although contemporary gene flow can also influence the distribution of organelle variation, the pattern of distribution often allows differentiation of the effects of historical factors from those of recent origin (Templeton et al, 1990; Schaal et al, 1998). There are some limitations in the use of cpDNA to investigate phylogeographic patterns in plants due to the often low level of variation within species and the slow rate of evolution compared to animal mitochondrial DNA. However, relatively high levels of cpDNA variation have been found in Australian plants (eg Butcher et al, 1995; Byrne and Macdonald, 2000; Byrne et al, 2002) in comparison to other species (Soltis et al, 1992). In addition, the south-west of Western Australia is an ancient landscape that has experienced geological stability with no major glaciation or extinction events, and where the pattern of variation is influenced by historic processes. In particular, climatic instability during the Pleistocene resulting in cyclic expansion and contraction of the mesic and arid zones (Hopper, 1979; Hopper et al, 1996) has led to naturally fragmented population systems with a high level of endemism and both relictual and derived species. Hence, cpDNA analysis has proven useful in determining phylogenetic patterns in plant species in Western Australia, and studies of both widespread species and restricted species with disjunct distributions have revealed the presence of historical lineages that have been differentiated through isolation (Byrne et al, 1999, 2001b, 2002). The presence of two ecotypes in S. spicatum influenced by different genetic processes may represent similar patterns of historic separation in this species.

This study investigates the evolutionary and phylogeographic patterns within S. spicatum using RFLP analysis of the chloroplast genome. The mode of inheritance of chloroplasts has not been determined in Santalum but is maternal in most angiosperms (Harris and Ingram, 1991; Birky, 1995).

Materials and methods

Plant collections

Leaf samples were collected from 23 populations throughout the main range of S. spicatum (Figure 1). Collections were made of 10 individuals from each of the populations. Leaf samples were also collected from 10 individuals in each of two populations of S. acuminatum as outgroups (Figure 1). DNA was extracted from the leaves of the 250 individuals as in Byrne and Moran (1994), with the addition of 0.1 M sodium sulphite to the extraction buffers (Byrne et al, 2001a). DNA quality was good, but the yield was low, probably due to high levels of sesquiterpene oils. Initially, 3 μg DNA from five individuals per population was digested with six restriction enzymes (BclI, BglII, EcoRI, EcoRV, HindIII, XbaI) and hybridised with heterologous probes covering the majority of the chloroplast genome. Six petunia cpDNA probes were used, P1, P3, P4, P6, P8, P10 (details given in Sytsma and Gottlieb, 1986), plus one tobacco cpDNA probe, pTBa1 (Shinozaki et al, 1986; Suguira et al, 1986). Restriction digestion and hybridisation were as described in Byrne and Moran (1994), and probe plasmids were linearised and then labelled with 32P using the random priming method. After analysis of the data for the first 125 individuals, DNA of the remaining five individuals per population was digested with two of the enzymes (EcoRI and EcoRV) and probe–enzyme combinations for mutations detected in the first 125 individuals were assayed for the second 125 individuals.

Figure 1
figure 1

Location of sampled populations of S. spicatum (•) and S. acuminatum (▴). Dotted line indicates separation of northern and southern regions.

Data analysis

Banding patterns obtained were interpreted in terms of restriction site or length mutations, and assessed as presence or absence of mutations (not presence or absence of bands). Fragment patterns for consecutive chloroplast probes were compared to ensure that each mutation was correctly interpreted and counted only once. Where a length mutation was detected by more than one restriction enzyme it was counted as only one mutation. Nucleotide diversity was calculated for restriction site mutations using HAPLO (Lynch and Crease, 1990), and partitioned within and between populations. Haplotype diversity was calculated by considering haplotypes as alleles at one locus using Nei's (1977) gene diversity measures. A parsimony analysis of haplotype relationships characterised by the presence or absence of each mutation was carried out using PAUP (Swofford, 1991). Bootstrap analysis used 1000 replications and heuristic search with TBR branch swapping and MULPARS on.

To test for the association between phylogenetic position of haplotypes and their geographic distribution, a nested clade analysis (Templeton et al, 1995) was carried out. The nested cladogram was drawn from the PAUP cladogram and parsimony of all connections determined using the program ParsProb (http://bioag.byu.edu/zoology/Crandall_lab/programs.htm) based on algorithms described in Templeton et al (1992). Geographic associations of haplotypes were determined using the program GeoDis (Posada et al, 2000). Interpretation of the nested clade analysis followed the inference key given in Templeton et al (1995).

Results

Variation in cpDNA

This study analysed restriction sites in the chloroplast genome and revealed polymorphism with all enzymes used. Within S. spicatum, 14 mutations were detected; of these, five were restriction site mutations and nine were length mutations (Table 1). Half of the mutations occurred in the small single-copy region and half occurred in the large single-copy region. The first assay of five individuals per population detected the majority of the mutations. The assay of an additional five individuals per population detected one additional mutation (number 14), which was restricted to three individuals from the Billabong population.

Table 1 Restriction fragment length polymorphisms detected in the chloroplast genome of S. spicatum

The 14 mutations were distributed over 11 haplotypes. One haplotype (B) was more common than all others, occurring in 45% of the individuals sampled. The remaining haplotypes occurred with frequencies ranging from 2.6% to 13%. The most common haplotype (B) was present in all southern populations, except Nyabing, and the north-eastern population of Coolgardie. The remaining haplotypes occurred in one to three populations. Intrapopulation variation was present in two populations: Kokerbin in the south had two haplotypes present and Billabong in the north had three haplotypes.

There were 11 mutations that differentiated S. spicatum from the two populations of the related species S. acuminatum, three site mutations and eight length mutations (Table 2). There were also eight mutations detected between the two populations of S. acuminatum, two restriction site mutations and six length mutations. Of these eight mutations, five were shared between the Goongarrie (north-eastern) population of S. acuminatum and all individuals of S. spicatum. The other three mutations were specific to either the Goongarrie population or the Arthur River (southern) population of S. acuminatum.

Table 2 Restriction fragment length variation detected between the chloroplast genomes of S. spicatum and S. acuminatum

Nucleotide and haplotype diversity

Nucleotide diversity, the average number of nucleotide differences per site between two sequences (Nei, 1978), can be determined for restriction site but not length mutations. Nucleotide diversity, averaged over all pairs of individuals in S. spicatum was 0.074%. The mean diversity within populations was 0.0001% and the mean diversity between populations was 0.00096%. The proportion of nucleotide diversity between populations, NST, was 99%. The mean diversity between lineages (see below) was 0.116%. A measure of haplotype diversity can be determined by treating the whole chloroplast genome as a single locus with each haplotype as an allele. Total haplotype diversity (HT) was 0.749 and haplotype diversity within populations (HS) was 0.045. The proportion of haplotype diversity between populations, GST, was 94%. At the regional level, haplotype diversity (HT) in the northern region was 0.836 and diversity between populations, GST, was 95%, while in the southern region haplotype diversity was lower, HT=0.244, and only 81% of this diversity occurred between populations.

Haplotype and population relationships

A phylogenetic parsimony analysis of haplotypes gave one tree of length 32, with a consistency index of 1.0. The cladogram showed all individuals of S. spicatum clustered into a clade clearly differentiated from, and monophyletic in relation to, the outgroup S. acuminatum (Figure 2). Within the S. spicatum clade, the tree shows a star pattern with an unresolved polytomy of six branches representing three clades and three single terminal branches. The most differentiated clade shows structuring of four related haplotypes (A–D) and is characterised by three mutations, with an additional three mutations that differentiate the haplotypes within the clade. The distribution of this clade, which contains the common haplotype (B), includes all of the southern populations plus the northern populations of Burnerbinmah and Coolgardie. Two more clades in the polytomy (with haplotypes J–K and H–I) show some structure, being characterised by one to two mutations, and having two haplotypes differentiated by one mutation. These clades occur in the north-west of the distribution. The remaining three branches show little differentiation, characterised by up to one mutation and containing one haplotype each. These branches contain the remaining northern populations. Doubling the number of individuals assayed did not result in a change to the distribution of the mutations among populations and did not change the structure of the phylogenetic tree except for a tip clade in the Billabong population resulting from the additional detection of mutation 14. Therefore, the sampling of individuals in this study has been sufficient to detect phylogenetic patterns in the chloroplast genome of this species.

Figure 2
figure 2

Phylogenetic parsimony tree of haplotype relationships in S. spicatum. Numbers and letters on branches represent mutations (see Tables 1 and 2). Numbers in italics below lines at nodes represent bootstrap confidence values (%), based on 1000 replications. Numbers in brackets indicate number of individuals in each population with that haplotype.

A nested clade analysis identified significant geographic structuring in the phylogeny of S. spicatum (P=0.000; Table 3, Figure 3). All clades showed significant structuring. Analysis could not be carried out for Clade 1-5 as there was no population variation in the clade. All other one-step and two-step clades identified restricted gene flow through isolation-by-distance as the main influence on the geographic structuring within these clades, except for Clade 1-4 where an inference of long-distance colonisation was made (Table 4). At the highest nesting level, past fragmentation was identified as the main influence on the geographic structuring between the two-step clades. These two clades are geographically separated, with Clade 2-1 containing haplotypes characterising the southern populations and Clade 2-2 containing haplotypes characterising the northern populations, except for the Burnerbinmah population, which contains haplotypes nested in Clade 2-1. The two clades are also separated by a higher than average number of mutational steps. Clade 2-1 is equivalent to the differentiated clade containing haplotyes A to D in the parsimony analysis and Clade 2-2 is equivalent to the remaining branches in the parsimony analysis (Figure 2).

Table 3 Levels of clade, nested clade and interior-tip distance in clades with significant geographic association in S. spicatum
Figure 3
figure 3

Nested cladogram for haplotypes in S. spicatum. Sampled haplotypes denoted by letters, missing haplotypes denoted by 0.

Table 4 Phylogeographic inferences from nested clade analysis of S. spicatum

Discussion

The level of diversity in the chloroplast genome of S. spicatum is high compared to other woody perennials but not as high as has been identified in other Australian tree species, including the main host of S. spicatum, Acacia acuminata (Byrne et al, 2002). Significant structuring of the diversity occurs with two main clades that are geographically separated, one centred on the southern region and one centred in the north of the distribution. This differentiation of two lineages is consistent with the identification of differences between the regions in the influence of genetic processes on the nuclear genome (Byrne et al, 2003). It is also consistent with the identification of two ecotypes in these regions, based on morphological variation. Nucleotide divergence can be used to estimate the time of separation between lineages, although these estimates should be treated as broad indicators due to the assumptions involved (Rieseberg et al, 1991). For the chloroplast genome, it is estimated that 0.1% divergence represents 1 million years separation (Zurawski et al, 1984). Using this estimate, the time of divergence between the two lineages is around one million years ago, in the middle of the Pleistocene era. Geographical structuring due to historical isolation has also been observed in a common host of S. spicatum, A. acuminata (Byrne et al, 2002) with a similar time frame for the divergence between lineages (approximately 800 000 years BP). The identification of similar phylogeographic patterns over similar time frames suggests that they have occurred through the influences of broad biogeographic processes. The isolation and differentiation of lineages observed within S. spicatum, and other species that have been investigated (Byrne et al, 1999, 2001b, 2002), is consistent with the hypothesis that cyclic contraction and expansion of the arid region in the north-east, and the mesic region in the south-west, during the Pleistocene era led to fragmentation and isolation in the intermediate area between the arid and mesic zones (Hopper, 1979; Hopper et al, 1996). Comparative phylogeographic studies in other parts of the world have also demonstrated broad biogeographic influences, including common postglacial colonisation routes in Europe (Ferris et al, 1993, 1998; Demesure et al, 1996; Dumolin-Lapègue et al, 1997; King and Ferris, 1998) and northern and southern glacial refugia in the Pacific North West of America (Soltis et al, 1997).

There were differences in the level of differentiation within the two lineages in S. spicatum. The southern lineage had high similarity within the lineage with low haplotype diversity and the majority of the populations having one of the four haplotypes present in the lineage. In comparison, the northern lineage had greater diversity within the lineage, with greater differentiation between populations and the six haplotypes having lower frequencies and being distributed into three main areas of geographical diversity. This could be an artefact due to the geographically closer sampling of populations in the south compared to the north. However, this reflects patterns of abundance although the distribution in the southern region has been influenced by land clearing for agriculture. There are no direct comparisons of abundance of S. spicatum in the southern and northern regions at the time of settlement. However, the tonnage of sandalwood harvested from the southern region in the early years of settlement indicates that the species abundance was greater than the current abundance in the northern region. Higher abundance and more continuous distribution would lead to greater gene flow and hence less differentiation between populations in the south. The seeds of S. spicatum are large nuts, which would result in low dispersal, although emus (Dromaius novaehollandiae) are known to eat the seeds and would be a means of seed dispersal. Havel (1993) hypothesised that the woylie (Bettongia penicillata), a small mammal, may have cached seeds, which would also increase dispersal. Emus and woylies were likely to have been more abundant in the southern region due to higher abundance of food source and may have led to greater seed dispersal in the south than in the north. The greater similarity of the southern populations could also indicate a more recent establishment of S. spicatum in this region compared to the north. Coalescent theory predicts that ancestral haplotypes have greatest mutational connections and are geographically widespread, but range expansion can lead to younger tip haplotypes also being geographically widespread (see Templeton et al, 1995). The southern clade is a tip clade and the northern clade is the interior clade containing the most likely ancestral haplotype (E). The southern clade being a geographically widespread tip clade is a pattern indicative of population range expansion in the southern region.

The genetic patterns in the chloroplast data are both concordant and discordant with the patterns identified in the nuclear genome. Genetic analysis of the nuclear genome identified an equilibrium between drift and gene flow in the northern region indicating that the region had existed under stable conditions of dispersal for a long period of time (Byrne et al, 2003). In contrast, the southern region showed a genetic pattern indicating that the populations have been more recently established, and that they have been fragmented and influenced to a greater degree by genetic drift than by gene flow. The southern region also showed greater differentiation between populations than those in the northern region. The genetic patterns in the nuclear and chloroplast genomes both suggest that the southern populations have been established more recently than the northern populations. However, the nuclear genome showed greater differentiation in southern populations (southern θ=0.108, northern θ=0.055; Byrne et al, 2003), whereas the chloroplast genome identified greater differentiation in the northern populations (northern GST=95%, southern GST=81%). This suggests that the lower differentiation in the chloroplast genome is not likely to be a result of population sampling, higher seed dispersal or greater abundance, but may be due to less time for fragmentation to influence divergence.