Introduction

Many plants, such as Restharrows (Fabaceae, Leguminosae) have ploidy level differences between species within the same complex. Whether gene flow still occurs between potential sibling species is an important question whose answer may have fundamental implications for the population structure present in those species. However, comparisons of genetic structure and relatedness across ploidy levels are complicated by the underlying assumption of disomic inheritance in most methods of analysis (for example, Wright's Fst; Wright, 1943, 1946, 1951). While the population genetics of autotetraploids have been modelled and elucidated (Glendinning, 1989; Moody et al., 1993; Ronfort et al., 1998), the theory is not readily put into practice due to the technical problem of not being able to assign copy number to the molecular markers studied, even using highly robust locus-specific marker systems such as microsatellites (Ellegren et al., 1995; Schlotterer, 1998; Amos, 1999; Chambers and MacAvoy, 2000). In this paper, we test the value of principle component (PCO) analysis of microsatellite data for making an assessment of such genetic relationships, using British Restharrows as an example.

Sixty-seven species of Ononis were described by Sirjaev (1932) with one new species being added by Crespo and Serra (1993). The genus is widely distributed throughout Europe, North East Africa and North America. Many Ononis species also produce a waxy substance in their tap roots called Onocerin (Rowan and Dean, 1972), which has been used as one criterion for taxonomic classification. In the British Isles, there are between two and five taxa (depending upon classification), which occur as native populations. O. reclinata (2n=4x=60) is a native (but very rare taxon) confined to just three populations (Wigginton, 1999). It is not included in this current analysis. The remaining taxa have a wide distribution, from coastal dunes and shingle beaches to grassy meadows.

Before this study, there had been no DNA-based evidence published on the genetic relationships between Restharrows in Britain, with the classification to date being based upon: gross morphology; multivariate analysis of morphological characters; analysis of stomatal structure and seed-coat ornamentation; comparative root anatomy; chromosome number; floral morphology; terminal leaflet morphology; flavonoid chemistry; palynology; α-onocerin and sterol content; observation of and experiments in hybridization (Table 1; Sirjaev, 1932; Morton, 1956; Morisset, 1967; Ivimey-Cook, 1968; Rowan and Dean, 1972; Stephens, 1978, 1979a, 1979b; Sañudo et al., 1979; Gupta, 1980; Cannon, 1981; Saint-Martin, 1992; Langer et al., 1995; Rodriguez-Riano et al., 1999).

Table 1 Distinguishing characteristics and previous taxonomic treatment of common Restharrow species in the British Isles

Despite this effort, the taxonomy of the main British Restharrow species remains controversial. Those found most commonly in Britain (that is, excluding O. reclinata) are variously classified into between one and four species, according to different authors (Table 1). This long-running debate has so far failed to reach a consensus, and has led to a lack of taxonomic clarity. For the current study, we will designate the taxa as: O. repens (2n=4x=60); O. maritima (2n=4x=60); O. spinosa (2n=2x=30); O. intermedia (2n=2x=30). We have chosen this approach for clarity, given the lack of taxonomic guidance available and will re-evaluate these designations in the Discussion in the light of the understanding developed in this study.

Meiotic behaviour had also been studied in an attempt to clarify relationships among diploid and tetraploid Restharrows. The chromosomes of tetraploid O. repens and O. maritima are known to produce multivalents, although the difficulty of obtaining good meiotic spreads prevented a quantitative analysis (Morisset, 1967, 1978). This would argue for an autotetraploid origin. One possibility would be the autotetraploidization of O. spinosa or O. intermedia, which would imply the potential for significant unidirectional gene flow from diploid to tetraploid Restharrows.

Masterson (1994) estimated that approximately 70% of angiosperms have polyploidy in their evolutionary history, so an improved understanding of the processes by which polyploid species arise and are maintained is crucial for the understanding of evolution of angiosperms. Few molecular genetic studies spanning diploids and their corresponding polyploids have been published. Where they have been, a number of approaches have been adopted (for example, Lumaret and Barrientos, 1990; Gauthier et al., 1998; Raker and Spooner, 2002; Robertson et al., 2004). In this study, microsatellites data are analysed using PCO analysis to investigate the relationships amongst British Restharrows, to identify whether gene flow has occurred in the recent past between the predominantly diploid and tetraploid Restharrow taxa and also between the taxa with the same ploidy levels. We also evaluate whether PCO analysis coupled to microsatellite analysis could be used to overcome some of the limitations that exist with conventional disomic analysis tools, when the species of interest include polyploids.

Materials and methods

Sampling sites

Twenty-one populations were sampled from a range of locations throughout Central and Eastern Britain. Plants from each population were identified based on their morphology as O. repens, O. maritima, O. spinosa and O. intermedia. The majority of these sites were allopatric, with one clear sympatric site (Harton Down Hill).

Details of the collection sites are given in Table 2.

Table 2 Sampling sites (coordinates) of British Restharrow populations used in this study

Genotyping Restharrows

All 411 individuals were genotyped at 10 SSR loci originally isolated from O. repens. DNA preparation and PCR were performed as described previously (Kloda et al., 2004). PCR products were visualized using capillary gel electrophoresis on an ABI-3100 DNA sequencer and the data analysed using Genotyper 2.5 (Applied Biosystems, http://www.appliedbiosystems.com). To avoid introducing inconsistencies due to rounding errors, alleles were sorted using a binning macro (Amos et al., 2007).

Microsatellites analysis and testing of the diploid species O. spinosa/O. intermedia

For the diploid species, consistency with Hardy—Weinberg (HW) proportions can be tested for by the exact test (Louis and Dempster, 1987; Guo and Thompson, 1992). A HW exact test for heterozygosity deficit was performed for nine populations of O. spinosa/O. intermedia at 10 loci using the programme Genepop (Raymond and Rousset, 1995). Following Bonferroni correction of significance thresholds, the adjusted P-value for the 5% nominal level was 0.000625 and for the 1% level the P-value was 0.000113.

Linkage disequilibrium (LD) was tested for all pairwise combinations of loci for each population of O. spinosa and O. intermedia. The tests were carried out using the programme FSTAT (Goudet, 1995) and the log-likehood ratio G-statistic, using 36 000 random assortments, was recalculated based on the randomized data. The adjusted P-value for the 5% nominal level was 0.001111 and for the 1% nominal level was 0.000222. No such tests are readily available for tetraploid data, so the tests were limited to the diploid data set.

Spatial genetic structuring among British Restharrow populations

The possibility that Restharrow populations exhibited spatial genetic structuring was investigated through the use of the pairwise genetic distance statistic, Rho (ρ). Ronfort et al. (1998) derived ρ for tetraploid populations as an alternative to Fst, which is inappropriate for tetraploids. Rho is independent of selfing rates, is also unaffected by double reduction rates and was calculated from the microsatellite data using the programme SPAGeDi (Hardy and Vekemans, 1999, 2002; Hardy et al., 2000). Correlations between the resulting matrices were tested with a Mantel test (Mantel, 1967; Sokal, 1979) using 10 000 permutations in the programme TFPGA (Miller, 1997).

Comparisons of diversity across populations using PCO analysis

A matrix was constructed counting each individual plant sample as a single case and each microsatellite allele as a variable, scored as present (1) or absent (0). The composite genotypes were presented as a single row. A matrix of allele frequencies was also constructed, representing the frequency of each allele for each population. The data sets were explored using Principal Coordinates Analysis (PCO; Gower, 1966), implemented through the Multivariate Statistical Package (Kovach, 2006). This ordination method makes no assumptions about the distribution of the variates or about their population genetics. Euclidian distance was chosen in preference to other distance measures, as it does not class common absence of an allele as a shared characteristic, and was therefore judged to be most appropriate in the present study, which included highly polymorphic microsatellite data spanning two ploidy levels.

Trees were generated based on Nei's genetic distance (Nei, 1972) using Unweighted Pair Group Method with Arithmetic Mean (UPGMA) in POPGENE 1.32 (Yeh et al., 1999) from a matrix of presence and absence of each allele. As this programme treats ‘0’ as an allelic state and therefore shared absence of an allele as a common characteristic, microsatellite alleles were scored as present (1) or as missing data (·). Trees were visualized using the programme TREEVIEW version 1.6.6 (Page, 1996).

Harton down hill—a sympatric site

The majority of populations sampled were allopatric, potentially as a result of different habit and soil preference (Table 1). The best example of a sympatric site in this study is Harton Down Hill, where populations 12s (O. spinosa) and 12r (O. repens) were found adjacent to one another on the ‘Harton Down Hill’ housing estate, separated only by a footpath. The separation was not perfect, with a few O. spinosa plants found growing in the O. repens patch and vice versa. Within the site, the O. spinosa (12s) patch covered a greater area than O. repens (12r). The O. repens and O. spinosa plants were classified as showing all the morphological characters typical of the individual groups. The O. intermedia (13i) plants were collected from a nearby coastal cliff, approximately 100 m away. Interestingly, the O. spinosa patch had set seed heavily, whereas the O. repens flowers had withered and fallen off without producing any seeds, appearing to be sterile at the time of sampling for this study. This example is examined in more detail, through PCO analysis, for evidence of gene flow between species where there is an overlap of habitats.

Results

In total, 247 different alleles were detected across the 10 microsatellites genotyped on 411 individuals from Britain. Up to four alleles were found per locus per individual in O. repens and O. maritima. In the samples of O. spinosa and O. intermedia, in the vast majority of cases, no more than two alleles were found per locus per individual. This is consistent with the first two being tetraploid and the last two being primarily diploids.

As microsatellites allow both alleles to be unambiguously identified in a diploid species, tests for deviations from HW equilibrium were carried out using the data from O. spinosa and O. intermedia. Excess homozygosity was only observed in the population ‘Harton Down Hill’ at a single locus (locus 3). This population is examined in more detail below, as it is also unusual in other ways. Observed excess homozygosity may be a reflection of microsatellite null alleles or population subdivision. The test was repeated for excess heterozygosity, but no cases were identified. No significant LD was detected between pairs of loci.

The PCO analysis in Figure 1a suggests two broad groups: one includes O. intermedia and O. spinosa and the other includes O. repens and O. maritima. There is limited overlap between the two groups and 15% of the variation is explained by the first two axes. No further genetic differentiation was detected within the two groups. The analysis was repeated using the data from each population summarized into allele frequencies (Figure 1b).

Figure 1
figure 1

Euclidian PCO analysis for O. repens, O. intermedia, O. maritima and O. spinosa. (a) Axes 1 and 2 represent 10.3 and 4.2% of the variation present, respectively, based on an analysis for all individuals. (b) Axes 1 and 2 represent 27.0 and 11.5% of the variation present, respectively, based on population allele frequencies with each data point representing a single population.

Figure 1b shows that, at the population level, O. spinosa/O. intermedia and O. repens/O. maritima form two discrete groups in terms of allelic variation with 38.5% of the variation being explained by the first two axes.

An unrooted tree based on Nei's genetic distance is shown in Figure 2. This shows a clear division between taxa, with the exception of O. spinosa (12s) and O. intermedia (13i), which were placed in the same cluster as samples of O. repens (12r). This is in contrast to the PCO analysis of the allele frequency matrix (Figure 1b), where no exceptions were revealed for these populations and might reflect some genetic interaction between these three populations, all of which were found in close proximity at the Harton Down Hill site.

Figure 2
figure 2

Unrooted UPGMA tree of genetic distance in British Restharrows. The unrooted UPGMA tree was generated in the programme POPGENE from Nei's genetic distances based on microsatellites. UPGMA, Unweighted Pair Group Method with Arithmetic Mean.

Harton down hill populations—a sympatric site

To investigate this unusual case further, PCO analysis was performed on microsatellite data for individual plants from O. spinosa, (12s), O. repens (12r) and O. intermedia (13i) (Figure 3).

Figure 3
figure 3

PCO analysis for Harton Down Hill populations. The first two axes of Euclidian PCO analysis for populations O. repens (12r), O. spinosa (12s) and O. intermedia (13i) from a sympatric site. Axes 1 and 2 represent 20.6 and 7.8% of the variation present, respectively.

In the PCO analysis, O. repens (12r) formed a relatively tight cluster, while O. spinosa (12s) and O. intermedia (13i) plants are generally intermixed, despite 12r and 12s being found together, with 13i approximately 100 m away. The PCO also suggests that more genetic variation appears to exist within the O. spinosa/O. intermedia populations than within the O. repens population, although this could possibly be a reflection of relative abundance of plants.

The O. repens patch did not appear to be clonal, based on the high level of microsatellite variation found in the samples analysed. An examination of the seed collected from the O. spinosa patch confirmed disomic inheritance of markers (data not shown).

Spatial genetic structuring within diploid and tetraploid Restharrows

In accordance with the PCO analysis results, populations were grouped into diploid (O. spinosa/O. intermedia) and tetraploid (O. repens/O. maritima) taxa. A Mantels test showed a highly significant correlation between genetic and spatial distances for tetraploid (P=0.0001) but not diploid species (P=0.1095) using the ρ estimate of pairwise genetic distance (Ronfort et al., 1998; Table 3). Similar results were obtained when the test was repeated using Nei's genetic distance (data not shown).

Table 3 Geographical (ln x) and genetic (ρ) distances between populations

Discussion

Comparisons across ploidy levels

Jackson (1976) and more recently Ramsey and Schemske (1998) have reviewed the effects of polyploidy on meiosis and genetic transmission for both allopolyploids and autopolyploids. While the situation for pure allopolyploids is usually relatively simple with disomic inheritance, the situation in segmental allopolyploids or autopolyploids is more complex, ranging from mild to complete polysomic inheritance, depending on the degree of chromosomal homology present between the chromosome sets in the polyploid. A number of groups have presented theoretical treatments for dealing with polysomic inheritance in autopolyploids (for example, Glendinning, 1989; Moody et al., 1993; Ronfort et al., 1998; DeSilva et al., 2005) often based around a proposed cytological analysis of bivalent/multivalent ratios in meiosis or on marker inheritance. The development of mapping strategies for autotetraploids has also taken the study of inheritance in such species further (Hackett and Luo, 2003; Luo et al., 2004). However, all methods which aim to estimate missing genotypic data either use defined population structures or make underlying assumptions about species population structure. Kosman and Leonard (2005) recently reviewed the use of similarity coefficients generated from molecular markers and concluded that even microsatellite markers in autopolyploids essentially behaved as dominant markers, making the accurate elucidation of genetic relationships difficult. The comparison of population genetic data within polyploids, let alone across ploidy levels, remains problematic. Here, we have investigated the use of PCO analysis of microsatellite data to provide an alternative approach for such comparisons.

Jackson (1976) summarized a number of potential mechanisms for gene flow between diploid and tetraploid complexes (such as exist for Restharrows) include a triploid bridge (either direction), unreduced diploid pollen (2n–4n) or even spontaneous double reduction of pollen (4n–2n). The potential of these mechanisms to operate has been supported experimentally (Zohary and Nur, 1959; DeWet, 1971; Vardi, 1974; Tyrl, 1975). However, although technically possible, such mechanisms require a geographical convergence between diploid and tetraploid species—a single case of which we have identified for O. repens and O. spinosa (Harton Down Hill), but which is unlikely to be common, because of different habit preferences.

Potential gene flow between diploid and tetraploid species has been studied through morphological, isozyme and microsatellites markers (for example, Lumaret and Barrientos, 1990; Gauthier et al., 1998; Raker and Spooner, 2002) generally taking a pragmatic approach using the Nei identity index (Nei, 1972).

Here, we present an approach based on PCO analysis, using data from the Restharrow complex, which includes tetraploids (O. repens and O. maritima) and diploids (O. spinosa and O. intermedia). This involved three stages:

  1. 1)

    The inheritance and reliability of microsatellites derived from the tetraploid species was tested in the diploid species (Kloda et al., 2004). We postulated that the lack of deviations from HW, lack of LD, excess homozygosity (except for one locus in one population) or excess heterozygosity in the heterologous species argued that the markers were likely to exhibit similar performance in the tetraploid species of origin, where formal tests are not possible. That all microsatellite worked equally well in the diploid taxa but were derived from the tetraploid O. repens also argues for limited divergence of the two genomes. Deviations from HW could arise due to the Wahlund effect (Wahlund, 1928), inbreeding, self-fertilization, assortative mating, selection or clonal reproduction.

  2. 2)

    We used a PCO analysis to compare relationships based on individual plants within and between O. spinosa, O. intermedia, O. repens and O. maritima (Figure 1a).

  3. 3)

    We used an allele frequency-based analysis to examine relationships between populations (Figure 1b). We would not expect this to be directly biased by ploidy levels per se, as allele frequencies are used in this analysis. However, there will be an under representation of the most common O. repens/O. maritima alleles within the allele frequency data, as alleles present in more than one copy in an polyploidy individual will be counted singly. Both individual and population frequency PCO analyses give very similar results, suggesting that the analysis is valid.

The results from this three-step approach suggest that such an analysis is informative, and from this we have concluded that there is little genetic exchange in the UK between the ploidy levels. This simple approach could have general application for such comparisons until tools and techniques to unambiguously distinguish copy numbers of markers in polyploids have been developed and are readily available in plant species. Kosman and Leonard (2005) were unable to identify an ideal approach in this situation and the potential differences identified between PCO and the subsequent UPGMA analysis may be a reflection of this. The PCO does have the advantage that it is identifying the most significant components of the molecular variation, but the disadvantage that it does not utilize all of the available molecular data.

Gene flow in UK Restharrows

Previous authors have suggested several different taxonomic treatments of the common British Restharrows, including the recognition of one, two, three or four species (Table 1). This study provides the first genetic evidence that there are restrictions to gene flow between the diploid and tetraploid species (O. spinosa/O. intermedia and O. repens/O. maritima, respectively), but not between O. repens and O. maritima nor between O. spinosa and O. intermedia.

The separate clustering of populations of O. spinosa/O. intermedia and O. repens/O. maritima based on microsatellite data suggests they are now largely reproductively isolated. The cause of reproductive isolation is not known, although it might include a number of pre-zygotic or post-zygotic barriers, such as asynchronous flowering, change in pollinator preference or triploid block (Ramsey and Schemske, 1998; Husband and Sabara, 2004).

The significant result of an ‘isolation by distance’ test for tetraploid (O. repens/O. martima) but not for diploid species (O. spinosa/O. intermedia) would also argue for a lack of interaction between these two groupings, with different factors shaping their population distribution.

Harton down hill—a sympatric site

The possibility of limited gene flow at the Harton Down Hill site (Figure 1b, Figure 2) is interesting, particularly as O. spinosa plants set seed the year of sampling and O. repens did not. Possible explanations for O. repens (12r) plants failing to produce seed in 2003 could be pollinator preference for O. spinosa, a specific response to environmental conditions in 2003 or that successful fertilization of O. repens (12r) may have been prevented due to an excess of pollination by O. spinosa. This is known as minority cytotype exclusion (Levin, 1975) and has been shown to reduce the chances of establishment of tetraploid populations at low frequency in diploid populations, due to fertilization by the predominant haploid pollen on the tetraploid cytotype (Husband, 2000; Baack, 2005). O. spinosa is most common at this location, as minority cytotype exclusion would require, but the complete absence of seed in the O. repens (12r) plants makes this an unlikely cause for the complete sterility observed. While the species proximity at Harton Down Hill may be an untypical situation (with O. repens preferring light, well drained, soils and O. spinosa preferring heavy limestone soils) the lack of seed set in the O. repens patch, together with seed analysed from O. spinosa patch demonstrating disomic inheritance of markers, is strong evidence against extensive genetic interaction between O. repens and O. spinosa, on the rare occasions when they co-exist in the same geographical location. This supports the conclusions from the PCO analysis.

Whatever the reasons for the reproductive isolation, it is clear that it is not appropriate to consider Restharrows to be a single species in Britain. A more appropriate classification would be into two species, comprising of the diploid species (O. spinosa/O.intermedia) and tetraploid species (O. repens/O.maritima). PCO analysis was able to reveal these relationship across ploidy levels and could potentially be used in other species to investigate gene-flow across ploidy levels in plant complexes.