Introduction

Winter dormancy (endodormancy) is an important adaptive strategy that enables plants to persist during periods of stressful winter conditions, including freezing temperatures, dessication and ice encasement. The ability to develop dormancy determines whether perennial plants will survive winter and early spring without damage to shoot and flower buds and the initiation of dormancy therefore represents a critical ecological and evolutionary tradeoff between survival and growth in most perennial plants.

Endodormancy is preceded by a stage of ecodormancy, during which growth arrest is under the control of external environmental factors but growth can resume if conditions become favorable again. In most trees, ecodormancy is known to be induced by a shortening of the daily photoperiod mediated by the phytochrome system (Howe et al., 2003). Recent evidence indicates that phytochrome involvement in photoperiod perception is mediated by plant hormones such as abscisic acid (ABA) or gibberellins (Eriksson and Moritz, 2002; Welling et al., 2002; Bertrand and Castonguay, 2003). Increased ABA levels have been shown to both induce and maintain bud dormancy in woody perennial plants (Wareing, 1956; Shimizu-Sato and Mori, 2001; Li et al., 2003) and studies in both silver birch (Betula pendula, Li et al., 2003) and black cottonwood (Populus trichocarpa, Rohde et al., 2002) show that levels of ABA increase at the initiation of bud dormancy, thus suggesting a causal link between photoperiod perception, ABA regulation and the development of dormancy in perennial plants. Recent studies suggest, however, that changes in ABA sensitivity rather than variation in actual ABA levels may drive changes in dormancy status (Chen et al., 2002).

Currently, little is known about variation in ABA responsiveness in natural plant populations or about naturally occurring variation in genes involved in regulating ABA signaling. In this paper, we analyze DNA sequence variation from a candidate gene (PtABI1B) involved in mediating ABA signaling, using samples from several populations of European aspen (Populus tremula). PtABI1B encodes a protein phosphatase 2Cs (PP2Cs), which have been shown to act as a negative regulator of ABA stress signaling pathways (Leung et al., 1997; Leung and Giraudat, 1998). Quantitative trait locus (QTL) mapping studies have indicated that one of the poplar ABI homologs, PtABI1B, maps close to QTLs for both bud flush and bud set in Populus trichocarpa (Frewen et al., 2000; Chen et al., 2002). In addition, transcripts of PtABI1B have also been detected in buds at the time of bud set in the fall and transcription of PtABI1B also increase at the time of bud flush in the spring, suggesting that this gene could be actively involved in regulating bud phenology in Populus (Chen et al., 2002).

We were initially interested in determining whether there was any evidence for adaptive genetic differentiation at PtABI1B across a latitudinal gradient representing growth seasons ranging from roughly 2–5 months. We have previously detected such adaptive differentiation at phytochromeB2, another candidate gene for dormancy initiation in Populus (Ingvarsson et al., 2006). We failed to detect a pattern consistent with adaptive population differentiation at PtABI1B, but instead uncovered a very pronounced haplotype structure. Here, we further study this unusual haplotype structure by characterizing patterns of nucleotide diversity and linkage disequilibrium (LD) across the PtABI1B locus. We discuss some of the evolutionary forces that may be acting to maintain this haplotype structure in populations of Populus tremula.

Materials and methods

PtABI1B isolation

Primers to amplify Populus homologs of the ABI1 genes from Arabidopsis thaliana were taken from Frewen et al. (2000). These primers were used to amplify a partial fragment of PtABI1B (Frewen et al., 2000) from a cDNA library constructed from leaf tissue. The resulting PCR product was cloned into the pCR2.1 vector using a TA-cloning kit from Invitrogen (Carlsbad, CA, USA) and sequenced using BigDye chemistry (Applied Biosystems Inc., Foster City, CA, USA) on an ABI377 automated sequencer at the Umeå Plant Science Centre sequencing facility. The 5′ portion of the transcript was obtained through 5′ RACE using a GeneRacer Kit from Invitrogen and new gene-specific primers were developed from the sequenced transcript. Primers were designed to amplify the complete PtABI1B gene. The homologous sequence from P. trichocarpa was obtained by BLAST searches of the assembled genome sequence of P. trichocarpa available at http://genome.jgi-psf.org/. Although the Populus homologs of the ABI1 genes are made up of a small gene family (unpublished), a BLAST search of the P. trichocarpa genome sequence suggest that the PtABI1B gene is a single-copy gene.

Sequence analysis

The PtABI1B gene was amplified from a total of 22 individuals sampled from four localities in Europe. Samples were taken within a few kilometers of Besancon in eastern France (FRA, four individuals), Klagenfurt in Southern Austria (AUT, six individuals), Färjestaden in Southeastern Sweden (SWE-S, six individuals) and Umeå in northern Sweden (SWE-N, six individuals, see Ingvarsson (2005) for more details about these populations). Because P. tremula is diploid and outcrossing, these 22 individuals represent 44 independent haplotypes. PCR products were cloned into the pCR2.1 vector (Invitrogen) and at least five, and often more, clones of each fragment were sequenced to obtain both haplotypes from an individual. To ensure that we were not inadvertently studying a pair of recently duplicated paralogs, we studied a few individuals in greater detail. In these individuals, a large number of clones (>20) were sequenced and we also performed restriction digests to check for restriction patterns inconsistent with a single, diploid locus. In no case did we find more than two haplotypes among the sequenced clones and our restriction analyzes did not find any evidence suggesting a recent duplication of PtABI1B. We therefore conclude that PtABI1B is a single-copy gene also in P. tremula.

Sequences were verified manually and contigs were assembled using the Sequencer v 4.0. Multiple sequence alignments were made using Clustal W (Thompson et al., 1994) and adjusted manually using BioEdit (http://www.mbio.ncsu.edu/BioEdit/bioedit.html). All sequences described in this paper have been deposited in the EMBL database (AM690392–AM690435).

Estimates of nucleotide polymorphism, LD and statistical tests of neutrality were obtained using a computer program written in C++ based on the publicly available C++ class library libsequence (Thornton, 2003). Significance of statistical tests of neutrality were evaluated by determining the null distribution for each test statistic under neutrality, by simulating 105 neutral genealogies using Richard Hudson's ms program (available at http://home.uchicago.edu/~rhudson1/source/mksamples.html). Briefly, simulations were conditioned on the original sample configuration, sequence lengths and the number of segregating sites observed. Population structure is known to influence both power and rejection rates of statistical tests of neutrality (Przeworski, 2002) and simulations were therefore run with samples taken from a subdivided population. The migration parameter, M=4 Nm, was set so that the expected value of FST=1−πwithin/πtotal (Charlesworth, 1998) matched that observed in the sample. Population structure in the simulations was assumed to follow Wright's Island Model with a total number of populations equal to 20. Finally, we also included recombination in our coalescent simulations. We did this by calculating the per-base pair recombination rate (ρ) estimated using the method of Hudson (1987). Statistical tests of neutrality generally gain power as the recombination rate increases (Wall, 1999). We therefore ran our coalescent simulations using two recombination rates, ρ or ρ/2. They gave the same results and we therefore only report the results from the simulations using the more conservative recombination rate (ρ/2).

We also used a sliding-window approach to study how patterns of polymorphism varied across the PtABI1B gene region. To determine the significance of summary statistics calculated in individual window segments, we used coalescent simulation of an infinite island model with the same number of alleles that were present in the four populations sampled (8, 12, 12 and 12, for a total of 48). Simulations were run conditional on the number of segregating sites observed in each window segment and the migration parameter, M=4 Nm, was set so that the expected FST matched the estimate of FST across the entire PtABI1B gene.

LD extends only a few hundred base pairs in P. tremula (Ingvarsson, 2005), and signals of balancing selection are expected to affect only small genomic regions (Charlesworth, 2006). We therefore used a fixed window size of 200 bp in our sliding window analyses; smaller window sizes resulted in zero polymorphism in many windows and with larger windows one run the risk of averaging over too many sites with independent evolutionary histories, thereby reducing the power to detect deviations from neutral expectations. The sliding window analysis is clearly exploratory and should be viewed as method for identifying regions worth studying in greater detail. We therefore do not correct for multiple testing in the analysis.

Results

Nucleotide polymorphism and divergence

We obtained the complete coding sequence and all intervening introns of PtABI1B from 44 haplotypes. The total aligned region, including indels, covered 2669 bp. We found a total of 32 segregating sites and overall, sequence diversity was low, π=3.2 × 10−3 and θ=3.0 × 10−3, compared to other genes in P. tremula, (Ingvarsson, 2005; Ingvarsson et al., 2006; Table 1). Synonymous site diversity was also two- to fivefold lower compared to previous estimates from P. tremula (range of πS 11.1–30.3 × 10−3, Ingvarsson, 2005). Interestingly, polymorphism at noncoding sites, in introns and the 5′-untranslated region, is approximately threefold greater than diversity at silent sites (πnc=3.59 × 10−3). Jukes–Cantor corrected divergence from the outgroup species Populus trichocarpa is 0.0152, and this is also two to five times lower than other P. tremula genes, where divergence from P. trichocarpa averages 0.0435 (Ingvarsson, 2005). The low levels of silent polymorphism and divergence therefore suggest that PtABI1B could be located in a genomic region with an unusually low mutation rate.

Table 1 Estimates of nucleotide variation and statistical test of neutrality at PtABI1B

The most striking feature of the polymorphism data at PtABI1B was the marked excess of polymorphism at nonsynonymous sites. In total, 18 of 20 polymorphic sites in the coding region are nonsynonymous (Table 2). Also, nucleotide diversity at nonsynonymous sites was πa=3.5 × 10−3 compared to πs=1.10 × 10−3 at synonymous sites, yielding a πa/πs ratio of 3.18 (Table 1). A McDonald–Kreitman (MD) test, which contrasts the rates of divergence with the levels of standing polymorphism within a species at synonymous and nonsynonymous sites, was performed using P. trichocarpa as an outgroup (McDonald and Kreitman, 1991). The MK test shows a significant deviation from neutral expectations at PtABI1B (P=0.036). Such a deviation from neutral expectations could be caused either by an excess of nonsynonymous mutations or by a deficit of synonymous mutations. If polymorphism and divergence at synonymous sites is contrasted with polymorphism and divergence at noncoding sites in the PtABI1B region, there is no evidence for any departure from neutrality (P=0.441). On the other hand, a similar test contrasting nonsynonymous and noncoding sites approaches significance (P=0.052), and a contrast between nonsynonymous and all silent sites in the PtABI1B region (synonymous+noncoding) was significant (P=0.042), suggesting that the significant MK test at PtABI1B is caused by an excess of nonsynonymous polymorphisms segregating within P. tremula rather than a deficit of polymorphisms at synonymous sites.

Table 2 Polymorphism and divergence at nonsynonymous, synonymous and noncoding sites

Haplotype structure and LD

Two nonsynonymous mutations in exon 3, separated by 15 bp, are linked in repulsion (Figure 1). These two sites (E321K, nucleotide site 1221 and N325K, nucleotide site 1235) are also in LD with several other synonymous and nonsynonymous variants across the entire coding sequence and define two distinct clades at PtABI1B (Figure 1, Figure 2). There are also five nonsynonymous mutations (including the E321K and N325K mutations), a 170 bp deletion in intron 4 (IND1) and additional four mutations in intron 4 that are fixed between the two clades. All but one of the 44 sequenced alleles carry either of these two variants. In addition, one sequence appears to be a recombinant between the two major clades (Figure 1). Although the two clades contain roughly similar levels of sequence diversity (πHap1=0.0014 and πHap2=0.0011), patterns of polymorphism differ substantially; clade 1 has an excess of nonsynonymous polymorphisms (πa/πs=2.43), whereas clade 2 has an excess of synonymous polymorphisms (πa/πs=0.67).

Figure 1
figure 1

Alignment of polymorphic sites in the PtABI1B gene region. Residue numbers are indicated above each site and each site is indicated as either a fixed difference relative to P. trichocarpa (F), nonsynonymous polymorphism (N), synonymous polymorphism (S) or polymorphisms occurring in flanking regions or introns (I). All residues identical to the top sequence are denoted by dots (.).

Figure 2
figure 2

Gene genealogies for 44 alleles of PtABI1B in Populus tremula. PtABI1B-C1 and PtABI1B-C2 refer to the two major clades identified at PtABI1B. The genealogy was constructed using the neighbor-joining method with p-distances in MEGA 3.0.

A sliding window analysis shows two peaks of high polymorphism, one located in exon 2 and one in exon 3 (Figure 3). The peak in exon 3 occurs in the vicinity of the two nonsynonymous variants that define the two major clades at PtABI1B (Figure 1, Figure 2). Intraclade polymorphism in these regions is relatively low and the excess diversity seen in the total sample thus represents variants that are fixed between the two clades (Figure 3). These regions are also characterized by a frequency spectrum with an excess of intermediate frequency variants, as evidenced by large positive values of Tajima's D (Figure 3). However, when averaged across the entire gene, the frequency spectrum does not deviate from neutral expectations as Tajima's D is slightly positive, but nonsignificant (D=0.239, Table 1).

Figure 3
figure 3

Sliding window plot of nucleotide polymorphism, π, divergence from P. trichocarpa (middle) and Tajima's D (bottom). A window of width 200 bp was moved along the sequence in 20 bp increments and statistics were calculated for each window segment. Exon–intron structure of PtABI1B is indicated by black boxes (exons) and thin lines (introns) at the bottom of the graph.

Despite the extensive LD observed across PtABI1B, there is evidence for recombination; the minimum number of recombination events inferred from the four-gamete test is 7 and the per site estimate of the recombination rate using Hudson's (1987) estimator is 2.9 × 10−3, in line with estimates from other genes in P. tremula. However, the average pairwise correlation among sites, ZnS (Kelly, 1997) is 0.194 and coalescent simulations (including recombination) show that LD among polymorphic sites at PtABI1B is significantly higher than expected under neutrality (Table 1). Another sign of the pronounced structuring at PtABI1B is the fact that LD extends over most of the sequenced region, covering approximately 2.6 kb (Figure 1). Normally, LD decreases to negligible levels <500 bp in P. tremula (Ingvarsson, 2005).

Discussion

Both nucleotide diversity and divergence from P. trichocarpa were three to five times lower at PtABI1B (Table 1) than at other genes in P. tremula (Ingvarsson, 2005), making it likely that PtABI1B is located in a region of the genome characterized by a low mutation rate. Despite the overall low levels of polymorphism at PtABI1B, there is a marked excess of nonsynonymous polymorphisms in the coding region of PtABI1B (Table 2). The polymorphism data are organized into two very distinct allelic clades (Figure 1), which extend across much of the gene region. These two clades are defined by two nonsynonymous variants located in the PP2C domain of PtABI1B that constitutes the most important active domain of the PtABI1B protein (Rodriguez, 1998). These variants are also in LD with several other nonsynonymous variants across the entire PtABI1B gene (Figure 1) and with a 170 bp deletion in intron 4 (IND1).

A sliding window analysis shows two peaks of elevated nucleotide polymorphism at PtABI1B (Figure 3). The regions of elevated polymorphism are also associated with frequency spectra that are skewed toward an excess of mutations occurring at intermediate frequencies (Tajima's D>0, Figure 3). Population subdivision or admixture can result in spurious haplotype structuring and positive Tajima's D. However, the pattern observed at PtABI1B is not likely to have been caused by population subdivision. First, the two major clades at PtABI1B are present in roughly equal frequencies in the four populations sampled (frequencies of clade 1 in the four populations are FRA=0.70, AUT=0.70, SWE-S=0.83 and SWE-N=0.58). This is also evident from the low genetic differentiation observed at PtABI1B (FST=0.004), which is actually lower than other genes surveyed in the same set of populations, arguing against population subdivision as a cause of the haplotype structure (Ingvarsson, 2005; Ingvarsson et al., 2006). Admixture could also result in strong haplotype structure and LD. However, effects of recent admixture should be evident throughout the genome, yet there are no such indications and population structure is overall very low in P. tremula (Ingvarsson, 2005; Ingvarsson et al., 2006). It is possible that one of the haplotypes has introgressed through hybridization with a closely related species. P. tremula is known to hybridize with P. alba in regions where their distributions overlap (Lexer et al., 2005). However, gene flow in hybrid zones between the two species appears to be biased toward introgression from P. tremula to P. alba (Lexer et al., 2005). Whether one of the allelic clades arose through introgression from P. alba needs further investigation, preferably by sequencing of the PtABI1B gene also in P. alba. Nevertheless, although hybridization can help explain the origin of the two different clades, it still does not explain why the two haplotypes appear to be maintained in P. tremula.

LD does not decrease with distance at PtABI1B where it decreased to negligible levels in <500 bp in all other genes in P. tremula that have been studied to date (Ingvarsson, 2005; Ingvarsson et al., 2006). These observations make it likely that some form of natural selection is the cause of the strong haplotype structure at PtABI1B in P. tremula. Selective sweeps can transiently increase levels of LD, although this should also lead to a reduction in levels of polymorphism in surrounding areas (Nielsen, 2005; Charlesworth, 2006). Even a partial sweep of one of the clades is thus inconsistent with the data, as synonymous site diversity is roughly equal in the two clades, making it unlikely that one of them have recently increased in frequency (Charlesworth, 2006). Balancing selection, on the other hand, can maintain polymorphisms for extended periods of time, resulting in the accumulation of mutations at sites in close proximity of the balanced polymorphism, yielding a frequency spectrum that is skewed toward alleles at intermediate frequencies (Charlesworth, 2006). Innan and Tajima (1999) showed that if balancing selection is maintaining two distinct alleles at a locus at roughly constant frequencies, the sum of pairwise differences within the two classes is roughly constant and equal to θ, regardless of the strength and pattern of selection. The sum of silent site diversity within the two clades at PtABI1B equals πsum=0.0026, which is not significantly different from the silent site diversity observed in the total sample (π=0.0029). A model of constant balancing selection thus appears to provide a possible explanation for the PtABI1B data.

A closer examination of the high πa/πs ratio at PtABI1B show that it is partially explained by four nonsynonymous mutations that are fixed between the two major clades. In addition to these nonsynonymous mutations, several synonymous mutations and the deletion in intron 3 are also fixed between the two clades. The peculiar haplotype structure seen at PtABI1B thus has all the hallmarks of a balanced polymorphism, including distinct allelic clades, high LD and an excess of sites at intermediate frequencies.

The extent to which a genomic region surrounding a balanced polymorphism is affected critically depends on the local recombination rate. Nordborg and Innan (2003) showed that the likelihood of observing the signature of balancing selection is a function of both the population mutation rate, θ=4 Nμ, and population recombination rate, ρ=4 Nr. The mutation rate determines the amount of neutral diversity in a population, whereas the recombination rate determines the degree of associations among sites. In highly selfing species, the recombination to mutation ratio will be low because the effective recombination rate is low in highly inbred genomic backgrounds (Nordborg, 2000). The signature of balancing selection is therefore expected to extend over large distances in inbred species (Nordborg et al., 1996; Charlesworth, 2006). In an outcrossing species, such as P. tremula, the recombination to polymorphism ratio is expected to be high and all but the most tightly linked sites will be uncoupled from a balanced polymorphism and will effectively behave as neutral sites. The power to detect balancing selection is therefore low in most outcrossing species (Nordborg et al., 1996; Charlesworth, 2006). It is thus not surprising that, besides the notable exceptions of self-incompatibility loci (Charlesworth et al., 2003), relatively few cases of balancing selection have been documented in outcrossing plants (Charlesworth, 2006). Recombination does not appear to be unusually low in the PtABI1B region; sequence-based estimates of the recombination rate at PtABI1B (using the method of Hudson (1987)) yields an estimate of the per site recombination rate of ρ=2.39 × 10−3/bp, which is in line with estimates from other genes in P. tremula (Ingvarsson, 2005). Therefore, even if the putative signal of balancing selection can be detected across most of the 2.6 kb region of the PtABI1B gene, the likelihood of detecting this unusual haplotype structure without actually targeting the core PP2C region of PtABI1B appears to be low. This is also evidenced by the fact that while there is an excess of sites at intermediate frequencies in the core PP2C regions, when averaged across entire gene there is no such excess (Table 1). This likely occurs because the evolutionary histories of sites in the extremes of the PtABI1B gene are decoupled from that of the sites in the core PP2C region.

Although several signs suggest that the PtABI1B haplotype structure is being maintained by selection, we can only speculate the functional basis for the polymorphism. Data from homologous genes in Arabidopsis thaliana suggest that sequence variants in these genes can result in either increased or reduced responsiveness to ABA (Merlot et al., 2001). In A. thaliana both mutants from which the ABI1 and ABI2 genes were initially described, carry the same Gly to Asp substitution in the PP2C domain of the ABI protein (Leung et al., 1997). These variants are both partially dominant and result in reduced ABA sensitivity of the mutant plants (Leung et al., 1997). The homologous amino acid is conserved in our sequences from P. tremula, but it is interesting to note that the two amino-acid polymorphisms that make up the two distinct haplotypes at PtABI1B are located only 100 bp downstream from the site of the A. thaliana mutations. The physical location of the two variants (E321K and N325K), in the core PP2C region, suggest that they could be involved in altering sensitivity to ABA. However, more work is needed to work out the functional significance of these variants and to understand possible targets of balancing selection.

Charlesworth (2006) argued that short-term balancing selection is probably far more likely than the more well-studied cases of long-term balancing selection, such as those associated with self-incompatibility loci or certain disease resistance polymorphisms. The results presented here highlight the difficulties inherent in locating possible candidates for ‘weak’ balancing selection in outcrossing species. Unless selection is strong and/or the region is associated with a reduced recombination rate, the genomic signal of balancing selection will be weak and such regions will be hard to identify through various test of selection (Charlesworth, 2006). Although weak balancing selection may be hard to detect at the genome level, such sites could have a significant impact on quantitative variation seen in natural populations, including variation in many fitness-related traits. How much of the standing genetic variation in quantitative traits is due to weak balancing selection has remained controversial (Johnson and Barton, 2005). However, with the accumulation of genomic data from a large number of species, we are now in the position where it is possible to start addressing questions on the genetic basis of quantitative variability in natural populations.