Introduction

Polyploidisation, the duplication of whole genomes, has long been known as a major mechanism of abrupt speciation in plants, and current research shows it to have important consequences for subsequent genome and transcriptome evolution (Soltis et al., 2004b; Comai, 2005; Hegarty and Hiscock, 2008; Leitch and Leitch, 2008). It can be difficult to untangle the effects of polyploidisation per se from the effects of hybridisation, which often accompanies it, and/or from the effects of evolutionary changes in subsequent generations. To do this, examination is needed of every stage from progenitor diploids through young polyploids to ancient polyploids, but no single natural system currently presents such a series. Instead, we must use different species of natural polyploids of different ages to investigate the temporal dynamics of polyploid evolution, often using artificial crosses to investigate the effects of hybridisation, and synthetic polyploids to investigate the instantaneous effects of genome duplication.

Such studies have shown that, at the level of the genome, polyploids tend towards a more diploid-like state, through downsizing and loss of duplicated genes (homoeologues) (Wolfe, 2001; Leitch and Bennett, 2004; Chen et al., 2007; Grover et al., 2007). Fragments of DNA seem to have been lost through homoeologous recombination in synthetic allopolyploids (Liu et al., 1998; Ozkan and Feldman, 2001; Shaked et al., 2001; Kashkush et al., 2002; Ozkan et al., 2002; Gaeta et al., 2007). ‘Genomic shock’ caused by whole-genome duplication may lead to gene silencing (McClintock, 1984; Comai et al., 2003), and changes to the transcriptome have been found in synthetic allopolyploids (Wang et al., 2004; Hegarty and Hiscock, 2008). However, we still know relatively little about natural genomic and transcriptomic changes during the early generations after polyploid formation, mainly because of the scarcity of young natural polyploid species for which genetic resources are available (Buggs, 2008).

The polyploid plant species Tragopogon miscellus provides an excellent evolutionary model (Soltis et al., 2004b) in which to investigate evolution in the early generation's post-polyploidisation. It arose naturally <80 years ago through allopolyploidisation, with the diploid species T. dubius and T. pratensis as parents (Ownbey, 1950; Soltis et al., 2004a). These parents are phylogenetically divergent (Mavrodiev et al., 2005; Buggs et al., 2008); therefore, homoeologues in T. miscellus can be distinguished by sequence differences (for example, Tate et al., 2006). The species formed repeatedly in different localities from separate populations of the diploid progenitors (reviewed in Soltis et al., 1995, 2004a), giving replicated independent natural allopolyploid lines. In addition, we now have multiple, reciprocal synthetic allopolyploid lines of the species (Tate et al., in press). These resources allow us to examine the instantaneous effects of allopolyploidisation, as well as its effects over 40 generations (80 years) in this biennial species.

Here, we use the T. miscellus system to examine the rapid evolution of 10 sets of homoeologous genes in multiple individuals from five independent natural lineages and 44 first-generation synthetic allopolyploids. Distinguishing between homoeologues using single-nucleotide polymorphisms (SNPs), we ask whether loss or silencing of these genes has occurred immediately on polyploidisation (that is, within the first generation), more gradually but within 40 generations, or not at all within so short a time frame? We also examine the possibility that there are concerted mechanisms for gene loss, by determining whether there are repeated patterns of loss or silencing among independent origins.

Materials and methods

Plant materials

Seeds from five natural populations of T. miscellus (Table 1), together with the seed of the diploid parent species, T. dubius and T. pratensis, were germinated and grown under controlled conditions in a greenhouse at the University of Florida (Gainesville, FL, USA). The five natural populations of T. miscellus all formed independently and included lineages of reciprocal origin: the short-liguled form (with T. pratensis as the maternal parent) from Moscow, ID, and Garfield, Oakesdale and Spangle, WA, USA, and the long-liguled form (with T. dubius as the maternal parent) from Pullman, WA, USA (Soltis and Soltis collection numbers for all populations are given in Table 1). Samples of T. dubius and T. pratensis were obtained from the same locations (except Pullman, where T. pratensis is not currently found (Novak et al., 1991)) and analysed with their progeny to ensure that the parents of the allopolyploid lineages were not heterozygous for the markers surveyed. Field-collected seeds were used from Oakesdale, Spangle and Garfield populations; seeds used from Moscow and Pullman populations were derived from plants grown in the greenhouse (at Washington State University, Pullman, WA, USA) from field-collected seeds and allowed to self-fertilise for one generation.

Table 1 Natural populations analysed

In addition, synthetic allopolyploids, formed from reciprocal hybridisation between T. dubius and T. pratensis followed by colchicine treatment (Tate et al., in press), were examined. Flowering was induced by cold treating the raw synthetic allopolyploids (S0 generation) for 3 months in a growth chamber at 10 °C under short-day-length conditions (8 h of light). After 3 months, the plants were placed back in the greenhouse and allowed to flower and self-fertilise. Capitula were enclosed in glassine envelopes (Paper Mart, Los Angeles, CA, USA) before the heads opening and maintained thus until mature achenes formed. Seeds from these initial synthetic allopolyploids (S1 generation) were germinated and grown in the same manner as in the case of the earlier generation. Leaf tissue for this study was sampled from 44 S1 plants, representing progeny from six initial crosses (see Supplementary Material, Table S1).

DNA and RNA extraction

Leaf tissue for the natural diploid and allopolyploid populations and synthetic allopolyploids was collected from seedlings 4 weeks after germination and flash frozen in liquid nitrogen (Tate et al., 2006). DNA was extracted from leaf tissue using a modified cetyl trimethylammonium bromide protocol (Doyle and Doyle, 1987), after homogenising tissue in a bead mill (BioSpec, Bartlesville, OK, USA). RNA was extracted using the RNAeasy kit from Qiagen (Valencia, CA, USA). RNA was DNAsed to eliminate residual genomic DNA using the DNA-free DNAse kit from Ambion (Austin, TX, USA). First-strand cDNA synthesis was carried out on 500 ng RNA using superscript II reverse transcriptase (Invitrogen, Carlsbad, CA, USA) and random hexamer primers.

Homoeologue identification

Arabidopsis thaliana genes (Paterson et al., 2006) were tblastn-aligned to 2081 Tragopogon, 227 016 Lactuca and 146 701 Helianthus expressed sequence tags (ESTs), with a minimum criterion of E<0.001. After filtering for ⩽50% identity, 807 Lactuca, 0 Tragopogon and 262 Helianthus ESTs matched at the translated protein level. The best Lactuca EST was then determined for 78 A. thaliana genes and blastn-aligned to A. thaliana genomic DNA (minimum E<1 × 10−10). From this search, 22 A. thaliana loci were aligned to these ESTs, each of which spanned one to three introns (based on the A. thaliana primary sequence).

Primer design and PCR

Primers were designed both directly from the Arabidopsis genomic sequence using primer 3 (Rozen and Skaletsky, 1999) and from an alignment of Arabidopsis genomic and cDNA and Lactuca ESTs using Primaclade (Gadberry et al., 2005) (Table 2). Primers were positioned in exons and separated by at least one intron, so that genomic and cDNA amplicons would be of different sizes and distinguishable on agarose gels. We were able to successfully use these primers to amplify 10 genes from Tragopogon genomic DNA and nine genes from cDNA. In the case of gene S13, for which primers did not work with cDNA, several additional sets of primers were designed; yet, they failed to amplify in either polyploids or diploids, perhaps because the genes were not expressed in the rosette leaf tissue at the time of collection.

Table 2 Homoeologous loci examined in T. miscellus with genomic and cDNA CAPS analysis

Genomic and cDNA fragments were amplified in 25 μl volume with 50–100 ng template, 20% final volume Promega 5 × sequencing buffer (Promega, Madison, WI, USA), 10% final volume 5 M betaine, 1 mM MgCl2, 0.4 mM dNTPs, 0.2 μM of each primer and 0.4 units of Promega Taq polymerase. Most of the primers were designed with annealing temperatures close to 60 °C; a representative set of thermocycling conditions is as follows: 94 °C for 2 min, followed by two cycles of 94 °C for 30 s, 60 °C for 30 s and 72 °C for 1 min, followed by three further sets of two cycles in which the annealing temperature was dropped to 58, 56 and 54 °C for each set, followed by 27 cycles with an annealing temperature of 52 °C. Gene fragments were amplified from the two diploid parents, T. dubius and T. pratensis, and run out on agarose gels. The PCR product was excised from the agarose gel, cleaned using the Wizard SV gel clean-up kit (Promega) and cloned using the Topo-TA cloning kit (Invitrogen). Two to four clones were sequenced using M13 forward and reverse primers on an ABI 3730 capillary sequencer, and fragments were assembled into contigs using Sequencher version 4.8 (Gene Codes, Ann Arbor, MI, USA). Sequences of the genes successfully amplified from T. dubius and T. pratensis genomic DNA were submitted to GenBank (accession numbers FJ708502-FJ708521).

Sequence alignment, SNP detection and restriction site detection

Sequences were aligned using MacClade (Maddison and Maddison, 2004) and the SNPs identified. Each SNP was investigated, using DNA Strider 1.1 (Douglas, 1995), to determine whether it was part of a restriction site that would differentiate the two diploid parental species. In most cases, a restriction site difference was identified in at least one exon for each gene, and these were then used to construct CAPS markers (Konieczny and Ausubel, 1993; Tate et al., 2006). In an allopolyploid, there are two copies of each gene, with each homoeologue possessing two alleles (if an allopolyploid has been formed by genome duplication of an F1 hybrid, the two alleles of each homoeologue will be identical). Both alleles of a given homoeologue must be lost for the loss to be visualised by CAPS analysis. Therefore, there may be some gene losses that went undetected in this study (for example, one of two copies of a gene).

In three cases, there was insufficient SNP variation between parental alleles in the exon sequences for selection of diagnostic restriction enzymes. In one of these cases, an exon SNP was exploited by designing primers that would amplify only one or the other of the SNP alleles (Ye et al., 2001). In the case of two genes (S4 and S7), there was insufficient SNP variation between parental alleles in the exon sequences for the use of diagnostic restriction enzymes or allele-specific primers, and so we used diagnostic restriction sites within intron sequences, which meant that those CAPS markers did not work for cDNA.

Visualisation of CAPS and homoeologue-specific PCR markers

CAPS markers were digested in a reaction volume of 10 μl, with 1–2 μl of PCR template, 1 μl enzyme buffer and 2.5–10.0 units of enzyme (New England Biosciences, Ipswich, MA, USA). CAPS were visualised on 3–4% agarose gels (for example, see Figure 1). Where differences between the restriction fragments or the lengths of the allele-specific primer products were <25 bp, 4–4.5% Metaphor agarose (Lonza, Allendale, NJ, USA) gels were used to aid in their separation. Where homoeologue losses were found using diagnostic restriction sites, PCR products were sequenced and compared with sequences of the diploid parents and T. miscellus individuals not showing homoeologue loss. This follow-up ensured that the detected losses were not caused by point mutations at the restriction sites. In the case of homoeologue silencing events, this was not necessary as the integrity of the diagnostic restriction sites had been shown in the genomic DNA.

Figure 1
figure 1

Example of an analysed CAPS marker. The stained 4% metaphor agarose gel shows homoeologue loss in two Tragopogon miscellus individuals (lanes 13 and 16) from the Oakesdale population for gene D1 (an LRR protein kinase). The DNA fragments are restriction-digested PCR products of this gene from individual plants. The first eight lanes of the gel show Tragopogon pratensis individuals (Soltis and Soltis collection number 2672, individuals 1, 2, 3, 6, 7, 8, 9 and 10, respectively); lanes 9–16 show T. miscellus individuals (Soltis and Soltis collection number 2671, individuals 2, 3, 4, 5, 7, 8, 10 and 11, respectively); lanes 17–23 show T. dubius individuals (Soltis and Soltis collection number 2670, individuals 1–7, respectively); lane 24 shows a negative PCR control; lane 25 shows HyperLadder IV (Bioline, Taunton, MA, USA) markers from 100–500 bp.

Results

Homoeologue loss in natural and synthetic allopolyploids of T. miscellus

We developed CAPS and/or homoeologue-specific PCR primers analysis in T. miscellus for nine homoeologue pairs that seem to be orthologous to genes that are found as singleton genes in A. thaliana and for one homoeologue pair that seemed orthologous to a duplicate pair in A. thaliana. These 10 homoeologue pairs, together with putative function and the details of the methods used to analyse them, are shown in Table 2. These methods were successful using cDNA for 7 of the 10 genes (see Table 2). We could not analyse expression for one gene that did not amplify from leaf cDNA and two genes that did not have suitable SNP variation in exon regions.

We characterised a total of 1140 loci (10 homoeologous pairs in 57 plants) in genomic DNA from five natural populations of T. miscellus (see Table 2), finding 18 cases of homoeologue loss (1.6%; see Table 3, Figure 1). All cases of loss were confirmed by sequencing: T. miscellus individuals showing homoeologue loss gave sequences identical to those of one parental diploid, whereas individuals with both homoeologues showed double peaks at SNPs and illegibility of sequence downstream of indels (see Supplementary Material, Figure S1). Thus, we showed that 3.2% of homoeologue pairs in our sample of T. miscellus were reduced to singleton status within 80 years of polyploidisation. Of these cases, 7 involved loss of the T. pratensis homoeologue and 11 involved loss of the T. dubius homoeologue. As some of the homoeologue losses within the same population were of the same gene and homoeologue, these may have descended from a single loss event in a common ancestor. Therefore, the minimum number of independent loss events that could explain our data is 13, assuming an independent origin for each population, which is supported by recent microsatellite data suggesting that individual populations may only rarely contain plants from lineages with separate origins (Symonds et al., in preparation).

Table 3 Loss and silencing of homoeologues of 10 genes in natural populations of Tragopogon miscellus

The homoeologue losses were not distributed evenly across the 10 genes analysed (Table 3). Four of the genes showed no homoeologue losses, and 5 of the 18 losses occurred at a single locus (S3). The frequency of homoeologue losses varied across the five populations sampled. Eight of the losses were in the Moscow population (0.44 per individual analysed), five in the Oakesdale population (0.63 per individual), three in the Garfield population (0.38 per individual), one in the Spangle population (0.13 per individual) and one in the Pullman population (0.06 per individual). In one case (Moscow 2604-20), a single plant showed homoeologue loss in more than one gene, with the T. dubius copy lost in both cases (Table 4).

Table 4 Individual plants showing loss or silencing of homoeologues

In 44 synthetic first-generation (S1) T. miscellus plants, we found no evidence of homoeologue loss for the 10 genes investigated.

Expression of homoeologous loci in natural and synthetic allopolyploids of T. miscellus

We characterised the expression of 474 loci (seven homoeologous pairs in 35 plants; 16 loci were not characterised as eight PCR reactions failed to work) using cDNA from natural populations of T. miscellus, finding 16 cases of homoeologue silencing (3.4%; see Table 3). Thus, 6.8% of homoeologue pairs in our sample of T. miscellus did not express one of the homoeologues that was present in the genome. This silencing is in addition to losses of homoeologues, which occurred in 3.2% of homoeologous pairs examined (see above). Eleven of the homoeologues silenced were from T. dubius, and five were from T. pratensis. If silencing of genes occurred in a heritable manner (for example, Kloc et al., 2008), the minimum number of heritable silencing events that must have occurred is 13 (Table 3).

The frequency of homoeologue silencing differed among the seven genes analysed, with the number of silencing events per gene varying between zero and five (Table 3). Homoeologue silencing differed in its frequency across the five natural populations of T. miscellus analysed. One of the silencing events was in the Moscow population (0.1 per individual analysed), two in the Pullman population (0.2 per individual), three in the Oakesdale population (0.5 per individual), four in the Spangle population (0.8 per individual) and six in the Garfield population (1.2 per individual). Two individuals showed homoeologue silencing in more than one gene: a plant from the Spangle population had two homoeologues silenced, one from T. dubius and one from T. pratensis, and one plant from the Garfield population had six homoeologues silenced, all from T. dubius (Table 4).

In 44 synthetic first-generation (S1) T. miscellus plants, we found no evidence for silencing of homoeologues in the seven genes investigated.

Discussion

This analysis of homoeologue evolution in T. miscellus shows the following: (1) no genes studied showed instantaneous loss or silencing of one homoeologue in re-synthesised first-generation T. miscellus; (2) loss and silencing of some homoeologues begin in natural populations within 40 generations (that is, 80 years) after whole-genome duplication in the natural allopolyploid T. miscellus; (3) none of the losses or silencing events we found was fixed within populations; (4) some loss or silencing events occurred independently in more than one population, but not with the consistency to suggest a concerted mechanism; (5) silencing of homoeologues at the loci surveyed is slightly more frequent than homoeologue loss overall.

Timing of homoeologue loss

In a survey of 10 loci in 59 plants from five natural populations of T. miscellus, we found 3.2% of the homoeologue pairs studied to return to singleton status in at least one individual. In a survey of 10 different loci in 20 individuals from two natural populations (Moscow and Pullman), Tate et al. (2006) found the rate of homoeologue loss to be higher: 10% of the homoeologue pairs surveyed. Together, these two studies concur in showing that loss of homoeologues is not uncommon within 40 generations of allopolyploidisation in natural T. miscellus.

Losses of homoeologues found here are not an immediate consequence of hybridisation and whole-genome duplication. If they were, we would have found them in the S1-generation synthetic allopolyploids, and we would expect them to be fixed in the natural populations. Rather, they seem to have arisen gradually over time and to be varying in frequency in the natural populations, with some losses currently at a higher frequency than others. We found no homoeologue losses in first-generation synthetic allopolyploids, and Tate et al. (2006) did not find losses in synthetic, diploid F1 hybrids between T. dubius and T. pratensis.

These patterns contrast with those we might expect on the basis of other empirical and theoretical studies. On the one hand, synthetic wheat allopolyploids show rapid gene loss occurring in the F1 or first synthetic (S1) allopolyploid generation (Ozkan et al., 2001; Shaked et al., 2001; Kashkush et al., 2002; Levy and Feldman, 2004). On the other hand, point mutation models predict a slower rate of homoeologue loss (Lynch and Force, 2000). Perhaps polyploidy in T. miscellus sets the stage for non-instantaneous, yet rapid, mutational mechanisms that are ongoing through many generations. The most obvious of these is homoeologous recombination, in which fragments of chromosomes can be lost, though we cannot exclude the possibility of gene conversion, as has been found for rRNA genes in the species (Kovarik et al., 2005; Matyasek et al., 2007). Homoeologous recombination seems to have caused loss of chromosome fragments in re-synthesised Brassica allopolyploids (Song et al., 1995; Gaeta et al., 2007). Ownbey (1950) observed multivalent formation in early generations of natural T. miscellus, and the patterns of isozyme variation in T. miscellus are consistent with homoeologous recombination (Soltis et al., 1995). More recently, Lim et al. (2008) and Tate et al. (in press) reported multivalent formation in both natural and synthetic Tragopogon allopolyploids, along with unisomy, trisomy and reciprocal translocations in natural Tragopogon allopolyploids. Thus, it seems that in T. miscellus homoeologue loss is an ongoing consequence of allopolyploidy, rather than an instantaneous event following allopolyploidisation.

We did not find a marked preferential loss of T. dubius versus T. pratensis homoeologues in this study, but if we combine our results with those of Tate et al. (2006), the total number of T. dubius homoeologue losses is 27, and the total number of T. pratensis homoeologue losses is 11. This bias towards the loss of T. dubius homoeologues is close to statistical significance (paired t-test: d.f.=19, t=2.027, P=0.057). A similar bias towards loss of T. dubius genetic material has been found for rDNA copy number (Kovarik et al., 2005). This could be because of cytoplasmic–nuclear interactions. All the T. miscellus populations that we studied, except Pullman (in which we found only one loss), have T. pratensis as the maternal parent and there may be selection to keep the cytoplasmic and nuclear genomes as similar in their ancestry as possible. A bias in the loss of homoeologues has been found in synthetic polyploids of Brassica (Song et al., 1995). In T. mirus, which has T. dubius as the paternal parent and T. porrifolius as the maternal parent, a bias towards loss of T. dubius homoeologues was also found using CAPS markers (Koh et al., in preparation).

Timing of homoeologue silencing

We found no evidence for silencing in the 10 homoeologues studied in the first-generation synthetic (S1) allopolyploids. In natural T. miscellus, we found silencing (loss of expression) of homoeologues to be more frequent than loss of homoeologues at the seven loci examined, and more stochastic. Six of the 16 cases of homoeologue silencing were found in a single individual (2106; see Table 4). This contrasts with the 10 loci in Tate et al. (2006), in which loss was more frequent than silencing. If the expression of both homoeologues at certain loci reduced plant fitness, then we might expect to see a correlation between loci where loss occurs, and loci where silencing occurs, but this was not found. It is possible that some homoeologue silencing may be specific to the rosette leaf tissue analysed here, and therefore tissue-specific expression could be occurring (Adams et al., 2003, 2004; Adams and Wendel, 2005). Further work is needed to compare the expression of homoeologues in other tissues.

In natural cotton polyploids, a micro-array study of 1383 genes found biased expression in 70% of homoeologue pairs, 24% of which seem to have biased expression immediately on diploid hybridisation of the parental species (Flagel et al., 2008). In synthetic Senecio cambrensis allopolyploids, anonymous microarrays with ∼6000 cDNA clones showed that gene expression changes are extensive in diploid hybrids of the parental species, but are ameliorated by polyploidisation (Hegarty et al., 2005, 2006). In synthetic allopolyploids between A. thaliana and A. suecica, a microarray study with 26 090 annotated genes found that ∼5% showed expression changes (Madlung et al., 2005; Wang et al., 2006).

The current study seems to contrast with these studies in finding no changes in gene expression in the S1 generation. This may be for the following reasons: (1) The methods used here, which examine gene expression by the presence or absence of a cDNA band in an agarose gel, are conservative in that they are not sensitive to slight down- or upregulation of genes. Although we can detect homoeologue silencing, small differences in the level of expression between homoeologues are likely to have gone undetected. (2) T. miscellus is an excellent evolutionary model, but not yet a genetic ‘model organism’ (Soltis et al., 2004b; Buggs, 2008). Our sample size of genes is much smaller than those conducted using microarrays; so other genes not studied here may show changes in expression. (3) Allopolyploidisation may differ between species in the mode and tempo of its effects on the genome and transcriptome. One cause of such differences could be the degree of relatedness between the progenitor species that hybridise to give rise to an allopolyploid: loss of genes through homoeologous recombination may be more likely to occur when parental species are more closely related, whereas distantly related parents may have chromosomes that are differentiated enough not to recombine (Darlington, 1937; Clausen et al., 1945; Buggs et al., 2008). On the other hand, hybridisation between distantly related species might only produce viable offspring if genomic and transcriptomic re-adjustments take place immediately (for example, Ozkan et al., 2001). The degree to which genes are silenced in parental species may be another cause of varied consequences of allopolyploidisation among taxa, as epigenetic silencing may be disrupted by allopolyploidisation (McClintock, 1984; Comai, 2000; Comai et al., 2003).

Conclusion

This study and Tate et al. (2006) provide the first evaluation of homoeologue loss and silencing in recent natural polyploids, in comparison with F1 hybrids and first-generation synthetic allopolyploids of the same species. We show the dynamic nature of the early generations after allopolyploidisation, which, in T. miscellus, seem to have greater consequences for homeologue silencing and loss than hybridisation and genome doubling do. New pyrosequencing technologies will enable the development of far more extensive genomic resources in Tragopogon, allowing us to follow up these intriguing results with a survey of many hundreds of homoeologue sets.