Natural selection is a fundamental concept underlying Darwinism and modern evolutionary theory. The molecular mechanisms of evolutionary change driven by natural selection have been investigated by many evolutionary biologists since the beginning of the molecular biology era. However, it is difficult to pinpoint the mechanism of adaptive evolution when studying wild populations unless selection and adaptive evolution is ongoing or happened only recently. Artificial selection experiments provide a solution to this problem as the selection process can be controlled. Humans have been applying artificial selection in the breeding of domesticated crops and animals since the onset of agricultural civilization, which has attracted many evolutionary biologists’ attention, including Darwin’s, as evidenced by his book “The Variation of Animals and Plants under Domestication1. However, most domestication has happened quite a long time ago in human history (archaeological evidence of plant domestication has been dated to as early as 11,050 BC2) and an often complex breeding history makes it difficult to investigate the molecular mechanism underlying domestication. Artificial selection experiments circumvent this problem by focusing on the very early onset of evolutionary change. Previously, short term artificial selection experiments have been carried out in animals including fruit flies3,4,5, nematodes6 and fish7, but less often in plants8,9.

Flower scent is an important chemical trait in flowering plants, helping them to attract pollinators to enable sexual reproduction. Because floral scent is a signal to pollinators, it can impact plant ecological speciation through impacting both reproductive success and reproductive isolation10,11. However, the molecular bases of scent evolution remain largely unknown partially because the heavily studied plant models such as Arabidopsis thaliana and rice produce quite weak scent and do not rely on pollinators for sexual reproduction. Most of the few studies on the molecular bases of scent evolution were restricted to candidate genes in non-model plants10,12,13,14,15,16,17,18. Brassica rapa is a very good model for studying floral scent as it features strong scent emission and most populations are outcrossing and rely on insect pollinators for sexual reproduction. Fast cycling Brassica rapa (Wisconsin fast plants) is a strain with short life cycle and thus an ideal model for artificial selection studies ( Such an experiment has recently been carried out to assess the heritability of flower scent with four different scent compounds as selection targets19. In their study, three generations of artificial selection lead to significant scent divergence between “high” and “low” selection line plants. As the reference genome and many functional genomics tools are available for Brassica rapa, this system provides us with the possibility of a system-level exploration of the molecular mechanisms of scent evolution, instead of targeting individual genes. We focused our study on the high and low lines with the target compound phenylacetaldehyde, as it is one of the key pollinator-attractive volatile compounds in the flower scent of Brassica rapa20. We carried out transcriptome analysis on flowers of Brassica rapa fast plant lines selected for high and low PAA emission to address the question of the molecular bases of scent evolution in artificially selected plants at the transcriptome level.

Material and Methods

Study plants

Seeds of fast cycling Brassica rapa plants used in this study were collected from plants that had been selected for three generations with phenylacetaldehyde as the target compound according to procedures described in ref. 19. 50 seeds from ten individuals (5 seeds per each mother plant) of third generation plants for both “high” and “low” selection lines were sown out in standardized soil (Humuswerke Gebr. Patzer GmbH & Co.KG, in a phytotron under 24 hours fluorescent light at 22 °C, 60% relative humidity and were watered twice a day. One week after sown out, the seedlings were transferred into individual pots (7 cm*7 cm*8 cm) and kept under the same growth condition until the experiments were finished.

Floral scent collection and analyses

Headspace collection of floral volatile organic compounds (VOCs) was carried out from 30 high line and 30 low line plants with a push-pull system when there around ten open flowers on each plant individual. The entire inflorescence was enclosed into cylindrical vessel made of glass previously silanized with Sigmacote (Sigma Aldrich, to minimize VOC absorption to the glass. Two Teflon plates were used to close the open end with a small opening in the middle to hold the stem. The glass vessel had two holes, one through which charcoal-filtered air was pumped into the glass vessel (“push”) and one through which then scent-containing air was pulled out with a vacuum pump. The scent was collected at the “pull” port with glass tubes filled with absorbent (35 mg Tenax TA 60/80, Supelco, Sigma Aldrich, The flow rate of push and pull were both set at 100 ml∙min−1. Floral scent was collected for three hours; one air control sample with the same settings but without a plant inside the glass vessel was collected for each batch. After scent sampling, the glass tubes with Tenax were removed from the system and immediately wrapped up with teflon tape. The samples were analyzed immediately or stored at −20 °C until analysis.

Analyses of floral scent was done using a gas chromatograph with a mass selective detector (GC-MSD; Agilent 6890 N, Agilent Technologies, fitted with a thermal desorption system (Gerstel TDS/TDU, Gerstel, Each glass tube was loaded and injected into the GC using a Gerstel MultiPurpose Sampler MPS. For thermal desorption, the temperature was programmed to start at 30 °C (hold for one minute) and increase to 240 °C (and held for one minute) at 60 °C min−1. The eluting volatiles from the TDS were collected and enriched at −150 °C at a cool injection system (CIS 4, Gerstel, For injection, the CIS was heated to 150 °C at 16 °C s−1, then increased further from 150 °C to 250 °C at 12 °C s−1. The GC was equipped with an HP-5 capillary column (Agilent, 15 m length, 0.25mm diameter, 0.25 μm film thickness) and helium was used as the carrier gas with a constant flow of 2 ml min−1. The temperature of the GC oven was set to 50 °C (held for one minute) at first and then increased to 250 °C at 10 °C min−1. An Agilent 5975 Series MSD mass spectrometer was used to identify and quantify compounds. Chromatograms were analyzed with the ChemStation Enhanced Data Analysis program (Version E.01.00). The mass spectra obtained from the samples were matched with those of a reference collection (the National Institute of Standards and Technology (NIST) mass spectral library) for initial identification; then, retention times and mass spectra of all compounds included in the quantitative analyses were compared to those of synthetic reference standards. Peak areas of target ions were subsequently converted to compound quantity in to nanogram by applying calibration curves established for each compound. Quantitation of 16 floral scent compounds was done in the four individuals with the highest emission of phenylacetaldehyde (PAA) among all “high” line plants, and the four individuals with the lowest emission of PAA among all “low” line plants. As the program cannot always identify the peak correctly, manual integration was done when necessary.

RNA extraction and library construction

After scent collection, the four individuals with the highest PAA emission, and the four individuals with the lowest PAA emission were used for RNAseq analyses. The flowers were collected and flash frozen in liquid nitrogen and stored at −80 °C until extraction. Frozen flowers were first homogenized with beads-beater and then used for RNA and DNA extraction. Total RNA was extracted using Trizol (Invitrogen) following the manufacturer’s instruction. The integrity of the RNA was then assessed using RNA nano chips on 2100 Bioanalyzer (Agilent Technologies, Inc). Then the following library construction and sequencing of the total RNA was done at Functional Genomic Center Zurich. The libraries were constructed with PolyA enrichment protocol using TruSeq Stranded mRNA library prep kit (Illumina) and then eight libraries were mixed to be sequenced in one lane on Hiseq2000 at single end for 100 bp. Totally, we got 212,759,422 reads from the eight libraries (on average 26,594,928 reads per sample). The raw reads were submitted to the NCBI under BioProject accession: PRJNA347876.

Transcriptome data analyses

The Tophat-Cufflink protocol21 was used to quantify gene expression level and find genes with different expression level between high and low lines. First, quality control was done with the raw data using FastQC ( with default settings. Then, filtered reads were mapped to the reference genome (GCF_000309985.1_Brapa_1.0 from NCBI with Bowtie22 implemented in TopHat. Finally we use Cuffdiff in the Cufflink package to find genes with different expression levels between high and low lines.

To find out whether there is a difference in the general pattern of expression profile of high line and low line, we carried out gene ontology (GO) enrichment test in genes with significantly different expression level between high lines and low lines. In addition, we also carried out gene set enrichment test for pathway analysis implemented in the R package of GAGE (Generally Applicable Gene-set Enrichment)23.

Results and Discussion

Volatiles with increased emission and transcriptome analysis

Our data show that besides phenylacetaldehyde, several other VOCs were emitted in significantly higher amounts in the “high line” than in the “low line” plants (Table 1). Those increased VOCs include α-farnesene, 2-aminobenzaldehyde, indole, benzyl nitrile, methyl salicylate and methyl anthranilate. In addition, 2-phenylethanol also showed considerable emission in high line, but no emission in low line plants; nevertheless, there was no significant difference between high and low line plants for this compound, likely because of low sample size and variation in the high line plants. Many of these volatiles are known to attract pollinators11,24,25 and herbivores26,27,28; α-farnesene is also known to be repellent to ants29. Evolutionary change in non-selected VOCs can be explained by pleiotropy or close linkage between individual “scent genes” (linkage disequilibrium). These mechanisms were previously suggested to contribute to the evolution of scent bouquets19. The transcriptome data of the high and low selection line plants render us the possibility to unravel the molecular mechanisms of these phenomena. PAA synthesis was previously found to be catalyzed by PAA synthase (PAAS) with phenylalanine as substrate in Petunia30. After BLAST search, we identified five homologs of PAAS in the Brassica rapa genome annotated as tyrosine decarboxylase. Three of the genes showed significant up-regulation in the high lines while the other two showed no significant difference between high and low lines (Table 2). Thus, these three genes are probably the functional genes encoding the PAAS in Brassica rapa. In addition to the actual PAAS gene, several genes in the shikimate pathway which synthesizes phenylalanine as the substrate of PAA synthesis, also showed increased expression in high line plants. Specifically, genes in four of the six reactions from shikimate to phenylalanine showed increased expression (Fig. 1). On the contrary, there are only three reactions upstream of shikimate that showed decreased expression in two corresponding genes: 3-deoxy-D-arabino-heptulosonate 7-phosphate (DAHP) synthase and bifunctional 3-dehydroquinate dehydratase (DHD)–shikimate dehydrogenase (SDH). DAHP expression was shown to be induced upon wounding in Solanaceae31 and methyl jasmonate treatment in Arabidopsis32. However, we know little about the reason for the down-regulation of those genes; feedback of phenylalanine and other intermediate product accumulation in the plants may play a role in this response.

Table 1 Results of the floral scent analysis of low- and high line Brassica rapa plants.
Table 2 Candidate scent genes and their expression profile in high and low line plants.
Figure 1
figure 1

Log2 fold changes of gene expression level mapped onto the KEGG pathway module “phenylalanine, tyrosine and tryptophan biosynthesis” by R package “Pathview”.

Phenylalanine is the precursor of phenylacetaldehyde in Petunia30. Most genes in the shikimate pathway and phenylalanine related pathway showed increased expression in high line plants. Red stars were used to label the reactions where significant expression changes were found in RNAseq data (q < 0.05).

Benzyl nitrile is another compound for which phenylalanine is the precursor and was found to be co-upregulated with PAA. It has been shown in Arabidopsis that cytochrome P450 CYP79A2 catalyzes the conversion of L-phenylalanine to phenylacetaldoxime33. Phenylacetaldoxime can then be conversed to benzyl nitrile by phenylacetaldoxime dehydratase. However, the later enzyme was up to now only found in a Bacillus strain34. Although benzyl nitrile can also originate from the hydrolyzation of glucosinolates, this reaction only occurs when myrosinase is activated by mechanical damage or herbivore attack which was not the case in our study. In the Brassica rapa genome, we identified two homologs of the gene CYP79A2 catalyzing the first step from phenylalanine to phenylacetaldoxime (Table 2). We found that one of these genes showed significantly increased expression in the high lines while the other one showing quite low expression level in both high and low lines (Table 2). Thus, the first gene is probably the functional gene in the tissue assayed which catalyzes the initial step of the benzyl nitrile synthesis from phenylalanine.

The compound with the most similar pattern of co-upregulation with PAA in the high line plants is 2-phenylethanol. In the low line plants, 2-phenylethanol was not detected at all, similar to the selection target PAA, while considerable amount was found in high line plants. A 2-phenylacetaldehyde reductase was identified in tomato to catalyze the conversion of PAA to 2-phenylethanol (=phenylethyl alcohol)35. When searching the Brassica rapa genome using the protein sequence of 2-phenylacetaldehyde reductase gene as the query, we identified genes annotated as cinnamoyl-CoA reductase 1 and 2 gene family most similar to our query gene (see Supplementary Information). We checked the expression changes of all 13 members of the cinnamoyl-CoA reductase 1, and 7 members of the cinnamoyl-CoA reductase 2 gene family and only one gene (gene ID: 103846465) showed significant up-regulation in high line plants (Table 2). The gene 103846465 is a member of cinnamoyl-CoA reductase 2 gene family and could possibly encode a functional PAA synthase. However, a previous study in tomato also suggested that the synthesis of the substrate PAA is the limiting step in the synthesis of phenylethyl alcohol, because over-expression of 2-phenylacetaldehyde reductase in tomato does not necessarily lead to increased emission of phenylethyl alcohol35. The increased emission of phenylethyl alcohol in our high line plants is therefore also likely caused by increased substrate availability rather than increased expression of the gene catalyzing the final step in its synthesis.

Many other VOCs related to the shikimate pathway were also found to be emitted at higher levels in the high line plants, including indole and methyl anthranilate (Table 1). Indole is synthesized from chorismate which is an intermediate product of phenylalanine synthesis in the Shikimate pathway ( Genes catalyzing four of the eight reactions from shikimate to indole showed increased expression in our data (Fig. 1). Methyl anthranilate, which is probably synthesized via methylation of anthranilate, also showed higher emission rates in high lines. Anthranilate is again an intermediate product in the pathway from shikimate to indole.

The evolutionary change in the compounds discussed above are pleiotropic effects of selection on PAA. Interestingly, selection on PAA led to up-regulation of genes both directly involved in PAA synthesis as well as those upstream in the shikimate pathway, likely enhancing the availability of the substrate of PAA synthase, phenylalanine. Substrate availability is a well-known mechanism for regulation of scent emission36,37 and in our study probably one mechanism causing pleiotropic responses in floral scent compounds. One more possible mechanism contributing to pleiotropy is the upregulation of one (or more) transcription factor in high PAA lines, regulating transcription level of multiple target genes. However, due to current lack of experimental data on target promotors of transcription factors in Brassica rapa, we are not able to test this hypothesis with our dataset. To find out whether linkage disequilibrium plays a role in the evolution of the scent bouquet in high PAA line plants, we manually checked the genomic locations of all the 82 genes in the PAA pathway (see Supplementary Table S1). However, we couldn’t find any pair of neighboring genes (within 10 kbp distance) showing co-regulation in the PAA network except tandem duplicates in the same gene family. This shows that the genes controlling diverse, co-upregulated genes involved in the synthesis of aromatic compounds are not located spatially close to each other. Linkage is usually associated with spatial proximity and was suggested to evolve in traits comprising pollination syndromes in Petunia38. However, we cannot exclude the possibility that distant genes in the genome get close to each other due to spatial organization at three dimensional level.

Increased transcription of ribosomal protein genes in high lines

In addition to scent genes and related pathways, our GO enrichment analysis in the genes with significantly different expression levels between high and low lines found that “structural constituent of ribosome” (GO:0003735, P = 1.4e-16), “translation” (GO:0006412, P = 1.3e-09) and “ribosome” (GO:0005840, P = 1.6e-14) were the top significantly over-represented GOs in the test of molecular function (MF), biological process (BP) and cellular component (CC), respectively. Those three over-represented GOs in their respective categories (MF, BP and CC) are actually referring to exactly the same molecular identity. All showed that the ribosomal protein genes as a group were significantly over-represented among all genes with increased expression level in high lines compared to low lines. In addition, our gene set enrichment test for pathway found that the KEGG pathway “ko03010 Ribosome” was the only significantly enriched pathway in genes with different expression level (P = 2.7e-6, q = 3.5e-4, Fig. 2). This result is consistent with the GO enrichment test and confirmed that ribosomal protein genes are the most prominent group of genes up-regulated in high PAA line compared to low line plants.

Figure 2
figure 2

Log2 fold changes of gene expression level mapped onto the KEGG pathway module “ribosome” by R package “Pathview”.

Most genes coding ribosomal proteins showed increased expression in high line plants. Red stars were used to label the reactions where significant expression changes were found in RNAseq data (q < 0.05).

Ribosomes as the protein translation machinery are conserved in all organisms and play an important role in translating the mRNAs into functional proteins. The hypothetic consequence of up-regulation of ribosomal protein genes is higher throughput of protein translation with the same level of mRNA input. As ribosomal proteins have been taken as reference genes in many studies, little attention has been paid to the biological meaning of their up-regulation. Only until Thorezz et al. published their work on falsifying the validity of ribosomal proteins as reference genes, tissues with active proliferating or secreting cells were found to have elevated expression level in almost all ribosomal protein genes39. Higher efficiency in protein translation as a result of more ribosomal proteins available may be necessary for those tissues to maintain their function. Similarly, higher ribosomal protein gene expression may add another layer of gene regulation and might contribute to the (short term) scent evolution at the protein level, in addition of the up-regulation of many genes involved in scent metabolism at transcriptome level. In addition to translation, many redundant ribosomal protein genes in plants were found to exert extra-ribosomal functions during development, such as stress response and other physiological process40,41,42. Those up-regulated ribosomal proteins in high lines might also play a role other than translation in the evolution of high PAA line plants.

To avoid bias due to alignment and the statistical method, we re-analyze the transcriptome data with a different protocol (Subread-edgeR) and also mapped the differentially expressed genes in the KEGG pathways. The results (see Supplementary Figs S1 and S2 and Table S2) from the alternative protocol showed high consistency with the first protocol, indicating that our conclusions are independent of the computational method used.

In conclusion, we show that multiple molecular factors may contribute to floral scent evolution, at least in the short term, namely over three generations with artificial selection. Correlated expression pattern with VOC emissions were found in the transcriptome data including increased expression of genes related to the biosynthesis of scent compounds and their biosynthetic precursors, as well as increased activity of the ribosomal translation machinery were the possible main mechanisms of increased scent emission in our study. In future investigations, the role of some SNPs or indels selected in the high lines and methylation changes during scent evolution deserve more attention. Also, it would be interesting to contrast short term to long term evolutionary change to see whether quick adaptive responses to environmental fluctuation in plants are mediated by different molecular mechanisms than long term evolutionary changes, for example during speciation.

Additional Information

How to cite this article: Cai, J. et al. The molecular bases of floral scent evolution under artificial selection: insights from a transcriptome analysis in Brassica rapa. Sci. Rep. 6, 36966; doi: 10.1038/srep36966 (2016).

Publisher’s note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.