Widely targeted metabolome and transcriptome landscapes of Allium fistulosum–A. cepa chromosome addition lines revealed a flavonoid hot spot on chromosome 5A

Here, we report a comprehensive analysis of the widely targeted metabolome and transcriptome profiles of Allium fistulosum L. (FF) with the single extra chromosome of shallot [A. cepa L. Aggregatum group (AA)] to clarify the novel gene functions in flavonoid biosynthesis. An exhaustive metabolome analysis was performed using the selected reaction monitoring mode of liquid chromatography–tandem quadrupole mass spectrometry, revealing a specific accumulation of quercetin, anthocyanin and flavone glucosides in AA and FF5A. The addition of chromosome 5A from the shallot to A. fistulosum induced flavonoid accumulation in the recipient species, which was associated with the upregulation of several genes including the dihydroflavonol 4-reductase, chalcone synthase, flavanone 3-hydroxylase, UDP-glucose flavonoid-3-O-glucosyltransferase, anthocyanin 5-aromatic acyltransferase-like, pleiotropic drug resistance-like ATP binding cassette transporter, and MYB14 transcriptional factor. Additionally, an open access Allium Transcript Database (Allium TDB, http://alliumtdb.kazusa.or.jp) was generated by using RNA-Seq data from different genetic stocks including the A. fistulosum–A. cepa monosomic addition lines. The functional genomic approach presented here provides an innovative means of targeting the gene responsible for flavonoid biosynthesis in A. cepa. The understanding of flavonoid compounds and biosynthesis-related genes would facilitate the development of noble Allium varieties with unique chemical constituents and, subsequently, improved plant stress tolerance and human health benefits.

However, the unbalanced selections by breeders and farmers have resulted in the loss of many useful agronomic traits in Allium crops 1,7 . Novel alleles with desirable attributes can be introduced by crossing Allium crops with wild or landrace progenitors to develop new Allium varieties.
In Southeast Asian countries, shallots (2n = 16), which are genetically closer to the bulb onion, have been recognized as a potential genetic reserve for Allium crop improvement because of their adaptability to a wide range of environmental stresses 3,8,9 . The integrated transcriptome and phytochemical analyses of monosomic addition lines (MALs, 2n = 17) of the Japanese bunching onion (FF, 2n = 16) with a single extra chromosome from the shallot (AA) enabled us to identify several transcripts on the FF genome with extra chromosome 2A from the shallot (FF2A), that involved in Alliospiroside A biosynthesis, a major saponin compound contributed to shallot resistance against the Fusarium pathogen 3 . Our early studies 10,11 also showed a considerable accumulation of carbohydrates and amino acids in several MALs during the winter and summer seasons, respectively. These reports gave insight into the significant effect of shallot extra chromosomes on the metabolic profile of A. fistulosum despite the fact that the causal genes regulating this process remain elusive.
Flavonoids are widely distributed in several plant species with diverse biological functions, including protection against abiotic/biotic stresses, male fertility, plant signaling, auxin transport, and color development to attract pollinators, as well as diverse human health-promoting properties [12][13][14] . Bulb onions and shallots are known to contain a large amount of flavonols, especially quercetin, quercetin-4′-glucoside, and quercetin-3,4′-diglucoside, as well as anthocyanins in comparison with A. fistulosum 9,15,16 . These flavonoid compounds are usually found at high concentrations in the outer skin of onion and shallot bulbs, where they impart yellow-red color development 17 . S-alk(en)yl-L-cysteine sulfoxides (ACSOs) are also one of the most characteristic ingredients in Allium species, playing a key role in both chemotaxonomic values and biological activities 18 . (+)-S-methyl-L-cysteine sulfoxide (Methiin, MeCSO), which is associated with a hot pungent taste and strong sulfur odor, is ubiquitous in the genus Allium, while (+)-S-(2-Propenyl)-L-cysteine sulfoxide (Alliin, AlCSO) and (+)-S-(trans-1-Propenyl)-L-cysteine sulfoxide (Isoalliin, PrenCSO) are characteristic of garlic and onions, respectively 1,19 .
The plant metabolome can offer a snapshot of the biochemical and physiological status of a plant cell at a given developmental stage under a particular set of environmental conditions, and thus it is closely related to the plant phenotypes [20][21][22] . In addition, plant metabolomic explains many of biological changes caused by genetic or environmental perturbations; thus it can be used to evaluate the effect of a foreign gene(s) and understand the relationships between gene function and metabolic phenotypes 23 . With the recent development of next-generation sequencing (NGS) technologies to outline gene expression profiles in the context of plant-environment interaction, integrated metabolome and transcriptome analyses are regarded as a powerful method to understand the molecular machinery underlying various biological process 9,24 . In this context, we performed integrated transcriptome and widely targeted metabolome analyses of A. fistulosum (FF), the shallot (AA), and MALs (FF1A, FF2A, FF3A, FF4A, FF5A, FF6A, FF7A, and FF8A) to investigate the effect of adding a single chromosome from the shallot to the A. fistulosum metabolome with a particular focus on flavonoids as a metabolic phenotype characteristic for the shallot.

Discussion
Onion bulb color is the result of the interaction of C, I, G, L, and R loci, among which one functional allele for the C locus and recessive alleles for the I locus are required for onion bulb color development, whereas other loci have modifying effects 2,25 . Onion bulb color ranged between white and red because of the variation in the production of different flavonoid compounds 2,26,27 . Flavonoid compounds can be divided into different flavonoid classes, with an estimated 10,000 different members, depending on the modification of the C-ring 28 . The flavonoid pathway is one of the well-characterized secondary metabolite pathways in model plants, and the transcriptional factors (TFs) are the key point for the regulation of flavonoid biosynthesis 2 . However, the several stages of flavonoid biosynthesis, especially in vivo, substrate specificity of glucosyltransferase, and acyltransferase in several Allium species, including shallots, require more elucidation. Additionally, little is known about flavonoid TFs in shallots. In this context, an integrated transcriptome and widely targeted metabolome analysis of MALs was carried out to investigate the genetic effect of the shallot (AA) genome on the A. fistulosum (FF) metabolome and transcriptome, with particular focus on the flavonoid biosynthetic pathway.
Bulb extracts were subjected to widely targeted metabolome analysis using LC-QqQ-MS, which resulted in the detection of 123 metabolite intensities (Supplementary Table S5). The trend in metabolic profiles of FF, AA, and MALs was compared using PCA analysis (Figs. 1A,B). PCA plots indicate a significant difference in the metabolic profiles between AA and FF (Figs. 1A-C). Amino acids, nucleic acids, and organic acids were characteristic for FF and FF2A, whereas di-and trisaccharides, including melibiose, sucrose, and 1F-fructosylsucrose, and several quercetin glucosides were characteristic for AA and other MALs (Figs. 1A-C). The accumulation of polysaccharides in shallots was recently reported in a comparative study between different shallot landraces and bulb onions 29 . The authors suggested that the accumulation of polysaccharides in the shallots, in comparison with bulb onions, contributed to the shallots' adaptation to climatic conditions in subtropical regions 29 . Likewise, our recent study using comparative metabolome profiling of a shallot double haploid (DHA), onion double haploid (DHC), and F 1 hybrid demonstrated a significant accumulation of carbohydrate, flavonoid, and phospholipid in the DHA and F 1 hybrid in comparison with DHC, reflecting the adaptability of the two lines to abiotic stress conditions 9 . The heatmap hierarchical clustering of the metabolite fold change using the FF genotype as a control provides a further clear discrimination between AA and MALs (Fig. 2). AA and FF5A were clustered together with a significant accumulation of quercetin glycosides (quercetin-4′-glucoside, quercetin 3-galactoside, and quercetin-3,4′-diglucoside), flavone glycoside (luteolin-4′-O-glucoside) as well as anthocyanin glycosides (petunidin 3-O-glucoside) in comparison with other MALs (Fig. 2 and Supplementary Fig. S2). This result clearly demonstrates the genetic effect of a single addition chromosome 5A from the shallot on the A. fistulosum flavonoid pool. The addition of chromosome 5A from shallots induced the accumulation of flavonoids in FF, which is an important agronomic trait for both plant stress tolerance and human health. Previous reports demonstrated that shallots exhibited higher flavonoid levels than the bulb onion and Japanese bunching onion, which could be a metabolic phenotype characteristic that enables shallots to sustain the abiotic and biotic stress conditions in subtropical regions 9,16,29,30 . Sixteen flavonol compounds have been reported previously in onions and shallots, including the aglycone and glycosylated products of quercetin, kaempferol, and isorhamnetin 17,31,32 . Quercetin glucosides are the major flavonols present in onions and shallots and are mainly present in the form of quercetin, quercetin-4′-glucoside, and quercetin-3, 4′-diglucoside 17,29,33 . However, the presence of flavone and anthocyanin glycosides, such as luteolin-4′-O-glucoside and petunidin 3-O-glucoside, respectively has not been reported in shallots.
The accumulation of flavonoids in FF5A could result from the localization of flavonoid structural or regulatory genes on the shallot chromosome 5A. Therefore, we carried out transcriptome analyses using AA and MALs, with FF as a control (Fig. 5). The transcriptome analyses revealed that 13 unigenes were upregulated in both AA and FF5A, whereas eight unigenes were downregulated (Fig. 5B). In addition, the transcriptome analysis showed a positive correlation between AA and FF5A (r = 0.54), similar to the metabolomic data (Figs 4C and 5D), indicating a similar tendency in the metabolomic and transcriptomic profiles between AA and FF5A. The transcripts encoded by the flavonoid biosynthesis (CHS, F3H, and DFR), glycosylation (UGT78D2 and UGT73D3), acylation (A5AAT-L/5AT), transporter (PDR-L ABC), and regulatory (MYB14) genes exhibited higher expression levels in both AA and FF5A than in FF (Figs. 5A,C and Supplementary Table S9). The expression of CHS, which encodes the enzyme for the first step in flavonoid biosynthesis, has been reported to be regulated by the C locus in onions 14 . The CHS enzyme plays an important role in providing chalcone scaffolds for the production of intermediate and final products of the flavonoid pathway (Fig. 7). DFR is a main enzyme in the anthocyanin biosynthesis that controls the stereospecific reduction of the 4-keto group of dihydroflavonols to the corresponding leucoanthocyanidins (Fig. 7), and a mutation within the R locus, which contains the DFR gene, is responsible for the variation between red and yellow onion cultivars 34 .
Although the central pathway for flavonoid biosynthesis is conserved in plants, a group of transferase enzymes, such as glycosyltransferase, acyltransferase, and methyltransferase, modify the flavonoid skeleton (Fig. 7), leading to the different flavonoid compounds within different plant species 35 . The glycosylation of flavonoids increases their solubility and stability in plants and affects their physiological activities and subcellular localization; therefore, the glycosylation represents an extremely important regulation point in the flavonoid homeostasis [35][36][37] . In the present study, several glycosyltransferase-related genes were upregulated in the shallot and/or FF5A, including UGT72E3, UGT73B1, UGT73B5, UGT73C5, UGT78D2, UGT78D3, and UGT85A5 (Supplementary Table S9). Among these, UGT78D2 and UGT78D3 were upregulated in both AA and FF5A (Figs. 5A,C). UGT78D2 and UGT78D3 enzymes have been previously identified in Arabidopsis as family 1 glycosyltransferase, which utilizes UDP-sugars as sugar donors for flavonoid glycosylation 34 . UGT78D2 catalyzes the glycosylation of both flavonols and anthocyanins at the C-3 position, and the Arabidopsis ugt78d2 mutant exhibited a low anthocyanin level and lacks 3-O-glucosylated quercetin as compared with wild-type plants 34,38 . Our transcripts (CL3639.Contig2 and CL3639.Contig3) encoded by the UGT78D2 gene exhibited high sequence similarity with the onion UFGT2 gene (Fig. 6A). A comparative expression of UFGT2 in different onion cultivars demonstrated that UFGT2 was www.nature.com/scientificreports www.nature.com/scientificreports/ highly expressed in the red onion cultivar, whereas a low expression level was detected in yellow and white onion cultivars 39 . UGT78D3 is a flavonol arabinosyltransferase enzyme that uses UDP-arabinose as a sugar donor and quercetin as a sugar acceptor, and the Arabidopsis ugt78d3 mutant exhibited a lower accumulation of flavonol www.nature.com/scientificreports www.nature.com/scientificreports/ 3-O-arabinosides than did wild-type plants 40 . However, the correlation analysis demonstrated that UGT78D3 was weakly correlated with the flavonoid biosynthesis genes in comparison with UGT78D2 40 . In general, our data and previous results indicated that 3-O-glycosylations by UGT78D2 and UGT78D3 may play an important role in the flavonoid accumulation of the shallot (AA) through chromosome 5A.
Similar to glycosylation, the acylation of oxygen and nitrogen containing anthocyanins to produce esters and amides, respectively, by acyltransferase enzymes influences the stability of flavonoids at a neutral pH 41 . In addition, the flavonoid acylation plays a key role in flavonoid storage and transport into the vacuole 42 . In the present study, both AA and FF5A exhibited upregulation in the transcript (Unigene15761) encoded by the A5AAT-L/5AT gene (Figs. 5A,C). The phylogenetic analysis indicated that our transcriptome sequence of A5AAT-L/5AT exhibited high sequence similarity with the onion AT1 and A. roylei A5AAT-L genes (Fig. 6B). A5AAT-L/5AT belongs to the BAHD acyltransferase family, which catalyzes the transfer of the coumaroyl or caffeoyl moiety to the 5-glucose of anthocyanidin 3,5-diglucosides in different plant species 41 . However, the molecular characterization of A5AAT/5AT in onions and shallots remains elusive. The aromatic acylation of anthocyanin in the grapevine (Vitis vinifera) shifts its color toward dark reddish blue or purple, and the acylated anthocyanins comprise more than 60% of the total anthocyanin contents in various V. vinifera cultivars 43 . Acylation is not only important for flavonoid color stability, but it also is essential for flavonoid transport in the grapevine. For example, antho-multidrug and toxic extrusion [antho-MATE1 (AM1)] and AM3 proteins mediate specifically acylated anthocyanin transport but could not transport cyanidin 3-O-glucoside or malvidin 3-O-glucoside, suggesting that the acyl conjugation was essential for the uptake 42 . Similar to many secondary metabolites, flavonoids are synthesized in the cytosol and transported into the vacuole to prevent cellular damage, where MATE and ATP-binding cassette (ABC) proteins are involved 44,45 . PDR-type ABC transporters have been reported in secretion of the flavonoid glycoside genistein from soybean roots 46 , and a multidrug resistance-associated protein (MRP), ZmMrp3, is required for anthocyanin transport in Zea mays 47 . However, the transporter system of flavonoids in onions and shallots remains unknown. In the present study, we identified a transcript encoded by the PDR-L ABC transporter gene that exhibited upregulation in both AA and FF5A in comparison with FF (Figs. 5A,C). The characterization of its in vivo role would supply more evidence of a relationship between the transport mechanism and the flavonoid composition in onions and shallots.
Transcriptional activation complex 'MBW, ' consisting of R2R3-MYB and basic helix-loop-helix (bHLH) TFs, and a WD-repeat protein are thought to be the key regulatory points in determining the spatial and temporal accumulation of flavonoids in plants 2,27 . In Asiatic lilies, R2R3-MYB (LhMYB6) was able to activate the promoters of the CHSa and DFR genes and positively regulate anthocyanin biosynthesis only when LhMYB6 form a protein-protein heterodimer interaction with bHLH2 48  In conclusion, this study provides a comprehensive evaluation of the genetic effect of shallot chromosome 5A on the A. fistulosum metabolome. Integrated widely targeted metabolome and transcriptome analyses enabled us to identify several quercetin, flavone and anthocyanin glucosides in the bulb extract of the FF genome with an extra chromosome 5A from the shallot. The transcriptome data showed an upregulation of biosynthesis (CHS, F3H, and DFR), glycosylation (UGT78D2 and UGT78D3), acylation (A5AAt/5AT), transport (PDR-L ABC), and regulatory (MYB14) genes in both AA and FF5A relative to FF (Fig. 5A). The upregulation of these genes in AA and FF5A in comparison with FF could be the switch key for flavonoid accumulation in these genotypes. The in vivo characterization of these genes using transgenic lines will provide further validation of their role in shallot and onion flavonoid biosynthesis. Widely targeted metabolomics of FF, AA, and MALs. Bulb tissues [3 replicates × 10 genotypes (FF, AA, and eight MALs)] were freeze-dried using a rotary evaporator (BUCHI, Rotavapor R-3) coupled with a vacuum pump (BUCHI, v-700) under reduced pressure at 10 °C ± 1. The sample preparation process was performed www.nature.com/scientificreports www.nature.com/scientificreports/ automatically by a liquid handling system (Microlab Star Plus, Hamilton) as described by Sawada et al. (2017) 51 . Briefly, 4 mg dry weight of bulb tissues was accurately weighted and transferred into a 2 ml tube with a 5 mm Zirconia bead. The metabolites were extracted using a proportional volume of 4 mg ml −1 extraction solvent (80% methanol, 0.1% formic acid, 16.8 nmol L −1 lidocaine, and 105 nmol L −1 10-camphorsulfonic acid as internal standards) using a multi-bead shocker (Shake Master NEO, Bio Medical Science) at 1000 rpm for 5 min. After centrifugation, the extracts were diluted to 40 µg ml −1 using an extraction solvent. Then 250 µL of the extract was transferred to a 96-well plate, dried, redissolved in 250 µL of ultra-pure water, and filtered using Whatman ® UNIFILTER ® plates 384 (GE Healthcare). One microliter of the solution extract at a final concentration of 40 µg ml −1 was subjected to widely targeted metabolomics using LC-QqQ-MS (UPLC-TQS, Waters). A KEGG ID and name, as well as our internal standard ID, were assigned to each metabolite. For detailed information, see Supplementary Tables S1-S4.
Metabolomic data analysis. The metabolomic data matrix of 499 metabolite intensities was obtained via LC-QqQ-MS analysis (Supplementary Table S1). The metabolomic data were analyzed using the statistical software R (version 3.2.2, http://www.R-project.org/). After the missing values were set to 20, the signal intensities of each experimental group were averaged in individual metabolites. The metabolites with a S/N (defined as ratio of average signal intensity to that of the extraction solvent control) < 5 in all experimental groups were removed. In addition, the metabolites with RSD > 0.30 in all experimental groups were removed, leaving 123 metabolites for further analysis. The intensities of the 123 metabolites were divided by those of the internal standards, and the resulting data matrix was used for comparative analysis (Supplementary Table S5). Initially, all 123 metabolites were subjected to pathway analysis using the MetaboAnalysis 4.0 database (http://www.metaboanalyst.ca/). Then a multivariate analysis was performed to evaluate the general trend in the metabolomic data in the different genotypes. The significant (P < 0.05) accumulation of metabolite intensities in each genotype was analyzed via t-test using FF genotypes as a control. A correlation analysis between AA and MALs was carried out using the Pcc in R v. 3.2.2. In addition, all of the significantly (P < 0.05) accumulated metabolites in both AA and FF5A, in comparison with FF, were used to generate a metabolite network. The metabolite network was developed based on the Pcc using Correlation Calculator 1.0.1, and the network was plotted using Cytoscape 3.3.0 based on Arabidopsis database with the following parameters: base edges on the Pcc, tooltip labels none, and range for edges from −1.0 to 1.0.
Transcriptome analysis of FF, AA, and MALs. Total RNA was extracted using an RNeasy Plant Mini Kit (QIAGEN Sciences, Germantown, Maryland, USA). RNA quality was assessed using a BioSpec-nano (Shimadzu, Kyoto, Japan) and Agilent 2100 Bioanalyzer (Agilent Technologies, Palo Alto, CA, USA). The cDNA library was constructed using TruSeq ™ RNA Sample Preparation Kit (Illumina, San Diego, CA, USA) in accordance with the manufacturer's instructions. The cDNA library was prepared in accordance with Illumina's protocol, and paired-end sequencing was performed using Illumina's HiSeq 2500 platform. The raw quality reads including adapter sequence and unknown nucleotides more than 5% and low-quality nucleotides (QV < 10) more than 20% in length were respectively excluded. The clean reads were assembled into contigs by Trinity r20121005 with the parameters -seqType fq-min_contig_length 100_group_pairs_distance 250-path_reinforcement_distance 85-min_kmer_cov 2 52 . In each sample, the contigs were clustered by a TGI Clustering Tool (TGICL) v2.1 53 with the parameters -l 40 -c 10-v 20 and further assembled into unigenes by Phrap 23.0 54 with the parameters -repeatstringency 0.95 -minmatch 35 -minscore 35. We searched 56,161 unigene sequences against the International Rice Genome Sequencing Project (IRGSP 1.0; http://rgp.dna.affrc.go.jp/IRGSP/), the Arabidopsis Information Resource (TAIR10; http://www.arabidopsis.org), and the NCBI's non-redundant (nr) protein database (http://www.ncbi.nlm.nih.gov) using the BLASTX 55 program with an E-value cutoff of 1E-10. Based on the number of reads mapped onto the unigene sequences, the reads per kilobase per million mapped reads (RPKM) value of each gene was calculated using an in-house Perl script. We tested for differences between the normalized means of AA and MALs compared with FF as a control. Genes were classified as differentially expressed if they exhibited ≥2-fold changes and a false discovery rate (FDR) < 0.05 using EdgR. All of the transcriptome data, including gene expression, sequence data, and gene annotation, were deposited in our open access Allium TDB.
Phylogenetic analysis of flavonoid glycosylation, acylation, and TF genes. The phylogenetic tree was developed by using the nucleotide sequences of glycosylation, acylation, and TF genes from FF, AA, and FF5A located in our Allium TDB, and their orthologs were obtained from the NCBI. The sequence alignments were constructed with a ClustalW package in BioEdit software. Then the phylogenetic tree was developed using the neighbor-joining method with bootstrap values (1000 replicates) using the MEGA 6.0 program.