Elucidation of the core betalain biosynthesis pathway in Amaranthus tricolor

Amaranthus tricolor L., a vegetable Amaranthus species, is an economically important crop containing large amounts of betalains. Betalains are natural antioxidants and can be classified into betacyanins and betaxanthins, with red and yellow colors, respectively. A. tricolor cultivars with varying betalain contents, leading to striking red to green coloration, have been commercially produced. However, the molecular differences underlying betalain biosynthesis in various cultivars of A. tricolor remain largely unknown. In this study, A. tricolor cultivars with different colors were chosen for comparative transcriptome analysis. The elevated expression of AmCYP76AD1 in a red-leaf cultivar of A. tricolor was proposed to play a key role in producing red betalain pigments. The functions of AmCYP76AD1, AmDODAα1, AmDODAα2, and AmcDOPA5GT were also characterized through the heterologous engineering of betalain pigments in Nicotiana benthamiana. Moreover, high and low L-DOPA 4,5-dioxygenase activities of AmDODAα1 and AmDODAα2, respectively, were confirmed through in vitro enzymatic assays. Thus, comparative transcriptome analysis combined with functional and enzymatic studies allowed the construction of a core betalain biosynthesis pathway of A. tricolor. These results not only provide novel insights into betalain biosynthesis and evolution in A. tricolor but also provide a basal framework for examining genes related to betalain biosynthesis among different species of Amaranthaceae.

. Identification of AmCYP76AD1 as a key element required for betalain pigment production in Amaranthus tricolor. (a) The leaf-color phenotypes of the red-leaf cultivar (AMR) and green-leaf cultivar (AMG) of three-week-old A. tricolor. (b) Extraction of chlorophyll pigments (hydrophobic layer) and betalain pigments (hydrophilic layer) from three-week-old leaves of AMR and AMG (left panel). Absorbance spectra of the extracted betalain pigments from AMR and AMG (right panel). The absorbance at 538 nm for betacyanins is indicated with a red dashed line, and the absorbance at 476 nm for betaxanthins is indicated with a yellow dashed line. (c) Liquid chromatography-tandem mass spectrometry (LC-MS/MS) analysis of three-weekold leaves of AMR and AMG. Shown are extracted ion chromatograms (XICs) of masses corresponding to tyrosine (m/z = 182), L-DOPA (m/z = 198), betalamic acid (m/z = 212), betanidin (m/z = 389), and betanin (m/z = 551). Time, retention time (min). (d) Expression levels of genes related to the betalain biosynthesis pathway in AMR and AMG analyzed by qRT-PCR. Statistically significant differences were determined using Student's t-test (*P < 0.01 for AMR vs. AMG). (e) Putative core betalain biosynthesis pathway in A. tricolor. Am, Amaranthus tricolor; CYP76AD1, cytochrome P450 76AD1; DODA,  www.nature.com/scientificreports/ color (hereafter referred to as AMR and AMG, respectively) ( Fig. 1a-c, Supplementary Fig. S1a, b), specific primer pairs were designed based on available sequence information from the NCBI database or previously published studies to selectively examine the transcript levels of genes related to the betalain biosynthesis pathway by qRT-PCR (Supplementary Table S1). Among them, AmADH, AmCYP76AD1, AmDODA, AmcDOPA5GT, AmB5GT, and AmUGT79B30-like 4 genes encoding for the putative enzymes were thought to be directly involved in the betalain biosynthesis 30 . AmMYB1, encoding an R2R3-type transcription factor, was proposed to participate in the regulation of betalain biosynthesis 10, 30 . AmPPO and AmCATPO genes, encoding tyrosinases, were considered for their roles in conversion of L-DOPA to cyclo-DOPA 1,30 . AmTyDC, encoding a tyrosine decarboxylase, was proposed to participate in the degradation of betalain 1,30 . In 3-week-old A. tricolor, AmCYP76AD1 and AmPPO showed higher expression levels in AMR than in AMG (Fig. 1d). Notably, only AmCYP76AD1 exhibited a highly differential expression pattern, showing an ~ 200-fold difference between AMR and AMG. In contrast, AmDODA, AmcDOPA5GT, AmB5GT, AmUGT79B30-like 4, AmMYB1, AmADH, AmCATPO, and AmTyDC did not show a significant differential expression pattern between AMR and AMG (Fig. 1d). The highly differential expression pattern of AmCYP76AD1 between AMR and AMG was also observed in 4-week-old A. tricolor ( Supplementary Fig. S1c). Moreover, as a key element in the initiation of the betalain biosynthesis pathway, AmCYP76AD1 displayed higher transcript levels in the upper leaves of AMR, which contained higher content of betalains than those in the lower leaves of AMR ( Fig. 2a,b, Supplementary Fig. S2). Further phylogenetic reconstruction and LOGO analysis revealed that AmCYP76AD1 belongs to the CYP76Adα clade (Fig. 2c,d), whose members possess both the tyrosine hydroxylase and L-DOPA oxidase activities required for L-DOPA and cyclo-DOPA formation, respectively (Fig. 1e). These results suggest that the elevated expression of AmCYP76AD1 is necessary for betalain pigment accumulation, which leads to an obvious red-violet color in the leaves and stems of AMR, but not in those of AMG.
AmDODA exhibits a marginal level of L-DOPA 4,5-dioxygenase activity. Although candidate transcripts related to betalain biosynthesis were identified previously in A. tricolor 30,31 , their functional and enzymatic activities have not yet been characterized. To functionally characterize the enzyme activities of AmCYP76AD1, AmDODA, and AmcDOPA5GT in the core pathway of betalain biosynthesis (Fig. 1e), 35S promoter-driven cDNAs encoding C-terminal YFP-or FLAG (SFP)-tagged AmCYP76AD1, AmDODA, and AmcDOPA5GT were transiently coexpressed in N. benthamiana leaves by agroinfiltration. Upon expression, only a small amount of betalain pigment was produced in N. benthamiana leaves, which was barely detectable (Fig. 3a). In contrast, as a positive control, high production of betalain pigments with red-violet color was observed when the Beta vulgaris tyrosinase gene (BvCYP76AD1), the B. vulgaris L-DOPA 4,5-dioxygenase gene (BvDODAα1), and the Mirabilis jalapa cyclo-DOPA 5-O-glucosyltransferase gene (MjcDOPA5GT), were coexpressed in N. benthamiana leaves (Fig. 3a). To elucidate the A. tricolor genes responsible for the negligible activity of betalain synthesis in transient analysis, a series of coinfiltration assays were carried out by replacing the positive control genes individually with AmCYP76AD1, AmDODA, and AmcDOPA5GT. The replacements of BvCYP76AD1 and MjcDOPA5GT by AmCYP76AD1 and AmcDOPA5GT, respectively, resulted in high amounts of betalain pigment and betanin production in N. benthamiana leaves (Fig. 3b,c). However, AmDODA failed to replace the function of BvDODAα1. The coexpression of BvCYP76AD1, AmDODA, and MjcDOPA5GT only produced marginal levels of betalain pigments and betanin, which were barely detectable (Fig. 3b,c). Together with the comparable levels of proteins detected by western blotting (Fig. 3d, Supplementary Fig. S3), these results suggest that the L-DOPA 4,5-dioxygenase activity of AmDODA is very low compared to that of BvDODAα1.
Two DODAα homologues are present in A. tricolor. Recently, a phylogenetic study of Caryophyllales suggested that at least two DODAα genes are present in betalain-pigmented species, including Amaranthus hypochondriacus 12 . To identify the DODAα homologue exhibiting a high level of L-DOPA 4,5-dioxygenase activity in A. tricolor, the RNA sequencing of aerial tissues derived from AMR and AMG plants was performed on the Illumina HiSeq 4000 platform. Two transcript libraries of AMR and AMG were built from the highquality reads through de novo assembly and functional annotation (Supplementary Tables S2, S3). The relative abundance of transcripts between AMR and AMG was illustrated in an MA plot (Fig. 4a). In addition, the relevant genes involved in the synthesis of betalain pigments were identified through in silico analysis and further highlighted in the MA plot (Fig. 4a, Supplementary Table S4). As expected, only AmCYP76AD1 was expressed at a significantly higher level in AMR than in AMG (Fig. 4a). These results suggest that AmCYP76AD1 is the key enzyme responsible for betalain pigment accumulation in AMR and that the loss of AmCYP76AD1 expression in AMG results in the green color phenotype. Additionally, two DODAα homologues, AmDODAα1 and AmDODAα2 (referred to as AmDODA), were recovered through in silico analysis ( Supplementary Fig. S4, Table S4). This indicated that gene duplication has occurred at least once in the DODAα lineage of A. tricolor. A reduced phylogenetic tree of DODAα was further generated using AmDODAα1, AmDODAα2, and previously characterized DODAα homologues from B. vulgaris, Carnegiea gigantea, Chenopodium quinoa, Mesembryanthemum crystallinum, M. jalapa, Parakeelya mirabilis, and Stegnosperma halimifolium (Fig. 4b). Two clades, DODAα1 and DODAα2, were obtained, and each of them presented seven previously identified conserved residues that are functionally important for high and marginal activities of L-DOPA 4,5-dioxygenase, respectively (Fig. 4c). Among these sequences, AmDODAα1 belongs to the DODAα1 clade and contains seven residues (DDYNDEI) associated with high L-DOPA 4,5-dioxygenase activity; AmDODAα2 (AmDODA) belongs to the DODAα2 clade and contains seven residues (YGFKNNT) associated with marginal L-DOPA 4,5-dioxygenase activity. These results suggest that AmDODAα1 may exhibit the high level of L-DOPA 4,5-dioxygenase activity required for betalain pigment production in A. tricolor. www.nature.com/scientificreports/ AmDODAα1, but not AmDODAα2, exhibits a high level of L-DOPA 4,5-dioxygenase activity. As a key step in betalain biosynthesis, L-DOPA 4,5-dioxygenase can convert L-DOPA into betalamic acid, the basic structural unit of all betalains 1,32 . To functionally characterize the L-DOPA 4,5-dioxygenase activity of AmDODAα1, AmDODAα1 was coexpressed with BvCYP76AD1 and MjcDOPA5GT by agroinfiltration. As a result, high production of betalain pigments and betanin was observed when comparable amounts of proteins were expressed in N. benthamiana leaves ( Fig. 3b-d, Supplementary Fig. S3). These results indicate that AmDODAα1, but not AmDODAα2, exhibits a high level of L-DOPA 4,5-dioxygenase activity, similar to that of BvDODAα1.
To verify enzyme activity in vitro, AmDODAα1 and AmDODAα2 were expressed as SUMO-fused recombinant proteins in an Escherichia coli expression system (Fig. 5a). Enzymatic reactions were conducted following the method described by Sasaki et al 32 , in which crude extracts prepared from E. coli were used. After incubation for 5 min at 30 °C, a bright yellow color derived from betalamic acid was observed in the reaction mixture containing L-DOPA, ascorbic acid, and a crude extract prepared from E. coli harboring AmDODAα1 or BvDODAα1, but not AmDODAα2 (Fig. 5b). However, only a very weak yellow color was observed when the reaction mixture contained twofold crude extract prepared from E. coli harboring AmDODAα2 (Fig. 5b). As a control, a reaction mixture containing the crude extract was prepared from E. coli harboring only the vector, and no color was

Reconstruction of the core betalain biosynthesis pathway of A. tricolor in N. benthamiana.
In this study, we also attempted to use TRV-based virus-induced gene silencing (VIGS) to examine the functional activities of genes involved in betalain biosynthesis in A. tricolor. However, the transient silencing of AmCY-P76AD1 in A. tricolor was particularly challenging and failed in our hands. In addition, the attempted overexpression of AmCYP76AD1 to complement the betalain pigments in the leaves of AMG was unsuccessful using an agroinfiltration system. These differences might have resulted from the different varieties and low transformation efficiency of A. tricolor 33 .
To reconstruct the core betalain biosynthesis pathway of A. tricolor, AmCYP76AD1, AmDODAα1, and Amc-DOPA5GT were transiently overexpressed in N. benthamiana leaves by agroinfiltration for the heterologous engineering of betalain pigments. Similar to the vector-only control, the heterologous expression of single AmCY-P76AD1, AmDODAα1, or AmcDOPA5GT was not sufficient to produce any betalain pigment in N. benthamiana (Fig. 6a). However, low production of betalain pigments was observed when AmCYP76AD1 and AmDODAα1 were coexpressed in N. benthamiana (Fig. 6a). In contrast, no betalain pigment was observed when AmCY-P76AD1 and AmcDOPA5GT or AmDODAα1 and AmcDOPA5GT were coexpressed in N. benthamiana (Fig. 6a). Only the coexpression of AmCYP76AD1, AmDODAα1, and AmcDOPA5GT together was sufficient to produce high amounts of betalain pigments in N. benthamiana, which resulted in a strong red-violet color (Fig. 6a). The strong red-violet color was similar to that in the positive control in which BvCYP76AD1, BvDODAα1, and MjcDOPA5GT were coexpressed in N. benthamiana (Fig. 6a). As expected, the coexpression of AmCY-P76AD1, AmDODAα2, and AmcDOPA5GT only produced marginal levels of betalain pigments, which were www.nature.com/scientificreports/ barely detectable (Fig. 6a). Consistently, high production of betanin was observed only when AmCYP76AD1, AmDODAα1, and AmcDOPA5GT were coexpressed in N. benthamiana leaves (Fig. 6b). Together with the comparable amount of proteins detected by western blotting (Fig. 6c, Supplementary Fig. S5), our results suggest that the enzyme activities of AmCYP76AD1, AmDODAα1, and AmDOPA5GT are sufficient to construct the core betalain biosynthesis pathway of A. tricolor.

Discussion
Molecular genetics have shed light on the betalain biosynthesis pathway and its evolutionary significance in Caryophyllales. Based on phylogenetic analysis, CYP76AD homologues can be classified into α, β, and γ clades 9 . To date, only the functions of CYP76ADα and CYP76ADβ clade homologues, such as CYP76AD1 and CYP76AD6, have been reported 10 . For example, the cosilencing of CYP76AD1 and CYP76AD6 represses the production of betacyanins and betaxanthins in B. vulgaris, causing a green leaf phenotype 16 . In this study, a CYP76AD6-like (AmCYP76AD6) gene, belonging to the CYP76ADβ clade according to phylogenetic construction and LOGO analysis (Fig. 2c,d), was also identified in A. tricolor through transcriptome analysis ( Supplementary Fig. S6). However, the expression of AmCYP76AD6 was extremely low and was difficult to detect in AMR and AMG. As a result, it is difficult to functionally connect AmCYP76AD6 with the production of betalains in A. tricolor. In addition, although PPO, a polyphenol oxidase gene, and CATPO, a catalase-phenol oxidase gene, were previously proposed to be involved in betalain biosynthesis via monophenolase activity 34,35 , their transcripts did not show highly differential expression patterns between AMR and AMG (Fig. 1d, Supplementary Fig. S1c). As a result, we propose that the elevated expression of AmCYP76AD1 is necessary for the occurrence of a red-violet color phenotype in A. tricolor; in contrast, the loss of AmCYP76AD1 expression results in a green color phenotype in A. tricolor (Fig. 1a-d). The existence of the AmCYP76AD1 gene in AMG examined by PCR using genomic DNA as a template confirmed the loss of AmCYP76AD1 expression in AMG (Supplementary Fig. S7). Together with the functional characterization of the enzymatic activity of AmCYP76AD1 through the heterologous engineering of betalain pigments in N. benthamiana (Figs. 3b, 6a), we conclude that AmCYP76AD1, a CYP76ADα homologue required for the initiation of the betalain biosynthesis pathway, plays a key role in betalain pigment accumulation in A. tricolor. Accordingly, AmCYP76AD1 displayed higher transcript levels in the upper leaves of AMR, which contained higher content of betalains than those in the lower leaves of AMR (Fig. 2a,b, Supplementary Fig. S2).
In recent years, with the elucidation of the central committed steps of the betalain biosynthesis pathway, comparative transcriptome analyses have been intensively applied to identify genes involved in regulating betalain  12,21 . Thus, it is necessary to examine the possible involvement of annotated genes in betalain biosynthesis on the basis of experimental evidence. In this study, the AmDODAα1 and AmDODAα2 genes, which belong to the DODAα clade according to phylogenetic construction and LOGO analysis (Fig. 4b,c), were identified in A. tricolor through transcriptome analysis ( Supplementary Fig. S4, Table S4). Based on the heterologous engineering of betalain pigments in N. benthamiana and in vitro biochemical studies (Figs. 3b, 5b), we report that AmDODAα1 displayed a high level of L-DOPA 4,5-dioxygenase activity to produce betalamic acid, but such activity was barely detectable for AmDODAα2. These results indicate that at least one duplication event has occurred in the DODAα lineage of A. tricolor, and the primary function of AmDODAα2 remains to be further studied.
Betalains are composed of betacyanins and betaxanthins. In contrast to betaxanthins, which are derived from betalamic acid via spontaneous condensation with amino acids or other amines, a large number of betacyanins are composed of betanidin conjugated with glycosyl moieties 9,10 . We characterized the function of AmcDOP-A5GT, a cyclo-DOPA 5-O-glucosyltransferase gene, through the heterologous engineering of betalain pigments in N. benthamiana. The coexpression of AmCYP76AD1, AmDODAα1, and AmcDOPA5GT enabled the production of high levels of betalain pigments with a dark red color (Fig. 6a). In contrast, low production of betalain pigments was observed when AmCYP76AD1 and AmDODAα1 were coexpressed (Fig. 6a). Our results suggest the importance of AmcDOPA5GT in the glycosylation reaction during betalain biosynthesis in A. tricolor. In fact, the metabolic pathway of betalain biosynthesis is very complex due to multiple glycosylation steps, and different betacyanins have been identified 10,38 . For example, betanin, the most common betacyanin, is not only produced by cyclo-DOPA 5-O-glucosyltransferase but is also produced by betanidin 5-O-glucosyl-transferase through the glycosylation of betanidin 39,40 . In this study, AmB5GT, a betanidin 5-O-glucosyl-transferase gene, was also identified through comparative transcriptome analyses (Supplementary Table S4). Although AmcDOPA5GT showed higher expression levels than AmB5GT in both AMR and AMG (Supplementary Table S4), it remains to be determined which of the two glycosylation routes is more important for the formation of betanin in A. tricolor.
Recently, betalain biosynthesis in different pitaya species, such as Hylocereus polyrhizus, Hylocereus costaricensis, Hylocereus undatus, and Hylocereus megalanthus, has been intensively studied through comparative transcriptome analysis 36,37,41,42 . However, further studies remain to be conducted to provide experimental evidence and strengthen the understanding of the roles of candidate genes in betalain biosynthesis. Here, complementation assays conducted through the heterologous engineering of betalain pigments in nonbetalain-producing plants www.nature.com/scientificreports/ provided a solution for the easy and rapid comparison of the functional activities of genes involved in the core betalain biosynthesis pathway between betalain-pigmented species of Caryophyllales. Using the coexpression of BvCYP76AD1, BvDODAα1, and MjcDOPA5GT in N. benthamiana as a positive control, the functional activities of A. tricolor genes responsible for betalain synthesis could be compared through a series of complementation assays (Fig. 3b-d). We showed that comparable amounts of betalain pigments were observed when the functional activities of positive genes were individually replaced with AmCYP76AD1, AmDODAα1, and AmcDOP-A5GT in transient coexpression assays (Fig. 3b-d).
In conclusion, a comparative transcriptome analysis combined with functional and enzymatic studies were performed to reveal the core betalain biosynthesis pathway of A. tricolor. The heterologous engineering of betalain pigments through the coexpression of AmCYP76AD1, AmDODAα1, and AmcDOPA5GT in N. benthamiana enabled the production of high amounts of betalain pigments with a red-violet color similar to those in the red-leaf cultivar of A. tricolor. Although the metabolic pathway of betalain biosynthesis is very complex, the core betalain biosynthesis pathway of A. tricolor constructed here not only provides a basal framework for examining genes related to betalain biosynthesis within the species of Amaranthaceae but also sheds light on the evolution of the betalain biosynthesis pathway in Caryophyllales. Betalain pigment extraction and measurement. For betalain pigment measurement, betalain contents were determined as described previously with some modification 43 . Briefly, leaves of seedlings were collected and ground into powder in liquid nitrogen. Betalain pigments were extracted with extraction solution (methanol:chloroform:H 2 O [1:2:1]). After centrifugation, the upper (hydrophilic) layer was collected to measure the absorbance at 538 nm and 476 nm for betacyanins and betaxanthins, respectively. The relative betalain content was calculated with the following equation: (A 538 + A 476 )/gram).

Methods
Plasmid construction. All plasmid constructs were generated using standard restriction site reconstruc-  45 . For the VIGS assay, a cDNA fragment of AmCYP76AD1 was amplified and subcloned into the pTRV2 vector 46 . The primer sequences used for plasmid construction are listed in Supplementary Table S5.

Quantitative real-time polymerase chain reaction (qRT-PCR) and statistical analysis. TRIzol™
(Invitrogen)-extracted total RNA was reverse transcribed using SuperScript III First-Strand Synthesis SuperMix (Invitrogen) according to the manufacturer's instructions. Briefly, each sample was prepared from the leaves of three biologically distinct 3-week-old or 4-week-old A. tricolor plants. Then, cDNA was synthesized from 1 μg of total RNA using a mixture of random hexamers and oligo(dT) 20 under the following conditions: 25 °C for 10 min, followed by 50 °C for 40 min. The cDNA was employed as a template for qRT-PCR using the KAPA SYBR Fast qPCR Kit (Kapa Biosystems). Three technical replicates were performed on a CFX96™ Real-time System (Bio-Rad) under the following conditions: 95 °C for 3 min, followed by 40 cycles of 95 °C for 10 s and 55 °C for 30 s. The expression levels of selected genes were determined by normalization to the reference gene Actin. Statistically significant differences were determined using Student's t-test in SPSS version 20.0. The primer sequences employed for qRT-PCR analyses are listed in Supplementary Table S1. PCR analyses using genomic DNA extracted from AMR and AMG as a template were performed to confirm the specificity of the primers (Supplementary Fig. S7).
Transient coexpression assay and western blotting. Plasmids for the transient expression of AmCY-P76AD1-YFP, AmDODAα1-SFP, AmDODAα2-SFP, AmcDOPA5GT-SFP, BvCYP76AD1-YFP, BvDODAα1-SFP, or MjcDOPA5GT-SFP were transformed into the Agrobacterium tumefaciens strain ABI. C-terminal tagged proteins were coexpressed using a mixture of A. tumefaciens carrying the desired constructs in N. benthamiana leaves by agroinfiltration following the method described previously 47 . After three days, the infiltrated leaves were photographed and ground into a powder in liquid nitrogen for total cell extract preparation. Briefly, 0.1 g of sample powder was added to 0.2 ml of 2.5 × SDS sample buffer (5 mM EDTA, 5% SDS, 0.3 M Tris-HCl, pH 6.8, 20% glycerol, 1% β-mercaptoethanol, and bromophenol blue), which was then heated at 95 °C in a dry bath for 10 min. After centrifugation at 13,000× g for 10 min, the supernatant was obtained, and total proteins were separated by SDS-PAGE. Western blotting assays were performed to monitor protein levels using specific polyclonal and monoclonal antibodies against YFP-and FLAG-tag, respectively. Chemiluminescence signals generated by ECL reagents (PerkinElmer) were captured with an ImageQuant LAS 4000 mini imager (GE Healthcare). All experiments were repeated at least three times using biologically distinct samples prepared from two infiltrated leaves. LC-MS/MS was performed using a Dionex UltiMate 3000 system (Thermo Fisher Scientific) linked with an amaZon speed-ion trap mass spectrometer (Bruker). Betalamic acid was detected on a Waters BEH shield RP18 column with two eluting solvent systems: (A) H 2 O with 0.1% formic acid, (B) 100% acetonitrile. The gradient elution program was set as follows: 0-3 min (100% A), 9 min (55% A and 45% B), 12-13 min (100% B). The flow rate was 0.3 ml min -1 , and the detector wavelength was 424 nm. The electrospray ionization mass parameters were set as follows: 4.5 kV capillary, 500 V end plate offset voltage, 40.0 psi nebulizer pressure, 8.0 l min −1 dry gas, and 230 °C dry temperature. The measurement was operated in multiple reaction-monitoring (MRM) with the positive ion mode. The MRM was set 182 → 165 m/z to detect tyrosine, 198 → 181 m/z to detect L-DOPA, 212 → 166 m/z to detect betalamic acid, 389 → 345 m/z to detect betanidin, and 551 → 389 m/z to detect betanin.
Next-generation sequencing and MA plot. To perform next-generation sequencing, aerial tissues derived from three biologically distinct 3-week-old A. tricolor plants were collected. Total RNA was extracted using the RNeasy Plant Mini Kit (Qiagen) according to the manufacturer's instructions. RNA quality was examined via 1.2% (wt/vol) formaldehyde gel electrophoresis and with an Experion RNA analysis kit (Bio-Rad, Munich). Only high-quality RNA was used for next-generation sequencing performed on the Illumina HiSeq 4000 platform with 150 paired-end reads. For each dataset (AMR and AMG), 100 million reads were generated, and de novo assembly was performed with the Trinity tool. The assembled transcripts were annotated with BlastX in UniProt. Gene expression levels were normalized as FPKM values, and differentially expressed genes were identified according to an FDR < 0.05 and logFC > 2 or < −2 (Supplementary Tables S2, S3). An MA plot was generated based on the average concentration (logCPM) and fold-change (logFC) values to show the relative abundances of transcripts between AMR and AMG.
Phylogenetic tree reconstruction and LOGO analysis. Phylogenetic trees were reconstructed using MEGA-X software based on the protein sequence comparisons of CYP76AD and DODA homologues from different betalain-producing species. Multiple sequence alignments were performed using the MUSCLE program and were processed to generate a maximum likelihood phylogenetic tree via the Jones-Taylor-Thornton (JTT) model with bootstrapping to perform molecular evolutionary analysis. The numbers at the branch points are bootstrap values representing the percentages of replicate trees based on 1000 repeats. LOGO analyses were performed via WebLogo (http:// weblo go. berke ley. edu/ logo. cgi) based on selected conserved amino acids of CYP76AD and DODA homologues reported previously 9,12,21,48 . The species, families, and accession numbers of CYP76AD and DODAα homologues are available in Supplementary Table S6.