Lilac (Syringa oblata) genome provides insights into its evolution and molecular mechanism of petal color change

Color change during flower opening is common; however, little is understood on the biochemical and molecular basis related. Lilac (Syringa oblata), a well-known woody ornamental plant with obvious petal color changes, is an ideal model. Here, we presented chromosome-scale genome assembly for lilac, resolved the flavonoids metabolism, and identified key genes and potential regulatory networks related to petal color change. The genome assembly is 1.05 Gb anchored onto 23 chromosomes, with a BUSCO score of 96.6%. Whole-genome duplication (WGD) event shared within Oleaceae was revealed. Metabolome quantification identified delphinidin-3-O-rutinoside (Dp3Ru) and cyanidin-3-O-rutinoside (Cy3Ru) as the major pigments; gene co-expression networks indicated WRKY an essential regulation factor at the early flowering stage, ERF more important in the color transition period (from violet to light nearly white), while the MBW complex participated in the entire process. Our results provide a foundation for functional study and molecular breeding in lilac.

A s one of the most important ornamental traits, floral color diversity has not only attracted much attention in the breeding and development of new varieties, but also play an important role in the evolution of floral traits and ecological effects 1 . Flower color change is a common phenomenon that accompanies the developmental period. The flower color of many plants is formed early and then fades during flower opening. For example, Malus hupehensis 2 and Brunfelsia acuminata 3 have red and purple flowers respectively in early flower opening stage, which gradually change to white with the opening. In the petals of Chrysanthemum morifolium 4 and Cleome hassleriana 5 , the heavy pink coloration and anthocyanin content show the highest level before flower opening, and then the color gradually changed to light pink due to the continuous decrease in anthocyanin content. In addition, with the flower development the petal color of Paeonia 'Coral Sunset' and 'Pink Hawaiian Coral' 6 fade from coral to light pink to pale yellow, presenting color group change. Petal color changes is often accompanied by both accumulation and loss of anthocyanins; therefore, a better understanding of the metabolic dynamics and molecular basis of petal color change is important for prolonging flower coloring and improving ornamental and economic value.
The accumulation of floral color has been studied more intensively and is mainly determined largely by distinct pigments, including flavonoids, carotenoids and/or alkaloids 7 . Among which, flavonoids as the third-largest group of natural products, are widely known for their important roles in plant fitness, health benefits and food quality 8 . Anthocyanin biosynthesis, as part of the flavonoids pathway, is essential for the appearance of these colors, which are expressed in plant tissues in a wide variety, from red to purple and even blue 9 . As we know, pigment accumulation is mainly determined by biosynthetic genes and their transcriptional regulation via transcription factors (TFs), which plays a major role in flower color development 10,11 . Currently, advances in genetic engineering make rare flower color development and directed breeding possible 11,12 .
Actually, there have been many studies on the biosynthesis and regulatory mechanisms of flower color, yet the widespread phenomenon of petal color change has been neglected. The genus Syringa (Oleaceae), known for the outstanding beautiful corollas, consists of more than 27 species and 2000 cultivars worldwide, of which 22 are distributed in China, including 18 endemic species 13 . As the representative species, Syringa oblata (lilac) is native to China with more than 1000 years of cultivation 13 , owing to its elegant colors, unique fragrance, beautiful corollas, large inflorescence, and early flowering. Moreover, as one of the rare trees with purple flowers, S. oblata is the most widely cultivated as ecological and ornamental tree, spanning subtropical, warm temperate, temperate, and the edge of the cold temperate zones 14 .
Since the 20th century, there have been more than 200 new cultivars bred with S. oblata as excellent hybrid parents, including Syringa × hyacinthiflora and cultivars breeding by crossing between S. oblata and S. vulgaris. It is observed that lilac petals are darkest and red-purple in the bud stage, then gradually change to violet and finally fade close to white. Differences in anthocyanins lead to flower color changes and the production of various flower colors 11 ; however, the biochemical and molecular basis for this change is still unclear. Lilac is considered as an excellent material to study the mechanism of petal color change because of its phenotypic advantages.
In lilac flowers, only two anthocyanins have been reported in the full blooming stage 15 , so a comprehensive assay and the metabolic dynamic analysis is lacking. Several studies have delved into the expression difference of some synthesis genes in the flavonoid biosynthetic pathway during flower developmental stage and cloned and verified the functions of some genes 16 , however, with the absence of genomic evidence, it is difficult to explore their regulatory mechanisms 16 . Therefore, the whole-genome sequence of S. oblata is needed to develop an in-depth understanding of the evolution and potential regulatory networks of petal color formation underlying biochemical process in successive flower development stages 12 .
Here, we present a high-quality chromosome-scale genome assembly of S. oblata, and it was determined by a combination of long-read sequencing and Hi-C scaffolding technologies. In total, 1.05 Gb and 93.75% of the assembled genome was anchored on 23 chromosomes with a scaffold N50 of 46.33 Mb. Furthermore, we predicted to contain 35,313 protein-coding genes and clarified the species WGD history. The metabolic dynamic results systematically revealed the composition and relative quantification of compounds related to flavonoid biosynthesis in lilac petals. Moreover, combining gene annotation, transcriptome and metabolome evaluations, we revealed a time-ordered co-expression network of color-changing petals, and identified the enzyme genes and potential regulatory factors controlling the key pigments associated with petal color change. This reference genome sequence, metabolic dynamic results and the generated gene regulatory network for petal color change are expected for further functional genomics investigations and molecular breeding.

Results
Genome sequencing, assembly, and quality assessment. Genome sequencing resulted in a total of 237.19 Gb of nanopore long-reads (Supplementary Fig. 1 Fig. 2). Here, we obtained an assembly ready for Hi-C scaffolding, which was 1.12 Gb in the length of contig with an N50 length of 3.96 Mb (Supplementary Table 4).
With the Hi-C scaffolding, the final assembly was generated by anchoring 443 contigs to 23 chromosome-scale super scaffolds with a total length of 1.05 Gb (Table 1; Supplementary Table 5; Supplementary Fig. 3), accounting for 93.75% (1.05/1.12Gb) of  Table 9). Among which, in the GO annotation, 6681 genes were assigned to the biological process, 10,653 genes were assigned to the cellular compound, and 6958 genes were assigned to the molecular function categories. Additionally, we identified non-coding RNAs, including 231 miRNAs, 724 transfer RNAs (tRNA), 138 ribosomal RNAs, and 3690 small nuclear RNAs (snRNA) ( Table 1; Supplementary Table 11).
We identified 3035 genes encoding TFs from 95 TF families in the lilac genome, accounting for 8.59% of protein-coding genes.  Table 12). A total of 29,856 gene families encompassing 386,920 genes were identified among the 12 species, from which 14,829 gene families (33,058 genes) were found in S. oblata (Supplementary Table 13). The sequences of the 356 conserved single-copy orthologous genes identified were used to reconstruct the phylogenetic relationship with O. sativa serving as the outgroup (Fig. 2a). The revealed phylogeny was in agreement with previous studies 20, 25 , which indicated that S. oblata had the closest relationship with Oleaceae (F. excelsior, O. fragrans, and O. europaea) and formed a clade with Solanaceae (S. lycopersicum and S. tuberosum) followed by R. delavayi, and was distant from the other species. In addition, lilac was separated from the Oleaceae common ancestor at~43.20 (38.23-48.18) million years ago (Mya) (Fig. 2a).
We also identified unique and shared gene families among these species. A total of 11,952 gene families were shared with Oleaceae, and 2982 unique gene families were found in S. oblata genome compared with O. europaea (5590), F. excelsior (3174), and O. fragrans (3625) (Fig. 2b); 400 gene families (2255 genes) were found to be specific to the assembled S. oblata genome when compared with the other 11 genomes (Supplementary Table 13). Moreover, 990 gene families (3876 genes) were found in expansion, while 1599 gene families (3513 genes) were found in contraction in the S. oblata lineage (Fig. 2b). Functional enrichment analysis showed that expanded gene families were enriched in GO terms mostly involved in 'binding', 'transport', 'metabolic process', and 'signal transduction' (Supplementary Fig. 4). In contrast, KEGG enrichment analysis revealed that the expanded genes were mainly from pathway implicated in plant defense response, development, and monoterpenoid biosynthesis ( Supplementary Fig. 5).
Gene synteny analysis and WGD events. Synonymous substitutions per site (Ks) age distribution, synteny analyses, and inferred dating times were used to unveil WGD events and to date the duplication events 26 . The divergence time of lilac and the other three Oleaceae species was approximately predicted at 43.20 Mya (Ks = 0.2) (Fig. 2a), simultaneously, they produced the same signature Ks peaks of WGDs at about 0.25 19,20,27 (Fig. 2c), higher than 0.2, suggesting that the most recent WGD event, estimated at around 54 MYA (Ks = 0.25), occurred before the divergence of Oleaceae family (Fig. 2c, Supplementary Fig. 4). Similarly, the main Ks peak value of 0.25, was smaller than the Ks peak suggesting divergence between Oleaceae and Solanaceae (approximately 1.1) (Fig. 2c), indicating that the most recent WGD event occurred after the divergence of Oleaceae and Solanaceae. Moreover, the intra-genomic synteny analyses showed that  Table 14), supporting a close evolutionary relationship.
We also observed a weaker signal of an additional ancient WGD shared by Oleaceae lineage (Fig. 2c), which has been previously reported based on genomic data 20 . Subsequently, we detected a 4:1 syntenic relationship between S. oblata and V. vinifera (Fig. 2d), and built a gene homology dot plot (Supplementary Fig. 9; Supplementary Table 14), which provided additional evidence for the identified two WGD events 19,20 after the Gamma paleohexaploidy in the stem lineage of core eudicots 28,29 . Moreover, the Ks analysis of four Oleaceae plants supported the WGD events ( Fig. 2c), which were inferred from the lilac genome (as well as from O. europaea 20 , F. excelsior 18 and O. fragrans 19 ) and were specific to Oleaceae.
Reconstruction of flower pigments biosynthesis pathway in S. oblata. Our high-quality genome assembly and transcriptome data allowed reconstruction of the flower pigment biosynthesis pathway, specifically capturing the implicated enzymatic genes in this process. We unveiled 124 genes encoding functioning enzymes in the flavonol (from colorless to pale yellow) and colored anthocyanin (from pink-red to violet) biosynthesis pathway (Fig. 3); 14 important gene families in this pathway were identified, among which, 4CL (19), F3H (15), DFR (15), and F3oGT (24) gene families had a large number of members (Fig. 3). In addition, we found that members of the PAL, F3H, F3'H|F3'5'H, DFR, F3oGT and RT gene families were replicated, especially F3H and F3oGT, which might enhance flavonoids production. Gene expression profiles showed that during the flower opening period the CHS (So16G045740 and So19G016880), F3H (So13G050140 and So15G003400), and F3'H|F3'5'H (So05G012500 and So07G050530) increased firstly and then  Delphinidin-based anthocyanins which had the most contents in lilac petals usually were the major constituents of violet and blue flowers in nature, and F3'5'H was the key synthesis enzyme 30 . We noticed that the So07G050530 expression first increased and then decreased during the flower opening period, which showed the high levels at T2 and T3 (Fig. 3). Its expression profile was consistent with blue-violet anthocyanins petunidin-3-O-rutinoside (Pt3Ru) contents (Fig. 4b), suggesting that the expression of So07G050530 may be related to Pt3Ru synthesis.
Metabolic dynamic results revealed the petal color change in S. oblata. Before blooming, there was a rapid pigment accumulation process in the lilac flower, and gradually decreased during the flower opening 16 , with color change from dark to light purple (Fig. 4a). For further study the lilac's petal color change, four flower opening period were divided according to phenotypic characteristics ( Fig. 4a, T1-T4). At the T1 stage (flower bud stage), the flower buds were red-purple and close to the calyx, no corolla tubes and lobes were showing or even barely exposed; at T2 (initial flower opening stage), corolla tubes were elongated, flower buds were enlarged, corolla lobes were about ready to open, and flower color was from red-purple to violet; at T3 (full flower opening stage), corolla lobes were stretched with almost 90°angle between the corolla lobes and tubes, and flower color was light violet; and at T4 (last flower opening stage), flowers color decayed and wilted, the corolla lobes were almost reversely curled, and the color of the flower became lighter than other three stages and tend to be in light violet nearly white (Fig. 4a).
Subsequently, we focused on the 9 anthocyanins, namely delphinidin-based (4 types, tended to be violet-blue), cyanidinbased (4 types, tended to be red-purple), and pelargonidinbased (1 type, tended to be orange-red) anthocyanins ( Fig. 4b; Supplementary Fig. 11). The first two, Dp3Ru and Cy3Ru determined lilac flower color and showed the highest contents in T1; and at T2 and T3, the contents of Dp3Ru was still high, while Cy3Ru only showed one-third of content at T1 (Fig. 4b).
In addition, the content change of pelargonidin-3-O-rutinoside (Pg3Ru) was consistent with Dp3Ru, which was much higher than the others, and cyanidin-3-O-glucoside (Cy3G) was consistent with Cy3Ru ( Supplementary Fig. 11). Different from the monotonously decreased contents of first three pigments, Data are presented as mean ±SE, n = 3 biologically independent samples (Supplementary Data 2); statistical significance was calculated by one-way ANOVA with multiple comparisons between adjacent stages, and was indicated by asterisks, in which n.s. (no significance), p < 0.05 (*), p < 0.01 (**), p < 0.001 (***), and p < 0.0001 (****). c PCA distance biplot. The PCA biplot shows both the PC scores (PC1 and PC2 here) of samples (here: three petal samples for each of the four flower opening periods, as dots) and the loadings of variables (here: the relative content of metabolites in each petal sample, as vectors) in the 2-dimension space defined by the first two PCs. We can observe the difference between petal samples at different flower opening periods, the relative contribution of metabolite variables to this difference, and the relationship of the difference. d Predicted regulatory network and the connection among metabolites, structural genes of flavonoids (Fig. 3) and TFs. The purple dots on the top represent metabolites; the pink dots in the middle represent structural genes in the flavonoid metabolic pathway (Fig. 3); at the bottom, inside the blue circle, the green dots represent the MYB genes, the red dots represent the ERF genes, the orange dots represent the WD40 genes, the pink dots represent the bHLH genes, the blue dots represent the WRKY genes; and outside of the circle, blue dots represent the remaining TFs. L1 to L8 indicate the levels identified in the time-ordered gene co-expression network. The PCC value increases from small to large, the width of the edges changes from thinner to thicker in the networks. e the heatmaps of average normalized FPKMs for each flower opening periods at each level identified in the time-ordered gene co-expression network. The three stages of flower color change are determined based on the gene expression profile and metabolome, as early stage (T1, red-purple), transition stages (T2-T3, violet), and fading stage (T4, light violet nearly white).
the Pt3Ru, a delphinidin-based anthocyanin, first showed an increasing pattern and then decreased along with flower opening, with the highest value at T2 (Fig. 4b). Interestingly, the differential accumulation of pigments demonstrated that the cyanidin-based anthocyanins caused the color change of lilac from T1 to T3, while Dp3Ru did not differ significantly among the first three flower opening stages, causing color changes from T3 to T4 (Fig. 4b).
Furthermore, the PCA based on 20 differentially expressed major metabolites, including 8 anthocyanins (marker in red and purple in Fig. 4c; Supplementary Fig. 11) and 12 flavonoids with high contents (marker in yellow in Fig. 4c; Supplementary  Fig. 12), showed that the first two principal components (PCs) cumulative contribution of the accounted for 84.6% of the variation (Fig. 4c). It is worth noting that PC1 (accounted for 67.5%) and can separate all four stages (Fig. 4c). Furthermore, the samples were arranged clearly in order of flower opening time (from left to right, Fig. 4c), with which anthocyanins, the higher contents at the early flower opening stage (Cy3Ru, Cy3G, Dp3Ru, Pg3Ru, and delphinidin (Dp)) were negatively correlated, while Pt3Ru and cyanidin (Cy) were positively correlated. As can be seen, the metabolic dynamics of these two pigments, Cy3Ru and Pt3Ru, were closely related to the color change of lilac flowers at the early and late blooming stages, respectively (Fig. 4b-c). Therefore, PCA clearly revealed the dynamic changes of the anthocyanin and other flavonoids contents with flower opening stages, which was an important basis for the dynamic integration of flower color phenotype and relevant gene expression.
Dynamic regulation of petal color change in S. oblata. The 12 samples of the four stages were clearly separated by metabolome and transcriptome PCA analysis ( Supplementary Fig. 13). Between any two flower opening stages (T1 to T4), we found a total of 20,984 differentially expressed genes (DEGs) (2507 TFs and 18,477 structural genes) with an average Fragments Per Kilobase Million (FPKM) greater than 0.5 and 80 metabolites exhibiting significant differentiation. Then, a TF gene of MYB family, So16G008570, expressing in a high level at T1 and monotonically decreased until T4, was selected as the initial node to build the time-ordered gene co-expression network (TO-GCN). According to Pearson's correlation coefficient (PCC) values, eight time-ordered expression levels (denoted as L1 to L8, Fig. 4d) were obtained, which comprised 137,523 edges and 1427 nodes. Finally, 1347 genes including 1268 significantly differentially expressed TFs genes and 79 structural genes of anthocyanin and flavonol biosynthesis pathway, and 80 metabolites (including 8 anthocyanins) made up the TO-GCN (Fig. 4d). These 8 levels matched the expression peaks of DEGs at the four flower opening stages, as shown by the red squares (high expression levels) along the diagonal in the heat-map, which formed the basis for the inference of upstream and downstream genes/metabolites regulatory relationships (Fig. 4e).
Pt3Ru was the most important pigment between the transition and the fading stages (Fig. 4b). CHS was the first key enzyme in the synthesis and metabolism pathway of anthocyanins/flavonols, whose upstream regulation mechanism was unclear yet. Here, an important sub-network (Fig. 6b) inferred from TO-GCNs, showed that CHS gene (So19G016880) with significantly high expression levels ( Supplementary Fig. 19), was directly related to Pt3Ru, simultaneously, and was the potential first-order to thirdorder upstream regulators. In the co-expression network that we have resolved, CHS gene (high expression levels in T2 and T3) might be regulated by ERF So15G039730 (high expression level in T2) as an upstream third-level regulator, GATA So01G024070 (high expression level in T2) as an intermediate secondary regulator, and WD40 So01G077230 (high expression level in T2 and T3) as the direct regulatory factor. The consistency of qRT-PCR data (Supplementary Fig. 19) and the result of DNA binding site prediction may initially confirm our inference (Fig. 6c). At the same time, two bHLH family members, So17G021180 and So09G026360, which exhibited significantly high and differential expressions during the four flower opening stages and were consistent with the expression level trend of WD40 So01G077230 (Supplementary Fig. 19), maybe potentially interacted with the WD40 So01G077230 accompanied by high PCC with 0.998 and 0.981. In addition, we also found that ERFs acted as both the direct regulators (So11G003650 and So18G001910) and the upstream regulator (So15G039730), all with significantly high expressions ( Supplementary Fig. 19). Furthermore, F3'H|F3'5'H So05G012500 and So07G050530, which contributes to the redbased cyanidin and the blue-based delphinidin pigment, were analyzed for potential regulatory networks ( Supplementary  Figs. 17-18). Genes, ERF So11G003650 and ERF So15G039730, were also identified as the important direct and upstream regulators in these networks.

Discussion
In many horticultural plants, it is common for the flower color to fade or deepen with the flower opening. Although the research on anthocyanin accumulation has been very detailed, the metabolic dynamics and molecular regulation mechanisms of petal color change are still poorly understood. Not only does lilac flowers have the obvious color change, but the elegant purple color is an important ornamental trait and widely recognized. Here, we reported on a chromosome-scale genome assembly of S. oblata in the genus Syringa of family Oleaceae. Compared   The network consists of metabolites (purple), structural genes (pink) and TFs from the outside to the inside. b Sub-network for flavonoids metabolism containing only the three families of TFs MYB (green), bHLH (pink), and WD40 (orange) at the early flowering stage. The structural genes were linked to bHLH, MYB and WD40 in red, blue, and orange lines, respectively. c Resolved one of the metabolisms of Cy3Ru and regulation for F3'H|F3'5'H gene involved. And 38 TFs (sixteen bHLH, five MYB, four WD40, nine WRKY and four ERF) identified as its direct regulators. d TFs (from Fig. 5c) heat-maps of gene expression profiles (log 10 (FPKM+1)) at the four flower opening stages and DNA binding sites of F3'H|F3'5'H gene (from Fig. 5c) in 5'UTR upstream 2Kb. The bar color change from blue to red indicates the expression from low to high. The PCC value increases from small to large, the width of the edges changes from thinner to thicker in the networks.
genome is expected to be a valuable resource for genome-wide studies for genetic dissection of key traits, molecular breeding, and evolutionary studies in lilac.
Comparative analyses revealed that lilac experienced and shared two rounds of WGDs 20 along with F. excelsior, O. fragrans, and O. europaea within the Oleaceae family. The more recent WGD event is obviously shared with four Oleaceae plants, suggesting an Oleaceae-specific whole-genome duplication, which was consistent with previous reports for other Oleaceae species [18][19][20]27 . However, the older WGD event was barely visible may be due to the accelerated deletion of the oldest syntenic pairs after the recent WGD, a classically observed phenomenon 33 additionally supported by collinearity evidence. This lilac genomic information is expected to further clarify the evolution of Oleaceae plants 10 .
Here, we performed the metabolome assay of time-ordered lilac flower samples to grasp the key flavonoid metabolites determining flower color change accurately and comprehensively. Metabolome results showed that Dp3Ru and Cy3Ru were the two major metabolites with the highest contents among the identified 118 flavonoid metabolites. Throughout the flower opening period, these two key anthocyanins showed distinctive and specific pattern of accumulation, distribution and fading. At the early stage of flower opening (T1), these two anthocyanins presented the highest relative contents, and the content of Cy3Ru was slightly higher than Dp3Ru (53.76% vs 46.24%); at the transition stage (T2-T3), the relative content of Cy3Ru was only two-thirds of T1, and on the contrary, the latter Dp3Ru was the main one (T2, 72.44% and 27.56%; T3, 70.25% and 29.75%); at the fading stage (T4), the lowest relative content was just one-sixth of T1, and Dp3Ru still was the main component with a proportion of 70.98%. Moreover, delphinidin-based anthocyanins were the major constituents of violet and blue flowers in nature, which has always been one of the most interesting topics in ornamental plant research, such as the world-famous cut flowers, roses, chrysanthemums, and lilies; such high contents of delphinidin-based anthocyanins (Dp, Dp3Ru, Pt3Ru, Dp3Ru7G) in lilac petals provide ideas for future research.
It has been revealed that pelargonidin, cyanidin and delphinidin were attached to the B ring of the molecule with one, two and three hydroxyl groups, respectively, the increase of which would shift the color from redder to bluer 9 . Our analysis revealed that anthocyanin content was closely related to the lilac petals color, which shifted from reddish-purple to violet and then to light violet nearly white along with the flower opening period. This may be closely related to self-association or/and an intramolecular co-pigmentation 34 . In addition, glycosylation, including mono-, di-, and tri-saccharides, generally improved the stability and water solubility of the pigment 35 ; with more stable diglycosides than their corresponding monoglycosides 36 ; at the same time, location, number, and size of glycosidic substitutions played important roles on the anthocyanins color expression 37 . These also explained why the content of rutinoside (diglycoside) in the lilac petals was much higher than those of glucoside (monoglycoside) and pigments without sugar groups.
In particular, it is relatively rare to find anthocyanin rutinosides in ornamental plants, but more reports often focus on the fruits of plants with higher nutritional value, such as black currant 38 , Solanum betaceum 39 , and Liriope platyphylla 40 . Black currant is rich in anthocyanins and the content of anthocyanidin 3-O-rutinosides is significantly higher than that of anthocyanidin 3-O-glucosides; among which, Cy3Ru and Dp3Ru account for about 77% of the total anthocyanin content, and Cy3G and Dp3G account for about 21%. Anthocyanins extracted from blackcurrant fruit have been reported to be important in the treatment of cancer 41 , cardiovascular disease and free radical scavenging, and are found intact in the blood when absorbed by humans as anthocyanin rutinosides 38 . In the petals of lilac, Cy3Ru and Dp3Ru account for about 98% of the total anthocyanin content, which may be of great value in future medical research and derivative products.
Based on the gene annotation and expression profile, we reconstructed the entire anthocyanin and flavonol metabolic pathways consists of metabolites (purple), structural genes (pink) and TFs from the outside to the inside. b DNA binding sites of the one direct regulatory gene and two potential upstream regulatory genes in 5'UTR upstream 2Kb. c Resolved one of the metabolisms of Pt3Ru and hierarchical regulation for CHS gene involved. From right to left, the network consists of metabolite (Pt3Ru in L5), structural gene (CHS in L4), TFs that play a direct regulatory role (L4), and upstream regulatory TFs (L3 and L2). The expression profiles at the four flower opening stages were displayed as heat-maps (log 10 (FPKM+1)). The bar color change from blue to red indicates the expression from low to high. The PCC value increases from small to large, the width of the edge changes from thinner to thicker in the networks. and found there were multiple-copy genes for key enzymes playing major roles. The followed analyses of the time-ordered gene coexpression networks, combining transcriptomic and metabonomic data, unraveled important TF families and members which may play key roles in anthocyanins regulation in lilac petals. MBWdependent (MYB-bHLH-WDR transcriptional complexes) 8,42,43 regulators were evidenced to control anthocyanin biosynthesis and meolism in our study, and WD40 seem to be more significant at the transition stage. It is worth noting that WRKY and ERF may play important roles at the early and transition periods, respectively. Several recent studies revetabaled the novel mechanisms by which WRKY TFs together with the MBW complex regulated the anthocyanin pigment pathway, and they even proposed the 'MBWW quartet' model 42 . Also, PyERF3 was found to interact with PyMYB114 and PybHLH3 to co-regulate fruits anthocyanin biosynthesis in Asian red-skinned pears 43 . Therefore, we speculated that WRKY and ERF families, which were known to play crucial roles in plant tolerance against abiotic and biotic stresses, were potential regulators in flavonoid biosynthesis pathway 42,43 , flower color change and patterning 12 . In addition, as the ethylene response factor, ERF was identified as the important direct as well as upstream regulators in our flower color change regulatory network, so we speculate that ethylene may affect petal color change, which may provide ideas for our future work.
In conclusion, the high-quality chromosome-level genome assembly presented here is expected to provide a reference for the further understanding of Oleaceae evolution and diversification. Furthermore, our transcriptome and metabolome investigations also provided an in-depth insight into lilac dynamic flower coloration. Genomic data from this study will be an essential resource in many traits for further functional study with genome editing and also for molecular marker-assisted breeding to promote genetic studies and novel cultivars breeding for Syringa.

Methods
Plant materials. A 10-year-old S. oblata individual, with red-purple inflorescence, strong fragrance, and stable flower traits was chosen from the nursery of Beijing University of Agriculture (China) for genome sequencing and de novo genome assembly (Fig. 1). Fresh leaves were collected for the genome sequencing library preparation, and flowers at four blooming stages (Fig. 4a) were collected with three replicates to determine metabolites (flavonoids), and to perform transcriptome sequencing in support of genome annotation and gene expression analysis. Plant materials were immediately frozen in liquid nitrogen and stored at −80°C for subsequent genomic DNA and RNA extractions.
Library construction and sequencing. Three kinds of libraries were constructed, including short-reads RNA-free sequencing with DNBSEQ TM , long-read Nanopore, and Hi-C, following manufacturers' instructions.
Petals' RNA was isolated using QIAGEN RNeasy plant mini kit (QIAGEN, Hilden, Germany) for short-read sequencing with DNBSEQ TM . One library with insert sizes of 300-400 bp was constructed, and then a BGISEQ Platform was used to sequence the reads, resulting in 45.65Gb clean data, providing~39.35× genome coverage.
High-quality and intact genomic DNA was extracted from fresh leaves using DNAsecure Plant Kit (Tiangen Biotech, Co. Ltd, Beijing, China). For long-reads sequencing, three Nanopore libraries with 20 kb insert size were generated and sequenced on the GridION X5 sequencer platform (Oxford Nanopore Technologies, UK) at the Beijing Genomics Institute (Wuhan, China). Hi-C library was sequenced on a BGISEQ platform.
Genome size and heterozygosity estimation. Size and heterozygosity of the lilac genome was estimated by K-mer analysis. Firstly, Jellyfish (v2.1.4) was used to quickly count the depth distribution of K-mer = 17-31, and the peak depth was clearly observed from the distribution data; then the frequency spectrum of K-mer was fitted by GenomeScope 44 (v1.0.0) to predict the genome characteristics.
De novo genome assembly and quality assessment. De novo genome assembly employed the following steps: first, we used the NECAT (https://github.com/ xiaochuanle/NECAT) to assemble the nanopore long-reads into contigs; subsequently corrected three times using Racon 45 software; followed by mapping short-reads to the reference genome using BWA-MEM 46 (v0.7.17), finally, further polished with the second round of pilon using clean DNBSEQ TM reads.
The Hi-C data were used for scaffolding the preliminary assemblies and to improve assembly continuity. The main Hi-C scaffolding workflow was as follows: clean Hi-C data was obtained by filtering Hi-C data from MboI digestion by Hicpro software; then, clean Hi-C data were mapped to the corrected contigs genome by Juicer 47 (v1.5.6); 3D-DNA pipeline 48 was used for chromosome-level scaffolding by clustering, ordering, and orienting the previous assemblies based on genomic proximity information between Hi-C read pairs; finally, with the help of manual adjustment, the HIC cross-linking signal was redrawn by HiCPlotter 49 software and the adjacent anchored scaffolds were connected forming 23 superscaffolds corresponding to 23 pseudochromosomes.
Given that the S. oblata genome had higher heterozygosity, Purge_dup 50 was used to remove heterozygous and redundant sequences from the corrected genome to obtain corrected contigs. We evaluated the completeness of the assembled genome through BUSCO v5.2.2 analysis using gene content from the embryophyta_odb10 51 , and mapping short-reads to the genome for quality assessment.
Gene structure prediction was based on a combination of evidence-based and de novo predictions. We used the Maker 57 software to carry out evidence-based assembly by evidence files, which included downloaded transcriptome data and mRNA/protein sequences of O. europaea subsp. europaea, F. excelsior, and O. fragrans. Transcriptome reads were sequenced on the BGISEQ-MGI2000RS platform and assemblies were generated (Supplementary Table 3) with Trinity (v.2.1.1) for genome annotation. De novo prediction was carried out using Augustus 58 . After repetitive sequence masking, the gene sets predicted by various methods were integrated into a non-redundant and more complete gene set. Finally, we used BUSCO (BUSCO, RRID:SCR_ 015008) to evaluate the quality of the gene set.
With the help of eggNOG-mapper (http://eggnog-mapper.embl.de), gene functional annotation was performed using KEGG 23 , GO 22 and COG 21 databases by comparing the eggNOG5.0 database. The advantages of eggNOGmapper are: (i) ability to distinguish between paralogous and orthologous genes, and achieve more accurate gene function annotation and (ii) faster database searches through HMMER (http://www.hmmer.org) (v.3.0) or DIAMOND. In addition, functional annotation of protein-coding genes was carried out by performing BLASTP 60 (E-value ≤ 1e −5 ) searches against SWISS-PROT 62 , NCBI NR protein databases, MSU Rice Genome Annotation database 15 and Arabidopsis protein databases.
Gene family, phylogenetic analysis, and divergence time estimation. OrthoFinder 63 (v2.3.3) software was used to cluster paralogous and orthologous groups from genomes of other 11 species. CAFE (Computational Analysis of gene Family Evolution) 64 (v4.2.1) was used to analyze the expansion and contraction of gene family among these species. GO classification and KEGG pathways using the R package clusterProfiler 65 . After concatenating single-copy genes to form a super matrix following alignment by MUSCLE 66 (v3.8.31), the conservative regions from the generated matrix were extracted, and the gap information in the sequence was deleted to form a new matrix with Gblocks_0.91b 67 . Then RAxML (v8) 68  (v0.8.12) package was used to determine high-confidence collinear blocks, paralogous and orthologous, in these species. Then, the Ks (the number of synonymous substitutions per synonymous site) values of the homologs within syntenic blocks were calculated by PAML to identify WGDs. WGD time was estimated with F. excelsior-S. oblata divergence time (mean: 43.20 Mya) as an age constraint. Ks peaks of F. excelsior-S. oblata syntenic orthologs allowed calculating γ = Ks/(2 × (divergence time)) 33 . The same γ value was applied to calculate the time of WGD events in lilac. We used iTAK software 72 to identify TFs in the genome of the 12 assessed species. In addition, PlantRegMap was used to identify the WD40 family by homology to A. thaliana. The 2 kb sequences upstream of the genes were used to identify TFBSs present in the promoters of genes using PlantRegMap with threshold p-value ≤ 1e −4 (http://plantregmap.gao-lab.org/binding_site_ prediction.php).
Qualitative and quantitative analyses of the raw data obtained via UPLC-MS/ MS were performed using Analyst v1.6.3 software. To determine the repeatability of metabolite extraction and detection, one quality control sample (QC) was inserted per ten samples, and then the overlapped total ion current chromatogram was analyzed. Principal Component Analysis (PCA), Fold Change (FC), and Orthogonal Partial Least Squares-Discriminant Analysis (OPLS-DA) were used to screen out differential metabolites (Supplementary Table 15), with FC > 2 or FC < 0.5, and variable importance in project (VIP based on OPLS-DA) greater 1. Then, we selected all eight anthocyanins and 12 additional flavonoids with high content in the differential metabolites for the PCA distance biplot.
Petals at four flower opening stages were collected with three replicates to perform transcriptome sequencing. For transcriptome analysis, clean reads were 304.3 million and Q20 and GC rate were 95% and 36.2%, respectively (Supplementary Table 3). DEGs (Supplementary Table 16) among the four flower blooming stages were identified using DESeq2, with adjusted p-value (FDR)<0.05 and |log 2 FC| ≥ 1.
Regulation network construction based on TO-GCNs. We used a recently developed method of reconstructing TO-GCNs 73 for joint analysis of transcriptome and metabolome data. Finally, 80 differential metabolites and 20,984 DEGs (2507 TFs and 18,477 structural genes) were incorporated. The PCC cutoff values were 0.90 between structural genes and TFs, and 0.73 between metabolites and genes. TO-GCNs network was visualized using Cytoscape 74 .
Quantitative real-time PCR analysis. The key genes analyzed were verified by quantitative real-time PCR with the expression level of lilac actin gene as an internal reference. QRT-PCR was performed using Green qPCR SuperMix (TransStart) on a CFX Connect Detection System (Bio-Rad). Using the 2-ΔΔCt method to calculate relative expression levels, and each analysis included three replicates. The specific primers for qRT-PCR were listed in Supplementary  Table 17.