Author Correction: Transcriptome and digital gene expression analysis unravels the novel mechanism of early flowering in Angelica sinensis

An amendment to this paper has been published and can be accessed via a link at the top of the paper.

www.nature.com/scientificreports www.nature.com/scientificreports/ flowering AS were built. By comparing changes in gene expression between these different groups, we can understand more deeply in the molecular mechanism of early flowering in AS. Transcriptome library preparation and sequencing of AS. The RNA was extracted from the flower buds of early flowering AS and apical meristem of vegetative growth AS for transcriptome analysis. If the RIN ≥8 and a 260/280 nm absorption ratio ≥ 1.8, RNA was used to set up transcriptome library. After the RNA extraction, mRNA was purified from total RNA by binding the RNA to magnetic beads. Then, mRNA was broken into short fragments. The cleaved RNA fragments were used as templates to synthesize the first-strand cDNA, after that DNA polymerase I and RNase H were added to synthesize the second-strand cDNA.

Material and
Next, suitable fragments were used as templates for PCR amplification, which yielded as the cDNA library for sequencing.
De novo assembly and unigenes annotation. The clean reads were screened from the raw data by filtering out poly-N, the low-quality reads (quality value ≤ 10 or reads including more than 5% unknown nucleotides). Then, the unigenes were generated by De novo assembling of the clean reads by using Trinity method 9,10 . In order to understand the function of the unigenes, they were searched against the public databases, including NCBI Nr and Nt, Swiss-Prot, GO, COG, and KEGG database, with E value ≤ 10 −5 .
Digital gene expression library preparation and sequencing. DGE library preparation of the three groups of AS Samples were performed in parallel using an Illumina Gene Expression Sample Preparation Kit (ZC: apical meristem of vegetative growth AS, ZT1:early stage flower bud of early flowering AS (During the AS early flowering time window, we observed the AS plants in the field every day, and the flower buds was collected within three days, the length of flower bud is less than 5 mm normally) and ZT2: late stage flower bud of early flowering AS (the flower buds was collected within one week, the length of flower bud is less than 1 cm normally)). Each experimental group consists of three biological sample replicates (no technical replicates).
Identification of differentially expressed genes. The clean reads from the sample of apical meristem of vegetative growth AS and flower bud of early flowering AS were mapped with the transcriptome library above. Reads per kilobase of per million mapped reads (RPKM) was used to measure the gene expression level. If the genes are satisfied with two conditions, false discovery rate (FDR) ≤ 0.001 and an absolute value of log2Ratio ≥ 1, they were defined for significant expression differences. The different expression genes (DEGs) were then compared with the transcriptome library of AS above.
Quantitative real-time PCR analysis. In order to verify the reliability of the DGE results, qRT-PCR was applied using LightCycler 480 SYBR Green I Master Mix (Roche, Basel, Switzerland) and a LightCycler 480 II Real-Time PCR instrument (Roche, Basel, Switzerland). Briefly, 1 μL of cDNA template from different group was used for reaction. The result of each gene repeated at least 3 times. The candidate genes expression changes were analyzed using 2 −△△CT method. Glyceraldehyde-3-phosphate dehydrogenase (GAPDH) was used as the endogenous control.

Results
Assembly of transcriptome sequencing. Because the genome sequencing of AS has not yet been carried out, it is necessary to complete the sequencing of transcriptome of AS to provide a reference for screening differential expression of genes during early flowering. After filtering the adaptors and low-quality sequences, there are 49,183,534 clean reads (Table 1). In addition, the GC percentage is 44.41% and the Q20 percentage is 98.18% of AS library. Subsequently, 133,010 contigs were assembled by short reads with average lengths of 90 bp ( Table 2). Then a total of 68,262 unigenes were assembled (including 25,560 clusters and 42,702 singletons) by Trinity with a average length of 728 bp (Fig. 1, Table 2). The E-value and similarity distribution against the NR database were showed in Fig. 2A (Table 3). In terms of distributed species, the homologous genes matched with the unique sequences of AS were mainly concentrated in Vitis vinifera (46.9%), followed by Ricinus communis (12.9%), Populus trichocarpa (12.2%) (Fig. 2C).
Classification of AS transcripts. Using WEGO software program, 326,207 unigenes were categorized in three categories: biological process, cellular component, and molecular function (Fig. 3). Because some unigenes matched to a few function groups, the number of unigenes match to the biological process was 161,138, to the cellular component was 120,997, to the molecular function was 44,072. In molecular function category, 18,759 unigenes were assigned to "catalytic activity" and 18,353 unigenes were assigned to "binding", which are the largest proportion, including 84.21% of the total unigenes. In cellular component category, "cell" (30,491), "cellpart" (30,491) and "organelle" (24,576) were highly represented. Moreover, "cellular process" (24,859) and "metabolic processes" (23,613) were the main groups in biological process category.
Clusters of Orthologous Group as a database was used for functional prediction and classification of unigenes. By searching with the COG database, 28,513 unigenes were assigned to 25 categories based on COG functional classification (Fig. 4). The number (4,749) of unigenes matching "General functional prediction only" was the highest in all category, followed by "Transcription" (2,682), "Replication, recombination and repair" (2,521), "Post-translational modification, protein turnover, chaperones" (2,044) and "Signal transduction mechanisms" (1,967). The number of unigenes matching "Nuclear structure" (7) and "Extracellular structures" (7) were the least.

Digital gene expression library sequencing and mapping. The gene expression changes involved in
the early flowering of AS were identified by DGE analysis. The sequencing saturation, homogenization and randomness were used reflect the quality of sequencing, and decided whether the data are suitable for further gene expression difference analysis. The distribution of a gene's coverage was considered as one of the most important parameter to measure the quality of the DGE libraries sequence dataset. In our results, the coverage of 56% of unigenes was exceeded 50% in all DGE libraries (Fig. 6).
Read mapping. The differentially expressed genes (DEGs) between samples were identified using an algorithm. The matching percentage of clean reads and reference genes ranged from 87.47% and 88.90% in three   www.nature.com/scientificreports www.nature.com/scientificreports/ DGE libaary. Among all reads, 67.50-69.74% per library was uniquely mapped to the reference genome, and 78.62-80.20% of reads was a perfect match to the reference gene (Table 4).

Differential gene expression during early flowering. The gene expression changes in different stages
of the early flowering of AS were screened by DGE analysis. RPKM was applied to assess the changes in gene expression. There were 5094 genes markedly changed between ZC and ZT1, with 2921 and 2173 of them being up-and down-regulated. Between ZC and ZT2, 4556 DEGs were screened, with 2818 up-regulated and 1738 down-regulated. There were 1111 DEGs markedly changed between ZT1 and ZT2, with 736 and 375 of them being up-and down-regulated. These data are presented in a histogram diagram in Fig. 7.
In order to further study the functions of DEGs, pathway enrichment analysis was performed on annotated DEGs. The KEGG pathway was considered significantly enriched with corrected P value < 0.05. The top 10 enriched KEGG pathways related to DEGs observed in different samples of ZC, ZT1 and ZT1 plants were listed   www.nature.com/scientificreports www.nature.com/scientificreports/ in Table S1, Table S2 and Table S3, respectively. The DEGs between ZC and ZT were focused in pathways, such as "Plant hormone signal transduction", "Biosynthesis of secondary metabolites", "Plant-pathogen interaction" and so on.   www.nature.com/scientificreports www.nature.com/scientificreports/ Key genes involved in flower development. Genes differentially expressed between ZC and ZT1, ZT2 were screened out. Genes having an adjusted log2 ≥ 1 or log2 ≤ −1 found by DGE were assigned as DEGs. There were many genes showing significantly different expression levels (Table 5).    www.nature.com/scientificreports www.nature.com/scientificreports/ In the study of the flowering mechanism of plant, most of the key genes are involved in photoperiodic pathway, vernalization pathway, autonomous pathway, gibberellin pathway. In photoperiodic pathway, we detected an increase in the expression of four genes (PHYA, CO, FT and GI) in the early flowering AS, the expression of the two genes (PHYB and ELF4) decreased, and the expression of the five genes (PHYC, CRY1, CRY2, LHY and CCA1) remained unchanged. In vernalization pathway, we detected an increase in the expression of four genes (FLC, FRIGIDA, VRN1 and VIN3) in the early flowering AS, the expression of the SOC1 decreased. In gibberellin pathway, we detected an increase in the expression of two genes (GA3ox, LFY) in the early flowering AS, and the expression of the gibberellin 20-oxidase remained unchanged. In autonomous pathway, all 4 key genes (FCA, FPA, FY and FVE) were no significant difference.  Fig. 9. 2 −ΔΔCt method was applied to calculate the relative expression of the genes. The DGE sequencing data was measured by the log2 value of samples. DGE sequencing and qRT-PCR showed significantly positive correlation (R 2 = 0.951) in linear regression analysis (Fig. 9), suggesting that the result of DGE analysis agreed well with qRT-PCR, thus proved the reliability of sequencing results.

Discussion
Now, lack of genomic and transcriptome data limited the research of the mechanism of early flowering of AS. In the present study, the Illumina sequencing technology were used for de novo reference transcriptome assembly using flower buds of early flowering AS and apical meristem of vegetative growth AS. After RNA sequencing, 68,262 unigenes were assembled. 49,477 (72.5%) unigenes were matched with public databases. Our results will contribute to future genomic studies on AS and other Umbelliferae species. However, there were still nearly one third of the unigenes cannot be matched in public databases. Similar phenomena existed in transcriptome assemble of other plant, such as Lycoris aurea 11 and Tagetes erecta 12 . The reason may be that the gene expression information of Umbelliferae is too little and the uniqueness of the gene expression of Umbelliferae. DGE was often used in combination with RNA sequencing to screen for differences in gene expression in different tissue of plant or to study disease mechanisms. Thus, a DGE analysis of apical meristem of vegetative growth AS and flower buds of early flowering AS was carried out to preliminarily clarify the mechanism of early flowering. According to the DGE results, a total of 5,094 and 4,556 transcripts were differently expressed between ZT1 and ZC, as well as ZT2 and ZC.
In Arabidopsis, there are four classic pathways which controlled the flower time. In our study, some key genes in photoperiodic pathway, vernalization pathway and gibberellin pathway are up-regulated in early flowering AS. By contrast, all key genes in autonomous pathway are not changed. There are similarities in the gene expression of early flowering in AS and normal flowering in model plant, but at the same time there are still some differences in gene expression. These different genes are the focus of our future research work.
In photoperiodic pathway, there were 4 genes (PHYA, CO, FT and GI) expressed higher in ZT. By contrast, there were only two genes (PHYB and ELF4) expressed higher in ZC. There were 5 genes (PHYC, CRY1, CRY2, LHY and CCA1) no significant difference. In phytochromes genes, PHYA was reported can promote flowering. On contrast, PHY, PHYD, PHYE inhibit flowering [13][14][15] . Our results were agreed with previous findings. In cryptochromes genes, CRY1 and CRY2 were both reported can promote flowering, our results were much the same between other groups 16 . GI and CO are regulated circadian clock, CO was considered as a gene that accelerates flowering in response to long days. FT is the target gene of CO, which is restricted to a similar time of day as  www.nature.com/scientificreports www.nature.com/scientificreports/ expression of CO. FT was considered as one of the three integrons which can promote flowering 17 . CO, FT and GI were all found high expressed in early flowering AS. Photoperiodic pathway should be involved in the early flowering phenomenon of AS.
In vernalization pathway, there were 4 genes (FLC, FRIGIDA, VRN1 and VIN3) expressed higher in early flowering AS. By contrast, there were only one genes (SOC1) expressed higher in ZC. SOC1 is a major floral pathway integrator, which encodes a MADS box transcription factor and is one of the key floral activators integrating multiple floral inductive pathways, namely, long-day, vernalization, autonomous, and gibberellin-dependent pathways 18 , but SOC1 expression is obviously decreased in our experiment. FLC, an upstream negative regulator of SOC1, is high expressed, although VRN1 and VIN3 which control vernalization-mediated FLC silencing are both high expressed 19 . This should the reason of SOC1 expression decreased.
Gibberellins (GAs) are essential for the development of fertile flowers in many plants, and may also be required immediately after fertilization 20,21 . In the GA-biosynthetic pathway, GA 20-oxidases and gibberellin 3 beta-hydroxylase 2 are both key enzymes 22 . In our study, gibberellin 3 beta-hydroxylase 2 were expressed higher in early flowering AS and there were no significant difference in gibberellin 20-oxidase expression level. The LFY homologs play a major role in the initiation of flowering 23 . LFY was also considered as one of the three integrons which can promote flowering 24 , which was positive regulated by GA. In our study, LFY was higher expressed in early flowering AS. Gibberellin pathway should be involved in the early flowering phenomenon of AS.
A central player in the floral transition is the floral repressor FLC 25 , the MADS-box transcriptional regulator that inhibits the activity of genes required to switch the meristem from vegetative to floral development 26,27 . One of the many pathways that regulate FLC expression is the autonomous promotion pathway composed of FCA, FY, FLD, FPA, FVE, LD, and FLK 28 . In our experiment, all 4 key genes (FCA, FPA, FY and FVE) were no significant difference. The proteins involved in autonomous pathway have no changes in early flowering in AS.
In fact, in addition to the classic four pathways that regulate plant flowering, we have also discovered changes in the expression of other genes. Plant polyamines are also an important class of plant growth regulators. Arginine decarboxylase (ADC) 29 , S-adenosylmethionine synthetase (SAMS), S-adenosylmethionine decarboxylase (SAMDC) 30 , Spermidine synthase (SPDS) 31 , polyamine oxidase (PAOs) are key enzymes in polyamine metabolism. ADC, SAMDC and SPDS expression are up-regulated in early flowering sample.
In conclusion, early flowering of AS was major effected by the genes involved in photoperiodic pathway and GA pathway. Vernalization pathway and autonomous pathway no significantly changes in early flowering. This also should be the difference between the early flowering and normal flowering. These results provide basic information for exploring the molecular mechanisms that influence the early flowering of AS. www.nature.com/scientificreports www.nature.com/scientificreports/

Conclusion
Now, effective genetic information on AS is very limited. Here, we combined RNA-Seq and DGE to study the molecular mechanism of early flowering of AS. We got 49,183,534 clear reads and assembled into 68,262 unigenes, the average length of each unigene was 728 bp.
The result of sequencing provided effective gene expression profile information for genomic research of AS. Based on DGE study, many important genes regulating early flowering of AS were discovered and further analyzed. In this paper, we proposed a putative network underlying an overview of known floral regulators present and differentially regulated during floral induction of AS (Fig. 10), which provided an important reference for the study of the molecular mechanisms of early flowering in AS. www.nature.com/scientificreports www.nature.com/scientificreports/