Aging affects artemisinin synthesis in Artemisia annua

Artemisinin (ART) is the most effective component in malaria treatment, however, the extremely low content restricts its clinical application. Therefore, it is urgent to increase the yield of ART. ART gradually accumulates with aging, small RNA (sRNA) and transcriptome analysis were applied on the leaves of 2-week-old (2 w) and 3-month-old (3 m) A. annua respectively. Among all the annotated sRNAs, 125 were upregulated and 128 downregulated in the 3 m sample compared to the 2 w one. Whereas 2183 genes were upregulated and 2156 downregulated. Notably, the level of miR156 and several annotated miRNAs gradually decreased while SPLs increased. In addition, the genes on ART biosynthesis pathway were significantly upregulated including ADS, CYP71AV1, ADH1, DBR2 and ALDH1, and so were the positive transcription factors like AaERF1, AaORA and AaWRKY1 indicating that age influences the ART biosynthesis by activating the expression of the synthesizing genes as well as positive transcription factors. This study contributes to reveal the regulatory effects of age on ART biosynthesis both in sRNA and transcription levels.


Results
Procession and statistics of the raw data. In order to address the link between age and artemisinin (ART) biosynthesis, the ART content in the leaves of A. annua in different age were analyzed. The ART content increased gradually with age though it was undetectable in half month (0.5 M) and the rate lowed down in 3 months (3 m) (Fig. 1A). Thus the leaves of 2-week-old (2 w) and 3-month-old (3 m) A. annua were sampled for RNA sequencing respectively, and named as S2 w and S3 m. The leaves of at least three individual lines of 2 w and 3 m mixed.
The original data from the sequencing contains low quality sequences with joints. In order to ensure the quality of information analysis, the original data must be filtered to obtain clean data, and subsequent analysis is based on clean data. The quality assessment of sample sequencing output data is shown in the following table (Table 1). Then the Bowtie 2 v2.1.0 was used to map the reads of each sample to the reference of A. annua genome [12][13][14] , and the parameters are defaulted, only 10.63% to 11.32% of the sRNA could mapped while 43.63% to 44.09% of the transcriptome could uniquely mapped to the reference genome (Table 2).  18,19 , miR845b, miR1134, SPL4 and SPL5 were found from the sequencing data. Table 1. Statistics of the sequencing data. Known miRNA: The sRNAs annotated by plant miRBase database. Other anno sRNA: annotated sRNAs other than rRNA, tRNA, sRNA, snRNA and miRNA. Unanno sRNA: unannotated sRNAs. Raw Reads: The number of sequencing sequences of each document is counted in four action units by counting the original sequence data. Clean reads: The calculation method was the same as Raw Reads, but the statistical files were filtered data, and subsequent analysis was based on this. Q20, Q30: The percentages of bases with phred value greater than 20 and 30 in the total bases were calculated respectively.  16 . The difference of gene expression is considered to be significant in the condition of |Fold change| > 2, FDR (q value) < 0.001, and at least one sample RPKM > 20. Statistical analysis was conducted on all the genes with significant expression difference between the S3m and S2w samples ( Table 3). The expression levels of reported miRNA and predicted novel miRNA are calculated between these two samples, 254 DEGs were obtained among which 50.79% were upregualted and 49.21% downregulated (Table 3). It has been predicted that miRNAs belong to miR414 and miR1310 families may target genes in ART biosynthesis like HMG-CoA reductase (HMGR), amorpha-4,11-diene synthase (ADS), farnesyl pyrophosphate synthase (FPS) and cytochrome P450 17 , unfortunately these miRNAs have not been detected in our sequence data. However, there were still six known miRNAs downregulated significantly (Supplement Fig. 1A). In addition, miR156 which is conserved in plant kingdom and abundant in the junior plant is also not detected in the sequence data, however, the expression of some predicted SPL genes show vigorous varieties (Supplement Fig. 1B). Luckily miR156 and several AaSPLs have been reported by other groups 18,19 . The SPL genes could be targeted or non-targeted by miR156, the reported AaSPLs and those from the our RNAseq data were aligned with SPLs from Arabidopsis (https:// www. ebi. ac. uk/ Tools/ msa/ clust alo/), AaSPL7, AaSPL9, AaSPL12, AaSPL13, SPL4 and SPL5 showing more similarities to miR156-target-SPLs in Arabidopsis were further confirmed (Fig. 2). The level of miR156 18 and two novel miRNAs, miR845b and miR1134 declined with age, and the picked SPLs 19 enhanced to varying degrees ( Fig. 1B-D). Indicating miR156 and targeted SPLs works as age cue in A. annua too. Because of the low abundance of miR156 in A. annua 18 , there may be other miRNAs to complement its role.

Items
There are 4339 differentially expressed genes in transcriptome, of which 49.69% were upregulated and 50.31% were downregulated. The transcription levels of the key genes in the biosynthesis of ART including ADS, CYP71AV1, ADH1, DBR2 and ALDH1 were all increased in S3m compared to S2w (Supplement Fig. 1C), which was in accordance with the phenomenon that ART accumulated with age. In addition, the positive regulatory transcription factor AaWRKY1 was obviously upregulated, and several other genes identical to the reported positive transcription factors like AaERF1, AaMYC2, AabHLH1 and AabZIP1 were upregulated as well (Supplement Fig. 1D). ART biosynthetic genes and reported regulation factors were further analyzed by qRT-PCR. Expression of most genes increased along with age, but the level of DBR2 descended in the four-month plant while AaHD8, AabHLH1, AaMIXTA and TAR1 didn't change much (Fig. 3). These transcription factors may construct a link between age and the ART biosynthesis which needs further detection.
Process involved in growth and metabolism are enriched with age. To gain functional annotations of the genes involved in age-related ART accumulation, GO analysis was carried out for DEGs. GO enrichment analysis using GOseq showed that the FDR (q value) value was less than 0.05 and the GO term was regarded as the enrichment term 20 . 1,237 DEGs were annoted in the GO analysis. Both the upregulated and downregulated genes of DEGs of A. annua's transcriptome were classified into three functions: Biological Process (BP), Cellular Component (CC), and Molecular Function (MF). Many terms of plant metabolites catalysis process were significantly enriched in these biological processes (FDR ≤ 0.05 or P-value ≤ 0.05). Among the upregulated genes 1770 were functional annotated including 133 terms for BP, 27 terms for CC, and 281 terms for MF. Photosynthesis, metabolic process including the biosynthetic and catabolic process, morphogenesis of shoot, and developmental process of embryo and seed are included in BP. Chloroplast, plasmids and NAD(P)H Table 2. Statistics of the clean reads mapped to the genome. Total reads: The total number of clean reads obtained by sequencing the sample. Mapped reads and Mapped ratio: The number and the ratio of the clean reads mapped to the reference genome. Unique mapped reads and Unique mapped ratio: The number and the ratio of the unique reads mapped to the reference genome. − no statistics.   (Table 4), and the majority of the DEGs are involved in DNA binding, redox enzyme activity, transcription factor activity (sequence-specific binding), transcriptional regulation and redox process are shown (Fig. 4).
Biochemical metabolic pathways and signal transduction pathways concentrated following age. Generally genes coordinate with each other to perform their biological functions. KEGG 21 , pathway analysis was conducted to determine the main biochemical metabolic pathways and signal transduction pathways in age related ART biosynthesis. Pathway with FDR (q value) less than 0.05 was defined as pathway with significant enrichment (Table 5). A total of 522 DEGs were annotated in KEGG classification. 334 upregulated DEGs with functional annotations were involved in 174 metabolic pathways while that was 220 metabolic pathways in 458 downregulated DEGs. Carbon metabolism, plant hormone signal transduction and biosynthesis of amino acids are the three significantly fluctuated ones in both upregualted and downregulated pathways (Fig. 5), www.nature.com/scientificreports/ which indicates their conserved and major roles in the aging process of artemisia. The results suggesting that the higher metabolic pathways associated with ART biosynthesis were ribosome, carbon metabolism, and plant hormone signal transduction .

Discussion
ART is a sesquiterpene lactone compound extracted from sweet wormwood A. annua, which contains the specific endoperoxide bridge, and it is synthesized by isoprenoid metabolic pathway. ADS, CYP71AV1, ADH1, DBR2 and ALDH1, as the key enzyme genes in the ART biosynthesis pathway, play an important role in ART biosynthesis. Overexpressing these genes is an important way of increasing ART content by genetic engineering. The transcripts of these genes are more abundant in the elder plants than the younger ones, which construct the link between ART biosynthesis and plant aging and provide the possibility of achieving higher ART production by harvesting the plants after blooming in addition with overexpressing enzyme genes. www.nature.com/scientificreports/ Another way to improve ART yield is to overexpress the positive transcription factors of ART biosynthetic pathway enzymes. As we know transcription factors often simultaneously regulate the expression of more than one enzyme in the biosynthetic pathway, which could be a more effective way to increase ART content. Several transcription factors have been reported to elevate ART biosynthesis by targeting different enzyme genes on ART biosynthesis pathway, like AaWRKY1, AabZIP1, AabHLH1, AaMYC2, AaERF1 and AaERF2. Those positive regulators of ART synthesis amounted with plant age. Our transcriptome data imply more unknown transcription factors should be investigated especially these AP2/ERF ones, and this will further illuminate the way of enlarging ART production by combining these ways. miR156 has been the only reported plant age cue, and it contributes to the anthocyanin biosynthesis in Arabidopsis and sesquiterpene biosynthesis in both Arabidopsis and Patchouli along with its critical roles in plant development 4,22 . The phenomenon that biosynthesis of ART mounts with age and is most vigorous after blooming indicating the effect of age on ART biosynthesis. Even though the miR156 is not detected in our small RNA sequencing data maybe because of the sequencing depth, selected samples or low abundance. Another group reported the miR156 sequence of A. annua with very low transcript level. Though miR156 gradually decreases while SPLs increases with age, whether other miRNAs or transcription factors play roles on aging and ART biosynthesis shall be further explored.
This study provide the molecular basis for the common sense that the aerial parts of A. annua are collected for ART extraction after blooming when both the biomass and ART content are high. The transcriptome data provide cues for key transcription factor mining for the regulation of ART biosynthesis. Meanwhile the small RNA sequencing data give some clues on investigation of A. annua aging and ART accumulation.

Materials and methods
All experiments were conducted in accordance with relevant institutional, national, and international guidelines and legislation.
Total RNA extraction and quality control. A. annua L. cv. QT was used in this paper. Total RNA was extracted from the leaves of two-week (2 w) and three-month (3 m) old A. annua.with Trizol (Invitrogen). The quality of total RNA was tested using 2100 Bioanalyzer. The qualified RNA samples were digested by DNaseI (TaKaRa, Japan) at 37 °C for 30 min.
cDNA library construction and sequencing. DNase digested RNA was treated with Dynabeads Oligo (dT)25 kit (Life, USA) to get the purified mRNA. 100 ng purified mRNA was then treated with NEBNext Ultra RNA Library Prep Kit for Illumina (NEB, USA) to build the cDNA Library. The quantity of the cDNA library was further tested by Qubit quantification, 2% agarose gel electrophoresis detection and high-sensitivity DNA chip detection. 10 ng cDNA was used to cluster generation in cBot with TruSeq PE Cluster Kit (illumina, USA), and then bidirectional sequencing was performed in Illumina Hiseq 4000. The transcripts were gained by referring to the whole genome sequencing data of A. Annua 14,23 , and then the differentially expressed genes (DEGs) were analyzed.
Seperation of the small RNA (sRNA). 100 mg samples were first grinded into powder in liquid netrogen, and then the total RNA was extracted by Trizol kit (Invitrogen). The total RNA was seperated by 15% poly-