Introduction

Malaria still threatens global human health which infects 214 million people and causes 430,000 deaths every year according to the world health organization's report. Scientists have been always working on preventing the spread of malaria since the 1940s, but malaria remains a tough problem in sub-Saharan Africa1. Artemisinin (ART) were first reported by Chinese scientists in Chinese science bulletin in 1977. Nowadays ART-based combination therapies (ACTs) is the primary treatment for malaria, especially the falciparum ones. In 2015, Dr. Youyou Tu won the Nobel Prize for her great contribution to the discovery of ART and its anti-malaria efficacy. In addition, ART and its derivatives also have anti-cancer, anti-parasite and other pharmacological effects2. Despite the huge market demand, the supply of ART is extremely insufficient for the following reasons. First, ART is mainly extracted from the wild A. annua plants, but the planting area and biomass of A. annua fluctuate greatly every year. Second, the content of ART in the wild A. annua is extremely low only 0.1% to 0.8% of dry weight3. Therefore, it is urgent to increase the yield of ART. Over-expression of the biosynthesis pathway genes, transcriptional regulation and blocking the competitive pathways have been applied to increase the content of ART in A. annua, however, how age regulates the biosynthesis of ART is still far been explored considering the fact that the content of ART accumulates with plant age and reaches maximum after bloom. The sRNA and transcriptome analysis will provide the differentially expressed genes related to age, which will aid to elucidate the link between age and ART biosynthesis.

We mainly focus on three key targets in transcriptome analysis. Firstly, miR156 and its targeted SPL genes. miR156 is the only reported age cue in plant, and it functions by targeting the SPL genes. Secondly, ART biosynthesis genes. ART is a sesquiterpene lactone with a specific endo-peroxide bridge, which is synthesized by isoprenoid metabolic pathway. FPP is catalyzed by ADS (amorpha-4,11-diene synthase) to form amorpha-4,11-diene; then artemisinic alcohol(AAOH) is formed under the catalysis of a cytochrome P450 monooxygenase CYP71AV1 (cytochrome P450-dependent hydroxylase); artemisinic aldehyde(AAA) is formed under the combined action of CYP71AV1 and ADH1 (alcohol dehydrogenase 1); DBR2 (artemisinic aldehyde Δ11(13) reductase) converts artemisinic aldehyde to dihydroartemisinic aldehyde (DHAAA), which is further oxidized to dihydroartemisinic acid(DHAA) by ALDH14. Thirdly, transcription factors. Many transcription factors have been reported to positively regulate the ART biosynthesis by activating the expression of pathway genes. AaWRKY1 activates the transcription of CYP71AV15. AaORA regulates the expression of ADS, CYP71AV1 and DBR26. AaERF1, AaERF2, TAR1, AabZIP1 and AabHLH1 specifically increase the expression of ADS and CYP71AV14,7,8,9. AaNAC1 upregulates the expression of ADS and CYP71AV110. AaMYC2, an activator of jasmonic acid (JA) signal pathway can improve the expression levels of CYP71AV1 and DBR211. This transcriptome study will aid to construct the link between age and ART biosynthesis by analyzing these above mentioned key targets.

Results

Procession and statistics of the raw data

In order to address the link between age and artemisinin (ART) biosynthesis, the ART content in the leaves of A. annua in different age were analyzed. The ART content increased gradually with age though it was undetectable in half month (0.5 M) and the rate lowed down in 3 months (3 m) (Fig. 1A). Thus the leaves of 2-week-old (2 w) and 3-month-old (3 m) A. annua were sampled for RNA sequencing respectively, and named as S2 w and S3 m. The leaves of at least three individual lines of 2 w and 3 m mixed.

Figure 1
figure 1

ART content and expression of miRNA and SPLs accumulated with age. (A) ART content in leaves of 0.5, 1, 2, 3 and 4-month old A. annua. The ART content was compared to the dry weight of the sample to gain the percent (%). (BD) qRT-PCR analysis of miRNAs and genes in in leaves of A. annua in different age. miR156, AaSPL7, AaSPL9, AaSPL12 and AaSPL13 were reported18,19, miR845b, miR1134, SPL4 and SPL5 were found from the sequencing data.

The original data from the sequencing contains low quality sequences with joints. In order to ensure the quality of information analysis, the original data must be filtered to obtain clean data, and subsequent analysis is based on clean data. The quality assessment of sample sequencing output data is shown in the following table (Table 1). Then the Bowtie 2 v2.1.0 was used to map the reads of each sample to the reference of A. annua genome12,13,14, and the parameters are defaulted, only 10.63% to 11.32% of the sRNA could mapped while 43.63% to 44.09% of the transcriptome could uniquely mapped to the reference genome (Table 2).

Table 1 Statistics of the sequencing data.
Table 2 Statistics of the clean reads mapped to the genome.

Several miRNAs and ART biosynthesis related genes accumulated with age

To estimate the differentially expressed genes (DEGs) between different ages, the concept of RPKM was introduced15. The MARS (MA-plot-based method with Random sampling model) model in DEGseq v1.20.0 package was used to analyze the differences16. The difference of gene expression is considered to be significant in the condition of |Fold change| > 2, FDR (q value) < 0.001, and at least one sample RPKM > 20. Statistical analysis was conducted on all the genes with significant expression difference between the S3m and S2w samples (Table 3).

Table 3 Statistics of the genes expression with significant difference.

The expression levels of reported miRNA and predicted novel miRNA are calculated between these two samples, 254 DEGs were obtained among which 50.79% were upregualted and 49.21% downregulated (Table 3). It has been predicted that miRNAs belong to miR414 and miR1310 families may target genes in ART biosynthesis like HMG-CoA reductase (HMGR), amorpha-4,11-diene synthase (ADS), farnesyl pyrophosphate synthase (FPS) and cytochrome P45017, unfortunately these miRNAs have not been detected in our sequence data. However, there were still six known miRNAs downregulated significantly (Supplement Fig. 1A). In addition, miR156 which is conserved in plant kingdom and abundant in the junior plant is also not detected in the sequence data, however, the expression of some predicted SPL genes show vigorous varieties (Supplement Fig. 1B). Luckily miR156 and several AaSPLs have been reported by other groups18,19. The SPL genes could be targeted or non-targeted by miR156, the reported AaSPLs and those from the our RNAseq data were aligned with SPLs from Arabidopsis (https://www.ebi.ac.uk/Tools/msa/clustalo/), AaSPL7, AaSPL9, AaSPL12, AaSPL13, SPL4 and SPL5 showing more similarities to miR156-target-SPLs in Arabidopsis were further confirmed (Fig. 2). The level of miR15618 and two novel miRNAs, miR845b and miR1134 declined with age, and the picked SPLs19 enhanced to varying degrees (Fig. 1B–D). Indicating miR156 and targeted SPLs works as age cue in A. annua too. Because of the low abundance of miR156 in A. annua18, there may be other miRNAs to complement its role.

Figure 2
figure 2

Phylogenetic tree of the SBP domain of SPLs from A. annua.and Arabidopsis. The tree was constructed using Clustal Omega (https://www.ebi.ac.uk/Tools/msa/clustalo/) with Neighbor-Joining method. The black triangle (▲) or blank triangle () indicates miR156 targeted or non-targeted Arabidopsis SPLs respectively.

There are 4339 differentially expressed genes in transcriptome, of which 49.69% were upregulated and 50.31% were downregulated. The transcription levels of the key genes in the biosynthesis of ART including ADS, CYP71AV1, ADH1, DBR2 and ALDH1 were all increased in S3m compared to S2w (Supplement Fig. 1C), which was in accordance with the phenomenon that ART accumulated with age. In addition, the positive regulatory transcription factor AaWRKY1 was obviously upregulated, and several other genes identical to the reported positive transcription factors like AaERF1, AaMYC2, AabHLH1 and AabZIP1 were upregulated as well (Supplement Fig. 1D). ART biosynthetic genes and reported regulation factors were further analyzed by qRT-PCR. Expression of most genes increased along with age, but the level of DBR2 descended in the four-month plant while AaHD8, AabHLH1, AaMIXTA and TAR1 didn’t change much (Fig. 3). These transcription factors may construct a link between age and the ART biosynthesis which needs further detection.

Figure 3
figure 3

Expression of ART biosynthetic genes and transcription factors increased with age. qRT-PCR analysis of ART biosynthetic genes (A, B) and transcription factors (CF) in leaves of 0.5, 1, 2, 3 and 4-month old A. annua.

Process involved in growth and metabolism are enriched with age

To gain functional annotations of the genes involved in age-related ART accumulation, GO analysis was carried out for DEGs. GO enrichment analysis using GOseq showed that the FDR (q value) value was less than 0.05 and the GO term was regarded as the enrichment term20. 1,237 DEGs were annoted in the GO analysis. Both the upregulated and downregulated genes of DEGs of A. annua’s transcriptome were classified into three functions: Biological Process (BP), Cellular Component (CC), and Molecular Function (MF). Many terms of plant metabolites catalysis process were significantly enriched in these biological processes (FDR ≤ 0.05 or P-value ≤ 0.05). Among the upregulated genes 1770 were functional annotated including 133 terms for BP, 27 terms for CC, and 281 terms for MF. Photosynthesis, metabolic process including the biosynthetic and catabolic process, morphogenesis of shoot, and developmental process of embryo and seed are included in BP. Chloroplast, plasmids and NAD(P)H dehydrogenase complex are involved in CC. While ribosome construction, activity of transcription factor and binding of RNA polymerase and chlorophyll are contained in MF. While in the downregulated genes, 2023 were annotated with functions, comprising of 112 terms in BP, 37 in CC, and 209 in MF. GO enrichment histogram of DEGs can directly reflect the distribution of the number of differentially enriched genes in each GO entry. 12 GO items with the most significant enrichment were selected for display (Table 4), and the majority of the DEGs are involved in DNA binding, redox enzyme activity, transcription factor activity (sequence-specific binding), transcriptional regulation and redox process are shown (Fig. 4).

Table 4 12 GO items with the most significant enrichment.
Figure 4
figure 4

GO enrichment of DEGs. 1. DNA binding. 2. Structural constituent of ribosome. 3. Hydrolase activity. 4. Xyloglucan: xyloglucosyl transferase activity. 5. Oxidoreductase activity. 6. Transcription factor activity; sequence-specific binding. 7. Ribosome. 8. Apoplast. 9. Translational elongation. 10. Regulation. 11. oxidation–reduction process. 12. Cellular glucan metabolic process.

Biochemical metabolic pathways and signal transduction pathways concentrated following age

Generally genes coordinate with each other to perform their biological functions. KEGG21, pathway analysis was conducted to determine the main biochemical metabolic pathways and signal transduction pathways in age related ART biosynthesis. Pathway with FDR (q value) less than 0.05 was defined as pathway with significant enrichment (Table 5). A total of 522 DEGs were annotated in KEGG classification. 334 upregulated DEGs with functional annotations were involved in 174 metabolic pathways while that was 220 metabolic pathways in 458 downregulated DEGs. Carbon metabolism, plant hormone signal transduction and biosynthesis of amino acids are the three significantly fluctuated ones in both upregualted and downregulated pathways (Fig. 5), which indicates their conserved and major roles in the aging process of artemisia. The results suggesting that the higher metabolic pathways associated with ART biosynthesis were ribosome, carbon metabolism, and plant hormone signal transduction .

Table 5 Enrichment results of KEGG of DEGs.
Figure 5
figure 5

KEGG enrichment of DEGs. 1. Alpha-Linolenic acid metabolism. 2. Carbon fixation in photosynthetic organisms. 3. Ribosome. 4. Glyoxylate and dicarboxylate metabolism. 5. Carbon metabolism. 6. Cell cycle—Caulobacter. 7. Cyanoamino acid metabolism. 8. Methane metabolism. 9. Glycolysis/gluconeogenesis. 10. Cysteine and methionine metabolism. 11. Plant hormone signal transduction. 12. Glycine, serine and threonine metabolism. 13. Photosynthesis—antenna proteins. 14. Biosynthesis of amino acids.

Discussion

ART is a sesquiterpene lactone compound extracted from sweet wormwood A. annua, which contains the specific endoperoxide bridge, and it is synthesized by isoprenoid metabolic pathway. ADS, CYP71AV1, ADH1, DBR2 and ALDH1, as the key enzyme genes in the ART biosynthesis pathway, play an important role in ART biosynthesis. Overexpressing these genes is an important way of increasing ART content by genetic engineering. The transcripts of these genes are more abundant in the elder plants than the younger ones, which construct the link between ART biosynthesis and plant aging and provide the possibility of achieving higher ART production by harvesting the plants after blooming in addition with overexpressing enzyme genes.

Another way to improve ART yield is to overexpress the positive transcription factors of ART biosynthetic pathway enzymes. As we know transcription factors often simultaneously regulate the expression of more than one enzyme in the biosynthetic pathway, which could be a more effective way to increase ART content. Several transcription factors have been reported to elevate ART biosynthesis by targeting different enzyme genes on ART biosynthesis pathway, like AaWRKY1, AabZIP1, AabHLH1, AaMYC2, AaERF1 and AaERF2. Those positive regulators of ART synthesis amounted with plant age. Our transcriptome data imply more unknown transcription factors should be investigated especially these AP2/ERF ones, and this will further illuminate the way of enlarging ART production by combining these ways.

miR156 has been the only reported plant age cue, and it contributes to the anthocyanin biosynthesis in Arabidopsis and sesquiterpene biosynthesis in both Arabidopsis and Patchouli along with its critical roles in plant development4,22. The phenomenon that biosynthesis of ART mounts with age and is most vigorous after blooming indicating the effect of age on ART biosynthesis. Even though the miR156 is not detected in our small RNA sequencing data maybe because of the sequencing depth, selected samples or low abundance. Another group reported the miR156 sequence of A. annua with very low transcript level. Though miR156 gradually decreases while SPLs increases with age, whether other miRNAs or transcription factors play roles on aging and ART biosynthesis shall be further explored.

This study provide the molecular basis for the common sense that the aerial parts of A. annua are collected for ART extraction after blooming when both the biomass and ART content are high. The transcriptome data provide cues for key transcription factor mining for the regulation of ART biosynthesis. Meanwhile the small RNA sequencing data give some clues on investigation of A. annua aging and ART accumulation.

Materials and methods

All experiments were conducted in accordance with relevant institutional, national, and international guidelines and legislation.

Total RNA extraction and quality control

A. annua L. cv. QT was used in this paper. Total RNA was extracted from the leaves of two-week (2 w) and three-month (3 m) old A. annua.with Trizol (Invitrogen). The quality of total RNA was tested using 2100 Bioanalyzer. The qualified RNA samples were digested by DNaseI (TaKaRa, Japan) at 37 °C for 30 min.

cDNA library construction and sequencing

DNase digested RNA was treated with Dynabeads Oligo (dT)25 kit (Life, USA) to get the purified mRNA. 100 ng purified mRNA was then treated with NEBNext Ultra RNA Library Prep Kit for Illumina (NEB, USA) to build the cDNA Library. The quantity of the cDNA library was further tested by Qubit quantification, 2% agarose gel electrophoresis detection and high-sensitivity DNA chip detection. 10 ng cDNA was used to cluster generation in cBot with TruSeq PE Cluster Kit (illumina, USA), and then bidirectional sequencing was performed in Illumina Hiseq 4000.

The transcripts were gained by referring to the whole genome sequencing data of A. Annua14,23, and then the differentially expressed genes (DEGs) were analyzed.

Seperation of the small RNA (sRNA)

100 mg samples were first grinded into powder in liquid netrogen, and then the total RNA was extracted by Trizol kit (Invitrogen). The total RNA was seperated by 15% polyacrylamide gel electrophoresis, and the RNA located within 15-35nt by referring to the RNA TrackIt 10 bp DNA Ladder (Invitrogen) were purified as the sRNA.

sRNA library construction and sequencing

The sRNA library was constructed using Truseq Small RNA Sample Preparation Kit (illumina), and the PCR products were seperated by 6% polyacrylamide gel electrophoresis and those located at 147 bp was purified as the sRNA cDNA library. The library was further amplified with TruSeq PE Cluster Kit (illumina, USA) and then sequenced by illumina Hiseq. After sequencing the clean data were gained by eliminating the low quanlity data, adaptors and contamination from the raw data, then the clean data were aligned and annotated with known rRNA, tRNA, nRNA, snoRNA and fragments of degraded mRNA in the database to eliminate the non-miRNA sequences. The unknown sRNA were further analyzed by novel miRNA software.

Expression analysis

One microgram of total RNA was used for cDNA synthesis with oligo (dT) primers and M-MLV Reverse Transcriptase (Invitrogen) while miRNAs were inverse transcripted with miRNA 1st strand cDNA synthesis kit (Accurate Biotechnology). Quantitative real-time RT-PCR was performed with SYBR-Green PCR Mastermix (Accurate Biotechnology). A. annua ACTIN (EU531837) and U6 (PWA96665) were used as internal references for mRNA and miRNA respectively.

Chemical analysis

Contents of ART were determined by high-performance liquid chromatography (HPLC) as described4, and the standard substance of ART was bought form Sigma.