Novel insight into theacrine metabolism revealed by transcriptome analysis in bitter tea (Kucha, Camellia sinensis)

Kucha (Camellia sinensis) is a kind of unique wild tea resources in southwest China, containing sizeable amounts of theacrine (1,3,7,9-tetramethyluric acid) and having a special bitter taste both in fresh leaves and made tea. Theacrine has good healthy function locally. But the molecular mechanism of theacrine metabolism in Kucha was still unclear. In order to illuminate the biosynthesis and catabolism of theacrine in Kucha plants, three tea cultivars, C. sinensis ‘Shangyou Zhongye’ (SY) with low-theacrine, ‘Niedu Kucha 2’ (ND2) with middle-theacrine and, ‘Niedu Kucha 3’ (ND3) with high-theacrine, were used for our research. Purine alkaloid analysis and transcriptome of those samples were performed by High Performance Liquid Chromatography (HPLC) and RNA-Seq, respectively. The related gene expression levels of purine alkaloid were correlated with the content of purine alkaloid, and the results of quantitative real-time (qRT) PCR were also confirmed the reliability of transcriptome. Based on the data, we found that theacrine biosynthesis is a relatively complex process, N-methyltransferase (NMT) encoded by TEA024443 may catalyze the methylation at 9-N position in Kucha plant. Our finding will assist to reveal the molecular mechanism of theacrine biosynthesis, and be applied to selection and breeding of Kucha tea cultivars in the future.

is probably a three-step pathway with 1,3,7-methyluric acid acting an intermediate 14 . This was the first demonstration about theacrine biosynthesis pathway in Kucha. However, why the theacrine were highly synthesized and concentrated in Kucha plant? And it still remains unclear about the molecular mechanism of theacrine metabolism in Kucha plant.
In order to determine the molecular mechanism of theacrine metabolism, RNA-Seq was performed using three tea cultivars with different theacrine content. They are Shangyou Zhongye (SY) with minor theacrine, Niedu Kucha 2 (ND2) with medium theacrine and Niedu Kucha 3 (ND3) with high theacrine content. In this study, genes related to theacrine metabolism were explored and identified based on transcriptomes analysis. It will assist to explain the theacrine mechanisms in Kucha plant.

Results
Analysis of purine alkaloid content. The total purine alkaloids were relatively stable in tested samples, they were no significant difference between SY and ND3, especially. However, the significant differences of theacrine and caffeine were found between them. The theacrine content were 8.04 mg/g and 10.45 mg/g in ND2 and ND3 respectively, significant higher than SY (0.83 mg/g). While the variation of caffeine content was opposite to theacrine between the tested samples (Fig. 1). The caffeine was estimated 33.47 mg/g, 26.24 mg/g in ND2 and ND3 respectively, they were considerably lower than SY (38.62 mg/g). It may due to caffeine was the precursor of the theacrine synthesis, which was partly converted to theacrine in Kucha. Moreover, the content of theobromine in ND2 (8.34 mg/g) and in ND3 (7.49 mg/g) was considerably higher than those in SY (4.60 mg/g).
Sequencing analysis and reference genome alignment. Three cDNA libraries of ND2, ND3 and SY were constructed and sequenced. Totally, 18.87 Gb high-quality reads were obtained from paired-end reads, with a single read length of ~150 bp. A total of 41.28, 42.84 and 41.33 million high-quality reads were generated from ND2, ND3 and SY, respectively. Q20 percentage values were over 96.65%, while the Q30 percentage value were more than 92.56%, and the GC content was ranged from 49.10% ~ 49.29% (Table 1). The average mapped ratio of three samples to the genome was 92.28%. These results showed that the obtained transcriptomic data with relatively high comparison to the reference genome could be used for the further research.

Analysis of different expression genes (DEGs).
A total of 3,588 DEGs were detected among three Kucha cultivars. In ND2 vs SY, 2,440 DEGs were noted, of which 1,247 were upregulated and 1,193 were downregulated. And 2,015 DEGs were identified in ND3 vs SY, of which 1025 were upregulated and 990 were downregulated. While 810 differentially co-expressed genes (446 up-regulated and 364 down-regulated) were identified GO enrichment analysis of DEGs. GO (Gene Ontology) is an international standard database for gene functional classification, which contains three categories, molecular function (MF), cellular component (CC) and biological process (BP) 15 . In the MF category, the molecular function, the catalytic activity and the binding were the top three enriched GO terms of two groups. In the CC category, DEGs were enriched in terms of the cellular component, the cell and the cell part. And in the BP category, the biological process, the metabolic process and the cellular process were the top terms.
In  www.nature.com/scientificreports www.nature.com/scientificreports/ into 41, 8 and 25 subcategories of BP, CC and MF, respectively. Moreover, the terms of the methylation, the S-adenosylmethionine-dependent methyltransferase activity and the N-methyltransferase activity were enriched in 23, 11 and 7 genes in two groups, which may play a key role in purine alkaloid biosynthesis. The top 20 enriched terms of GO classification are shown in Fig. 3.

KEGG enrichment analysis of DEGs.
To explore whether the related genes of metabolism in Kucha were enriched, we constructed the KEGG pathway analyses by the obtained unigenes which corresponding reference pathway 16 . Results showed that 192 and 182 metabolic pathways in ND2 vs SY and ND3 vs SY group, respectively. These pathways contain flavonoid biosynthesis, phenylpropanoid biosynthesis, carotenoid biosynthesis, ABC transporters and steroid hormone biosynthesis were significant enriched in both group (Fig. 4). Moreover, a certain amount of DEGs were noted in secondary metabolite biosynthesis pathway, such as purine metabolism, flavone and flavonol biosynthesis, tyrosine metabolism, tropane, piperidine and pyridine alkaloid biosynthesis and indole alkaloid biosynthesis. www.nature.com/scientificreports www.nature.com/scientificreports/ Gene expressions in purine alkaloid metabolism. Caffeine metabolism was the main pathway of purine alkaloids in C. sinensis 17 . Theobromine is the intermediate product in caffeine synthesis process, while theacrine and theophylline are in caffeine degradation process. The pathway of "7-Methyxanthosine → 7-Methyxanthine → Theobromine → Caffeine → 1,3,7-Trimethyluric acid → Theacrine" was the only process for theacrine synthesis at present. The purine salvage process is "Adenosine →AMP → XMP → Xanthosine". In addition to the process of "caffeine → theacrine", we also annotated other three caffeine degradation pathway in KEGG database, the final degradation products in this pathway are "NH 3 + CO 2 ", "7-Methyluric acid" and "1-Methyluric acid", respectively (Fig. 5). The identification of those pathway can help us to further explain the metabolism of purine alkaloids in tea plant.
The expression patterns of key genes in theacrine synthesis such as TCS, SAMS, APRT, NMT were conduced to ascertain the reason of high concentration of theacrine in Kucha plant (Fig. 5). TEA015248 and TEA028651 (encoded adenine phosphoribosyltransferase, APRT) in the adenine donor synthesis, TEA015791, TEA028050, and TEA028052 (encoded tea caffeine synthase, TCS) were highly expressed in SY with high caffeine concentration. TEA006735 and TEA015661 (encoded S-adenosylmethionine synthetase, SAMS) in the methyl donor synthesis were up-regulated in ND2 which contains the most purine alkaloid among three samples. The expression levels of TEA024443 (encoded N-methyltransferase, NMT) were up-regulated in ND2 and ND3 which had a highly content of theacrine compare with SY. We also identified the key genes of purine alkaloid degradation, TEA027082 and TEA011804 (encoded 5′-nucleotidase, 5′-NT) which involved the degradation pathway of "AMP → Adenosine" and "Guanosine→GMP" were up-regulated in ND2 and ND3 compared with SY. www.nature.com/scientificreports www.nature.com/scientificreports/ TEA002149 (encoded allantoinase, ALN) was highly expressed in ND3 with high theacrine concentration. From the KEGG enrichment analysis, 5 and 6 DEGs which annotated in cytochrome P450 family were obtained in SY vs ND2, and SY vs ND3, respectively. TEA010267 and TEA004815 (encoded cytochrome P450 family 1 subfamily A polypeptide 2, CYP1A2) were highly expressed in high caffeine concentration sample. www.nature.com/scientificreports www.nature.com/scientificreports/ Pearson correlation coefficient was used to evaluate the correlation between the gene expression level and purine alkaloid content ( Table 2). The results shown that the expression patterns of TEA028049 (TCS), TEA015248 (APRT) and TEA000333 (5′-NT) were negatively correlated with the content of theacrine (P < 0.05). Meanwhile, TEA028050 (TCS) and TEA028052 (TCS) were positively correlate with caffeine content (P < 0.05), especially TEA018201 (APRT) presents the significantly positive correlation (P < 0.01). It was speculated that the proteins encoded by these genes were involved in the regulation of theacrine and caffeine content in the Kucha plants.
qRT-PCR analysis of the DEGs. To further confirm the reliability of the gene expression levels, 12 genes which were functionally related to purine alkaloid metabolism were selected for verification by quantitative real-time (qRT) PCR analysis, and their primers are listed in Table S1. Results shown that, 12 genes, except for TEA027780 (CYP1A2) and TEA010267 (CYP1A2), were generally consistent with RNA-Seq which further indicated the reliability of the transcriptome date (Fig. 6).

Discussion
As a kind of special tea germplasm, the classification of Kucha still remains being under discussion. In Zhang Hongda classification method, Kucha was defined as C. sinensis var. kucha Zhang et Wang 18  www.nature.com/scientificreports www.nature.com/scientificreports/ indicated that Kucha germplasms were similar to C. sinensis var. assamica in terms of catechin composition which was reflected the degree of tea plant evolution 22 . For Kucha classification, further study should be focused on genomic level by comparing Kucha plant with other tea species.
Meanwhile, the exploitation and utilization of bitter tea resources are also an important problem faced by breeders. Jianghua Kucha (a bitter tea) has been studied and utilized deeply which was widely promoted in the area of broken black tea at Hunan province, because of its suitable for black tea [23][24][25] . In our study, High level of purine alkaloids in Kucha cultivars is conducive to the processing of high-quality black tea, and it also provides a new way to develop product of natural purine alkaloid and multi-purpose exploitation of Kucha.
Theacrine is an advantage purine alkaloid second only to caffeine in Kucha plant generally, which affect quality and flavor of Kucha products significantly. Previous studies indicated that the theacrine varied in different Kucha cultivars, and less affected by environments 22 . Ye et al. found that the content of theacrine was stable with season change in young leaves of Kucha. While the theacrine was significantly higher in immature leaves than that in mature leaves and aged leaves 4 . At present, metabolic pathway of purine alkaloid and the related genes of biosynthesis have been basically determined which lay the foundation for our research of theacrine metabolism 26,27 . In our study, RNA-Seq was used to discovered the transcriptome of three tea cultivars, DEGs involved in purine alkaloid metabolism were discovered in two comparison groups ND2 vs SY and ND3 vs SY. Contents of theacrine and caffeine among three cultivars were obviously different, theacrine in ND2 and ND3 was higher than SY, while the caffeine content was lower.
KEGG pathway analysis indicated that purine alkaloid metabolism pathway plays significant roles in Kucha metabolism, and a number of key genes related to theacrine and caffeine biosynthesis were selected from these DEGs. Theacrine biosynthesis is catalyzed by a series of enzyme, including S-adenosylmethionine synthetase (SAMS), tea caffeine synthase (TCS), N-methyltransferase (NMTs) and Adenine phosphoribosyltransferase (APRT). In general, SAMS is the key enzyme in purine alkaloid metabolism pathway, of which S-adenosylmethionine act as methyl donor in theacrine and other purine alkaloids biosynthesis 28 . Our results show that the gene expression levels of SAMS were up-regulated in two groups, which the total amount of purine alkaloids was higher than that of control variety. TCS is a class of N-methyltransferase that catalyzes the methylation of N-3 (theobromine synthase) and N-1 (caffeine synthase) in tea plant. Previous studies point that TCS1 plays a crucial role for methylation of xanthosine at 1-N position in caffeine biosynthesis, while TCS1 alleles with low transcription level or encoded proteins with no TCS activity 29 . TEA015791, encoded TCS1, was down-regulated in ND2 and ND3, it may be the reason that why the lower content of caffeine in two Kucha samples. But no research indicated that TCS has the ability to catalyzed methylation at 9-N position. It still needs the further research that whether TCS involved in the final step of theacrine biosynthesis. TEA024443, encoded NMTs, was highly expression in high-theacrine sample which was positively correlated with the content. NMT may play a significant role in theacrine synthesis which dominant the pathway of "1,3,7-trimethyl xanthine (caffeine)→1,3,7-trimethyluric acid→1,3,7,9-tetramethyluric acid (theacrine)", we speculated that N-methyltransferase encoded by this gene responsible for catalyze the methylation at 9-N position in theacrine www.nature.com/scientificreports www.nature.com/scientificreports/ synthesis. Meanwhile, the expressions of related gene in purine alkaloid degradation pathway was probably correlated with the decreased of purine alkaloid in Kucha samples. TEA007899 and TEA023610, encoded allantoinase (ALN), and TEA020308, encoded urease (URE), were all highly expression in ND2 and ND3 with low-caffeine Kucha plant, which involved the final degradation step of "Allantoin→Allantoic acid → →NH 3 + CO 2 ".
Caffeine is seemed as a precursor of theacrine, its biosynthesis pathway is very clear. And the key genes involved in caffeine metabolism have been identified, which can provide reference for the further study of theacrine [29][30][31] . However, until now few studies have been reported on molecular mechanism of theacrine biosynthesis in Kucha cultivars. In this study, we proposed an NMT, encoded by TEA024443, was likely the key enzyme involving in the final synthesis process of theacrine. Next, the gene of TEA024443 should be focused on verification and testing its function in theacrine biosynthesis. Further, we will construct the tea genetic population with Kucha cultivars as parents, key genes involved in theacrine metabolism were identified and analyzed by density genetic map and QTL mapping. It will make it possible to breeding new tea cultivar which is rich in theacrine yet contains low caffeine.
To sum up, our study provided a new insight to illuminating theacrine molecular metabolism in Kucha plants, and it may help to the breeding improvement of Kucha plant in subsequent studies.

Methods plant materials.
Three tea cultivars were used as plant materials in this study, including high-theacrine cultivar 'Niedu Kucha 3′ (ND3), medium-theacrine 'Niedu Kucha 2′ (ND2) and low-theacrine 'Shangyou Zhongye' (SY) as a control. The former two are Kucha accessions from the same local population of tea landrace named 'Niedu Kucha' , the latter one is local cultivated tea clone. All of them were collected from the same region located in the southwest part of Jiangxi province. And they are currently preserved in National Germplasm Hangzhou Tea Repository, in the Tea Research Institute of the Chinese Academy of Agricultural Sciences at Hangzhou city, Zhejiang province in China. Leaves of two Kucha plants are tasted very bitter, while that of the controlled cultivar is not bitter. Fresh healthy leaves were picked on the two terminal leaf position from ND2, ND3 and SY respectively, quickly frozen in liquid nitrogen firstly, and then stored at −80 °C for RNA extraction. And the young shoots with "two leaves and a bud" collected from three cultivars were fixed with hot air at 120 °C for 5 min and then dried to constant weight at 80 °C. These dried samples were used to detect content of purine alkaloid such as theacrine, caffeine and theobromine using HPLC method. All experiments had three biological replicates.
Identification of purine alkaloids. Purine alkaloids such as theacrine, caffeine and theobromine were detected by HPLC methods described in a previous reported method 32 . A Waters High Performance Liquid Chromatography (HPLC) instrument was used to detect the compounds in this experiment. A Waters C 18 column (5 μm, 4.6 cm × 250 mm) was performed at a flow rate of 1.0 mL/min with the column temperature of 35 °C. 1% formic acid was used as mobile phase A, 100% acetonitrile as mobile phase B, 10 μL sample liquid was injected and analyzed at 280 nm, and the elution time was 45 min.
RnA extraction and library construction. EASY-spin Plus Complex Plant RNA kit (Aidlab Biotechnologies Co., Beijing, China) was used to extract total RNA. The integrity and quantity of total RNA were estimated by NanoDrop ultraviolet spectrophotometer (Thermo, Waltham, MA) and agarose gel electrophoresis. cDNA libraries established by the standard mRNA-Seq Library Prep Kit, and the library quality was assessed by using the Agilent 2100 Bioanalyzer (Agilent, Palo Alto, CA). The cDNA libraries were sequenced by the Illumina HiSeq platform, performed by Personal Biotechnology Co., Ltd. (Beijing, China).

Gene expression analysis.
High-quality reads were evaluated and acquired from the raw data, which need to filter out the low-quality sequence, adaptor sequences, duplicated sequences and ambiguous (with the ratio of N > 5%). The clean reads from all the samples were mapped to C. sinensis 'Shuchazao' genome by HISAT software with default parameters and counted by feature counts 33,34 .
Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) annotation were downloaded from http://tpia.teaplant.org/. The DEGs (differentially expressed genes) among tea samples (three biological replicates per group) were identified by DESeq. 2. Genes with a FDR (false discovery rate) ≤ 0.05 and an absolute fold change value ≥ 1.0 were identified as DEGs 35,36 .

Validation of quantitative real-time PCR.
To confirm the accuracy of RNA-Seq and DEG analyses, 12 DEGs associated with purine alkaloid biosynthesis pathway were selected for qRT-PCR. The primers of specific genes for qRT-PCR were designed by Primer Premier 5, and the primer pair sequences were listed in Table S1. First-strand cDNA (10-fold dilution) synthesis from total RNA was operated by using FastKing RT kit (with gDNase) (TianGen Biotech Co, Beijing). qRT-PCR was performed in ABI7500 (Applied Biosystems Inc, US), and the GAPDH (glyceraldehyde 3-phosphate dehydrogenase) gene were utilized as the internal reference. The total volume of the qRT-PCR reaction is 20 μl which contained 7.4 μL distilled H 2 O, 0.6 μL primer each pair, 10 μL 2x FastFire qPCR PreMix, 0.4 μL 50x Rox Reference Dye△ and 1 μL diluted cDNA. The cycling profile was 95 °C, 60 s; 40 cycles of 95 °C, 5 s, and 60 °C, 32 s in 96-well optical reaction plates. The relative gene expression was calculated though the 2 −△△Ct method 37 . Each sample was examined in three technical replicates.
Statistical analysis. Three biological replicates of sequencing data and purine alkaloid components were used for calculating the mean value and the standard deviation (SD) based the Duncan's multiple range tests. SPSS 18.0 was used to analyze the correlations of gene expression and purine alkaloid concentrations via Pearson correlation. The volcano figure was produced with Prism 7.0 software. The double coordinate figures and heat maps were draw with R package.