Physiological and transcriptome analyses of photosynthesis and chlorophyll metabolism in variegated Citrus (Shiranuhi and Huangguogan) seedlings

Citrus species are among the most economically important fruit crops. Physiological characteristics and molecular mechanisms associated with de-etiolation have been partially revealed. However, little is known about the mechanisms controlling the expression and function of genes associated with photosynthesis and chlorophyll biosynthesis in variegated citrus seedlings. The lower biomass, chlorophyll contents, and photosynthetic parameter values recorded for the variegated seedlings suggested that chlorophyll biosynthesis was partially inhibited. Additionally, roots of the variegated seedlings were longer than the roots of green seedlings. We obtained 567.07 million clean reads and 85.05 Gb of RNA-sequencing data, with more than 94.19% of the reads having a quality score of Q30 (sequencing error rate = 0.1%). Furthermore, we detected 4,786 and 7,007 differentially expressed genes (DEGs) between variegated and green Shiranuhi and Huangguogan seedlings. Thirty common pathways were differentially regulated, including pathways related to photosynthesis (GO: 0015979) and the chloroplast (GO: 0009507). Photosynthesis (44 and 63 DEGs), photosynthesis-antenna proteins (14 and 29 DEGs), and flavonoid biosynthesis (16 and 29 DEGs) pathways were the most common KEGG pathways detected in two analyzed libraries. Differences in the expression patterns of PsbQ, PetF, PetB, PsaA, PsaN, PsbP, PsaF, Cluster-2274.8338 (ZIP1), Cluster-2274.38688 (PTC52), and Cluster-2274.78784 might be responsible for the variegation in citrus seedlings. We completed a physiological- and transcriptome-level comparison of the Shiranuhi and Huangguogan cultivars that differ in terms of seedling variegation. We performed mRNA-seq analyses of variegated and green Shiranuhi and Huangguogan seedlings to explore the genes and regulatory pathways involved in the inhibition of chlorophyll biosynthesis and decreases in Chl a and Chl b contents. The candidate genes described herein should be investigated in greater detail to further characterize variegated citrus seedlings.

ethylene responses 6 , riboflavin biosynthesis 7 , endogenous abscisic acid levels 3 , phospholipid hydroperoxide glutathione peroxidase activity 8 , and proteomics analysis 9 have been used to examine plant growth and development induced by etiolation. Many studies have focused on the expression analysis of related genes during photosynthesis in greening, and the regulation of the corresponding protein levels 10 . However, there have been relatively few transcriptome-level studies of variegation in citrus seedlings.
Variegated plants, which have both green and white or yellow areas on the same leaf, are considered as the invaluable materials for studying chloroplast biosynthesis, development and maintenance 11 . The white sectors of var2, which is one of the Arabidopsis (Arabidopsis thaliana) mutants showing leaf variegation, are active tissues that are formed by viable cells with undifferentiated plastids 12 . Chloroplast development and chlorophyll accumulation are inhibited at the albinism growth stage of albino tea cultivars 13 . In photosynthetic organisms, chlorophyll and carotenoid are the main pigments of plants that capture light energy. Earlier studies revealed that the etiolated growth of seedlings or leaves considerably affects chloroplast development and chlorophyll metabolism [14][15][16] . Whether the seedlings undergo etiolation or exhibit variegation, both lead to color changes of leaves.
Next-generation sequencing (NGS) technology is a rapid and cost-effective approach to analyzing a large number of protein-coding genes 17,18 . This technology can be used to address questions related to ecological comparative and evolutionary genomics in non-model organisms 19 . Based on the similar expression pattern, using transcriptomic data to group differentially expressed genes (DEGs) is one of the most effective methods to explore the relationships and predict functions of candidate genes 20,21 . Similarly expressed genes are often functionally related 22 . In the present study, we identified variegation -related genes and examined the pathways associated with photosynthesis and chlorophyll biosynthesis using RNA-sequencing (RNA-seq) combined with the analysis of chlorophyll contents to characterize the variegation in citrus seedlings. Our findings may help clarify the molecular basis of variegation.
We recently analyzed the transcriptome of etiolated citrus seedlings, there were variegated and green seedlings in both Shiranuhi and Huangguogan under the same conditions of seedling germination 23 . Although variegated plants exist widely in nature, the mechanism of citrus leaf variegation and the mechanisms regulating the expression patterns of well-characterized genes involved in photosynthesis and chlorophyll biosynthesis are still unknown. In order to study the color difference of citrus leaf after germination under the same culture condition, transcriptome sequencing was performed on the leaves of citrus seedlings. In the present study, we compared the leaf biomass and chlorophyll contents of variegated and green Shiranuhi and Huangguogan seedlings. In addition to the differentially expressed photosynthesis-related genes, the expression profiles of chlorophyll biosynthesis-related genes were analyzed to reveal the molecular mechanism underlying variegation.

Biomass accumulation of variegated and green seedlings. Variegated and green Shiranuhi and
Huangguogan seedlings were sampled at 20 days after germination to measure dry weight, length, and chlorophyll contents. We observed a significant difference in shoot and root dry weight between the variegated and green seedlings. Additionally, the roots of variegated seedlings were longer than those of green seedlings for both analyzed cultivars, while the opposite trend was observed for the shoots. Furthermore, the shoot lengths of green seedlings were longer than those of variegated seedlings in both Shiranuhi and Huangguogan, while the roots were shorter (Fig. 1). These results indicated that the variegation of citrus leaves had different effects on seedling root and shoot lengths.
Chlorophyll content and photosynthesis of variegated seedlings. The chlorophyll a (Chl a) concentration of variegated Shiranuhi and Huangguogan seedlings were 0.56 and 0.71 mg g −1 fresh weight, respectively ( Fig. 2), while the corresponding concentrations in green seedlings were 1.86 and 2.22 mg g −1 fresh weight, respectively. Similarly, chlorophyll b (Chl b) and carotenoid contents as well as the composition and proportion of photosynthetic pigments exhibited the same trends as the Chl a content.
The photosynthetic parameters differed between the Huangguogan and Shiranuhi variegated and green seedlings (Fig. 3). For example, the net photosynthetic rate (Pn), intercellular CO 2 concentration (Ci), and transpiration rate (Tr) were lower in variegated seedlings than in green seedlings. Additionally, the photosynthetic parameter values were lower for Huangguogan than for Shiranuhi seedlings, except for the Ci.
Sequencing and assembly of expressed genes. To investigate the genome-wide expression patterns of variegated seedlings, more than 10 leaves harvested from each seedling type for each cultivar at 20 days after germination were analyzed by RNA-seq. We constructed four cDNA libraries, namely R_V and R_G, which represent variegated and green Shiranuhi seedlings, respectively, and Y_V and Y_G, which correspond to variegated and green Huangguogan seedlings, respectively. After removing the sequencing adapters and low-quality data, we obtained 567.07 million clean reads and 85.05 Gb of RNA-seq data. More than 94.19% of the reads had a quality score of Q30 (sequencing error rate = 0.1%). The sequencing data statistics are listed in Table 1. The raw data were deposited in the NCBI Gene Expression Omnibus database (accession number GSE90935).
Evaluation of gene expression levels. The number of fragments per kilobase of exon per million fragments mapped (FPKM) was used to quantify the unigene expression levels, which were detected by RNA-seq with high sensitivity. Hierarchical cluster analyses were conducted with 66 DEGs identified for variegated and green Shiranuhi and Huangguogan seedlings. Genes with the same or similar expression profiles were clustered to present the gene sets exhibiting different expression patterns under various experimental conditions (Fig. 4).
Identification and functional annotation of differentially expressed genes. During the screening of DEGs, we used P value < 0.005 24 and |log2 (fold change)| > 1 as the threshold criteria to determine the significance of the differences in gene expression levels. A total of 4,786 and 7,007 DEGs between variegated     To investigate the biological pathways important for the variegation in Shiranuhi and Huangguogan seedlings, the DEGs between variegated and green seedlings were further annotated based on the reference pathways in the Kyoto Encyclopedia of Genes and Genomes (KEGG) database (Fig. 7). These DEGs in the R_V and Y_V libraries were assigned to 117 and 119 KEGG pathways, respectively. The pathway search results were sorted based on the number of hits, and photosynthesis (44 and 63 DEGs), photosynthesis-antenna proteins (14 and 29 DEGs), and flavonoid biosynthesis (16 and 29 DEGs) were the most highly represented pathways in both libraries. Genes associated with cysteine and methionine metabolism (47 DEGs), monobactam biosynthesis (9 DEGs), and ABC transporters (15 DEGs) were also common in the R_V library (Fig. 7). In contrast, genes related to carbon fixation in photosynthetic organisms (80 DEGs), ribosome (154 DEGs), and porphyrin and chlorophyll metabolism (41 DEGs) were common in the Y_V library (P value < 0.05).  www.nature.com/scientificreports www.nature.com/scientificreports/

Analysis of chlorophyll biosynthesis and photosynthesis-related transcripts. Chlorophyll is
a key photosynthetic pigment in plant chloroplasts, and chlorophyll metabolism is an important factor influencing crop yield 25 . To investigate whether photosynthesis pathway genes are involved in this decrease, we analyzed the expression patterns of genes encoding regulatory enzymes associated with chlorophyll biosynthesis (Fig. 8). Fifteen enzymes are required for the chlorophyll biosynthesis from glutamyl-tRNA to Chl b 26 . Magnesium (Mg) chelation is the first step in the cholorophyll branch of porphyin metabolism, forming  www.nature.com/scientificreports www.nature.com/scientificreports/ magnesium-protoporphyrin (MgP, EC: 2.1.1.11), which is followed by a SAM-dependent methylation of the 6-propionate side chain to form Mg-protoporphyrin monomethyl ester (MgPME) 27 . We observed that the expression of the gene encoding magnesium chelatase subunit I (MgCE I, EC: 6.6.   www.nature.com/scientificreports www.nature.com/scientificreports/ We observed no changes in the expression levels of the gene encoding HCAR in the Y_V library, but it was down-regulated from 51.14 to 10.30 FPKM (Cluster-2274.20724) in the R_V library.
In comparisons between the R_V and Y_V libraries and between the R_G and Y_G libraries, the following nine common DEGs associated with the photosynthesis pathway were identified: Cluster  (Fig. 9). Additionally, 15 genes involved in photosynthetic activities and chlorophyll biosynthesis may be associated with the differences in leaf color between the R_V and Y_V seedlings (Figs 8,9).
qRT-PCR validation. We selected photosynthetic genes (PsbQ, PetF, PetB, PsaA, PsaN, PsbP, and PsaF) and 6 porphyrin and chlorophyll metabolism pathway genes from common DEGs in R_V and Y_V libraries and evaluated their expression profiles using quantitative real-time PCR. Actin (GenBank: XM 006480741.2) was selected for internal controls. The verification results of the 7 photosynthesis-related and 6 chlorophyll metabolism-related DEGs demonstrated that nearly all of the common photosynthetic genes showed similar expression patterns of RNA-seq analysis (Fig. 10). In the porphyrin and chlorophyll metabolism pathway, there were three common DEGs (Cluster-2274.8338, Cluster-2274.38688, Cluster-2274.78784) with a similar expression pattern in R_V and Y_V seedlings. The other three common DEGs showed the opposite expression pattern, up-regulated in R_V and down-regulated in Y_V (Fig. 10).

Figure 9.
Differentially expressed genes mapped to photosynthesis pathway. The known pathways were obtained from the KEGG database. Red square denotes up-regulated expression in variegated seedlings of Shiranuhi and Huangguogan compared with that of green seedlings, respectively, while green square denotes both up-and down-regulated genes. We have got the formal permission of KEGG imagery (map00195) 29,30 from Kanehisa laboratories.

Discussion
The purpose of this research was to study the differences in the transcriptome expression of variegated and green Huangguogan and Shiranuhi seedlings, especially regarding photosynthesis and chlorophyll biosynthesis. We identified major transcriptomic features in the variegated seedlings.
Photosynthetic activities are affected by stomatal conductance (Gs), Ci, Tr, and chlorophyll fluorescence 31 . The Pn directly reflects the photosynthetic capacity 32 , while Ci is an important indicator of stomatal activity. During photosynthesis, a decrease in Tr caused by a decrease in Gs leads to a significant decrease in transpiration, which can inhibit the absorption and transport of water and nutrients 33 . In addition to shorter roots, the variegated seedlings also had significantly lower seedling biomass, chlorophyll content, and photosynthetic parameter values (P < 0.05) compared with the green seedlings (Figs 1-3). The identification and functional annotation of DEGs indicated that the photosynthesis pathway genes (44 and 63 DEGs) were the most common genes in the R_V and Y_V libraries (based on the number of hits for DEGs), with the lowest corrected P values (8.17212942029e-11 and 6.07238386484e-12) (Fig. 7).
Transcriptional regulation is important for coordinating the conversion of glutamyl-tRNA to the final product (Chl a and Chl b). There are 15 enzymes encoded by 27 genes in Arabidopsis thaliana in chlorophyll synthesis 4 . The six-electron oxidation of protogen IX is catalyzed to protoporphyrin IX (proto IX) by Protoporphyrinogen IX (protogen IX) oxidase (PPOX), with flavin as the cofactor 34 . Previous studies revealed that MgPMT (Mg-protoporphyrin IX methyltransferase) activity is inversely related to the transcriptional activities associated with Mg-chelatase and ALA synthesis, and restricts the formation of chlorophyll 4,35 . Low expression levels of Cluster-2274.8338 (K04035, 1.14.13.81) lead to a decrease in divinyl protochlorophyllide synthesis, while the up-regulated expression of Cluster-2274.78784 (K08099, 3.1.1.14) induces the transformation of Chl a to Chlorophyllide a. Furthermore, the up-regulated expression of the Cluster-2274.56424 (K13071, 1.14.15.17) gene enhances the synthesis of the chlorophyll catabolite during chlorophyll biosynthesis.
Three common differentially expressed genes of porphyrin and chlorophyll metabolism pathway with the same expression pattern (Cluster-2274.8338, Cluster-2274.38688, Cluster-2274.78784) in the variegated Shiranuhi and Huangguogan seedlings were blasted in Citrus Sinensis Annotation Project (http://citrus.hzau. edu.cn/orange/index.php). The sequence of Cluster-2274.8338 was similar to Cs6g16200 (the e-value was 4e-77, and ident was 83%), which is the ZIP1-like gen of Citrus sinensis. In addition, the sequences of ZIP1 (GenBank: FJ940751.1) of Oryza sativa Japonica Group (Japanese rice) in NCBI have been compared with transcriptome data by sequence alignment. The percentage of identical matches between ZIP1 and Cluster-2274.8338 was 99.92%, the total score was 2261, and the expect value (e-value) was 1.37e-26. On the other hand, Cluster-2274.8338 (ZIP1) was down-regulated in R_V and Y_V, and the log2 FC were −2.7399 and −2.5467, respectively (Fig. 10). Previous studies have found that EaZIP transcripts may not accumulate or become unstable in the early stage of plastid development 36 , leading to the loss of MPE cyclase, and the low expression results in chlorophyll losing and forming the diverse phenotype of leaf color 11 . Cluster-2274.38688 was similar to Cs8g15480.1, which encodes protochlorophyllide-dependent translocon component 52 (PTC52). PTC52 is considered to be part of a unique transposon and is most abundant in etiolated plants 37 . In this study, Cluster-2274.38688 gene was found to be up-regulated in variegated seedlings (Fig. 10). In addition, Cluster-2274.78784 was similar to Cs9g07520 (the total score was 3675, query cover was 78%, e-value was 0, and ident was 99%), which was noted to be involved in chlorophyll catabolic process (GO: 0015996). The up-regulated expression of Cluster-2274.78784 might lead to the decomposition of Chl b to chlorophyllide b. These results indicated that the formation of variegated seedlings may be caused by decreased chlorophyll synthesis and increased catabolism.
Nuclear-encoded genes affecting photosynthesis are co-expressed to assemble PSI and PSII during the de-etiolation of plants 34,38 . The transcription levels of several photosynthesis-related genes, such as PsbA 39 , LHCA1-6 and LHCB1-6 40 , as well as LSU and SSU, are regulated by light during chloroplast biogenesis in de-etiolating plants 41 . In this study, we detected 9 common DEGs involved in the photosynthesis pathways of the variegated Shiranuhi and Huangguogan seedlings. Additionally, we identified 6 common DEGs related to porphyrin and chlorophyll metabolism. The differential expression of these genes may be critical for the leaves variegation in citrus seedlings, but need to be confirmed in future work.

Materials and Methods
Plant materials. The Shiranuhi and Huangguogan hybrid citrus cultivars analyzed in this study (Fig. 11) were provided by the Institute of Pomology and Olericulture, Sichuan Agricultural University, China. Refer to the method of Xiong et al. 23 for seed germination and seedling culture. Seeds were presoaked in water for 4 h and then incubated at 25 °C for 3 days, sown in pots (vermiculite: perlite = 1:1) and then transferred to a growth chamber at 25 °C, 50-60% relative humidity, and 12-h light/12-h dark period. After germination (i.e., radicle breaks through the seed coat), seedlings were watered every 2 days. The Shiranuhi variegated seedlings (R_V), Shiranuhi green seedlings (R_G), Huangguogan variegated seedlings (Y_V), and Huangguogan green seedlings (Y_G) sample contained more than 20 seedlings were used for physiological and transcriptome analyses.
Photosynthetic characteristics. Three seedlings of each sample were selected as test materials. The photosynthetic parameters of the second fully unfolded leaves from the base of the seedlings were recorded 20 days after germination. The Pn, Tr, Ci, and Gs were measured using the LI-6400 portable photosynthesis system (Li-Cor, Lincoln, NE, USA). The photosynthetically active radiation and CO 2 concentration were 1,200 µmol m −2 s −1 and 400 µmol m −2 s −1 , respectively. Additionally, Gs and Ci were determined at a saturated light intensity of 1,000 µmol m −2 s −1 and 70% relative humidity 42 . Chlorophyll and carotenoid contents. All leaves, used for the determination of photosynthetic characteristics, were harvested from variegated and green seedlings for each cultivar at 20 days after germination. Once harvested, plant materials were quick frozen using liquid nitrogen and stored at −80 °C until using. The chlorophyll and carotenoid contents of seedlings 20 days after germination were estimated as previously Preparation of cDNA libraries and transcriptome sequencing. We purified mRNA from 1.5 µg total RNA using poly-T oligo-attached magnetic beads. The RNA was fragmented using divalent cations in 5 × NEBNext First Strand Synthesis Reaction Buffer. First-strand cDNA was synthesized using a random hexamer primer and M-MuLV Reverse Transcriptase (RNase H − ). Second-strand cDNA was subsequently synthesized using DNA Polymerase I and RNase H 23 . The remaining overhangs were converted into blunt ends by exonuclease/polymerase activities. The NEBNext adapter with a hairpin loop structure was ligated to adenylated 3′ ends of DNA fragments for a subsequent hybridization. To preferentially select cDNA fragments that were 150-200 bp long, the library fragments were purified with the AMPure XP system (Beckman Coulter, Beverly, USA). The size-selected, adapter-ligated cDNA sequences were treated with 3 µl USER Enzyme (NEB, USA) at 37 °C for 15 min followed by a 5-min incubation at 95 °C. A PCR was then completed using Phusion High-Fidelity DNA polymerase, Universal PCR primers, and an Index (X) Primer. The PCR products were purified using the AMPure XP system, and library quality was assessed using the Bioanalyzer 2100 system 46 . The index-coded samples were clustered with the cBot Cluster Generation System of the TruSeq PE Cluster Kit v3-cBot-HS (Illumina) 47 . The resulting libraries were sequenced using the HiSeq. 4000 system (Illumina), which generated paired-end reads.
Quality control. Refer to the previous method, raw data (raw reads) in a fastq format were first processed using in-house perl scripts. Clean data (clean reads) were obtained by removing reads containing the adapter, reads with poly-N sequences, and low quality reads from the raw data. Additionally, the Q20, Q30, GC content, and sequence duplication level were calculated for the clean reads, which were used in all downstream analyses 48 . Analysis of differentially expressed genes. Gene expression levels were estimated using the RSEM program 51 . Clean reads were mapped based on the assembled transcriptome, and the read count for each gene was obtained from the mapping results. Genes differentially expressed between two groups were analyzed using the DESeq R package (1.10.1), which provides statistical algorithms for detecting DEGs using a model based on a negative binomial distribution. The resulting P values were adjusted using the Benjamini and Hochberg's approach to control the false discovery rate 52 . Genes with an adjusted P value < 0.05 following the DESeq analysis were considered to be differentially expressed between groups 52 .
Analysis of GO and KEGG pathway enrichment of differentially expressed genes. The GO enrichment analysis of the DEGs was completed using the GOseq R packages based Wallenius' non-central  Table 2. Primers for qRT-PCR analysis.