Transcriptome analysis highlights nuclear control of chloroplast development in the shoot apex

In dicots, the key developmental process by which immature plastids differentiate into photosynthetically competent chloroplasts commences in the shoot apical meristem (SAM), within the shoot apex. Using laser-capture microdissection and single-cell RNA sequencing methodology, we studied the changes in the transcriptome along the chloroplast developmental pathway in the shoot apex of tomato seedlings. The analysis revealed the presence of transcripts for different chloroplast functions already in the stem cell-containing region of the SAM. Thereafter, an en masse up-regulation of genes encoding for various proteins occurs, including chloroplast ribosomal proteins and proteins involved in photosynthesis, photoprotection and detoxification of reactive oxygen species. The results highlight transcriptional events that operate during chloroplast biogenesis, leading to the rapid establishment of photosynthetic competence.

to the SAM's PZ, within a few cell divisions from the CZ. The membranes continue to expand and differentiate within developing leaf primordia before reaching their mature form. Thylakoid membrane development and the acquisition of photosynthetic competence thus follow a sharp gradient across the shoot apex. Here, we report on the changes in the cellular transcriptomes that take place along this gradient.

Results and Discussion
Given the small size of the vegetative Arabidopsis SAM (diameter of 50-60 µm), we analyzed the relatively larger SAM of tomato, measuring 150-200 µm in diameter. The pattern of chlorophyll fluorescence in tomato generally resembles that of Arabidopsis 13 , with no fluorescence apparent in the central area of the SAM below the L1 layer, and a sharp increase of fluorescence when moving from the CZ to the PZ, and then to the LP (Fig. 1b). This visual indicator for chloroplast development guided the selection of the desired regions, termed for simplicity as the regions they overlap with -CZ, PZ and LP (Fig. 1). Notably, the pattern of chlorophyll fluorescence exhibited by the different SAM layers and regions, both in Arabidopsis 13 and tomato, does not correspond to the known gene expression patterns delineating the SAM zones 14 . While studies using fluorescently-labeled markers have provided valuable information on gene expression patterns in defined regions of the SAM 15,16 , these do not reproduce (c,d) A typical section of the shoot apex before (c) and after (d) being subjected to laser capture microdissection, to isolate the chlorophyll-less region of the SAM CZ (yellow outline), the PZ, in which chlorophyll fluorescence becomes visible (green outline), and tissue of the LP which harbors still more developed chloroplasts (red outline). For better visibility, the original software lines were re-traced. Scale bars, 50 μm. the chloroplast differentiation path. Thus, following the initial events of chloroplast biogenesis necessitated the use of laser-capture microdissection (LCM). A typical section, before and after LCM, is presented in Fig. 1c,d, respectively. As the number of cells in the samples was exceedingly small, only minute amounts of RNA could be extracted, necessitating the use of single-cell RNA sequencing methodology (CEL-Seq 17 ). Altogether, a total of ~4,000 unique transcripts were identified, with a mostly similar pattern of gene products' cellular localizations as the total transcriptome (see below). Notably, almost all of the transcripts identified were present in the three regions analyzed, including in the proplastid-containing region of the SAM (Supplementary Table 1). A list of putative differentially expressed genes (DEGs) was compiled using a threshold ≥ six reads and a cutoff of 1.5-fold expression change. Subsequently, qPCR was performed on 95 genes selected from the list. The final list of 223 DEGs (Supplementary Table 2) included ones that showed a two-tailed FDR ≤0.05, or individual p-values ≤ 0.05 for genes whose expression levels were validated by qPCR. The thus compiled list primarily represents genes whose expression level in LP differs from those in CZ (>90%). From CZ to PZ, 39 genes were differentially expressed, 19 of which differed also between CZ and LP. From PZ to LP, 20 genes were differently expressed, the level of 17 of which also differed between CZ and LP.
Principle component analysis and hierarchical clustering of the DEGs show that the samples taken from the three regions of the shoot apex are well separated from each other (Fig. 2). This is in spite of the PZ being a transition zone between the SAM and leaf organs. Three distinct expression patterns are observed (Fig. 3), with the majority of the genes being upregulated throughout the developmental path. The second largest group is of genes whose expression is downregulated along the gradient. The third consists of genes whose expression first increases, between CZ and PZ, and then decreases upon the transition from PZ to LP. As shown in the bar graph in Fig. 3a, photosynthesis genes are highly represented within the first group and are barely present in the other two.
The predicted subcellular localizations of the differentially expressed gene products are shown in Fig. 4a (two lower bars). As can be seen, plastids are the most prevalent destination of these products, amounting to 36% of the total. This enrichment, which also pertains to photosynthesis genes, is highly significant (p < 0.01, hypergeometric test). This is despite the fact that these transcripts amount to only 7% of the total we identified in the shoot apex (Fig. 4a, second bar from the top). The other major destinations of the DEG products are the nucleus (20%), and the cytosol (15%). The least common cellular compartment of the DEG products is the ER, which constitutes the target of less than 2% of these products. This is albeit being the predicted site for almost 30% of the proteins encoded by the transcripts identified in the shoot apex. The fact that along the CZ-PZ-LP developmental path almost no changes occur in the expression of genes encoding for ER-targeted proteins may have two explanations. First, it may be that the expression of the relevant genes is already high at the central zone and is maintained as such in the peripheral zone and leaf primordia, perhaps reflecting processes initiated earlier during embryogenesis. Alternatively, low expression levels of ER (and Golgi proteins, see Fig. 4a, two lower bars) may be maintained to reduce secretory load, perhaps as a systemic effort to funnel available resources to thylakoid membrane formation and differentiation and other processes associated with plastid ontogeny. As can be expected, a large fraction (about half) of the plastid-targeted proteins encoded by the up-regulated DEGs are related to photosynthesis (Fig. 4b). Other up-regulated DEG products localized to plastids are involved in RNA and protein synthesis and processing, essential for the construction of the photosynthetic apparatus and its auxiliary components.
Examination of the DEGs operating along the chloroplast developmental path reveals that the relative proportions of the different functional groups to which they belong generally resemble those of the tomato genome and the expressed genes identified (Fig. 5a). The only notable exception are genes encoding photosynthesis-related proteins, which amount to 17% of the DEGs, as opposed to only 2-3% in the genome and the expressed genes. This highlights the allocation of a significant fraction of the cellular resources toward the build-up the photosynthetic machinery.    Overall, the transcripts of 54 nuclear genes encoding chloroplast-targeted proteins were upregulated during the transition (Table 1). These include transcripts encoding for proteins of the chloroplast 30S and 50S ribosomal subunits, constituents of photosystems I and II (PSI and PSII) and their peripheral antenna complements, ATP synthase and NADH dehydrogenase, and ferredoxin and its cognate oxidoreductase. Others were transcripts encoding enzymes of the Calvin-Benson cycle, including the small subunit of ribulose-1,5-bisphosphate carboxylase/oxygenase (Rubisco), glyceraldehyde 3-phosphate dehydrogenase, fructose-bisphosphate aldolases, fructose-1,6-bisphophatase, phosphoribulokinase, and Rubisco activase. Notably, almost all of the above transcripts were already present in the CZ, indicative of an early acquisition of photosynthetic capacity. This is in contrast to maize, where very few photosynthesis-related genes were found to be expressed in the SAM and early-stage leaf primordia 18 . The expression of constituents of the machineries that drive the light-dependent and -independent reactions of photosynthesis was accompanied by the establishment of photoprotective and reactive oxygen species (ROS) detoxification capabilities, along with other oxidative stress resistance mechanisms (Table 1). Such capabilities are especially essential during biogenesis of the photosynthetic apparatus, when chlorophyll and other pigments are synthesized. During this time, light absorption by these pigments can lead to rapid generation of ROS and thus to damage. Non-chloroplastic upregulated DEGs mostly belonged to several major functional groups. These included genes encoding for proteins involved in biotic and abiotic stresses, including oxidative stress, cell wall and carbohydrate metabolism, protein catabolism, and water homeostasis (Supplementary Table 2). The number of downregulated DEGs identified was significantly lower and generally not enriched in particular functional groups, with the exception of several transcription factors (TFs), as described below. Table 2 lists TFs and development-related proteins whose expression was up-or down-regulated along the chloroplast developmental pathway. Amongst the former are two YABBY genes, YABBY1 and YABBY2, encoding key TFs involved in abaxial cell fate determination 19 . An opposite behavior is observed for STM, a key TF required for SAM formation, which is down-regulated in leaf primordia 20 . CUC2, a TF that determines the border of the CZ 21 is also down-regulated outside this region. These trends faithfully capture known expression patterns of key SAM transcriptional regulators along the shoot apex. The expression of other TFs and developmental regulators displayed changes that are also consistent with processes known to occur in the SAM. CRK, a gene encoding an adapter protein involved, amongst other things, in the establishment of cell polarity 22 , was up-regulated along the developmental gradient ( Table 2). The expression of the auxin-efflux transporter MDR1, also essential for cell polarity 23 , likewise increased along the gradient. Transcripts of FD, SPT and SVP, all encoding TFs involved in flowering 24 , were identified in the CZ, and down-regulated outside of it, in accordance with the vegetative state of the SAMs we analyzed. Also down-regulated was the gibberellic acid receptor GID1. This gene was the only one whose expression changed exclusively between the peripheral and central zones of the SAM, consistent with the high rate of cell division in this region. Overall, the expression patterns observed for the aforementioned TFs and regulators matches those expected for ontogenetic processes related to leaf growth and differentiation during the vegetative phase.
Leaves of monocotyledon plants have been widely used as a model for chloroplast development due to their relatively simple architecture and the presence of a linear developmental gradient from the base of the leaf, where proplastids are found, to its tip, possessing the most mature chloroplasts [25][26][27][28] . We thus sought to compare our SAM transcriptomic data to those reported for maize leaves 27 . In the latter study, it was found that genes that were upregulated along the developmental gradient were mostly related to protein translation, tetrapyrrole and carotenoid biosynthesis, plastid targeting, photosynthesis, Calvin cycle, redox regulation, very similar to the upregulated DEGs identified in our study (Fig. 6). Of these, the products of 38 genes were chloroplast-targeted. Down-regulated genes observed in both species were related to chromatin structure, DNA replication, cell cycle, signaling and cell wall biosynthesis. Over 30 DEGs that were down-regulated in maize were up-regulated along the developmental path in tomato (Fig. 6). Inspection of these genes does not offer any obvious explanation for the opposite behavior, which might result from differences in species and/or in carbon fixation. This may also hold for other differences observed between the two plants.
The transcriptome analysis performed here revealed that expression of nuclear genes encoding chloroplast-targeted proteins occurs already in proplastid-containing stem cells of the SAM's CZ, and increases in cells of the PZ and the LP. In this respect, the transition from cell proliferation to cell expansion, highlighted in a previous study 28 as a key stage in which photosynthetic genes are strongly up-regulated, represents only the continuation of a gene expression gradient that is already established in the SAM. The increasing expression of chloroplast and photosynthesis-related genes, which correlates with thylakoid proliferation and chloroplast development, starts in the SAM and continues all the way through to leaf maturation, and thus, is not limited to specific stages in shoot morphogenesis. This trend is evident not only for chloroplast-related genes, but also for at least one TF controlling their expression, WRKY40 29,30 , which by itself is also found to be up-regulated in our dataset.
The parallels between leaf morphogenesis and chloroplast development, beginning already in the SAM, raise the possibility that the two processes are intimately connected. This is also suggested by the similar expression patterns of chloroplast-related genes and genes related to leaf development (Tables 1 and 2). However, tomato leafless (lfs) mutants possess a naked shoot apex that is green 31 . Furthermore, some albino mutants that have only proplastids, nonetheless develop normal-looking leaves 32,33 . It therefore seems that the two processes are independent of each other.
Finally, our data show that there is no sequential order of expression of chloroplast-targeted gene products along the chloroplast developmental pathway. As shown in Table 1, genes encoding chloroplast ribosomal proteins, photosynthesis proteins, enzymes related to carbon fixation as well as proteins involved in photoprotection and ROS detoxification are all concurrently expressed. Expression of genes encoding chloroplast proteins thus appears to proceed en masse, beginning already in the CZ of the SAM and progressively increasing towards the flanking primordial leaves. It is quite remarkable that apart from the extensive up-regulation of genes related to chloroplast functions, there appear to be no major changes in the expression of genes related to other organelles and cellular compartments. This is albeit the fact that the developmental gradient encompasses cells belonging to different clones, regions and organs. Future work should aim at increased coverage and spatial resolution using single-cell approaches 34,35 , that potentially can unravel more subtle expression patterns.

Confocal microscopy.
A solution of phosphate-buffered saline (PBS) containing 34.7% albumin was poured into 10 × 10 × 5 mm cryo-molds (Sakura Finetek). Albumin cross-linking was achieved by mixing glutaraldehyde (GA, final concentration 1.75%) into the solution, resulting in a hardened block within less than a minute. A ~2-mm-long top part of the seedling containing the shoot apex was placed on top of the block followed by another, identical, round of buffer and GA mix to embed the seedling into the block (the high GA/albumin ratio, utilized to achieve rapid cross-linking of albumin, excludes effective diffusion of GA into the seedlings). The embedded seedlings were then cut to 70-µm-thick slices, using an oscillating tissue slicer (OTS-4000, EMS, USA). All steps were carried out at 4 °C. Imaging was carried out as described 13 .
Laser capture microdissection (LCM) of SAM microdomains. The top part of the seedling containing the shoot apex was embedded into TissueTek @ OCT (Sakura Finetek) medium inside of 10 × 10 × 5 cm cryo-molds. Cryo-blocks were prepared by freezing the SAM-containing cryo-molds in liquid nitrogen and storing them at −80 °C until use. The cryo-blocks were cryo-sectioned into 10-μm-thick slices with a cryostat (Leica  annotations from Solgenomics, with the addition of the chloroplast genome, were used for counting reads per gene. Differential expression analysis was performed using the R-based software 'DESeq2' v1.4.0. Differentially expressed genes were those having ≥6 reads in at least one of the tested zones (CZ, PZ or LP), as well as a fold change (increase or decrease) of ≥1.5 in at least one of the three pair-wise comparisons. Additional criteria were either two-tailed FDR ≤0.05, or individual p-value ≤0.05, validated by subsequent qPCR analysis, as detailed below. The differentially expressed signals were log 2 -transformed and normalized by Z-score transformation before PCA and cluster analyses. Unsupervised clustering was performed using Euclidean distance metric and average-linkage agglomeration method. The heat map was based on K-means clustering of the genes, using Pearson correlation coefficient as a distance metric. The optimal number of clusters in the heat map was computed using the gap statistic method and included 1000 Monte Carlo iterations 36,37 . The RNA-Seq data have been deposited in NCBI's Gene Expression Omnibus 38 and are accessible through GEO Series accession number GSE102070 (https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE102070).
Real time PCR (qPCR). RNA isolation and amplification (for two rounds) were performed as described 3 , except that the primers used were from the MessageAmp II aRNA Amplification Kit (Invitrogene). Three biological replicates were used for each developmental stage, derived from 5-6 SAMs each. cDNA synthesis was carried out using 20 ng of the aRNA. qPCR was performed using the BioMark ™ HD System (Fluidigm). Data was analyzed with the Fluidigm Real-Time PCR Analysis software, using the Linear (Derivative) Baseline Correction and the Auto Ct Threshold Method. Differential expression was determined using two-tailed Student's t-test (p ≤ 0.05).

Figure 6.
Chloroplast development in tomato is distinct from that in monocots (maize). Cellular localization of up-and down-regulated DEGs across the chloroplast developmental gradients in tomato and maize 11 . Of the 223 DEGs we identified in the tomato shoot apex, about 70 showed a similar behavior in maize leaves (see Supplementary Table 2), the majority of which were up-regulated and encode for plastidial proteins.