Stages identifying and transcriptome profiling of the floral transition in Juglans regia

Using paraffin sections, the stages of walnut female flower bud differentiation were divided into the predifferentiation period (F_1), initial differentiation period (F_2) and flower primordium differentiation period (F_3). Leaf buds collected at the same stage as F_2 were designated JRL. Transcriptomic profiling was performed, and a total of 132,154 unigenes were obtained with lengths ranging from 201 bp to 16,831 bp. The analysis of differentially expressed genes (DEGs) showed that there were 597, 784 and 532 DEGs in the three combinations F_1vsF_2, F_1vsF_3, and F_2vsF_3, respectively. The comparison F_2vsJRL showed that 374 DEGs were differentially expressed between female buds and leaf buds. Thirty-one DEGs related to flowering time were further used to construct coexpression networks, and CRY2 and NF-YA were identified as core DEGs in flowering time regulation. Eighteen DEGs related to flowering time were subjected to real-time quantitative analysis. Our work provides a foundation for further research on the walnut floral transition and provides new resources for future research on walnut biology and biotechnology.

unigenes were annotated to at least one database, accounting for 59.56% of all unigenes, and 11,165 unigenes were annotated by all seven major databases, accounting for 8.44% of all unigenes (Fig. 2B).
The annotation results of the NR database show that 11372 unigenes (18.1% of the total annotated unigenes) were compared to Vitis vinifera, 6522 unigenes (10.4% of the total annotated unigenes) were compared to Prunus mume, 6141 unigenes (9.7% of the total annotated unigenes) were compared to Prunus persica, 4834 unigenes (7.7% of the total annotated unigenes) were compared to Jatropha curcas, 3457 unigenes (5.5% of the total annotated unigenes) were compared to Citrus sinensis, and the remaining 48.7% unigenes were compared to other species (Fig. 2C).
GO enrichment. The gene functional classifications were identified in a GO enrichment analysis based on the 52,157 unigenes annotated in the GO database. In the biological process category, cellular process, metabolic process, and single organism process were highly represented. In the cellular component category, significantly enriched genes were associated with cell, cell part, and organelle. In the molecular function category, GO terms related to binding, catalytic activity and transporter activity were significantly enriched (Fig. 3). KEGG pathway enrichment. KEGG pathway enrichment analysis showed that transport and catabolism, signal transduction, transition, carbohydrate metabolism and environmental adaptation had the most unigenes in classifications of Metabolism, Genetic Information Processing, Environmental Information Processing, Cellular Processes and Organismal Systems, respectively (Fig. 4). In particular, Environmental Information Processing and Organismal Systems, which included many subterms associated with flowering, such as the pathways of plant hormone signal transduction (ko04075) and Circadian rhythm-plant (ko04712), were enriched (Table S1).
Transcription factor (TF) enrichment analysis. TFs function alone or with other proteins in complexes by promoting or blocking the recruitment of RNA polymerase to specific genes [19][20][21] . Some TFs have been shown to play key roles in plant flowering [22][23][24] . Among the 132,154 unigenes, 4,835 of the unigenes (approximately 3.7%) were annotated as TFs, and they fell into diverse categories covering nearly all TF families, with MYB, NAC, and AP2-EREBP being the most highly represented (Fig. 5).
DEGs analysis of walnut transcriptome data. Identification of DEGs. The four databases from F_1, F_2, F_3, and JRL were used for paired comparisons, and four combinations (F_1vsF_2, F_1vsF_3, F_2vsF_3, F_2vsJRL) were chosen and analyzed.
The three combinations of female flower bud developmental stages (F_1vsF_2, F_1vsF_3, F_2vsF_3) had 597 DEGs, 784 DEGs and 532 DEGs, respectively. In addition, 16 DEGs were shared by all three combinations, which means they were expressed differently in the three periods of female flower buds (Fig. 6). The expression levels of DEGs between F_1, F_2 and F_3 were plotted in a heatmap (Fig. 7A).
In addition, the female flower buds at the beginning of differentiation (F_2) and the leaf buds of the same stage (JRL) were used for comparison (F_2vsJRL). The results showed that 374 DEGs were expressed differently in female buds (F_2) and leaf buds (JRL), of which 90 unigenes in the female flower buds (F_2) showed a higher expression level than that in the leaf buds (JRL), and the other 284 unigenes showed the opposite trend (Fig. 6E). The expression levels of 374 DEGs between female buds (F_2) and leaf buds (JRL) were plotted in a heatmap (Fig. 7B).  KEGG pathway analysis of DEGs: KEGG pathway analysis of the DEGs of F_1VSF_2, F_1VSF_3, F_2VSF_3 showed that most pathways were associated with flowering, and three pathways (Photosynthesis-antenna proteins, Plant hormone signal transduction, Porphyrin and chlorophyll metabolism) were shared by the three combinations. In addition, the pathway enrichment analysis of female flower buds (F_2) and leaf buds (JRL) also included many flowering-related pathways (Fig. 9).
Analysis of DEGs related to flowering time: The flowering time genes that induce plant flowering are usually attributed to the photoperiod, vernalization, autonomous, GA, and age-regulated pathways, as well as the integrated factors of the intersection of these pathways 3,6,13,25 .
Based on previous studies, we summarized the important flowering time genes in the model plant Arabidopsis thaliana, and these genes were used as query genes (Table 2). Finally, we screened 31 DEGs associated with flowering time in walnut, and eighteen of the flowering time DEGs were chosen for qRT-PCR.  www.nature.com/scientificreports www.nature.com/scientificreports/  www.nature.com/scientificreports www.nature.com/scientificreports/ DEGs related to the walnut floral transition. A total of 31 unigenes were involved in the photoperiod (circadian rhythm), vernalization pathway, GA, and age-related pathways. To investigate the complex molecular mechanisms underlying floral transition in walnut, we set the 31 flowering time-related DEGs in the flowering time regulation network (Fig. 10).
Most of the DEGs were annotated in the photoperiod pathway: In total, twenty-four DEGs were annotated in this pathway, fifteen DEGs were highly expressed in the predifferentiation period (F_1), including CRY2 (1), www.nature.com/scientificreports www.nature.com/scientificreports/ (1), and LHY (2) were highly expressed in the initial differentiation period (F_2). NF-YA (1) and TEM1 (2) were highly expressed in the flower primordium differentiation period (F_3). Two PHR2 genes had higher expression levels in leaf buds (JRL) than in flower buds (F_2), while PRR9 exhibited the opposite trend.
In Arabidopsis thaliana, CYCLING DOF FACTOR 2 (CDF2) and CYCLING DOF FACTOR 3 (CDF3) act redundantly to reduce CONSTANS (CO) expression and are degraded by the complex of GIGANTEA (GI) and FLAVIN-BINDING KELCH REPEAT F-BOX 1 (FKF1), thus releasing repression of CO and FT transcription 26-28 .   www.nature.com/scientificreports www.nature.com/scientificreports/ Blue-light photoreceptors such as PHOTOLYASE-RELATED 2 (PHR2) are essential light detectors for the early development of plants and mediate phototropism and the expression of specific genes. Loss of a blue-light photoreceptor in the hy4 mutants of Arabidopsis thaliana substantially delayed flowering 29,30 . CRYPTOCHROME 2 (CRY2) can promote the expression of FT and is negatively regulated by FLOWERING LOCUS C (FLC) 31 . The NUCLEAR FACTOR Y (NF-Y) transcription factors (heterotrimeric complexes composed of NF-YA and the dimer of NF-YB/NF-YC) can initiate photoperiod-dependent flowering by cooperatively interacting with CO to drive the expression of FT 32 .
Phytochrome Interacting Factor 3 (PIF3) can bind the promoters of LATE ELONGATED HYPOCOTYL (LHY) and CIRCADIAN CLOCK ASSOCIATED 1 (CCA1) to form an in vitro ternary complex. Additionally, the complex of CCA1 and LHY can bind with TIMING OF CAB 1 (TOC1) to form a feedback loop that is necessary for the circadian clock in Arabidopsis thaliana 33 . In addition, the CCA1 and LHY complexes were repressed by the Pseudo Response Regulators (PRRs) encoded by TOC1 6,34 . The activator CO and the repressor TEMPRANILLO (TEM) have a quantitative balance that determines the FT transcription level, as it shifts the CO/TEM balance in favor of CO activity, allowing FT transcription to reach the threshold level required to trigger flowering 35 .
In the age-regulated pathway, three DEGs were annotated as TREHALOSE-6-PHOSPHATE SYNTHASE 1 (TPS1), which is essential for normal vegetative growth and transition to flowering 36,37 . They were differentially expressed between the different flower bud formation stages, and the flower buds showed no difference from the leaf buds in their expression level.
In the vernalization pathway, vernalization can cause epigenetic changes in FLC and indirectly promote flower formation by histone modification, and MADS AFFECTING FLOWERING 5 (MAF5) is an FLC-related gene [38][39][40][41][42][43] . VERNALIZATION INSENSITIVE 3-LIKE 2 (VIL2), which can repress MAF5 and permit more rapid flowering during noninductive photoperiod conditions in Arabidopsis thaliana 44 . In this study, VIL2 was highly expressed in the predifferentiation period (F_1), decreased in the initial differentiation period (F_2), and decreased to the lowest point at the flower primordium differentiation period (F_3). In the leaf buds (JRL), its expression level was higher than that in the flower buds during the same period (F_2).
In the GA pathway, GA regulates LEAFY (LFY) via DELLAs, miR159 and MYBs [45][46][47] . The combination of gibberellin and GIBBERELLIN-INSENSITIVE DWARF1 (GID1) can combine with the DELLA protein to form a GA-GID1-DELLA trimer, and then the SKP1-CUL1-F-box (SCF) polymer can tag the trimer to induce the ubiquitin 26 S proteasome to degrade the DELLA protein, relieving the inhibitory effect of the DELLA protein on plant growth and producing gibberellin effects 48,49 . In this study, GID1 was highly expressed in the predifferentiation period (F_1), and its expression was downregulated in the initial differentiation period (F_2). In addition, the expression of GID1 was higher in flower buds (F_2) than in leaf buds (JRL). www.nature.com/scientificreports www.nature.com/scientificreports/ In addition to the DEGs mentioned above, there are two DEGs annotated as APETALA1 (AP1). In Arabidopsis thaliana, AP1 is required for the floral transition 50 . In this study, the two AP1 genes were nearly undetectable in the predifferentiation period (F_1), and their expression showed a significant upregulation in the initial differentiation period (F_2).

Coexpression networks. Weighted gene coexpression network analysis (WGCNA) is a biology method
for interaction analysis and resolving correlation networks 51 . To search for the genes involved in flowering time regulation in walnut, thirty-one flowering time-related DEGs were used to construct a coexpression network using the WGCNA method, and the results are presented in Fig. 11. In the coexpression network, many of the hub genes that participate in flowering time regulation were identified, such as JrCDF-2, JrCDF-3, JrPRR7, JrPRR7-1, JrPRR7-2, JrTPS-1-1, and the hub genes with the highest edge numbers were JrCRY2 and JrNF-YA-2.

Verification of DEG expression by qRT-PCR. To further verify the gene expression levels shown by
RNA-Seq, we chose eighteen flowering time-related DEGs to perform qRT-PCR. Except for JrTPS1-2, the expression trends of the DEGs showed high similarity between the qPCR data and the RNA-Seq data (Fig. 12).

Identification of the critical periods of female flower bud differentiation in walnut.
Little is known about the development of walnut female flower buds, and initial differentiation is usually identified by a duration of one month after the female flowers bloom or four to six weeks after the medium and short twigs finish elongating. It is essential to ensure that experimental materials can be collected at precise times; however, due to regional differences in phenology, previous research conclusions cannot be used directly. In this experiment, we have preliminarily identified the initiation period of female flower bud differentiation in walnut through two years of observation of tissue sections, external morphological characteristics and local phenology. Generally, the stage of floral differentiation was divided into the physiological differentiation stage and morphological differentiation stage. Although we identified the initiation period of the morphological differentiation stage, the initiation period of the physiological differentiation stage and the DEGs involved in floral bud determination are still unknown in walnut and will be investigated in a future study. In addition, we suggest that some simple experimental methods, such as freehand sectioning or counting the number of bud scales can be used as preliminary judgment methods in field conditions (Fig. S1).

Functional classification of genes.
To functionally categorize the unigenes, the GO and KEGG databases were used to annotate the functions of the unigenes. GO enrichment showed that single organism process (GO:0044699, level 2, BP) was enriched, which included the GO subterm GO:0048449 (level 4) with the function of floral organ formation. The GO analysis of the DEGs indicated that the regulation of RNA metabolism, nucleic acid binding transcription factor activity and transcription factor activity were important during female flower bud differentiation. Interestingly, the GO terms of the DEGs between the flower buds (F_2) and leaf buds (JRL) were mainly enriched in functions related to photosynthesis. In total, 14 DEGs were associated with GO terms related to photosynthesis between F_2 and JRL, and the expression of these 14 DEGs were all upregulated in the leaf buds (Table S3), which suggested that the related genes upregulated in the leaf buds may participate in the process of carbohydrate assimilation. KEGG pathway analysis indicated that most pathways were associated with flowering and were shared by the four combinations of F_1vsF_2, F_1vsF_3, F_2vsF_3 and F_2vsJRL. Interestingly, in the dimer of GI-FKF1 and trimer of CCA1-LHY-PIF3, the expression levels of the components in each complex were not consistent, such as GI, which showed a contrasting expression pattern to FKF1, as did LHY and PIF3. In addition, PHR2 genes are annotated as blue-light photoreceptors, and CRY2 genes also function as blue-light receptors. They are flavoproteins similar in sequence and repressors of the CLOCK/BMAL1 heterodimer 29,52,53 . However, they showed contrasting expression patterns for reasons that remain unclear. www.nature.com/scientificreports www.nature.com/scientificreports/ In the development of the floral transition, PRRs, GI, and FKF1 were all downregulated from F_1 to F_2, while LHY and CDFs were upregulated from F_1 to F_2, suggesting that photoperiod pathway (circadian rhythm) genes may participate in the regulation of the floral transition.
In this study, the three stages of the female floral transition were identified by the morphological characteristics of tissue sections, and transcriptome-wide investigation of the gene expression profiles in walnut flower buds and leaf buds was conducted during the floral transition. Thirty-one DEGs related to flowering time were identified, and among them, CRY2 and NF-YA were screened as core DEGs in flowering time regulation. Eighty-eight of the thirty-one DEGs (including CRY2 and NF-YA) were confirmed by real-time quantitative analysis, and most of the RNA-Seq data were validated by qPCR data. Our work provides a foundation for further research on the walnut floral transition.

Materials and Methods
Plant materials. Walnut (Juglans regia L.) trees were grown under natural conditions in the southern part of the Xinjiang Uyghur Autonomous Region, China. Leaf buds were collected during the floral transition period (JRL), and female flower buds were collected before, during, after the floral transition period (F_1, F_2 and F_3). Each sample was pooled from 3 buds, and 3 biological repeats were performed, for a total of 9 buds for each stage of the floral transition. Three mixed samples from 9 buds were collected for sequencing, and the total RNA of each sample was extracted individually.
Microscope observations. We peeled off the outer scales of the buds and fixed the buds in FAA fixative solution. Then, the fixed buds were dehydrated with a continuous gradient of ethanol and embedded in paraffin. Samples were cut into 8-12 μm slices (Leica Microtome, Germany), deparaffinized with xylene, and hydrated in a decreasing ethanol series. The sections were stained with Safranin and Fast Green and mounted with neutral gum. Finally, we observed the slices under a Motic microscope (Motic AE31, China).
Transcriptome sequencing and library construction. Sequencing was carried out by the Novogene company, Beijing, China. Total RNA was extracted using RNAout 1.0 (Tianenze, Beijing, China). A total of 1.5 µg RNA per sample was used as input material for RNA sample preparation. Sequencing libraries were generated, and the clustering of the index-coded samples was performed. After cluster generation, the library preparations were sequenced on an Illumina HiSeq 2000 platform, and paired-end reads were generated. Quantitative real-time PCR. Total RNA was extracted using RNAout 1.0 (Tianenze, Beijing, China) by Novogene, Beijing, China. We synthesized the cDNA using the PrimeScript RT Reagent Kit (TaKaRa, Dalian, China). We performed real-time quantification using CFX Manager (Bio-Rad, USA) with SYBR Green Realtime PCR Master Mix (Toyobo, Osaka, Japan). The protocol of the real-time PCR was as follows: an initiation step at 95 °C for 5 min; followed by 40 cycles of 30 s at 94 °C, 30 s at 55 °C, and 30 s at 72 °C; and the melting curve step was processed from 65 to 95 °C. Each reaction was repeated for three times. The walnut Actin gene (forward primer: 5′-CCATCCAGGCTGTTCTCTC-3′, and reverse primer: 5′-GCAAGGTCCAGACGAAGG-3′) and walnut GAPDH gene (forward primer: 5′-ATTTGGAATCGTTGAGGGTCTTATG-3′ and reverse primer: 5′-AATGATGTTGAAGGAAGCAGCAC-3′) were used as the internal controls. The results were analyzed by the 2 −ΔΔCt method 54 . Differential expression analysis. For differential gene expression analysis, the read counts of each sequenced library were adjusted with EdgeR software. Differential expression analysis of two samples was performed using the DEGseq (2010) R package. The P values were adjusted using the q values 55 , and thresholds for significantly differential expression were set as q value < 0.005 and |log2(foldchange)| > 1.