Comparative transcriptome profiling of multi-ovary wheat under heterogeneous cytoplasm suppression

DUOII is a multi-ovary wheat line with two or three pistils and three stamens in each floret. The multi-ovary trait of DUOII is controlled by a dominant gene, whose expression can be suppressed by the heterogeneous cytoplasm of TeZhiI (TZI), a line with the nucleus of common wheat and the cytoplasm of Aegilops. DUOII (♀) × TZI (♂) shows multi-ovary trait, while TZI (♀) × DUOII (♂) shows mono-ovary. Observing the developmental process, we found that the critical stage of additional pistil primordium development was when the young spikes were 2–6 mm long. To elucidate the molecular mechanisms that are responsible for the heterogeneous cytoplasmic suppression of the multi-ovary gene, we RNA-sequenced the entire transcriptome of 2–6 mm long young spikes obtained from the reciprocal crosses between DUOII and TZI. A total of 600 differentially expressed genes (DEGs) was identified. Functional annotation of these DEGs showed that the heterogeneous cytoplasmic suppression of additional pistil development mainly involved four pathways, i.e., chloroplast metabolism, DNA replication and repair, hormone signal transduction, and trehalose-6-phosphate in the primordium development stage, which cooperated to modulate the multi-ovary gene expression under heterogeneous cytoplasmic suppression.

The normal development of floral organs depends on the intricate and precise regulation of gene expression. Genome-wide gene expression profiling by high-throughput sequencing is a powerful method to analyze the molecular regulation mechanisms and acquire candidate genes involved in various processes 17,18 . In wheat, RNA sequencing (RNA-seq) has been successfully applied for the transcriptome analysis of glume development 19 , asymmetry in synthetic and natural allotetraploid lines 20 , fertility conversion in male sterile line 21 , and responses to various biotic and abiotic stresses 22,23 .
In this study, we applied RNA-seq transcriptome analysis to identify differential gene expression patterns in the F 1 s derived from the reciprocal crosses between DUOII and TZI, and gain an insight into the underlying molecular mechanisms that control the heterogeneous cytoplasmic suppression of the multi-ovary gene in wheat. To our knowledge, this is the first report on the universal transcript expression patterns involved in the cytoplasmic suppression of wheat floral meristems.

Results
Morphological analysis and cytological examination. Generally, there is only one ovary in each floret of wheat. However, DUOII is a multi-ovary wheat variety, which has 2-3 pistils, three stamens per floret, and the genetics of multi-ovary are very stable. Furthermore, it can set 2-3 seeds in one floret ( Fig. 1a-g). For the reciprocal crosses between multi-ovary wheat DUOII and alloplasmic cytoplasm wheat TZI, the spike phenotypes of F 1 plants had similar spike phenotypes (Fig. 1h,i); however, using DUOII as the female parents, the F 1 plants showed multi-ovary trait with a normal cytoplasm of DUOII (Mu_NC), whereas, using TZI as the female parents, the F 1 plants showed mono-ovary trait with the heterogeneous cytoplasm of TZI (Mo_HC). The results suggested that the heterogeneous cytoplasm influenced the expression of the multi-ovary trait. Other morphological traits, such as plant height, spike length, spike number, or spikelet number on the main stem spike, were also investigated, but no significant differences were identified between Mu_NC and Mo_HC 16 . Identifying the critical stage of additional pistil development. To study the molecular mechanism of the multi-ovary trait, we aimed to identify the critical stage of additional pistil development. In Mu_NC, we observed a small protuberance at the base of the main pistil in the middle floret of 2-3 mm long young spikes, which was between the frontal stamen and lateral stamen; this was considered as the first sign of the additional pistil. The protuberance increased in size with spike development and became obvious in the middle floret of 5-6 mm long young spikes ( Fig. 2a-d). However, no protuberance was observed in Mo_HC ( Fig. 2f-i). Further, when the spike grew to about 25 mm, the protrusion became bigger, and its position was the same as the final additional pistil (Fig. 2e); and there was still no protuberance in Mo_HC (Fig. 2j). So, the protuberance really was the primordium of additional pistil. Since the developmental process of each floret in a spike was different, the critical stage of additional pistil development was identified as when the young spikes were 2-6 mm long.
Assessment of RNA-seq Data. RNA-seq produced a total of 94.54 Gb clean data with a GC content of 54.40-55.43%. For each sample, the average clean data were 15.75 Gb, with a Q30 higher than 86.61%, and an alignment efficiency to the reference genome of 70.41-72.47%, of which more than 88% were uniquely mapped reads (Supplementary Table S1). Additionally, Pearson's correlation between different samples was 0.984-0.992, whereas, R 2 within the biological replicates was higher than that between Mo_HC and Mu_NC (Fig. 3a). The gene expression levels and sequencing quality were identical in Mo_HC and Mu_NC (Fig. 3b,c). Consequently, the sequencing quality was sufficiently high to allow further analysis.

Identification of DEGs.
In total, 56,809 and 55,659 genes were co-expressed in Mo_HC and Mu_NC, respectively, and 1,866 and 716 genes were specifically expressed in Mo_HC and Mu_NC, respectively (Fig. 4a). Using an adjusted P < 0.05 as a selection criterion, we identified 600 DEGs in Mo_HC; 330 upregulated and 270 downregulated ( Fig. 4b; Supplementary Dataset S1). The differential transcription profiles of DEGs were different www.nature.com/scientificreports www.nature.com/scientificreports/ between Mo_HC and Mu_NC (Fig. 4c), and may possibly be related to the mechanism of heterogeneous cytoplasmic suppression of the multi-ovary gene.

GO-based functional analysis of DEGs.
Sequence homology analysis showed that 468 DEGs (78% of all DEGs) were associated with at least one GO term and categorized into 42 functional groups at the second level, consisting of 17 groups in biological process, 14 groups in cellular component, and 11 groups in molecular function (Fig. 5). These findings suggested that a wide range of functional genes were related to the process of heterogeneous cytoplasmic suppression of the multi-ovary gene.
To find the most concentrated gene functional groups in DEGs, GO enrichment analysis was performed. Using a corrected P < 0.05, 72 GO terms were significantly enriched and included 232 DEGs (Supplementary Dataset S2). Biological process consisted of a major portion of the enriched GO terms (53 terms), followed by molecular function (16 terms) and cellular component (three terms). To explore the relationship of enriched GO terms, we constructed a directed acyclic graph (DAG) using the top 10 enriched GO terms as the master nodes. For biological process, 30 enriched GO terms (56.60% of the total enriched GO terms) were drawn to the DAG divided into three categories: trehalose biosynthetic process (GO:0005992), DNA replication, synthesis of RNA primer (GO:0006269), and spermine biosynthetic process (GO:0006597) ( Supplementary Fig. S1a). For cellular component, two enriched GO terms (66.67% of the total enriched GO terms) were drawn to the DAG: checkpoint clamp complex (GO:0030896) and nucleus (GO:0005634). The former was a subset of the latter and also the most detailed GO term in molecular function ( Supplementary Fig. S1b). For molecular function, 13 enriched GO terms (81.25% of the total enriched GO terms) were drawn to the DAG divided into four categories: homoserine O-succinyltransferase activity (GO:0008899), sulfate adenylyltransferase (ATP) activity (GO:0004781), adenosylmethionine decarboxylase activity (GO:0004014), and serine-type endopeptidase inhibitor activity (GO:0004867) ( Supplementary Fig. S1c).
KEGG pathway analysis of DEGs. KEGG    www.nature.com/scientificreports www.nature.com/scientificreports/ repair (11 DEGs), and mismatch repair (nine DEGs). The three terms had a containment relationship, located in the nucleus, and cooperated to ensure the accuracy of DNA replication. Except for Novel 12743 (a new gene predicted as replication protein A subunit), the other 19 DEGs were down-regulated, suggesting that the  www.nature.com/scientificreports www.nature.com/scientificreports/ heterogeneous cytoplasm in Mo_HC might negatively affect the quality of DNA replication. Subset II included four terms: photosynthesis-antenna proteins (6), photosynthesis (14 DEGs), carbon fixation in photosynthetic organisms (12 DEGs), and sulfur metabolism (11 DEGs). The four terms were both located in the chloroplast and related to chloroplast metabolism. Except for those in the photosynthesis term, most of the other DEGs were significantly downregulated, suggesting that chloroplast metabolism might be disrupted in Mo_HC and played a role in the suppression of additional pistil development. Subset III included only one term that was plant hormone signal transduction (25 DEGs). Twenty-one DEGs were upregulated, and four DEGs were downregulated. Additionally, of these 25 DEGs, 12 DEGs were identified as transcription factors (TFs), such as auxin response factor (ARF), auxin/indole-3-acetic acid (AUX/IAA), basic region-leucine zipper (bZIP), ethylene-insensitive3-like (EIL), and Orphans, which implied that the expression of many genes related to hormone signal transduction would be influenced. The influenced hormones included auxin, cytokinin, and ethylene, which can influence cell division, growth, and the producing of additional pistils.

Validation of DEGs through qRT-PCR.
To confirm the accuracy and reproducibility of the transcriptome analysis results, 14 candidate DEGs, related to different biological pathways, were randomly selected for qRT-PCR. Linear regression analysis of the fold change of the gene expression rations between qRT-PCR and RNA-seq showed positive correlation (R 2 = 0.9521, Supplementary Fig. S2). These results confirmed the reliability of the DEGs obtained by RNA-seq in this study.

Discussion
DUOII is an excellent variety for studying the mechanisms of the multi-ovary trait and floral development in wheat. The multi-ovary trait of DUOII is controlled by a dominant gene, which is suppressed by the heterogeneous cytoplasm of TZI 16 . So, the suppression of multi-ovary gene is a nuclear-cytoplasm interaction process. Plant genome consists of both nuclear and cytoplasmic genomes, which cooperate to determine plant developmental process. Although the nuclear genome has a predominant role in determining the inheritance of most traits, the cytoplasmic genome also plays an essential role in plant development 24,25 . Chloroplasts are the site of photosynthesis and multiple anabolic reactions essential for growth, development, and reproduction 26 . Signals from chloroplasts can modulate nuclear gene expression and regulate plant development, including flower development [27][28][29] . Thompson et al. 30 created an Arabidopsis mutant lacking the chloroplast-localized rhomboid. The mutant plants that had either a double stigma or a single stigma with distortions in shape and size showed reduced fertility. In the present study, we identified four significantly enriched pathways located in the chloroplast region of Mo_HC, including photosynthesis-antenna proteins, photosynthesis, carbon fixation in photosynthetic organisms, and sulfur metabolism. Of the 43 DEGs involved in these pathways, 18 were upregulated, and the other downregulated. Chloroplast metabolism was disrupted in Mo_HC, which might play an important role in the suppression of additional pistil development.
Floral organs grow from a specialized structure called the shoot apical meristem (SAM), which comprises a pool of stem cells that continuously divide and replenish 31 . SAM produces floral meristems, in which floral organ primordia are formed and developed into organs by coordinated cell division and differentiation 32 . The disruption of nuclear and cell division can cause alterations in cell fate and organ differentiation 33,34 . As shown in Fig. 2, the additional pistil of the multi-ovary wheat was derived from a protuberance between the frontal stamen and lateral stamen. Meanwhile, Wang et al. 9 found that the primordium producing this additional pistil resulted from the abnormal division of the subcortical cell. In the present study, 20 DEGs were related to the DNA replication process, encoding DNA polymerase alpha-primase complex subunit, DNA polymerase epsilon subunit 2, mini-chromosome maintenance (MCM) complex subunits 4, 5, and 6, proliferating cell nuclear antigen (PCNA), and replication protein A (RPA) subunits 1B, 2A, and 2B. All these products play an important role in DNA replication and repair, as well as in cell proliferation; they also cooperate to ensure the accuracy of DNA replication, and coordinate the timing and order of cell cycle events 35,36 . Except for Novel12743, 19 DEGs were down-regulated in Mo_HC, suggesting that DNA replication process plays an essential role in the differentiation of additional pistil primordium, and heterogeneous cytoplasm can disrupt DNA replication and cell cycle events, inhibiting the differentiation of additional pistil primordium. www.nature.com/scientificreports www.nature.com/scientificreports/ The initiation of a floral organ is a major step in a plant's life cycle, and the timing and positioning of floral organ initiation are fundamental aspects of plant inflorescence architecture 37,38 . Plant hormones play a crucial role in determining inflorescence architecture, and disruption of the hormone status alters inflorescence morphology by modulating cell division and differentiation in the inflorescence primordium 39,40 . For instance, when hormone levels and distribution are disrupted in a vrs2 barley mutant, the two-rowed pattern of spikelets at the base and center of the inflorescence is altered to a six-rowed pattern 41 . In Mo_HC, 25 DEGs were significantly enriched in the plant hormone signal transduction pathway, of which 12 DEGs were TFs. The results showed that the expressions of genes related to hormones, including auxin, cytokinin, and ethylene, were disrupted and could lead to varied hormone signals in Mo_HC.
Disrupted chloroplast metabolism might play an important role in the suppression of additional pistil development. And in the nucleus, the changed DNA replication and hormone signal transduction processes suppressed the differentiation of additional pistil. What act as the messenger to transmit signal between chloroplast and nucleus? Sugars, the main photosynthetic products, not only serve as energy sources in diverse plant functions, but also act as signaling molecules and osmotic regulators, regulating floral signal transduction 42,43 . In the DAG analysis of enriched GO terms in biological process, the DAG divided into three categories: trehalose biosynthetic process, DNA replication, synthesis of RNA primer, and spermine biosynthetic process ( Supplementary  Fig. S1a). Interestingly, six DEGs of trehalose biosynthetic process were engaged in a special KEGG pathway: starch and sucrose metabolism pathway, which implied that these DEGs might play an important role in the suppression of additional pistil development. Functional analysis showed that the products of these six DEGs were all related to trehalose-6-phosphate (T6P).
T6P is the metabolic precursor of the non-reducing disaccharide trehalose, generated from glucose-6-phosphate (G6P) and UDP-glucose by trehalose-6-phosphate synthase (TPS). Previous studies have shown that T6P transmits signals to the nucleus and chloroplast that subsequently modulate the chloroplast metabolism and nuclear gene expression 44,45 . Additionally, T6P interacts with hormones to influence floral signals 43 . T6P is further metabolized to trehalose by trehalose-6-phosphatase (TPP) 45,46 , and can regulate cell division, differentiation, and plant architecture [47][48][49] . RAMOSA3 encodes TPP expressed in discrete domains subtending axillary inflorescence meristems to establish the correct identity and determinacy of axillary meristems in female inflorescences. The ramosa3 maize mutant plants have ears with additional abnormal branches at their bases generated from abnormal axillary meristems 44 . We also found that the additional pistils in multi-ovary wheat were generated from additional axillary meristems at the base of the native ovary. Interestingly, of the six DEGs involved in trehalose biosynthetic process, five DEGs encoded TPP, and one DEG encoded TPS. Therefore, T6P might play an important role in the suppression of additional pistil development in Mo_HC.
Based on the relationships among the chloroplasts, nucleus, hormones, and T6P, we created a hypothetical signaling pathway to better understand their interactions (Fig. 6). In the chloroplast, the disrupted photosynthetic process negatively affected the status of sugars, redox/reactive oxygen species, energy, and metabolites, which interacted with the altered T6P levels to send a signal to the nucleus and TFs through some sensors (e.g., SnRK1 46 and N-PTM 29 ). In the nucleus, DNA replication and repair, as well as cell proliferation, were modulated by DNA polymerase, MCM, PCNA, and RPA, which influenced the genome duplication and cell cycle events. Under the activation or inhibition effect of TFs, the expression of genes related to hormones was either triggered or repressed. Then, the hormones (auxin, cytokinin, and ethylene) modulated cell division and differentiation, which were the basis of the additional pistil meristem development. In addition, the expression of photosynthesis-related genes sent an anterograde signal to the chloroplast, which modulated photosynthesis and the chloroplast metabolism processes. In conclusion, the modified interactions among the chloroplasts, nucleus, hormones, and T6P might be responsible for the heterogeneous cytoplasmic suppression of the multi-ovary trait. To confirm the reliability of this signaling pathway, we randomly selected 14 DEGs related to this pathway and used qRT-PCR to investigated their expression level. The results showed that the expression patterns of these genes revealed by qRT-PCR data were consistent with those derived from RNA-seq (Table 2).
Multi-ovary wheat has the obvious advantage of increased number of grains per spike, thereby potentially increasing the wheat yield. In addition, heterogeneous cytoplasm can suppress the expression of the multi-ovary trait. To our knowledge, this is the first report of the universal transcript expression patterns involved in the cytoplasmic suppression of wheat floral meristems. This signaling pathway provide insights to understand the regulatory role of retrograde signaling from cytoplasm to nucleus, and will be useful in further mechanistic studies on the underlying mechanism that controls the heterogeneous cytoplasm-induced suppression of the nuclear multi-ovary gene in wheat. Our results provide a foundation for understanding the development of the multi-ovary trait in wheat, and can be implemented in future breeding activities focused on the development of high yield wheat cultivars.

Methods
Plant materials. In this study, we used two inbred wheat lines (20 years of artificial selfing with no segregation); DUOII, which is a common multi-ovary line, and TZI, which is an alloplasmic line developed by the Key Laboratory of Crop Heterosis of Shaanxi Province, using the nucleus of 'Chris' and the cytoplasm of Aegilops. In October 2013, DUOII and TZI were sown in the experimental field of Northwest A & F University, Yangling, China (34°91′N, 106°86′E). In May 2014, reciprocal crosses between DUOII and TZI were performed, and the F 1 seeds were sown in October 2014. In March 2015, 2-6 mm young spikes were hand-dissected from approximately 100 plants in each F 1 population, immediately frozen in liquid nitrogen, and stored at −80 °C until RNA extraction.
Morphological analysis and cytological examination. Photographs of the F 1 spikes were taken using Nikon D600 digital camera (Nikon, Tokyo, Japan), whereas those of the pistils were taken using a Nikon E995 www.nature.com/scientificreports www.nature.com/scientificreports/ digital camera (Nikon) mounted on a Motic K400 dissecting microscope (Preiser Scientific, Louisville, KY, USA). For cytological examination, young spikes were processed following the method described by Zhang et al. 50 and observed with a JSM-6360LV scanning electron microscope (JEOL, Tokyo, Japan).
RNA Extraction, RNA-seq library preparation, and sequencing. Young spikes from three biological replicates (0.2 g each) were separately ground into a fine powder in liquid nitrogen by constant crushing using sterilized and chilled pestle and mortar, and subsequently used for total RNA extraction with TRIzol reagent (Invitrogen Life Technologies, Waltham, MA, USA) following the manufacturer's instructions. RNA degradation and contamination were monitored by performing 1% agarose gel electrophoresis. RNA concentration was measured using Qubit ® 2.0 Fluorometer (Life Technologies, Carlsbad, CA, USA) with Qubit ® RNA Assay Kit (Life Technologies). RNA purity was tested using NanoPhotometer ® spectrophotometer (Implen, Munich, Germany).
RNA integrity was assessed using Bioanalyzer 2100 (Agilent Technologies, Santa Clara, CA, USA) with RNA Nano 6000 Assay Kit. Sequencing libraries were generated using NEBNext ® Ultra TM RNA Library Prep Kit for Illumina ® (NEB, Ipswich, MA, USA), following the manufacturer's instructions, and index codes were added to attribute sequences to each sample. Index-coded samples were clustered using cBot Cluster Generation System with TruSeq PE Cluster Kit 3-cBot-HS (Illumina, San Diego, CA, USA), following the manufacturer's instructions. RNA-seq was performed by Beijing Novogene Technologies (Beijing, China) using Illumina Hiseq platform, and 150 bp paired-end reads were generated.

Sequence alignment.
To obtain high-quality, clean reads, we removed all reads with adaptor sequences, with a percentage of ambiguous bases (N) higher than 10%, and with a percentage of low-quality bases (Q ≤ 20) higher than 50%. The Q30 and GC content of clean data were calculated, and the paired-end clean reads were aligned to the reference genome using TopHat 2.0.12 51 . The reference genome and gene model annotation files were obtained from ftp://ftp.ensemblgenomes.org/pub/release-25/plants/fasta/triticum_aestivum/dna/, and the index of the reference genome was constructed using Bowtie 2.2.3 52 . www.nature.com/scientificreports www.nature.com/scientificreports/

Measurement of gene expression levels and detection of differentially expressed genes (DEGs).
The expected number of fragments per kilobase of transcript per million mapped reads (FPKM) was used for estimating the gene expression levels 51 . The number of reads mapped to each gene was counted by HTSeq. 0.6.1 53 , and then the FPKM of each gene was calculated based on the length of the gene and the number of mapped reads. Differential expression analysis between Mu_NC and Mo_HC was performed using DESeq. 1.18.0 for R 54 . The resulting P-value was adjusted using the Benjamini-Hochberg procedure for controlling the false discovery rate. An adjusted P-value was computed by DESeq for each gene and those with an adjusted P < 0.05 were assigned as DEGs.
Functional analysis of DEGs. To perform functional annotation, the identified transcript genes were annotated using the National Center for Biotechnology Information (NCBI) non-redundant protein database (Nr) and the Swiss-Prot database. Gene Ontology (GO) enrichment analysis of DEGs was implemented by GOseq for R 55 , in which the gene length bias was corrected. GO terms with corrected P < 0.05 were considered significantly enriched by DEGs. Following this, we employed topGO 56 to constructed the DAG using the top 10 enriched GO terms as the master nodes. For the Kyoto Encyclopedia of Genes and Genomes (KEGG) analysis, statistical enrichment of DEGs in the KEGG pathways was tested by KOBAS 57 . In addition, transcription factors (TFs) were identified by iTAK 58 .

Validation of DEG expression with quantitative reverse transcription PCR (qRT-PCR). To val-
idate the RNA-seq results, qRT-PCR analysis was performed. Primers for qRT-PCR were designed using Primer Primer 5.0 software (Primer, Canada) and were synthesized by Invitrogen Life Technologies (Shanghai, China). All primers used for qRT-PCR were list in Supplementary Table S2

Data Availability
The sequence data generated in this study were deposited in the NCBI Sequence Read Archive (Accession number SRP144469). The datasets analyzed during the current study are available from the corresponding author on reasonable request.  Table 2. Validation of the RNA-seq expression profiles of selected DEGs by qRT-PCR. Note: TF, Transcription factor; FPKM: The expected number of fragments per kilobase of transcript per million mapped reads; Mu_NC, F 1 population derived from a cross between DUOII as female parent and TZI as male parent; Mo_HC, F 1 population derived from a cross between TZI as female parent and DUOII as male parent; Log 2 (FC), Log 2 (Fold change) = Log 2 (Mo_HC/Mu_NC).