Identification of Genes/Proteins Related to Submergence Tolerance by Transcriptome and Proteome Analyses in Soybean

Flooding can lead to yield reduction of soybean. Therefore, identification of flooding tolerance genes has great significance in production practice. In this study, Qihuang 34, a highly-resistant variety to flooding stress, was selected for submergence treatments. Transcriptome and proteome analyses were conducted, by which twenty-two up-regulated differentially expressed genes (DEGs)/differentially expressed proteins (DEPs) associated with five KEGG pathways were isolated. The number of the DEGs/DEPs enriched in glycolysis/gluconeogenesis was the highest. Four of these genes were confirmed by RT-qPCR, suggesting that glycolysis/gluconeogenesis may be activated to generate energy for plant survival under anaerobic conditions. Thirty-eight down-regulated DEGs/DEPs associated with six KEGG pathways were identified under submergence stress. Eight DEGs/DEPs enriched in phenylpropanoid biosynthesis were assigned to peroxidase, which catalyzes the conversion of coumaryl alcohol to hydroxy-phenyl lignin in the final step of lignin biosynthesis. Three of these genes were confirmed by RT-qPCR. The decreased expression of these genes led to the inhibition of lignin biosynthesis, which may be the cause of plant softening under submergence stress for a long period of time. This study revealed a number of up-/down-regulated pathways and the corresponding DEGs/DEPs, by which, a better understanding of the mechanisms of submergence tolerance in soybean may be achieved.

RNA-Seq analysis. RNA sequencing (RNA-Seq) based-transcriptome was performed in order to obtain the key genes involved in submergence tolerance. The samples from soybean root tissues were used for the cDNA library construction and 23.33-23.94 million raw reads were generated by sequencing. After removing adapter sequences and low quality reads, a total of 23.18-23.78 million clean reads were generated for each sample. The ratio of clean reads to total raw reads ranged from 97.17% to 99.9%. The total mapping ratio was from 86.65% to 92.07% (Table S1). The correlation of gene expression patterns between the biologically repeated samples was consistent (Fig. S1).
Comparative analysis of DEGs in response to the submergence treatment. The numbers of the up-regulated DEGs were normally distributed. The number of the genes was the least under the 3-h submergence stress, which began to increase at 6 h, reached the maximum at 12 h, and finally decreased at 24 h. However, the number of the down-regulated DEGs increased gradually with the time of submergence, and reached the maximum at 24 h (Fig. 2a). A cluster heat map indicated that the expression patterns of the DEGs were similar for the plants at these four time points (Fig. 2b). The Venn diagram clearly showed the number of the up-/down-regulated common genes at various time points under the submergence condition. A total of 4188 up-regulated and 4693 down-regulated genes were shared by the plants at these four time points (Fig. 2c). The log 2 FoldChange and other detailed information of the DEGs in all sample combinations are listed in Table S2. www.nature.com/scientificreports www.nature.com/scientificreports/ GO enrichment analysis based on transcriptome. To dissect the function of the DEGs under the submergence stress, we performed gene ontology (GO) enrichment by transcriptome analysis. The GO terms with P value < 0.05 were considered as enriched GO terms. GO terms for the up-/down-regulated DEGs were shown at various time points (Tables 1, 2). The up-regulated DEGs were mainly enriched in binding, phosphotransferase activity, alcohol group as acceptor, kinase activity and protein kinase activity. These GO terms enriched at four time points belonged to molecular function (MF). Transferase activity, transferring phosphorus-containing groups, adenyl nucleotide binding, adenyl ribonucleotide binding, cellular protein modification process and protein modification process were enriched under the 6, 12 and 24-h submergence stresses; these GO terms belong to MF and biological process (BP). The GO terms responding to the submergence stress in the early stage (3 h) included nucleic acid binding transcription factor activity, negative regulation of macromolecule metabolic process, nucleic acid-templated transcription, RNA biosynthetic process, carbohydrate metabolic process and regulation of protein metabolic process. The GO terms associated with signal transducer and transferase activities responded to the late submergence stress (24 h) ( Table 1).   www.nature.com/scientificreports www.nature.com/scientificreports/ Five main KEGG pathways were enriched at four time points for the up-regulated DEGs, including MAPK signaling pathway-plant, taurine and hypotaurine metabolism, plant-pathogen interaction, protein processing in endoplasmic reticulum and plant hormone signal transduction. Glycolysis/gluconeogenesis and fatty acid degradation were enriched under the 3, 6, and 12-h submergence stresses. Circadian rhythm-plant was enriched under the 6, 12 and 24-h submergence stresses. Inositol phosphate metabolism, benzoxazinoid biosynthesis, biosynthesis of secondary metabolites, metabolic pathways, glyoxylate and dicarboxylate metabolism responded to the submergence stress in the early stage (3 h), while ubiquitin mediated proteolysis and RNA transport responded to the submergence stress in the late stage (24 h) ( Table 3).
A total of twenty-seven KEGG pathways were enriched due to the down-regulated DEGs. Biosynthesis of secondary metabolites, carotenoid biosynthesis, metabolic pathways and beta-Alanine metabolism were enriched at these four time points. Six KEGG pathways were enriched at three time points, including arginine and proline metabolism, ascorbate and aldarate metabolism, isoflavonoid biosynthesis, phenylpropanoid biosynthesis, flavonoid biosynthesis and flavone and flavonol biosynthesis. Plant hormone signal transduction, stilbenoid, diarylheptanoid and gingerol biosynthesis, and phenylalanine metabolism responded to the submergence stress in the early stage (3 h), while pyruvate metabolism, glycolysis/gluconeogenesis, amino sugar and nucleotide sugar metabolism responded to the submergence stress in the late stage (24 h) ( Table 4).

GO enrichment analysis based on transcriptome and proteome. Only the samples under the 24-h
submergence treatment were taken for the proteome sequencing analysis (Fig. S2, Table S3). The result demonstrated that the GO terms in the up-regulated DEGs/DEPs were mainly enriched in cation binding, metal ion binding, ion binding and carbohydrate metabolic process, which belong to MF and BP ( Table 5).
The GO terms in the down-regulated DEGs/DEPs were mainly enriched in membrane, membrane part, oxidoreductase activity, oxidoreductase activity, acting on diphenols and related substances as donors, oxygen as acceptor, oxidoreductase activity, acting on diphenols and related substances as donors, phenylpropanoid metabolic process and secondary metabolic process (Table 6).
KEGG pathway enrichment analysis based on transcriptome and proteome. The analysis of the KEGG pathways is the same as that of the GO terms (Fig. S3). The result showed that five KEGG pathways in the up-regulated DEGs were enriched, and twenty-two DEGs/DEPs were identified under the submergence stress. All DEGs/DEPs involved in glycolysis/gluconeogenesis, carbon metabolism, MAPK signaling pathway-plant, fatty acid degradation and plant-pathogen interaction were up-regulated simultaneously. The number of DEGs/DEPs enriched in glycolysis/gluconeogenesis pathway was the highest (Table 7).
Six KEGG pathways in the down-regulated DEGs/DEPs were enriched, and thirty-eight DEGs/DEPs were identified under the submergence stress. The down-regulated DEGs/DEPs were related to phenylpropanoid biosynthesis, tryptophan metabolism, metabolic pathways, isoflavonoid biosynthesis, sesquiterpenoid and triterpenoid biosynthesis and biosynthesis of secondary metabolites (Table 8).    www.nature.com/scientificreports www.nature.com/scientificreports/ Submergence stress repressed the expression of phenylpropanoid biosynthesis-related genes/ proteins. Nine DEGs/DEPs from the phenylpropanoid biosynthesis pathway were isolated by the transcriptome and proteome analyses, all of which were down-regulated simultaneously ( Table 8). Out of these nine DEGs, eight genes encode peroxidase, while Glyma.20G180800 encodes phenylalanine ammonia-lyase 2. We selected three genes to validate the results of RNA-Seq, among which, Glyma.14G053600 and Glyma.06G275900 encode peroxidase P7, and Glyma.03G039800 encodes cationic peroxidase 1. These genes were down-regulated at four time points as shown by RT-qPCR analysis, which is consistent with that of RNA-Seq (Fig. 4a-c). We further measured the lignin content, which showed no difference between the 24-h submergence treatment group and the control group. The lignin content was remarkably different at 96 h, and the difference was significantly increased at 192 h between the submergence treatment group and the control group. The content of lignin decreased with the duration of the submergence treatment (Fig. 4d).
Submergence stress induced the expression of other pathways-related genes/proteins. We also selected representative genes from the other pathways according to the expression levels in RNA-Seq. The expression analysis was performed for the selected genes, Glyma.05G124000, Glyma.16G204600 Glyma.06G176200 and Glyma.18G258000, which belong to MAPK signaling pathway-plant, carbon metabolism, tryptophan metabolism and isoflavonoid biosynthesis. Glyma.05G124000, Glyma.16G204600, Glyma.06G176200 and Glyma.18G258000 encode polygalacturonase inhibitor, enolase, cytochrome P450 71A1 and malonyl-CoA: anthocyanidin 5-O-glucoside-6″-O-malonyltransferase, respectively. The comparative analysis of these genes showed that the expression patterns in the RT-qPCR analysis were similar to those observed in the RNA-Seq data, in which Glyma.05G124000 and Glyma.16G204600 were up-regulated, while Glyma.06G176200 and Glyma.18G258000 were down-regulated at four time points under the submergence stress (Fig. 5). This result confirmed that RNA-Seq and our experimental results were reliable.

Discussion
Submergence stress activated carbon metabolism. Because the oxygen-dependent respiration is greatly limited under submergence conditions, the acceleration of carbohydrate metabolism is critical for plant survival 21 . Many crops, including soybean, are sensitive to flooding stress 22 . Under flooding conditions, oxygen is insufficient for normal energy generation in plants, and glycolysis/gluconeogenesis becomes the main way for plants to obtain energy 2 . Some studies have shown that flooding of soybean seedlings can increase the abundance of proteins in glycolysis and fermentation 11,14,19,23 . Enzyme-encoding genes participating in the glycolysis/gluconeogenesis pathways have been isolated in the early stage of flooding 15,[24][25][26][27] . DEGs encoding glucose-6-phosphate isomerase (GPI), 6-phosphofructokinase (6-PFK), glyceraldehyde-3-phosphate dehydrogenase (GAPDH), FBA, PGAM, and PK are up-regulated in Sesbania cannabina 15 . Several DEGs involved in glycolysis and alcohol fermentation are significantly accumulated in both 'Zaoer-N' and 'Pepino' in cucumber, including 6-phosphogluconate dehydrogenase (G6PD), triose phosphate isomerase (TPI), pyruvate decarboxylase (PDC), PFK, ADH, and PK 27 . Proteins involved in glycolysis/fermentation are increased under flooding stress, such as enolase and ADH 28 . ADH is an important enzyme involved in alcohol fermentation, and has been observed in many plant species such as maize 29 , sorghum 30 , and Sesbania cannabina 15 . The maize mutant with a deficient ADH gene is more sensitive to flooding than the wild type plants 29 . The ADH activity in the flood-tolerant sorghum cultivar SSG-59-3 is significantly higher than that of the sensitive variety S-308 30 . Five up-regulated DEGs encoding ADH involved in alcohol fermentation have been identified, while none of these DEGs show a changed expression in the late stage of flooding in Sesbania cannabina 15 . The expression of ADH is specifically enhanced in waterlogged cotton, suggesting that ADH may play an important role in sustaining cotton growth under waterlogging stress 31 .

Submergence stress repressed lignin biosynthesis. Phenylpropanoid biosynthesis is regulated by
biotic and abiotic stimuli, and phenylpropanoid-based polymers such as lignin, suberin, and tannin contribute substantially to the stability and robustness of plants in the face of mechanical or environmental damages. Lignin plays an important role in mechanical support, water transportation and resistance to the harmful environment for plants 32 . In our study, the GO terms of the down-regulated DEGs related to phenylpropanoid metabolic process and lignin metabolic process were enriched at various time points (Table 2).
Previous reports have demonstrated that genes related to phenylpropanoid biosynthesis are differentially expressed, and many genes have been identified 15,31 . Our experiment identified nine DEGs/DEPs related to phenylpropanoid biosynthesis by the transcriptome and proteome analyses under submergence stress ( Table 8). All of these DEGs/DEPs were down-regulated simultaneously at four time points by RNA-Seq analyses. Eight of these genes encoded peroxidase which catalyzes the conversion of coumaryl alcohol to hydroxy-phenyl lignin in the final step of lignin biosynthesis (Fig. S5). Three key genes were selected based on the differential expression levels. The expression data and pattern revealed by RT-qPCR were highly consistent with those obtained by RNA-Seq (Fig. 4a-c). We further measured the lignin content, and the result showed no difference between the submergence and control group under the 24-h submergence stress. A highly significant difference was observed at 192 h between the submergence treatment group and the control group. The content of lignin decreased with the time of the submergence treatment (Fig. 4d). The decrease of the gene expression inhibited lignin biosynthesis, which might cause plant softening under the submergence treatment for a long period of time.
Activation of MAPKs has been rarely observed in plants exposed to flooding stress. In our study, we identified four genes (Glyma.05G124000, Glyma.10G152200, Glyma.05G123700 and Glyma.05G123900) related to the MAPK signaling pathway -plant under submergence stress. Glyma.10G152200 encodes a respiratory burst oxidase homolog protein B-like, and Glyma.05G124000, Glyma.05G123700 and Glyma.05G123900 encode polygalacturonase inhibitor 2. These three genes were also identified in the plant-pathogen interaction pathway (Table 7). Glyma.05G123900 was the same as that discovered in previous reports (Table S4). The DEGs/ DEPs isolated from the MAPK signaling pathway were up-regulated under submergence stress. We analyzed the expression level of Glyma.05G124000, which showed a consistent result compared with the RNA-Seq data (Fig. 5). According to the map of the MAPK signaling pathway, we found Glyma.05G124000, Glyma.05G123700 and Glyma.05G123900 encoded the same protein, FLS2. Therefore, we predict that submergence stress triggers basal defense responses similar to pathogen attack, and the transmission of signals through FLS2 further activates the MAPK signaling pathway (Fig. S7).
Two genes involved in tryptophan metabolism, Glyma.06G176200 and Glyma.08G350800, were isolated, which encode cytochrome P450 71A1 and cytochrome P450 93E1, respectively. These genes were down-regulated at four time points as revealed by the transcriptome data. Glyma.06G176200 was selected for RT-qPCR verification, and the results were consistent with that of RNA-Seq (Fig. 5). Inhibition of cytochrome P450 77A1 may enhance soybean tolerance to flooding stress 13 . Cytochrome P450 71A1, 93E1 and 77A1 belong to the same protein family, indicating the decreased expression of cytochrome P450 71A1 and cytochrome P450 93E1 may also enhance soybean submergence tolerance.

Conclusion
In the present study, the RNA-Seq technology was used to analyze the DEGs of soybeans subjected to 3, 6, 12 and 24-h submergence stresses, and the iTRAQ technology was used to analyze the DEPs subjected to the 24-h submergence stress. Transcriptome and proteome analyses were performed, which revealed many key DEGs/ DEPs and metabolic pathways responding to submergence stress. Eleven up-regulated enzyme-encoding DEGs/ DEPs involved in glycolysis/gluconeogenesis were isolated, suggesting that the glycolysis/gluconeogenesis pathway was activated for ATP production for plant survival. Eight down-regulated peroxidase encoding DEGs/ DEPs related to phenylpropanoid biosynthesis were identified, which catalyzes the conversion of coumaryl alcohol to hydroxy-phenyl lignin. We measured the lignin content, which showed no difference between the 24-h submergence treatment group and the control group. However, a highly significant difference was observed at 192 h between the submergence treatment group and the control group. The content of lignin decreased with the time of submergence treatment. The decreased expression of these genes inhibited lignin biosynthesis and accumulation, which might cause plant softening under submergence stress. Other up-/down-regulated pathways and DEGs/DEPs related to submergence tolerance were identifed, such as carbon metabolism, MAPK signaling pathway-plant, fatty acid degradation and isoflavonoid biosynthesis. The present study provides a foundation for future genomic studies on submergence tolerance of soybean.

Methods
Plant materials and stress conditions. The focus on the flooding tolerance of Qihuang 34 was from the field. Our previous study showed that Qihuang 34 possesses stronger tolerance to flooding during the entire growth stages 44 . Then, we selected 8 cultivated soybean varieties, including Qihuang 34, Qihuang35, Zhonghuang37, Qihuang 42, Jidou 17, Ludou 1, Fendou 95, and Hedou 19, and Nannong 1138-2 (a sensitive variety) was used as the control. Seeds were sterilized in 1% sodium hypochlorite for 30 min, rinsed with distilled water several times, and then sown on the sandy soil. Ten seeds were planted in each pot (180-mm length × 140mm width × 45-mm depth). A total of ten pots were sowed. The seedlings were grown in a psychometric room illuminated by a photoperiod of 16/8 h light/dark at 25 °C. Five seedlings with the same size were retained in each pot, and each variety eventually retained ten pots. For the submergence treatment, when two true leaves were fully unfolded, the plants were transferred to the white plastic containers filled with water. When the death rate of the seedlings of Nannong 1138-2 increased to 85%, water was released. The survival rate of the seedlings was calculated after de-submergence and seven days recovery. The seedlings without any green leaves were treated as dead seedlings. This experiment was repeated three times. The survival rate of the seedlings was the average of these three replicates.
Qihuang 34 was used as the material for transcriptome and proteome sequencing, and the submergence treatment was the same as above. The samples were collected at 3, 6, 12 and 24 h, respectively, and the untreated plants were used as the control. The root of the seedlings were collected from the control and the submergence treated group, frozen in liquid nitrogen and stored at −80 °C. Three biological replicates were performed for each sample, which contained 10 roots from 10 independent plants.

RNA isolation.
Total RNA was isolated from the root samples of the control and the treated seedlings using TRIzol reagent (Invitrogen, Carlsbad, CA USA). We determined the total RNA through Agilent 2100 Bioanalyzer (Agilent Technologies, USA). The RNA concentration was measured by a NanoDrop 1000 Spectrophotometer (Thermo Scientific, Wilmington, Delaware, USA).
Library construction and sequencing. cDNA library construction and sequencing were performed at Beijing Genomics Institute (BGI). The mRNA with the polyA tail was isolated using oligo(dT) attached with magnetic beads, and fragmented by a fragmentation buffer. The double-strand cDNAs were synthesized by random hexamers, RNase H and DNA Polymerase I. These cDNA fragments were added with a single ' A' base and subsequently ligated with the adapter. The products were then purified and enriched with PCR amplification. The double-stranded PCR products were heat denatured and circularized by the splint oligo sequence. The single strand DNA circles (ssDNA circles) were collected and used for the final library. The cDNA libraries were used for sequencing with the sequencing platform BGISEQ-500 (BGI).
Read mapping to soybean reference genome and expression analyses. Clean reads were obtained by removing reads containing adapters, unknown base with the N content greater than 10% and low quality reads. The filtered clean reads were saved as the FASTQ format. HISAT 45 was used to map clean reads to the genome of Glycine max Wm82.a2.v1. We used Bowtie2 46 to map the clean read to the reference sequence in order to count the rate of gene alignment, and then calculated the expression of genes and transcripts using the RSEM software package 47 .

DEGs, GO and KEGG enrichment analyses.
Prior to identifying the DEGs associated with submergence tolerance in soybean, the gene expression levels in different samples were calculated using FPKM (Fragments per kilobase for a million reads). DEGs analyses (three biological replicates per condition) were performed using the DESeq. 2 package. The DESeq. 2 method was based on the negative two term distribution principle. The DEGs were detected according to the method described by Love et al. 48 . The P-values were adjusted using the Benjamini and Hochberg approach 49 . Genes with the corrected P value < 0.05 were considered differentially expressed.
Gene ontology (GO) and KEGG 50 enrichment analyses of DEGs were implemented by the GOseq R package and KOBAS2.0 software (http://kobas.cbi.pku.edu.cn), respectively. The GO terms and KEGG pathways with P value < 0.05 were considered as enriched GO terms and KEGG pathways, respectively.
Protein purification and digestion. Proteins (150 μg) were purified by phase separation in the organic layer. The volume was adjusted to 150 μL. A total of 600 μL methanol was added to the protein solution, which was thoroughly mixed before 150 μL chloroform and 400 μL water were added. The mixture was vortexed and centrifuged at 20,000 g for 5 min. The supernatant was discarded. A total of 400 μL methanol was added to the organic phase, and the samples were centrifuged at 20,000 g for 5 min. The pellets were dried and re-suspended in 50 mM NH 4 HCO 3 , reduced with 50 mM dithiothreitol for 30 min at 56 °C, and alkylated with 50 mM iodoacetamide for 30 min at 37 °C in darkness.
Trypsin Gold (Promega, Madison, WI, USA) was used to digest the proteins with the ratio of protein: trypsin of 40: 1 at 37 °C overnight. After trypsin digestion, the peptides were desalted with a Strata X C18 column (Phenomenex) and vacuum-dried according to the manufacturer's protocol. Protein quantification. IQuant 52 was used for quantitative analysis of the labeled peptides with isobaric tags. To assess the confidence of peptides, the PSMs were prefiltered at a PSM-level FDR of 1%. Protein FDR at 1% was based on the picked protein FDR strategy to control the rate of false-positive at the protein level 53 . The protein quantification process included the following steps, protein identification, tag impurity correction, data normalization, missing value imputation, protein ratio calculation, statistical analysis, and result presentation.

Mass spectrometry detection.
Protein GO and KEGG pathway enrichment. We defined DEPs to be significantly regulated if the P value was less than 0.05. In GO enrichment analysis, we used the hyper geometric test to get the target GO terms. The principle of the KEGG pathway enrichment analysis of differentially expressed proteins was similar.

GO and KEGG enrichment analyses based on transcriptome and proteome.
Because the accumulation of proteins fell behind the expressions of genes, only the samples under the 24-h submergence treatment were taken for the proteome sequencing analysis. We performed the gene ontology (GO) enrichment analysis by transcriptome and proteome analyses at 24 h under submergence stress. Then, the enriched GO terms at 24 h were integrated with the transcriptome data from the intersection of four time points. The analysis of the KEGG pathways was the same as that of the GO terms.
Analysis of the lignin content. The submergence treatment of Qihuang 34 was the same as above, and the sampling time points included 24, 96 and 192 h. For the control, the plants were untreated with submergence. The root of the seedlings was collected from the control and the submergence treatment groups, grinding in liquid nitrogen and stored in a drying oven until the weight no longer changed. Three biological replicates were performed for each sample, which contained 10 roots from 10 independent plants. The lignin content was measured using a lignin extraction kit (COMIN, http://www.cominbio.com/).